Refined model of the covariance/correlation matrix between securities
TTHESE DE DOCTORATDEL’Université de Paris XIII-Sorbonne Paris CitéEcole Doctorale Erasme - Sciences économiques et de gestionCEPN (UMR-CNRS 7234)M. Sébastien VALEYRE
Modélisation fine de la matrice decovariance/corrélation des actions
Thèse présentée et soutenue le 29 Mai 2019, à 10h, à la Maison du Poumon, 66 boulevardSaint-Michel, Paris. Composition du Jury:M. Prigent, Jean-Luc Professeur, Université de Cergy-Pointoise Président du juryM. Malevergne, Yannick Professeur, Université Paris 1 RapporteurM. Maillet, Bertrand Professeur, EM Lyon RapporteurM. Courtault, Jean-Michel Professeur, Université Paris 13 ExaminateurMme. Rivals, Isabelle Maitre de conférence, E.S.P.C.I. ExaminateurM. Aboura, Sofiane Professeur, Université Paris 13 Directeur de thèse a r X i v : . [ q -f i n . P M ] J a n ésumé Une nouvelle méthode a été mise en place pour débruiter la matrice de corrélation des ren-dements des actions en se basant sur une analyse par composante principale sous contrainteen exploitant les données financières. Des portefeuilles, nommés “Fundamental Maximumvariance portfolios”, sont construits pour capturer de manière optimale un style de risquedéfini par un critère financier (“Book”, “Capitalization”,etc.). Les vecteurs propres souscontraintes de la matrice de corrélation, qui sont des combinaisons linéaires de ces porte-feuilles, sont alors étudiés. Grâce à cette méthode, plusieurs faits stylisés de la matrice ontété mis en évidence dont: i) l’augmentation des premières valeurs propres avec l’échellede temps de 1 minute à plusieurs mois semble suivre la même loi pour toutes les va-leurs propres significatives avec deux régimes; ii) une loi “universelle” semble gouverner lacomposition de tous les portefeuilles “Maximum variance”. Ainsi selon cette loi, les poidsoptimaux seraient directement proportionnels au classement selon le critère financier étu-dié; iii) la volatilité de la volatilité des portefeuilles “Maximum Variance”, qui ne sont pasorthogonaux, suffirait à expliquer une grande partie de la diffusion de la matrice de cor-rélation; iv) l’effet de levier (augmentation de la première valeur propre avec la baisse dumarché) n’existe que pour le premier mode et ne se généralise pas aux autres facteurs derisque. L’effet de levier sur les beta, sensibilité des actions avec le “market mode”, rend lespoids du premier vecteur propre variables.
Mots clefs : corrélation, filtre, diagonalisation sous contrainte, modèle mul-tifactoriel, portefeuilles optimaux, gestion d’actifs, diffusion1 mon épouse, Léa, A ma fille Gabrielle, A mes parents, Françoise etDominique
Remerciements
Je souhaite ici rendre hommage et exprimer ma profonde gratitude àtous ceux qui, de près ou de loin, ont contribué à la réalisation de cettethèse. Mes remerciements s’adressent tout d’abord à mon Directeur de thèse,le Professeur Sofiane Aboura. Tout au long de ce travail, il a su m’apporterune disponibilité, une écoute, une confiance, des conseils précieux et avisés àla hauteur de ses compétences et de ses qualités humaines. J’adresse égale-ment tous mes remerciements aux Professeurs Bertrand Maillet et YannickMalevergne, de l’honneur qu’ils m’ont fait en acceptant d’être rapporteurs decette thèse ainsi qu’aux Professeur Jean-Michel Courtault, Jean-Luc Prigentet Isabelle Rivals de l’honneur qu’ils m’ont fait en acceptant d’être examina-teur. Je remercie aussi mes collègues, l’équipe de recherche de John Locke,Denis Grebenkov, Stanislav Kuperstein, Maxime Beucher, Edouard Limouseet Frédéric Herbette. Sans eux, ce travail n’aurait probablement pas vu lejour. Leur aide et soutien ont été déterminants. J’ai aussi une pensée pourThomas Dionysopoulos, avec lequel, suite à nos discussions en janvier 2015,j’ai commencé à réfléchir à l’opportunité de publier des articles sur la matricede corrélation.Enfin, sur un plan plus personnel, l’amour que je porte à mon épouse,Léa, et à ma fille, Gabrielle, m’a permis de surmonter les étapes difficiles.J’ai aussi une pensée pour les personnes qui ont été déterminantes dans mareconversion professionnelle il y a 12 ans de l’énergie nucléaire à la finance de2arché dont cette thèse est l’un des jalons. Je me souviens notamment de madiscussion avec le Professeur Jean-Philippe Bouchaud, en mai 2006, au jardindu Luxembourg, qui a été un moment crucial. Il a été très bienveillant enme donnant des bons conseils: il ne fallait surtout pas reprendre mes étudesdans les produits dérivés, les calculs stochastiques et les équations différen-tielles, il fallait au contraire reprendre mes études axées sur la recherche,l’empirisme et les inefficiences de marché, sujet qui n’était pas encore très àla mode avant la crise financière. Je me souviens, par sa clairvoyance, qu’ilavait déjà anticipé la crise des subprimes de 2007-2008 en me disant que se-lon lui les hypothèses liées à la valorisation des produits financiers dérivéscomplexes étaient devenues surréalistes. J’ai toujours été impressionné par sasimplicité, sa disponibilité et sa gentillesse vu sa réussite exceptionnelle. Jeme souviens à cette époque du nombre de journées que j’ai passées à lire sonlivre et ses nombreux articles de recherche qui m’ont permis de former mespremières intuitions par rapport au fonctionnement des marchés financiers.J’ai aussi une pensée émue pour les Professeurs Jacques Prost et Pierre-Gilles de Gennes, tout aussi simples, bienveillants et exceptionnels, qui ontpermis probablement en me recommandant, que l’université dauphine m’ac-cepte en tant qu’étudiant, à 28 ans, avec mon parcours atypique au masterde finance. J’ai aussi une pensée pour Joel Benarroch et François Bonnin quim’ont appris le métier de gérant de fonds discrétionnaire et systématique.J’ai aussi une pensée pour les seuls professeurs de l’ESPCI qui ont vraimentcru en moi, Isabelle Rivals et Léon Personnaz, qui m’ont formé aux statis-tiques pendant un an dans leur laboratoire. Ils ont eu beaucoup d’influencesur mes travaux. J’ai aussi une pensée pour mes amis les plus proches Pierre,3lexandre, Charles-Henri, Olivier, Emmanuel et Antoine et pour mes parentset mes frères et sœurs, qui ont cru en moi, et m’ont toujours soutenu danscette reconversion professionnelle et je les remercie aussi.4 vant-Propos
Cette thèse a débuté en janvier 2016 dans le laboratoire CEPN (UMR7234, CNRS) de l’université Paris-XIII. Le travail de recherche a été réa-lisé chez John Locke Investments, société de gestion indépendante et à taillehumaine (15 salariés), pour laquelle j’ai continué à travailler à pleins tempsen tant que chercheur et gérant des fonds systématiques John Locke EquityMarket Neutral et John Locke Smart Equity. J’ai ainsi pu profiter de monexpérience concrète des marchés financiers pour adapter mes modèles à laréalité. Aussi j’ai dû me concentrer sur des modèles qui devaient avoir un in-térêt certain pour la gestion d’actifs et les deux fonds que je gère. Modéliser lamatrice de corrélation des actions est clef chez John Locke Investments. Ainsiles portefeuilles optimaux pour faire du trend following se basent uniquementsur l’exploitation de la matrice de corrélation qu’il faut maitriser, nettoyer,inverser, modéliser très proprement pour pouvoir amplifier les faibles autocor-rélations en performances robustes. Ainsi les compétences de certains gérantspeuvent très bien se limiter à la bonne modélisation de la matrice de cor-rélation. Les papiers de recherche devaient aussi être pratiques et constituerun support intellectuel pour convaincre les clients des fonds du fondementscientifique de mes modèles de gestion.Trois papiers ont été présentés lors des conférences en 2016 (Liège, Belgique),en 2017 (Valence) et en 2018 (Paris) de l’AFFI et un quatrième sera présentélors de la conférence en Juin 2019 (Laval, Quebec):– le papier “Emergence of Correlation between Securities at Short Time5cales” a été présenté au 35th International Conference of the FrenchFinance Association à l’ESCP à Paris du 20 au 24 mai 2018. Le papierest présenté en premier au chapitre 1 car il explique l’origine physiquedes corrélations entre actions;– le papier “Fundamental Market Neutral Maximum Variance Portfolios”a été soumis en janvier 2019 au 36th International Conference of theFrench Finance Association à Laval au Quebec. Le papier justifie laméthodologie utilisée dans la thèse pour débruiter la matrice de correla-tion. Le papier est à cheval entre plusieurs spécialités (model factoriels,matrice aléatoires, Asset Pricing) et doit être restructuré et découpé enplusieurs projets pour être publiable. Le papier est présenté au chapitre2.– le papier “The Reactive Beta Model” a été présenté au 34th InternationalConference of the French Finance Association à Valence le 31 mai et1er et 2 juin 2017. Le papier est présenté au chapitre 4;– le papier “Should Employers Pay Better their Employees? An AssetPricing Approach” a été présenté au 33rd International Conferenceof the French Finance Association à HEC-Management School of theUniversity of Liege du 23 au 25 mai 2016. Le papier est présenté dansle manuscrit au chapitre 6 comme une application.6 . Table des matières
I Introduction Générale 9 I Dissertation doctorale 41
III Conclusion Générale 259 Introduction Générale . Introduction La matrice de corrélation des rendements des actions est nécessaire àl’analyse du risque d’un portefeuille. Une modélisation fine est nécessairepour construire les portefeuilles optimaux robustes (maximiser le gain po-tentiel tout en minimisant le risque). Les mesures empiriques de la matricede corrélation sont bruitées du fait d’un nombre trop faible de rendementsindépendants et homoscedastiques disponibles et d’un nombre trop grandd’actions. Ainsi il est courant de devoir mesurer les corrélations entre 500actions ou plus avec beaucoup moins d’un an d’historique afin de pouvoirsupposer que les corrélations restent à peu près constantes sur cette période.L’échantillon se réduit encore lorsqu’on s’intéresse aux corrélations des ren-dements mensuels voire annuels, qui importent le plus pour les investisseurs.Les autocorrélations des rendements sont faibles mais suffisantes pour défor-mer la matrice selon l’horizon de temps et transformer des facteurs de risque
1. ce qui correspond à moins de 250 rendements journaliers qui ne sont que très ap-proximativement gaussiens . Lesautres vecteurs propres ont des valeurs propres beaucoup plus petites (infé-rieure à 20) et représentent des portefeuilles “long/short” et “market neutre”d’abord plutôt sectoriels puis plutôt de style. Cependant l’instabilité de lamatrice de corrélation associée au bruit de mesure rend difficile l’interpréta-tion des vecteurs propres “long/short” mesurées. Ainsi d’un côté, on devraitréduire la profondeur sur laquelle on mesure la corrélation pour espérer unecertaine stabilité des corrélations sur la période de mesure, mais, de l’autrecoté, on devrait augmenter la période et la fréquence pour réduire le bruitde mesure.Pour filtrer le bruit de mesure, inhérent à l’analyse par composante prin-cipale, j’ai contraint l’analyse au sous espace des facteurs de risque princi-
2. proche de la corrélation moyenne de l’ordre de 0.4 au carré multipliée par le nombred’action de l’ordre de 500 dans mon cas . Revue de la littérature
Ce travail de recherche s’est articulé autour de six champs disciplinairesrelativement cloisonnés entre plusieurs disciplines finance, économie, écono-physique et mathématiques appliquées.
La bonne estimation des corrélations des rendements des actions est né-cessaire pour l’analyse de risque d’un portefeuille et pour son optimisation.La bonne compréhension des variations temporelles des corrélations en coursou potentielles est aussi critique pour la gestion d’un portefeuille et notam-ment d’un fonds “market neutre” qui utilise un fort effet de levier financieret dont l’arbitrage de style est un des moteurs de performance. L’arbitragede style est une stratégie de trading qui consiste à investir sur les styles degestion porteurs. Par exemple si le style de gestion qui consiste à acheter21es petites capitalisations et à vendre des grosses capitalisations est profi-table, la stratégie va acheter les petites capitalisations et vendre les grandes.Dans le cas inverse, la stratégie va acheter les grosses et vendre les petites.Plus de 240 styles de gestion ou facteurs de risques profitables ont été pu-bliés dans la littérature scientifique décrites dans la section 2.5. L’intérêt dumarket timing ne fait pas consensus (Lee (2017); Bender et al. (2018); Basset al. (2017)) et certains préfèrent bénéficier simplement de la diversification.DeMiguel et al. (2017) montrent qu’en pratique il suffit, pour construire unportefeuille, de sélectionner 15 critères financiers significatifs sur plus de 100.Les stratégies de market timing peuvent être complexes. Elles s’appuyentsur des modèles de prévision. Hodges et al. (2017) cherchent des prédicteursdes facteurs dans diffèrent régimes économiques et différentes conditions demarché. Ils trouvent que l’utilisation d’une combinaison d’indicateurs sur lecycle économique, la valorisation, la tendance et la dispersion serait plusefficace que l’utilisation d’indicateur individuel. Ainsi Dichtl et al. (2018) fa-briquent un portefeuille “long/short” grâce à la méthode d’optimisation desparamètres introduite par Brandt et al. (2009) en utilisant plusieurs indi-cateurs de valorisation et de tendance et ils montrent que le market timingpermet de surperformer le portefeuille investi équitablement sur les différentsstyles de gestion dont les primes de risque sont positives. La fragilité de cesrésultats vient du risque de surapprentissage. De plus ces stratégies en gé-néral peuvent souffrir de chocs de corrélations entre les différents styles degestion qui peuvent survenir et générer des pics de volatilités. Ainsi les stra-tégies quantitatives d’habitude non corrélées peuvent se corréler fortement demanière brutale. Cela s’est passé du 8 au 9 août 2007, quand la plupart des22onds d’arbitrage de style ont subi des pertes très significatives brutalementen même temps (Stein (2009)). Lors de cet évènement, nommé plus tard“quant crash”, la plupart des fonds touchés employaient des stratégies “mar-ket neutre” quantitatives sans exposition au marché ce qui remet en questionleur statut “market neutre” (Khandani and Lo (2011)). Il semble en fait quetrop de gérants étaient investis sur les même “crowded” stratégies avec tropde levier et qu’ils ont tous voulu réduire leurs positions en même temps aumême signal. De tels risques de “crowding” affectent une grande variété destratégies, comme le style “Momentum” (acheter les actions qui ont surper-formé et vendre les actions qui ont sous performé) car ils ne dépendent pasd’estimation indépendante des valeurs fondamentales des entreprises (Hongand Sraer (2016); Stein (2009)).Des centaines de milliards de dollars sont aussi gérées directement enutilisant l’optimisation Mean-Variance introduite par Markowitz (1952) enpréférant se baser sur des hypothèses simples concernant les espérances desrendements. La valeur ajoutée des gérants viendrait seulement d’une modé-lisation plus adaptée de la matrice de corrélation et d’une bonne capacité àexécuter les ordres en minimisant l’impact de marché. Ainsi le portefeuille“Min Variance” suppose que les espérances des rendements sont toutes iden-tiques et que la matrice de corrélation peut se modéliser simplement, parexemple, avec un modèle à un facteur (Clarke et al. (2013)). Le portefeuille“Max Diversification” introduit par Choueifaty and Coignard (2008) supposeque les espérances sont proportionnelles au risque. Ces deux derniers porte-feuilles nécessitent d’inverser la matrice de corrélation ce qui peut poser pro-blème si la matrice n’est pas proprement modélisée. Le portefeuille “Equal-23isk Contribution” introduit par Maillard et al. (2010) est moins sensibleaux bruits de mesure et est donc plus robuste mais n’est plus théoriquementoptimal. Benichou et al. (2017) introduisent le portefeuille “Agnostic RiskParity”. Les espérances ne sont plus forcément positives mais dépendent desrendements passés. Le portefeuille dépend alors de l’inverse de la racine carréde la matrice de corrélation multipliée par des signaux qui représentent desindicateurs techniques des tendances. Ce portefeuille “trend following” allouele même risque sur chaque vecteur propre de la matrice de corrélation.
Aujourd’hui les performances des différents styles de gestion et les perfor-mances sectorielles sont très suivies par tous les acteurs du marché qui ne secontentent plus d’avoir une vue binaire (le marché va-t-il monter ou baisser?)et s’auto-alimentent par un phénomène d’effet moutonnier très bien décritdans la littérature (Guedj and Bouchaud (2005); Michard and Bouchaud(2005); Cont and Bouchaud (2000); Wyart and Bouchaud (2007); Lux andMarchesi (1984)). Ainsi quand tel ou tel style de gestion chute, les acteursvont le vendre en même temps et accentuer sa chute. La moindre nouvellemacroéconomique va impacter les indices mais aussi les autres facteurs derisque. Quand la Réserve fédérale des États-Unis se dit prête à augmenterles taux d’intérêt, le facteur levier (vente d’actions endettées, achat d’actionspeu endettées) sera joué puis d’autres facteurs seront entrainés. Benzaquenet al. (2017) mettent en évidence le lien entre le trading et la matrice de cor-rélation en partant de la microstructure et du cross impact des transactionssur les prix. Les corrélations ne décriraient que l’interaction entre actions par24e jeu des traders. Les corrélations sont aussi réputées pour augmenter avecl’échelle de temps: les rendements mensuels sont plus corrélés que les rende-ments journaliers qui sont plus corrélés que les rendements 1 minutes (Epps(1979)). Bouchaud and Potters (2018) avaient déjà proposé dans la partie“some open problem” une piste (les rendements de l’action i n’impactent pasinstantanément les rendements de l’action j mais avec un certain retard) pourexpliquer la dépendance des corrélations à la fréquence mais ne l’avait pasdéveloppé.
Depuis l’article majeur de Markowitz (1952), l’optimisation (cid:40)(cid:40)
Mean Variance (cid:41)(cid:41) est devenue une méthode rigoureuse pour construire un portefeuille d’inves-tissement. Deux ingrédients fondamentaux sont nécessaires: les espérancesdes rendements de chaque action et la matrice de covariance des rendements.L’estimation de la matrice de covariance a toujours été un sujet important. Laméthode de base se contente d’agréger les rendements historiques et de cal-culer leurs covariances historiques. Malheureusement cela crée des problèmesbien documentés (Jobson and Korkie (1980)). Pour l’expliquer simplement,quand le nombre d’actions est grand devant le nombre d’observations dis-ponibles, ce qui est généralement le cas, la matrice de corrélation historiquecomporte beaucoup d’erreurs. Cela implique que les coefficients les plus ex-trêmes prennent des valeurs extrêmes non pas à cause de la réalité maisà cause d’erreurs extrêmes. Invariablement les optimisations de portefeuillevont miser leurs plus gros paris sur ces erreurs extrêmes ce qui rendra l’op-timisation extrêmement non fiable. Michaud (1989) appelle ce phénomène25error-optimization”. De manière alternative on peut considérer une estima-tion avec beaucoup de contraintes, comme le (cid:40)(cid:40) single-factor model (cid:41)(cid:41) de Sharpe(1963). Ces estimateurs de la matrice de corrélation contiennent d’un côtépeu d’erreurs mais de l’autre beaucoup d’erreurs de spécification et de biais.Une alternative est le “Shrinkage” qui consiste à un mélange entre l’esti-mation sans contrainte et l’estimation avec la contrainte (Ledoit and Wolf(2003, 2012)). L’APT (“Arbitrage Pricing Theory”) de Ross (1976) a généréun intérêt croissant dans les modèles multifactoriels. Ainsi le standard del’industrie de la gestion d’actifs est d’utiliser des modèles multifactoriels.Quelques entreprises, comme APT, Barra et Axioma (Barra (1998)) qui sontdevenues incontournables dans l’industrie de la gestion d’actifs, proposent àleurs clients des matrices de covariances qui s’adaptent mieux aux optimisa-tions de portefeuille. Ces sociétés ont été accusées d’être à l’origine du “quantcrash” de 2007, déjà mentionné dans la section 2.1, car elles favorisaient le“crowding” en fournissant les même facteurs de risque à tous les gérants. Cesméthodes se basent sur des modèles multifactoriels fondamentaux combi-nant une cinquantaine de facteurs sectoriels et d’autres risques. Ces facteursutilisent le rendement des portefeuilles associés à certains critères financiersobservables tel que le “Dividend Yield”, le “Book to Market” ratio ou lessecteurs d’appartenance. Une autre approche est d’utiliser les facteurs sta-tistiques issus de l’analyse par composante principale, qui est décrite dans lasection 2.4, avec un nombre total de facteurs de l’ordre de 5. Connor (1995)montre que les modèles multifatoriels “fondamentaux” permettent d’expliquer42% ( R = 42% étant le pouvoir explicatif du modèle) des rendements alorsqu’une simple analyse par composant principale sur 5 facteurs explique deja269%. Connor (1995) trie les facteurs selon leur pouvoir explicatif. Les secteurspermettent d’augmenter de 18%, puis le facteur “Low Volatility” (proche dufacteur “Low Beta”) augmente le R de 0.9% puis les facteurs “Momentum”,“Capitalization”, “Liquidity”, “Growth”, “Earning”, augmentent de moins de0.8%. Puis il reste par ordre d’importance décroissant des facteurs plutôtmineurs: le “Book to market”, le “Earning Variability”, le “Leverage”, l’inves-tissement à l’étranger, le coût du travail et enfin le “Dividend Yield”. Toutefoisla sélection des facteurs nécessaires et le choix du nombre a fait l’objet denombreuses controverses (Roll and Ross (1980, 1984); Dhrymes et al. (1984);Luedecke (1984); Trzcinka (1986); Conway and Reinganum (1988); Brown(1989)). Connor and Korajczyk (1993) proposent une méthodologie simplepour estimer le nombre de facteurs significatifs: si le rajout d’un facteur neréduit pas significativement le carré du résidu alors le facteur n’est pas consi-déré comme significatif. La plupart des études académiques se base sur uneanalyse historique depuis 1967 en exploitant la base de données du centre derecherche des prix des actions (Center of Research in Security Prices). Cettebase de données regroupe principalement les actions cotées à la bourse duNew-York Stock Exchange depuis 1926. L’analyse par composante principale (ACP) prend sa source dans un ar-ticle de Karl Pearson publié en 1901. Encore connue sous le nom de transfor-mée de Karhunen-Loeve ou de transformée de Hotelling, l’ACP a été de nou-veau développée et formalisée dans les années 1930 par Harold Hotelling. Lapuissance mathématique de l’économiste et statisticien américain le conduira27ussi à développer l’analyse canonique, généralisation des analyses factoriellesdont fait partie l’ACP. Les champs d’application sont aujourd’hui multiples,allant de la biologie à la recherche économique et sociale, et plus récemmentle traitement d’images.La théorie de la matrice aléatoire, dont la distribution des valeurs propresobtenues par l’ACP suit la loi de Marčenko-Pastur pour les grandes matrices,modélise les bruits de mesure des corrélations et montre que les petites va-leurs propres en dessous d’une valeur propre critique sont sous estimées etne sont pas significatives (Laloux et al. (1999); Plerou et al. (1999, 2002);Potters et al. (2005); Wang et al. (2011)). Bun et al. (2016) appliquent uneméthode théorique introduite par Ledoit and Péché (2011), qu’ils appellent“Rotationnaly invariant estimator”, pour debiaiser de manière continue lesvaleurs propres empiriques et ils montrent que la méthode semble plus ro-buste que celles du “Clipping” ou du “Shrinkage” qui sont bien documentéespar Ledoit and Wolf (2004, 2003)). Allez and Bouchaud (2012) modélisentl’impact du bruit sur le premier vecteur propre et montre que ce derniertourne légèrement autour d’un vecteur fixe. L’angle de rotation dépend duratio entre la première valeur propre et les autres. En appliquant ce modèleaux autres vecteurs propres, on comprend qu’ils tournent aussi autour d’axesfixes mais avec un angle de rotation bien plus important. Ils sont ainsi trèsbruités ce qui explique la difficulté à les interpréter.L’ACP avec une contrainte linéaire est une alternative aux filtres issusde la théorie de la matrice aléatoire pour éliminer le bruit de mesure et estentièrement résolu depuis longtemps (Golub (1973)). Dans ce cas les vecteurspropres sous contrainte appartiennent tous au sous espace solution de la28ontrainte : les vecteurs propres sous contrainte sont simplement les vecteurspropres d’une matrice qui a été réduite et débruitée. Toute la difficulté est dedéfinir les facteurs formant le sous espace contraint pour que les contraintesn’impactent principalement que le bruit des valeurs propres. Pour cela, ilest possible de s’inspirer de la littérature de l’Asset Pricing décrite dans lasection 2.5) et des modèles multifactoriels décrite dans la section 2.3).
Fama (1965) a abouti à la théorie des marchés efficients, selon laquelle,les prix suivent des marches aléatoires. Puis Sharpe (1964) dérive le MEDAFà partir d’hypothèses plus ou moins réalistes, comme l’absence de coût detransaction et la rationalité des investisseurs. Selon le MEDAF, l’espérancedes rendements doit être théoriquement proportionnel au beta, seul risquequi n’est pas diversifiable et qui doit être rémunéré. Depuis 1970, différentesanomalies ont été observées par rapport à cette théorie. Les facteurs clas-siques de Fama and French (1992, 1993) sont investis à l’achat sur le top20 %, selon le critère financier étudié, et investis à la vente sur le bottom20%. Ces facteurs peuvent capturer une anomalie par rapport à la théoriedes marchés efficients s’ils génèrent des gains significativement différents dezéros. La construction top 20 % bottom 20% est clairement sous optimale,selon Asness et al. (2013), mais reste paradoxalement la référence dans le do-maine de l’Asset Pricing. La régression de Fama and MacBeth (1973) est laméthode la plus utilisée pour mettre en évidence des anomalies par rapportau MEDAF. Plusieurs modèles ont été développés pour fournir une interpré-tation économique aux nombreuses anomalies et pour améliorer le MEDAF.29ama and French (1993) ont proposé un modèle à trois facteurs pour mo-déliser les espérances des rendements. Harvey and Liu (2018) ont listé 316facteurs potentiels censés capturer une anomalie à partir de 313 articles de-puis 1967. Selon eux, la plupart des facteurs peuvent être le fruit du datamining et ne seraient pas robustes. La plupart de ces facteurs se recoupentc’est pourquoi une vingtaine peut suffire mais le niveau de significativitépour caractériser les anomalies ne fait pas consensus. Les travaux acadé-miques ont d’abord retenu les critères financiers tels que la “Capitalization”,le “Price Earning Ratio”, le “Cash Flow”, le “Book to Market”, la croissanceet le “Momentum”. Par exemple les actions de petites capitalisations tendentà surperformer (Banz (1981)). La volume moyen semble plus adéquate quela taille pour Ciliberti et al. (2017). Une autre anomalie importante est laprime “Value”: les entreprises “Value” tendent à surperformer les entreprisesde croissance (Fama and French (1998)). La profitabilité proche du “CashFlow” est aussi une variable explicative significative de l’espérance des ren-dements (Fama and French (2015)). L’anomalie “Low Volatility” ou “LowBeta” ont aussi été révélées (Jordan and Riley (2013); Fu (2009); Ang et al.(2006)). L’anomalie la plus populaire reste le “Momentum”: les actions quiont surperformé auront tendance à continuer à surperformer (Jegadeesh andTitman (1993)). Les anomalies sont directement exploitées dans la gestiond’actifs dont les strategies sont décrites dans la section 2.1. Asness et al.(2013) expliquent ainsi qu’une stratégie de base d’investissement et très po-pulaire simplement allouée en partie sur le “Momentum” et sur l’anomalie“Value” permet d’atteindre un Sharpe “in the sample” supérieur à 1.Les théories financières pour justifier de telles primes de risque alterna-30ives (manque de liquidité, asymétrie) sont remises en cause car les anomaliesont tendance à disparaitre une fois publiées. McLean and Pontiff (2016) ontplusieurs explications alternatives: le biais “in the sample” avec le problèmede la suroptimisation ou l’adaptation des marchés.A ma connaissance aucune étude ne s’est encore intéressée à la mise en évi-dence des autocorrélations des rendements des facteurs de risque qui pourraitconstituer une inefficience plus subtile et plus robuste des marchés financiers.Une explication est que les autocorrélations sont trop difficiles à caractéri-ser de manière significative. Des articles existent mais la significativité et larobustesse de leurs résultats ne sont pas convaincants. Ainsi Hodges et al.(2017) cherchent des prédicteurs des facteurs dans diffèrents régimes éco-nomiques et différentes conditions de marché. Ils trouvent que l’utilisationd’une combinaison d’indicateurs sur le cycle économique, la valorisation, latendance et la dispersion serait plus efficace que l’utilisation d’indicateursindividuels.
L’instabilité de la matrice de corrélation de population a d’abord étémodélisée par des modèles de diffusion pour évaluer des produits dérivés(Possamai and Gauthier (2011)). Les modèles théoriques ont été ajustés pourretrouver les prix des produits dérivés sans chercher à connaitre la réalité dela dynamique de la matrice de corrélation empirique car cette dernière estdifficilement mesurable avec la précision recherchée: les modèles ARCH ontété initialement développés pour décrire l’heteroscedasticité des variations del’inflation (Engle (1982)) mais ont ensuite été utilisés pour modéliser la dy-31amique de la volatilité des actions pour évaluer des options (Duan (1995)).Des modèles de type “Dynamic Conditional Correlation” (DCC GARCH,Engle (2002, 2016)) ont étendu le modèle GARCH à une dimension et ontété développés pour modéliser la dynamique des corrélations et des volatili-tés. De la même façon le processus introduit par Cox et al. (1985), qui esttrès populaire en finance pour décrire la dynamique des taux d’intérêt et dela volatilité des actions pour évaluer des produits derivés, a aussi été étenduà partir de la diffusion Feller pour modéliser la dynamique des covariances:les processus de Wishart généralisent à plusieurs dimensions la diffusion deFeller. Gourieroux (2006) introduit ainsi un terme de retour vers la moyenneau processus de Wishart en le rendant stationnaire et généralise le proces-sus de Cox et al. (1985). Da Fonseca et al. (2007) généralisent de la mêmemanière le modèle d’Heston (1993) pour valoriser les options multi asset. Unprocessus de Wishart peut être vu comme le carré de Browniens ou dans saversion stationnaire d’Ornstein-Uhlenbeck. Cuchiero et al. (2011) analysentles fondations des processus stochastiques affines continus sur l’univers desmatrices de covariance motivé par l’utilisation de tels modèles pour valoriserdes options multi-asset ou pour décrire les intensités de défauts. Bru (1991)dérive les équations stochastiques pour décrire la dynamique de la matriceet la dynamique des valeurs propres. D’autres matrices aléatoires sont aussitrès étudiées, comme les matrices gaussiennes dont la distribution des va-leurs propres suit la loi circulaire de Wigner. Ahdida and Alfonsi (2013)s’intéressent à des matrices de corrélations aléatoires à travers la diffusion deWright-Fisher pour modéliser les corrélations des actions. Des algorithmesont aussi été implémentés pour générer des marches aléatoires parmi les ma-32rices de rotation. Cela permet de décrire la diffusion des vecteurs propresde la matrice de corrélation. Ainsi la marche aléatoire de Kac (1959) est unalgorithme assez efficient mais il ne contient pas de retour vers la moyenne,si bien qu’au bout d’un certain temps la matrice n’a plus aucun lien avec lamatrice initiale.D’autres phénomènes assez fins, comme l’effet de levier restent mal mo-délisés par les modèles de la littérature. Ainsi des versions asymétriques desmodèles type DCC GARCH ont été développées pour tenir compte de l’ef-fet de levier. Malgré une littérature conséquente sur l’effet de levier (quandles prix baissent, la volatilité augmente, selon Black (1976); Christie (1982);Campbell and Hentschel (1992); Bekaert and Wu (2000); Bouchaud et al.(2001)), aucun ne s’intéresse à la réalité et la complexité du phénomène biendécrite dans Bouchaud et al. (2001). De nombreux papiers rapportent que lesbeta, sensibilité des prix des actions aux variations de l’indice, peuvent varier(Blume (1971); Fabozzi and Francis (1978); Jagannathan and Wang (1996);Fama and French (1997); Bollerslev et al. (1988); Lettau and Ludvigson(2001); Lewellen and Nagel (2006); Ang and Chen (2007)) sans établir derelation précise entre entre l’effet de levier et l’augmentation des beta. Lesactions à fort effet de levier sont plus exposées à un beta instable (Galai andMasulis (1976); DeJong and Collins (1985)). Bien tenir compte de la variabi-lité des beta est important aussi pour bien tester les modèles d’Asset Pricing.Ainsi Bali et al. (2017) prétendent qu’une fois que les beta sont bien estimésà partir d’un modèle DCC GARCH, alors l’anomalie “Low Beta” disparait etle MEDAF est alors enfin vérifié empiriquement (le rendement espéré seraitbien proportionnel au beta quand il est bien mesuré).334 . Contributions principales
Le travail de recherche s’est focalisé sur six sujets pointus et s’est déclinésous la forme de six projets d’articles. Les contributions principales pourchacun des six sujets sont les suivantes:– “Emergence of Correlation of Securities at Short Time Scales” (chapitre1) : l’article introduit un modèle multifactoriel de retard, qui reproduitassez fidèlement les mesures de l’effet d’échelle sur les valeurs propres.Le modèle s’inspire du modèle d’impact de Kyle (1985). Le modèle sup-pose que les transactions sur les facteurs de risque, impactent le prixdes actions avec un certain retard. Je dérive, sous certaines hypothèses,une formule simple pour décrire la dépendance des valeurs propres avecl’échelle de temps. La formule contient deux paramètres pour chaquevaleur propre: la valeur propre asymptotique et un temps de relaxa-tion de l’ordre d’1 minute qui traduit un retard moyen de l’ordre dequelques minutes entre les actions et les facteurs de risque. Ainsi les35orrélations apparaissent à partir d’une minute. Toutefois ce retard dequelques minutes continue d’impacter les valeurs propres de la matricede corrélation des rendements 20 minutes et au-delà à cause d’une loien puissance qui s’explique par un mécanisme relativement subtile bienque le phénomène sature. L’article identifie donc une inefficience signi-ficative du marché, qui pourrait générer des gains dans le cas théoriqueoù les coûts de transactions sont nuls.– “The Fundamental Market Neutral Maximum Variance Portfolios” (cha-pitre 2): l’article introduit le “FCL” d’un portefeuille (ratio entre lavariance du portefeuille et la variance du portefeuille dans le cas oùles corrélations entre actions seraient nulles). Le “FCL” est un conceptproche des valeurs propres et a l’avantage de s’appliquer non seulementaux vecteurs propres mais aussi à n’importe quel facteur de risque. Le“FCL” serait une mesure idéale pour caractériser la significativité d’unfacteur de risque. J’introduis aussi le portefeuille “fundamental Maxvariance” qui optimise le “FCL” et qui peut être interprété comme unvecteur propre de la matrice de corrélation sous contrainte pour captu-rer au mieux un style donnée defini par un critère financier. Je montreque les poids optimaux dépendent directement des classements des ac-tions en fonction de ce critère et suivent une même loi universelle quis’applique à tous les critères financiers. Je montre que cette optimisa-tion permet de répliquer au mieux la matrice de corrélation à partir dequelques facteurs ainsi que sa dynamique en filtrant le bruit. Je fais lelien entre les différents “FCL”, les valeurs propres sous contraintes etles valeurs propres empiriques. Je montre enfin que les vecteurs propres36rincipaux de la matrice de corrélation s’investissent sur les facteurs quiont les “FCL” les plus élevés. Les “FCL” sont volatiles et sont bien modé-lisés par des processus d’Orstein-Ulhenbeck avec un temps de relaxationde 60 jours. La composition des vecteurs propres est donc très variablece qui explique pourquoi leur interprétation est difficile à l’exceptiondu premier. Je montre aussi sous certaines hypothèses que le Sharpedes portefeuilles “maximum variance” est optimal théoriquement. Lesrésultats de ce chapitre ont été obtenus en collaboration avec StanislavKuperstein.– “Time Scale Effect on Correlation at Long Time Horizon” (chapitre 3):l’article décrit une forme plus subtile mais plus robuste d’inefficiencedes marchés financiers que les écarts entre les espérances non condi-tionnelles des rendements et les prédictions du MEDAF. Il s’agit del’autocorrélation des rendements des facteurs de risque qui s’expliquepar l’illiquidité des marchés financiers et par le comportement mouton-nier des investisseurs qui ont tendance à acheter les produits qui ontmarché. Cette autocorrélation qui n’est pas décrite dans la littératureva rendre les vecteurs propres et les valeurs propres de la matrice decorrélation sensibles à l’échelle de temps.– “The Reactive Beta Model” (chapitre 4): l’article décrit le modèle delevier systématique (la corrélation augmentent lorsque l’indice baisse),spécifique (le beta d’une action augmente lorsque elle sous performe) etd’élasticité (lorsque la volatilité relative augmente le beta augmente).Il ressort qu’une grande partie de la variabilité des beta s’explique parces phénomènes. L’approche qui consiste à normaliser les rendements37our corriger ces petits phénomènes permet de réduire le bais de cer-tains facteurs (“Momentum” et “Low Beta”) par rapport à la régréssionlinéaire directe sur les rendements. Des tests empiriques montrent lasupériorité du modèle par rapport à la simple régréssion linéaire. Dessimulations Monte-Carlo montrent aussi l’avantage d’un tel modèle parrapport aux méthodes robustes telles que les régressions par quintileset les modèles de type DCC GARCH symétriques ou asymétriques. Jemontre que mon modèle semble le plus adapté à la réalité des marchéscar il a été conçu pour s’adapter à des phénomènes bien caractérisés etmesurés.– “The Model of Diffusion of the Correlation between Securities” (chapitre5): l’article identifie quelques faits stylisés qui caractérisent la diffusiondes vecteurs propres empiriques des marchés. Les vecteurs propres dela matrice à l’instant t voient leur corrélation en utilisant la matrice àl’instant t + τ augmenter très légèrement avec τ . Je m’intéresse à la dis-tribution des valeurs propres des incréments de la matrice de corrélationqui est différente de la loi demi-cercle de Wigner et de la distrution quiressemble à un chapeau pointu. Les équations stochastiques standard(Wright-Fisher, Feller) qui simulent directement la matrice de corréla-tion ainsi que d’autres méthodes simples qui simulent des trajectoiresaléatoires de la matrice de rotation autour de la matrice identité avecun terme de retour vers la moyenne pour simuler la diffusion des vec-teurs propres ne permettent pas de reproduire la distribution empiriquedes valeurs propres. La diffusion des FCL, définies dans le chapitre 2,permet de generer simplement cette distribution. Les résultats de ce38hapitre ont été obtenus en collaboration avec Stanislav Kuperstein.– “Should Employers Pay Better their Employees? An asset Pricing Approach”(chapitre 6) : le facteur rémunération est identifié comme un facteur derisque commun significatif grâce au “FCL“ mesuré qui est significative-ment supérieur à 1. Le facteur est ainsi aussi significatif que le facteur“Book” de Fama et French. Le facteur rémunération révèle aussi unefaible anomalie de marché: les entreprises qui payent bien leurs em-ployés ont un risque en commun et tendent à surperformer les autres.L’article remet en cause la méthodologie de Fama et French qui ne se-rait pas assez fine pour caractériser une telle anomalie. Ainsi il sembletrès important de maintenir à chaque instant le facteur beta neutre etpas seulement en moyenne seulement pour mesurer l’anomalie.390 I Dissertation doctorale . Emergence of Correlation betweenSecurities at Short Time Scales mergence of correlations between securities at shorttime scales AbstractThe correlation matrix is the key element in optimal portfolio allocation and riskmanagement. In particular, the eigenvectors of the correlation matrix correspondingto large eigenvalues can be used to identify the market mode, sectors and style factors.We investigate how these eigenvalues depend on the time scale of securities returns inthe U.S. market. For this purpose, one-minute returns of the largest 533 U.S. stocks areaggregated at di(cid:27)erent time scales and used to estimate the correlation matrix and itsspectral properties. We propose a simple lead-lag factor model to capture and reproducethe observed time-scale dependence of eigenvalues. We reveal the emergence of severaldominant eigenvalues as the time scale increases. This important (cid:28)nding evidencesthat the underlying economic and (cid:28)nancial mechanisms determining the correlationstructure of securities depend as well on time scales.
How do the eigenvalues of securities correlation matrices emerge at di(cid:27)erent time scales?This fundamental question is important because cross-correlations change over di(cid:27)erent in-vestment horizons while a reliable empirical determination of the correlation matrix remainsdi(cid:30)cult due to its time and frequency dependence. This was (cid:28)rst evidenced by Epps, whodemonstrated the decay of correlations among U.S. stocks when shifting from daily to intra-daily time scales (or frequencies) [1]. In other words, the price correlation decreases withthe duration of the time interval over which price changes are measured. The economicargument behind the Epps e(cid:27)ect is that the information is not instantaneously transmittedat shorter time intervals, where the average adjustment lag in response of prices lies ap-proximately between 10 and 60 minutes. This appears to reduce the scope of the E(cid:30)cientMarket Hypothesis [2] at short time scales given that tick data prices seem to adjust to newinformation only after a lag time, thus do not re(cid:29)ect all available information. Since itsinception, the Epps e(cid:27)ect has been con(cid:28)rmed by several studies, although its impact hasbeen progressively declined in the NYSE, indicating that the market becomes increasinglymore e(cid:30)cient [3]. 1he dependence of securities cross-correlations on time scales can be captured via theeigenvalues of the correlation matrix. In particular, the largest eigenvalue re(cid:29)ects changes inthe average correlation between stocks, whereas the corresponding eigenvector is associatedto the (cid:16)market mode(cid:17). Kwapien et al. showed a signi(cid:28)cant elevation of the largest eigenvaluewith increasing time scale using data from 1 minute to 2 days from NYSE, NASDAQ andDeutsche B(cid:246)rse (1997-1999) [4]. Using high-frequency stock returns from NYSE, AMEX andNASDAQ (1994-1997), Plerou et al. supported the idea that the largest eigenvalue and itseigenvector re(cid:29)ect the collective response of the entire market to stimuli such as certain newsbreaks (e.g., central bank interest rates hikes) [5]. This is particularly true during periodsof high volatility when the collective behavior is enhanced. Coronnello et al. con(cid:28)rmed thatthe largest eigenvalue, computed from 5-minute data, describes the common behavior of thestocks composing the LSE stock index (2002) [6].As (cid:28)rms having similar business activities are correlated, some other eigenvectors caneconomically be interpreted as business sectors [7]. So, Gopikrishnan et al. computed theeigenvectors of cross-correlation matrices of 1000 U.S. stocks at a 30-minute scale (1994-1995)and a 1-day scale (1962-1996) [7]. They found that the correlations in a business sector,captured via an eigenvector, were stable in time and could be used for the construction ofoptimal portfolios with a stable Sharpe ratio. In the same vein, as similar trading strategiesinduce cross-correlations in stocks, some eigenvectors can be (cid:28)nancially interpreted as stylefactors. The corresponding eigenvalues are thus expected to exhibit non-trivial dependenceon time scales. However, an accurate statistical analysis of multiple eigenvalues at di(cid:27)erenttime scales is challenging due to measurement noises. In fact, as the correlation matrixis estimated from time series of stocks’ returns, its elements are unavoidably random andthus prone to (cid:29)uctuations. These (cid:29)uctuations become larger as the length of time series isreduced, i.e., when the time scale is increased. While the largest eigenvalue typically exceedsthe level of (cid:29)uctuations by two orders of magnitude, the other eigenvalues rapidly reachthis level and become non-informative. Several researchers employed the random matrixtheory to distinguish economically signi(cid:28)cant eigenvalues from noise [8, 9, 10, 11, 12]. Inparticular, Laloux et al. showed that only 6% of the eigenvalues carried some informationof the S&P 500 (1991-1996), while the remaining 94% eigenvalues were hidden by noise[8]. Guhr and Kalber proposed an alternative statistical approach to reduce noise that theycalled (cid:16)power mapping(cid:17) [13]. Andersson et al. extended this work by comparing the powermapping approach to a standard (cid:28)ltering method discarding noisy eigenvalues for Markowitzportfolio optimization using daily Swedish stock market returns (1999-2003) [14].In this paper, we consider the correlation matrix of (cid:28)nancial securities and investigate theemergence of its eigenvalues at small time scales. As the (cid:28)nancial literature on this criticalissue remains sparse, this research (cid:28)lls the gap by investigating the eigenvalues at intradaytime scales using 1-min returns. We propose a simple model, coined the (cid:16)lead-lag factormodel(cid:17), as an adaptation of the well-known (cid:16)one-factor marker model(cid:17) [15] to smaller timescales and to multiple sectors and style factors. In this model, stock returns are correlated to2he returns of selected factors at earlier time steps. A detailed description of the eigenvaluesas functions of the time scale is then derived. An empirical validation is performed on longtime series of 1-min returns of a large universe of U.S. stocks. To get several signi(cid:28)canteigenvalues at time scales from 1 minute to 2 hours, the correlation matrix was estimatedover the whole available period (2013-2017) so that variations of cross-correlations over timewere ignored (note that the dynamics of the eigenvalues and eigenvectors over time has beeninvestigated elsewhere [16, 17, 18]). In spite of its simple character, the lead-lag factor modelis shown to be able to reproduce the dependence of large eigenvalues on the time scale.The paper is organized as follows. In Sec. 2, we estimate the correlation matrix ofU.S. stocks’ returns at di(cid:27)erent time scales and present the empirical dependence of largeeigenvalues on the time scale. To rationalize the observed behavior, we develop in Sec. 3the lead-lag factor model and compare it to empirical results. Section 4 summarizes andconcludes. Some derivations and more technical analysis of the lead-lag factor model arepresented in Appendices.
We study the correlation structure of a universe that includes 533 U.S. stocks whose capi-talization exceeded 1 billion dollars in 2013. For the considered period from 1st of January2013 to 28th of June 2017, our database contains 338 176 1-min returns for each stock. Wehave also veri(cid:28)ed that the arithmetic aggregation of returns, r i (1) + . . . + r i ( τ ) , is almostidentical to considering the product (1 + r i (1)) . . . (1 + r i ( τ )) − , given that the 1-min returns r i ( t ) are very small.From the time series of 1-min returns, we estimate the correlation matrix over the wholeavailable period, and then compute its eigenvalues. Then we aggregate the returns into 2-min, 4-min, ..., 128-min returns, producing time series with 169 088, 84 544, ..., 2 642 points,respectively. At each time scale τ , we repeat the computation to investigate the dependenceof the eigenvalues on τ . Figure 1a shows the four largest eigenvalues of the covariance matrix of 533 U.S. stocks’ re-turns, computed by aggregating 1-min returns with the time scale τ , ranging from 1 minutesto 128 minutes (2 hours). The (cid:28)rst two eigenvalues exhibit almost linear growth with τ ,the others show minor deviations from linearity at small τ but scale linearly with τ at large τ . This behavior re(cid:29)ects the di(cid:27)usion-like growth of the variance of aggregated returns; inparticular, if the returns were independent, the eigenvalues of the corresponding covariance3 time scale (min)10 -6 -4 -2 e i gen v a l ue s (a)
20 40 60 80 100 120time scale (min)020406080100120140 e i gen v a l ue s (b) Figure 1: Four largest eigenvalues of the covariance matrix (a) and of the correlation matrix(b) for returns of 533 U.S. stocks, computed by aggregating 1-min returns with the timescale τ , varying from 1 minute to 128 minutes (2 hours).matrix, C ij = τ σ i δ ij , would be just λ i = τ σ i , and thus proportional to τ . Although correla-tions a(cid:27)ect this linear growth, their e(cid:27)ect is subdominant, at least for large eigenvalues, aswitnessed by Fig. 1a. To highlight the e(cid:27)ect of correlations, we focus on the eigenvalues ofthe correlation matrix. This choice is also justi(cid:28)ed from the (cid:28)nancial point of view to levelo(cid:27) the variability of stocks volatilities.Figure 1b shows the four largest eigenvalues of the correlation matrix of the same 533 U.S.stocks’ returns. If the returns were independent, the correlation matrix would be the identity,and thus all its eigenvalues would be equal to . The growth of these eigenvalues with thetime scale τ indicates strong cross-correlations between stocks. The largest eigenvalue canbe naturally attributed to the market mode, whereas the next eigenvalues correspond todi(cid:27)erent sectors and style factors.After a sharp growth at short time scales (few minutes), the eigenvalues slowly approachto their long-time limits. The existence of these upper bounds is expected because the sum ofeigenvalues of a correlation matrix is equal to its size (i.e., to the number of stocks, N ). Thissaturation e(cid:27)ect contrasts with the unlimited growth of eigenvalues of the covariance matrix(Fig. 1a). Finding the functional form of this approach and identifying its characteristic timescales present the main aim of our work. Recently, Benzaquen et al. proposed a multivariatelinear propagator model for dissecting cross-impact on stock markets and revealing theirdynamics [19]. Due to its very general form accounting for both cross-correlations and auto-correlations of stocks, the proposed model contains too many parameters, while the resultingformulas are not explicit. Our ambition is rather the opposite and consists in suggesting anexplicit model, as simple as possible, that would capture the empirical results shown in Fig.1b and thus provide a minimalistic framework for their (cid:28)nancial interpretation.4 The lead-lag factor model
We consider a trading universe with N assets. In a conventional one-factor model, the returnof the i -th asset at time t , r i ( t ) , is modeled as a combination of a speci(cid:28)c, asset-dependentrandom (cid:29)uctuation, ε i ( t ) , and an overall market contribution, R ( t ) , r i ( t ) = ε i ( t ) + βR ( t ) , (1)with a market sensitivity β (that we generalize below to other factors). The asset-speci(cid:28)crandom (cid:29)uctuations ε i ( t ) are typically modeled as independent centered Gaussian variableswith volatilities σ i .We propose a modi(cid:28)cation of this conventional model by incorporating the lead-lag e(cid:27)ect,in which the i -th asset return at time t is in(cid:29)uenced by a common factor R ( t − k ) at earliertimes t − k , with progressively decaying weights: r i ( t ) = ε i ( t ) + β ∞ X k =0 α k R ( t − k ) , (2)where ≤ α < characterizes the relaxation time of the memory decay. Note that theupper limit of the sum in Eq. (2) is formally extended to in(cid:28)nity, bearing in mind thatcontributions for very large k are exponentially small. We will analyze the model in thestationary regime as t → ∞ in order to eliminate transient e(cid:27)ects.The common term R ( t ) can be interpreted as an idealized factor without auto-correlationsin an e(cid:30)cient market that most stocks follow with a lead lag delay. We model therefore R ( t ) by independent centered Gaussian variables with volatility Σ . The term R ( t ) can representthe market mode but also sectors or style factors, or any popular trading portfolio. Moreover, R ( t ) can also be interpreted as being linked to the market order transactions for a particularstrategy (market, sector or styles). In this light, our model can be seen as an extensionof the Kyle model [20] that explains the impact of transactions on price for a single stockand without delay. Here, we consider multiple stocks and include an exponential decay ofthe impact. While more sophisticated models with a power law decay of the impact wereproposed [19, 21], we will show that our minimalistic model is enough to reproduce a slowgrowth of the eigenvalues of the correlation matrix. For the sake of clarify, we (cid:28)rst analyzethis basic lead-lag one-factor model and then discuss its several straightforward extensions.The one-factor relation (2) is the basic model for returns at the smallest time scale. Wethen consider the returns aggregated on the time scale τ : r τi ( t ) = τ − X ‘ =0 r i ( t − ‘ ) , (3)5ith t being a multiple of τ . Under the former Gaussian assumptions, the covariance functionof the aggregated returns reads (see A): C τij = h r τi ( t ) r τj ( t ) i = τ σ δ ij + β Σ (cid:0) τ (1 − α ) − α (1 − α τ ) (cid:1) (1 − α )(1 − α ) , (4)where h· · · i denotes the expectation, and δ ij = 1 for i = j , and otherwise. Note that weset here σ i = σ for all assets for simplicity (this simpli(cid:28)cation will be relaxed below). As weconsider the stationary regime, the covariance function does not depend on time t .Denoting κ α ( τ ) = τ (1 − α ) − α (1 − α τ )1 − α , (5)one gets the correlation matrix C τij = C τij p C τii C τjj = ( i = j,ρ ( τ ) i = j, (6)with ρ ( τ ) = (cid:0) η ( τ ) /γ (cid:1) − / , (7)where γ = Σ β σ (8)and η ( τ ) = τκ α ( τ ) / (1 − α ) = (1 − α ) − α − α (1 − α τ ) /τ . (9)The function η ( τ ) , that will play the central role in our analysis, monotonously decreasesfrom η (1) = 1 − α to η ( ∞ ) = (1 − α ) .Since the matrix C τ − (1 − ρ ( τ )) I has rank ( I being the identity N × N matrix),there are N − eigenvalues λ i = 1 − ρ ( τ ) . In turn, the single largest eigenvalue of thecorrelation matrix C τ can be obtained as follows: N = Tr( C τ ) = λ + ( N − λ i , from which λ = 1 + ( N − ρ ( τ ) . We get thus the complete description of the eigenvalues as functionsof the time scale τ : λ = 1 + ( N − ρ ( τ ) , (10) λ i = 1 − ρ ( τ ) ( i = 2 , , . . . , N ) . (11)In the limit of very large τ , one (cid:28)nds ρ ( ∞ ) = (cid:0) − α ) /γ (cid:1) − . (12)6his simplest lead-lag one-factor model predicts a monotonous growth of the largest eigen-value (corresponding to the market mode) with the time scale τ , up to a saturation plateau.In turn, the other eigenvalues exhibit a monotonous decrease to a plateau. In spite of theexponential decay of the lead-lag memory e(cid:27)ect in Eq. (2), the approach to the plateau isgoverned by a slow, /τ power law, in a qualitative agreement with the empirical observation(see Sec. 3.5 for quantitative comparison). In particular, this approach has no well-de(cid:28)nedtime scale.While the basic model can potentially capture the behavior of the largest eigenvalue, itclearly fails to distinguish other eigenvalues. One needs therefore to relax some simplifyingassumptions to render the model more realistic. We start by introducing arbitrary volatilities σ i and sensitivities β i of the i -th asset to thecommon factor R ( t ) : r i ( t ) = ε i ( t ) + β i ∞ X k =0 α k R ( t − k ) . (13)In this case, the computation is precisely the same, the only di(cid:27)erence is that C τij = τ σ i δ ij + Σ β i β j κ α ( τ ) . (14)As a consequence, the structure of the correlation matrix is fully determined by β i , whereasthe dependence on the time scale τ is still represented by κ α ( τ ) . The correlation matrixreads C τij = ( i = j ) ,ρ i ( τ ) ρ j ( τ ) ( i = j ) , (15)with ρ i ( τ ) = (cid:0) η ( τ ) /γ i (cid:1) − / , γ i = Σ β i σ i . (16)The eigenvalues of this correlation matrix can be computed as follows.If all γ i are distinct , the components of an eigenvector are v i = ρ i Qλ − ρ i ( i = 1 , . . . , N ) , with Q = N X i =1 ρ i v i , (17) When some γ i are identical, the analysis of eigenvalues becomes more involved (see B), but the largesteigenvalue still satis(cid:28)es Eq. (18) and can thus be approximated by Eq. (19). In particular, if all γ i = γ , onegets λ ≈ N ρ ( τ ) , which is close to the exact solution (10). λ N X i =1 ρ i λ − ρ i = 1 . (18)This equation has N distinct solutions that can be characterized in terms of ρ i (see B). When N is large, the largest eigenvalue is expected to be large, and the asymptotic expansion ofEq. (18) yields λ ≈ N X i =1 ρ i = N X i =1 (cid:0) η ( τ ) /γ i (cid:1) − . (19)In turn, the other eigenvalues are below (see B). As a consequence, such a lead-lag one-factor model cannot reproduce several eigenvalues larger than . For this purpose, one needsto consider multiple factors. Now we consider a general lead-lag multi-factor model r i ( t ) = ε i ( t ) + ∞ X k =0 α k F X f =1 β i,f R f ( t − k ) , (20)where ε i ( t ) are independent centered Gaussian variables (representing random (cid:29)uctuationsspeci(cid:28)c to the stock i ) with variance σ i , F is the number of factors, R f ( t ) are independentcentered Gaussian returns of the factor f with variance Σ f , β i,f is the sensitivity of the stock i to the factor f , and α sets the relaxation time. Repeating the computation from A, onegets C τij = δ ij + (1 − δ ij ) F X f =1 ρ i,f ρ j,f , (21)where ρ i,f ( τ ) = Σ f β i,f β i ρ i ( τ ) (22)and ρ i ( τ ) = (cid:0) η ( τ ) /γ i (cid:1) − / , γ i = β i σ i , β i = F X f =1 Σ f β i,f . (23)Considering ρ i,f as the elements of an N × F matrix ρ , one can rewrite Eq. (21) in a matrixform C τ = ( I − P ) + ρρ † , (24)8here P is the diagonal matrix formed by ρ i , and † denotes the matrix transpose.The matrix ρ of size N × F plays the central role in the following analysis. As the elementsof the matrix ρ are real, ρρ † , as well as ρ † ρ , are positive semi-de(cid:28)nite matrices which havenonnegative eigenvalues. The rank of the matrix ρ is equal to that of matrices ρρ † and ρ † ρ and thus cannot exceed min { F, N } . Given that F (cid:28) N , the correlation matrix C τ appearsas the perturbation of a diagonal matrix by a low-rank matrix.The eigenvalues of the correlation matrix are the zeros of the determinant (cid:0) λI − C τ (cid:1) = det (cid:0) λI − I + P − ρρ † (cid:1) . (25)Since ρρ † is a low-rank perturbation, one can expect, as in the one-factor case of B, thatmost eigenvalues coincide with that of the unperturbed diagonal matrix I − P , i.e., they aregiven by − ρ i for some indices i . These eigenvalues are essentially hidden by noise andnon-exploitable in practice. We are interested in large eigenvalues that (signi(cid:28)cantly) exceed . If λ exceeds , it cannot be equal to − ρ i for all i , the matrix λI − I + P is nonsingular,its inverse exists, so that one can rewrite Eq. (25) as (cid:0) λI − I + P (cid:1) det (cid:0) I − ρ † ( λI − I + P ) − ρ (cid:1) , (26)from which one gets a new equation on eigenvalues: (cid:0) I − ρ † ( λI − I + P ) − ρ | {z } φ ( λ ) (cid:1) . (27)(here we used a general property: if A ∈ C m × m is nonsingular matrix and U, V ∈ C m × r ,then det( A + U V ∗ ) = det( A )det( I + V ∗ A − U ) , see [22]). Denoting the F × F matrix in thedeterminant as φ ( λ ) , one can write explicitly its elements as φ f,g ( λ ) = N X i =1 ρ i,f ρ i,g λ − ρ i . (28)The solutions of Eq. (27) determine some eigenvalues λ of the correlation matrix in Eq. (21).As one typically deals with the situation N (cid:29) F , the reduction of the original determinantequation (25) for a matrix of size N × N to Eq. (27) for a matrix of size F × F is a signi(cid:28)cantnumerical simpli(cid:28)cation of the problem. Most importantly, this formal solution allows oneto get analytical insights onto the eigenvalues, as we did in the one-factor case in B. Notethat in the one-factor case ( F = 1 ), the determinant equation (27) is simply reduced to I − φ ( λ )) = 1 − φ , ( λ ) = 1 − N X i =1 ρ i λ − ρ i , (29)9.e., we retrieve Eq. (18).If one searches for large eigenvalues, λ (cid:29) , one can neglect the matrix P − I in comparisonto λI in Eq. (27), that yields det (cid:0) λI − ρ † ρ (cid:1) = 0 . (30)In other words, the large eigenvalues of the correlation matrix can be approximated by theeigenvalues of the matrix ρ † ρ of size F × F . This symmetric positive semi-de(cid:28)nite matrixhas F nonnegative eigenvalues that correspond to F factors. As we will discuss in detail in Sec. 3.5, empirical data exhibit the short-range memory e(cid:27)ect( α is small) and the relatively small impact of the factors onto the variance of individualstocks as compared to the stock-speci(cid:28)c (cid:29)uctuations ( γ i are small). In this situation, whichis particular to the time series of securities returns at the considered time scales, one has η ( τ ) /γ i (cid:29) so that ρ i ( τ ) in Eq. (23) can be approximated as ρ i ( τ ) ’ γ i η ( τ ) . (31)This approximation greatly simpli(cid:28)es the elements of the matrix ρ † ρ : ( ρ † ρ ) f,g = N X i =1 Σ f β i,f ρ i ( τ ) β i | {z } = ρ i,f Σ g β i,g ρ i ( τ ) β i | {z } = ρ i,g ≈ Nη ( τ ) Γ f,g , (32)where the matrix elements Γ f,g do not depend on the time scale: Γ f,g = Σ f Σ g N N X i =1 β i,f β i,g σ i . (33)As a consequence, all the elements of the matrix ρ † ρ and thus its eigenvalues exhibit thesame dependence on the time scale τ , expressed via the explicit function η ( τ ) given by Eq.(9). Denoting the eigenvalues of the matrix Γ as γ f ( f = 1 , . . . , F ), one gets the followingapproximation for large eigenvalues of the correlation matrix: λ f ≈ N γ f η ( τ ) ( f = 1 , . . . , F ) . (34)From the explicit form (16) of η ( τ ) , one deduces a slow, /τ , power law approach of theeigenvalue to the saturation level as the time scale τ increases. Within this approximatecomputation, all large eigenvalues exhibit the same dependence on the time scale.10 γ α t α (min) 0.55 0.72 0.58 0.74Table 1: Two adjustable parameters of the (cid:28)tting formula (34) applied to four largesteigenvalues of the correlation matrix of N = 533 U.S. stocks’ returns. The correspondingrelaxation time t α in minutes is obtained as / ln(1 /α ) .In practice, one aims at constructing the factors R f to capture independent features ofcross-correlations in the market. The sensitivies β i,f and β i,g of the stock i to factors R f and R g are thus expected to be (cid:16)orthogonal(cid:17), and this property can be formally expressedby requiring that the nondiagonal elements of the matrix Γ are negligible. In this case, theeigenvalues γ f are given by the diagonal elements γ f = Γ f,f = 1 N N X i =1 Σ f β i,f σ i . (35)This is a kind of empirical mean of the squared sensitivities β i,f , normalized by the squaredvolatilities σ i . We aim at applying the lead-lag factor model to (cid:28)t the eigenvalues of the empirical correlationmatrix of U.S. stocks’ returns. The (cid:28)tting formula (34) has two adjustable parameters: therelaxation time α in the function η ( τ ) and the amplitude N γ f . Using the least square (cid:28)ttingalgorithm implemented as the routine lsqcurvefit in Matlab, we apply the formula (34)separately to each empirical eigenvalue.Figure 2 shows the (cid:28)tting of the four largest eigenvalues. The good quality of the (cid:28)t bythe lead-lag factor model indicates that, in spite of numerous simplifying assumptions onwhich the model was built, it captures the overall behavior qualitatively well. In particular,the eigenvalues converge to limiting values, at least for the considered short-time scales (upto 2 hours). Moreover, this saturation level is approached slowly, with the characteristic /τ power law dependence. The adjustable parameters are summarized in Table 1. Rewritingthe attenuation factor α k in the lead-lag factor model (2) as exp( − t/t α ) with t = kτ and t α = τ / ln(1 /α ) , where τ = 1 min is the (cid:28)nest time scale of the time series used, onegets the relaxation time t α in minutes. One can see that the relaxation times α (or t α )for four eigenvalues are close to each other. In other words, all the dominant eigenmodesevolve at comparable time scales. This is an important conclusion which refutes a common11 time scale (min)100110120130 empiricalfitting time scale (min)202530 empiricalfitting time scale (min)121416 empiricalfitting time scale (min)789101112 empiricalfitting Figure 2: Fitting by Eq. (34) of the four largest eigenvalues of the correlation matrix of N = 533 U.S. stocks’ returns, computed by aggregating 1-min returns with the time scale τ . The adjustable parameters α and γ are summarized in Table 1.belief that the market mode (corresponding to the largest eigenvalue) evolves at a time scalethat is signi(cid:28)cantly di(cid:27)erent from other modes (sectors and style factors). The values of t α are of the order of one minute, in agreement with predictions by Benzaquen et al. [19].Remarkably, while the lead-lag memory e(cid:27)ects vanish so rapidly, they impact the behaviorof the eigenvalues at much longer time scales. In particular, if the lead-lag was ignored (bysetting α = 0 ), the largest eigenvalue would be ’ N γ and independent of the time scale τ . For instance, using the estimated value γ = 0 . and setting α = 0 , one would get thelargest eigenvalue to be , which is signi(cid:28)cantly smaller than the expected limit for α = 0 . or the observed value at τ = 128 min. We investigated the dependence of the eigenvalues of the correlation matrix on the timescale τ . Aggregating 1-min returns of the largest 533 U.S. stocks (2013-2017) to estimatethe correlation matrix at di(cid:27)erent time scales, we showed that its large eigenvalues grow with τ and apparently saturate to limiting values. This growth re(cid:29)ects the important phenomenonthat inter-stock correlations accumulate over time scales.To rationalize this phenomenon and to interpret empirical observations, we developed12he lead-lag factor model. In the one-factor case, each stock is considered to be partlycorrelated to a given lead-lag factor. Under several simplifying assumptions, we deriveda simple formula for large relevant eigenvalues. This formula containing just two easilyinterpretable adjustable parameters, was then validated on empirical data.The relaxation time of the stock market was estimated to be around 1 minute. A possibleinterpretion of this observation can be that a transaction can generate a cascade of transac-tions that decays in 1 minute so that the impact of transaction on price decays in 1 minute.As correlations emerge from the cross-impact of transactions on prices, we model this e(cid:27)ectby extending the Kyle model to the impact of transaction on preferential portfolios with alead lag e(cid:27)ect.The small value of the observed relaxation time suggests that correlation measurementsbased on 5 minutes returns should provide a good proxy of correlation of daily returns for riskmanagement, in line with the conclusion by Liu et al. on volatility estimation [23]. However,other phenomena are likely to occur at much larger time scales (from day to month), e.g.,autocorrelations of returns of (cid:28)nancial factors (book, size, momentum) due to herding e(cid:27)ect,or lack of liquidity. An accurate estimation of correlations at larger time scales remains achallenging problem because of a limited number of the available returns and thus higherimpact of noise in the estimated correlation matrix. To overcome this limitation, one caneither consider time horizons over several decades (in which case neglecting variations ofcorrections over time becomes debatable), or reduce the number of considered securities andthus the dimension of the correlation matrix (in which case (cid:28)nancial meaning of estimatedcorrelations may be debatable). A possible solution consists in constructing relevant (cid:28)nancialfactors and investigating how their correlations change with the time scale, as suggested byour factor-based model. A Computation of the covariance matrix
The covariance matrix of aggregated centered Gaussian returns r τi ( t ) de(cid:28)ned by Eq. (3) is C τij = h r τi ( t ) r τj ( t ) i (36) = τ σ δ ij + β τ − X ‘ ,‘ =0 ∞ X k ,k =0 α k α k h R ( t − ‘ − k ) R ( t − ‘ − k ) i . The (cid:28)rst term in this expression comes from the uncorrelated stock-dependent (cid:29)uctuations.The independence of returns R ( k ) implies C τij = τ σ δ ij + β σ m τ − X ‘ ,‘ =0 ∞ X k ,k =0 α k α k δ ‘ + k ,‘ + k . (37)13o calculate these four sums, it is convenient to consider separately various terms dependingon ‘ and ‘ : • there are τ terms with ‘ = ‘ that implies k = k , whose contribution is τ ∞ X k =0 α k = τ − α ; (38) • there are τ − terms with ‘ = ‘ + 1 that implies k = k − , whose contribution is ( τ − ∞ X k =0 α k +1 = ( τ − α − α . (39)Moreover, the same contribution comes from ‘ = ‘ − and k = k + 1 . • similarly, there are τ − j terms with ‘ = ‘ + j that implies k = k − j , whosecontribution is ( τ − j ) ∞ X k =0 α k + j = ( τ − j ) α j − α , (40)and this contribution is doubled by the symmetry argument. • (cid:28)nally, there is one term with ‘ = ‘ + ( τ − and thus k = k − ( τ − whosecontribution is α τ − / (1 − α ) .Combining all these terms, one gets after simpli(cid:28)cations Eq. (4). B Analysis of the lead-lag one-factor model
We study in more detail the model (15) of the correlation matrix C , with ρ i ( τ ) given by Eq.(16). This matrix is a perturbation of the identity matrix by a rank one matrix, for whichmany spectral properties are known (see, e.g., [24]). This matrix combines both e(cid:27)ects:the correlation coe(cid:30)cient ρ and the impact of the exponential moving average (with thecoe(cid:30)cient α ). We search for an eigenvector of this matrix as v = ( v , v , . . . , v n ) † . Writingexplicitly C v = λv , we get v i (1 − ρ i ) + ρ i Q = λv i ( i = 1 , . . . , N ) , (41)where Q = N X i =1 v i ρ i . (42)First, we note that if ρ i = 0 for some i , then the above equation is reduced to v i = λv i that has two solutions: either λ = 1 and v i can be arbitrary; or v i = 0 . One can check thatif ρ i = . . . = ρ i k = 0 for k stocks, then the correlation matrix has the eigenvalue λ = 1 with14he multiplicity k . The corresponding eigenvectors can be chosen as an orthogonal basisin the subspace R k . In turn, the remaining n − k eigenvalues are nontrivial, and can bedetermined as discussed below. In what follows, we focus on these nontrivial eigenvalues,i.e., we assume that all ρ i = 0 .The equation (41) has two solutions:(i) either λ = 1 − ρ i and Q = 0 ; or(ii) λ = 1 − ρ i and v i = ρ i Qλ − ρ i . (43)In the latter case, one can substitute this expression into Eq. (42) to get an equation on theeigenvalue λ : N X i =1 ρ i λ − ρ i = 1 . (44)This equation can be seen as a polynomial of degree N which has N (a priori complex-valued)zeros. Finally, Q can be (cid:28)xed by setting the normalization condition on v : N X i =1 v i = Q N X i =1 ρ i ( λ − ρ i ) . (45)This is a generic situation.Let us return to the (cid:28)rst option, namely, we suppose that λ = 1 − ρ k for some index k that implies that Q = 0 . If all ρ i are distinct, i.e., ρ = ρ = . . . = ρ N , so that v i = 0 forall i = k , but, due to Q = 0 , it would also imply that v k = 0 . As a consequence, v = 0 butthis is not an eigenvector. We conclude that, if all ρ i are distinct, then λ cannot be given by − ρ i , and this option is excluded.Now, we consider the case when two or more values ρ i are identical. For instance, let usassume that ρ = ρ = ρ = . . . = ρ N . In this case, λ = 1 − ρ is indeed an eigenvalue. In fact,one gets Q = 0 and thus v i = 0 for i > . However, one has Q = ρ v + ρ v = ρ ( v + v ) = 0 ,implying that v = − v . The normalization condition implies thus v = − v = 1 / √ . Weconclude that λ = 1 − ρ is then a single eigenvalue. More generally, if ρ = ρ = . . . = ρ k = ρ k +1 = . . . = ρ N , then the eigenvalue λ = 1 − ρ has the multiplicity k − .In general, it is convenient to denote z i = 1 − ρ i and to order them in an increasing order: z ≤ z ≤ z ≤ . . . ≤ z N (46)or, equivalently, by grouping the eventual identical values: z = z = . . . = z i < z i +1 = z i +2 = . . . = z i + i < . . . < z i + ... + i m = z i + ... + i m +1 = . . . = z N . (47)15n other words, there are i identical values z = . . . = z i ; i identical values z i +1 = . . . = z i + i , etc. (note that when all z i are distinct, one has i = i = . . . = 1 ). Inthis con(cid:28)guration, the correlation matrix has: the eigenvalue z with the multiplicity i − (if i > ); the eigenvalue z with the multiplicity i − (if i > ); etc. If for some k , z i k = 1 , then this eigenvalues has the multiplicity i k . Finally, the remaining eigenvalues aredetermined as solutions of Eq. (44) that can be written as f ( z ) = 1 , with f ( z ) = N X i =1 ρ i z − ρ i = N X i =1 − z i z − z i . (48)The terms with z i = 1 (resulting in the eigenvalue λ = 1 ) are excluded from this sum.Moreover, if some z i are identical, the corresponding terms are just grouped together. Asa consequence, the equation f ( z ) = 1 is reduced to a polynomial of degree at most N (thedegree N corresponding to the case when all z i are distinct).It is worth noting that the function f ( z ) is decreasing everywhere: f ( z ) = − N X i =1 ρ i ( z − z i ) < . (49)As a consequence, one gets immediately that each interval ( z i , z i +1 ) (with z i < z i +1 and z N +1 = ∞ ) has exactly one solution of the equation f ( z ) = 0 , i.e., one eigenvalue. Inparticular, one gets the following bounds for the smallest eigenvalue z ≤ min ≤ i ≤ N { λ i } ≤ z . (50)We conclude that all eigenvalues are positive if and only if z ≥ , i.e., ρ i ≤ for all i . Inother words, the inequalities ρ i ≤ for all i present the necessary and su(cid:30)cient conditionfor the positive de(cid:28)niteness of the matrix. These conditions are evidently satis(cid:28)ed in oursetting.Since f (1) ≥ , one also gets the following bound for the largest eigenvalue λ = max ≤ i ≤ N { λ i } ≥ (51)(note that the eigenvalues are ordered in descending order, λ ≥ λ ≥ . . . , in contrast to z k ). However, this bound is rather weak. In turn, since λ ≤ z N = 1 − ρ N < , all othereigenvalues are below : λ i < i = 2 , , . . . , N ) . (52)16 eferencesReferences [1] T. W. Epps, Comovements in Stock Prices in the Very Short Run, J. Am. Stat. Asso.74, 291-298 (1979).[2] E. F. Fama, E(cid:30)cient capital markets: A review of theory and empirical work, J. Finance25, 383-417 (1970).[3] B. T(cid:243)th and J. KertØsz, Increasing market e(cid:30)ciency: Evolution of cross-correlations ofstock returns, Physica A 360, 505-515 (2006).[4] J. Kwapien, S. Drozdz, and J. Speth, Time scales involved in emergent market coherence,Physica A 337, 231-242 (2004).[5] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. Nunes Amaral, T. Guhr, and H. E.Stanley, Random matrix approach to cross correlations in (cid:28)nancial data, Phys. Rev. E65, 066126 (2002).[6] C. Coronnello, M. Tumminello, F. Lillo, S. MiccichŁ, R. N. Mantegna, Sector Identi(cid:28)ca-tion in a Set of Stock Return Time Series Traded at the London Stock Exchange, ActaPhys. Pol. B 36, 2653 (2005).[7] P. Gopikrishnan, B. Rosenow, V. Plerou, and H. E. Stanley, Quantifying and interpretingcollective behavior in (cid:28)nancial markets, Phys. Rev. E 64, 035106(R) (2001).[8] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, Noise Dressing of Financial Cor-relation Matrices, Phys. Rev. Lett. 83, 1467 (1999).[9] S. Pafka, M. Potters, and I. Kondor, Exponential Weighting and Random-Matrix-Theory-Based Filtering of Financial Covariance Matrices for Portfolio Optimization, ArXiv cond-mat 0402573v1 (2004).[10] Z. Burda, A. Gorlich, A. Jarosz, and J. Jurkiewicz, Signal and noise in correlationmatrix, Physica A 343, 295-310 (2004).[11] M. Potters, J.-P. Bouchaud, and L. Laloux, Financial applications of random matrixtheory: Old laces and new pieces, Acta Phys. Pol. B 36, 2767-2784 (2005).[12] T. Conlon, H. J. Ruskin, and M. Crane, Random matrix theory and fund of fundsportfolio optimisation, Physica A 382, 565-576 (2007).1713] T. Guhr and B. K(cid:228)lber, A new method to estimate the noise in (cid:28)nancial correlationmatrices, J. Phys. A 36, 3009 (2003).[14] P.-J. Andersson, A. (cid:214)berg, and T. Guhr, Power Mapping and Noise Reduction forFinancial Correlations, Acta Phys. Pol. B 36, 2611 (2005).[15] S. A. Ross, The Arbitrage Theory of Capital Asset Pricing, J. Econ. Theory 13, 341-360(1976).[16] T. Conlon, H.J. Ruskin, and M. Crane, Cross-correlation dynamics in (cid:28)nancial timeseries, Physica A 388, 705-714 (2009).[17] R. Allez and J.-P. Bouchaud, Eigenvector dynamics: General theory and some applica-tions, Phys. Rev. E 86, 046202 (2012).[18] G. Buccheri, S. Marmi, and R. N. Mantegna, Evolution of correlation structure ofindustrial indices of U.S. equity markets, Phys. Rev. E 88, 012806 (2013).[19] M. Benzaquen, I. Mastromatteo, Z. Eisler, and J.-P. Bouchaud, Dissecting cross-impacton stock markets: an empirical analysis, J. Stat. Mech. 023406 (2017).[20] A. S. Kyle, Continuous Auctions and Insider Trading, Econometrica 53, 1315-1336(1985).[21] J.-P. Bouchaud, J. D Farmer, and F. Lillo, How markets slowly digest changes in supplyand demand, in (cid:16)Handbook of (cid:28)nancial markets: dynamics and evolution(cid:17), ed. by T. Hensand K. Reiner Schenk-Hoppe, (Elsevier, North-Holland, 2009), pp. 57-160.[22] Z. Xiong and B. Zheng, On the eigenvalues of a specially rank-r updated complex matrix,Comput. Math. Appl. 57, 1645-1650 (2009).[23] L. Liu, A. Patton, and K. Sheppard, Does anything beat 5-minute RV? A comparisonof realized measures across multiple asset classes, J. Econometr. 187, 293-311 (2015).[24] G. Golub, Some modi(cid:28)ed matrix eigenvalue problems, SIAM Rev. 15, 318-334 (1973).18 . Fundamental Market Neutral MaximumVariance Portfolios
1. The results of this chapter were obtained in collaboration with Stanislav Kuperstein. he Fundamental Market Neutral Maximum-Variance Portfolios January 17, 2019
AbstractWe introduce Maximum-Variance portfolio that maximises the exposure to a given fundamentalsignal while remaining market neutral. Using real stock data we show that the Maximum-Varianceportfolio weights are proportional to the stock rankings with respect to the signal, implying that thesignal sensitivities are uniformly distributed among the stocks. Those signals are derived from (cid:28)nancialfactors, like Book, Size, etc. and the portfolio constructions are performed independently for each factor.We argue that this results in a large overlap between the subspaces spanned by the Maximum-Varianceportfolios and the leading eigenvectors of the sample correlation matrix. Reducing the initial space allowsus to reproduce the eigenvalues of the sample correlation matrix with remarkable accuracy. We thus canreplicate any alternative beta strategy more e(cid:30)ciently than by using the mainstream top-bottomapproach. Moreover, our method permits to mimic the eigenvalue dynamics. The empirical analysisis carried out on the 500 largest U.S. stocks within di(cid:27)erent time scales and with 24 most popularfactors, both fundamental and sectoral. Under certain hypotheses, the Maximum-Variance portfolioalso optimises the Sharpe ratio, although our 18 years of data are insu(cid:30)cient for statistically signi(cid:28)cantbacktesting results.
Keywords: Portfolio Management, Factor Investing, Alternative Risk Premia, Correlation.JEL classi(cid:28)cation: C5, C61, G11, G12, G23, G4.
This paper develops a theoretical portfolio, coined (cid:16)Maximum-Variance(cid:17) as it is obtained when optimisingthe variance correlated to a signal while minimising the speci(cid:28)c risk, to provide a sound rule for determiningoptimal alternative beta portfolios. Alternative beta factors refer to the traditional Fama and French ([1], [2],[3]) setting where the long/short 20% stocks are ranked by their factor signal (i.e., size, book-to-market etc.).The fact that such long/short positions are not optimal is a crucial issue for investors. This issue is evenno addressed by the classical (cid:16)long-only(cid:17) optimal management because it optimises the Sharpe ratio, whichis not adopted at all for alternative beta. To that end, our solution, hereafter (cid:16)Maximum-Variance(cid:17), thatintends to improve the 20% top/bottom approach, formulates this important problem in terms of maximisingthe risk exposure to the targeted factor, while minimising speci(cid:28)c risk and other systematic risks.The objective of the Maximum-Variance approach is to provide an optimal solution for the alternativebeta portfolios. The risk-based (cid:16)long-only(cid:17) portfolios, namely the (cid:16)Minimum Variance(cid:17), the (cid:16)Risk Parity(cid:17), By opposition to the (non-optimal) heuristic rule-based portfolios, Value-Weighted and the Equally-Weighted. A drawbackis that it has no theoretical grounds in contrast to Mean-Variance theory. It is also the case for the most popular EquallyWeighted heuristic portfolio, known as (cid:16) /n -portfolio(cid:17), which assigns equal weights to all constituents (see e.g. [4] and [5]). Itis relevant in the absence of any information on expected returns and on the covariance matrix. It is Mean-Variance optimalonly if asset classes have the same expected returns and covariances [6]. Such an equal weighting scheme, expressed in termsof market value, tends to produce a systematically higher allocation to undervalued stocks at the expense of overvalued ones,which explains its high performance [7]. A drawback is that diversi(cid:28)cation is not optimal in terms of risks.
1r the (cid:16)Maximum Diversi(cid:28)cation(cid:17) provided solutions to overcome the ine(cid:30)ciency of the market capitalisationweighted indices but cannot help for the alternative beta issues. The properties of the (cid:16)Minimum Variance(cid:17),the (cid:16)Risk Parity(cid:17), or the (cid:16)Maximum Diversi(cid:28)cation(cid:17) have been well documented in asset management lit-erature (See e.g. [8], [9], [10], [11], [6]). Such risk budgeting allocation approaches are known to be robustbecause they do not require any return forecasts. It is the case for the well-known Minimum Variance port-folio [8] that has the lowest risk of all portfolios as it is located in the left-most part of the e(cid:30)cient frontier.It has the unique property that optimal security weights are solely dependent on security covariance matrixwithout regards to the expected returns; hence, it does not rely on any speci(cid:28)c expected return estimate(see e.g. [12]), which makes it appear more robust than the mean-variance framework. Minimum-Variancestrategies have gained popularity notably due to the empirical (cid:28)nding that low-volatility stocks tend to havereturns that tend to exceed in average the market returns [13]. A drawback is that the portfolios concentra-tion around is very sensitive to the covariance matrix noises, while it could be more equally distributed. Thistrouble is solved through the Risk Parity approach that induces a more conservative way of allocating assetsaccording to their risk contribution to the portfolio. The Maximum Diversi(cid:28)cation relies on the concept ofdiversi(cid:28)cation ratio as the ratio of the weighted average of volatilities divided by the portfolio volatility [11].The maximisation of the diversi(cid:28)cation ratio is equivalent to the minimisation of the variance in a universewhere all stocks have the same expected volatility. In this case where all stocks have the same volatility, theMost Diversi(cid:28)ed portfolio becomes equal to the global Minimum Variance portfolio. The objective functionis motivated by maximising the Sharpe ratio where expected asset returns are assumed to be proportional toasset volatility. All these portfolios have optimal risk-based weights equations determined with semi-closed-form analytical solutions, where Minimum-Variance weights are generally proportional to inverse variance,while Maximum Diversi(cid:28)cation and Risk Parity weights are generally proportional to inverse volatility [10].It is also the case for the market neutral Maximum-Variance, we introduced in the paper. The di(cid:27)erence isthat the Maximum-Variance portfolio optimises under a market neutral constraint a di(cid:27)erent ratio than theSharpe ratio that appears not adapted to the alternative beta issue. To do so, we introduced the concept ofFactor (cid:16)Correlation Level(cid:17) (FCL) as the ratio between the variance of the portfolio and the variance of theportfolio as if the correlation between single stocks were zero. FCL can also be simply interpreted as theaverage correlation between stocks within a given portfolio, or as a weighted average of the eigenvalues ofthe correlation matrix with the weights given by the squares of the eigenvectors projections of the portfolio.In fact, the FCL ratio is closely related to the diversi(cid:28)cation ratio that can also be de(cid:28)ned as the squareroot of the ratio between the variance of the portfolio as if correlations were one and the variance of theportfolio. We argue that the FCL plays an equivalent role of the Sharpe ratio in the context of alternativebeta portfolios but could be measured more accurately (theoretical expected Sharpe ratio for a market neu-tral portfolio is null with the e(cid:30)cient market hypothesis, and empirically the expected returns are rarelysigni(cid:28)cantly di(cid:27)erent from zero and remain very controversial in the literature [14]. We also show that theMaximum-Variance portfolio has a robust and theoretically optimal Sharpe ratio under some hypothesesthat are di(cid:27)erent from the e(cid:30)cient market ones.The Maximum-Variance portfolio is de(cid:28)ned as the market neutral portfolio that optimises the FCL, whenusing a Two-Factor model for the returns. The Two-Factor model including the dominant factor and thestyle of interest is justi(cid:28)ed as the e(cid:27)ect of additional orthogonal factors is small [15]. We argue that theparameters of the model could be (cid:28)tted via a linear law depending on the ranking of the stocks accordingto the signal.In fact, this weighting scheme has been already implemented in [16] for the value and momentum factors,as a means to reduce the in(cid:29)uence of outliers (lowest and highest ranked stocks), but without furthertheoretical and empirical justi(cid:28)cation.Remarkably, the law seems to be universal for any signal. To model the correlation matrix in thatway introduces a constraint and avoid getting the sample eigenvectors as the optimal portfolios. The styleof interest could be sectoral risk or include fundamental factors that have been mainly formed on (cid:28)rm2haracteristics such like capitalisation [17, 1, 2], book-to-market [18, 1, 2], low volatility/beta [19, 13] ormomentum [20, 21], to quote only the most popular.The authors of [22] study more than 330 return predictive signals that are mainly accounting based andshow the large diversi(cid:28)cation bene(cid:28)ts by suitably combining these signals. In[23] an out-of-sample approachis used to study the post-publication bias of discovered anomalies. The overall (cid:28)nding of this literature isthat many discovered factors are likely false. In [24] more than 300 di(cid:27)erent factors were listed, claimed tocapture an alternative risk premia but they argue that most claimed research (cid:28)ndings are likely false. In [14]14 major factors were selected and it was showed that the original market factor is by far the most importantfactor in explaining the cross-section of expected returns.Market participants have extended this risk-based investment style with what is generally named thealternative risk premia, which corresponds to any risk premia that can be earned by building long-shortportfolios exposed to common risk factors as opposed to the long-only exposure of the so-called (cid:16)traditionalrisk premia(cid:17) approach. Alternative risk premia can be aggregated to build multi-factor portfolios likelyto extend the horizon of assets for a better diversi(cid:28)cation, while providing, not only a more economicallymeaningful investment opportunity, but also a more transparent systematic risk exposure to investors. Gen-erally, risk premia factors constructions are not optimal as they have a top-down approach to gain accessto the interest factor and they are beta market neutral. In the classical setting [1, 2, 3], factor constructionis generally based on a long-short basket approach that is long/short the top 20% stocks ranked by thefactor signal (i.e., size, book-to-market etc.). Other approaches from the industry, that are not necessarilydisclosed, are often of a bottom-up type. Additionally, the methodology applies constraints on region-sectorexposure, maximum constituent weights, liquidity and turnover and did little e(cid:27)ort to discuss the problemsof optimisation. The optimised risk premia factors, through the Maximum-Variance approach, constitute aset of portfolios that allows replicating easier any alternative beta strategies than a set of 20% top/bottomrisk premia factors would do. Moreover, the conventional way to identify factors to model the correlationmatrix is done through fundamental multi-factor models. These models specify that expected returns arelinearly related to the weights of the common factors, but remained generally silent on the number of factors,which has induced some controversy , in the (cid:28)nancial economics literature. We argue in our paper that usingthe optimised factors should improve the explanation power of the cross-section of single stock returns andthat the FCL should be a criterion of the importance of a factor used to decide to select it or not into themodel.Furthermore, when using enough factors, the optimised factors could also be used as an e(cid:30)cient (cid:28)lter- Indeed, the multi-factor models of security market returns can be divided into three types (macroeconomic, fundamentaland statistical), but the fundamental model remains more suited to ensure whether those factors are associated with risk premia.The fundamental model slightly outperforms the statistical model as it explains 42.6% of the total explanatory power, against39% for statistical models and only 10.9% for macroeconomic model [25]. Recall that the multi-factor model, initially formulated in [26], through the Arbitrage Pricing Theory (APT) o(cid:27)ers a testablealternative to the Capital Asset Pricing Model (CAPM) introduced initially in [27]. The two major di(cid:27)erences between theAPT and the CAPM model is that the APT allows for a more than just one generating factor and demonstrates that everyequilibrium will be characterized by a linear relationship between each asset’s expected return and its return’s loading on thecommon factors [28]. Notice that while [28] suggests that APT results are invariant to rotation of the original factors, [29]brings a nuance by stating that the statistical tests for the number of priced factors are not invariant to rotation. Some early in(cid:29)uential papers were written on that particular topic in the (cid:28)nancial economic literature. [30] explain that asone increases the number of securities, the number of factors increases. [31] demonstrate that returns can be linearly relatedto n factors if n eigenvalues of the covariance matrix of returns become large as the number of securities increases. [32] (cid:28)ndsevidence that one eigenvalue at most dominates the covariance matrix indicating that a one-factor model may describe securitypricing.[33] obtain that the number of factors common across securities was limited to three or (cid:28)ve.[34] (cid:28)nd the existence of onlyone dominant factor and suggest the e(cid:27)ect of adding orthogonal factors is small. [35] shows that one market factor explains themajor part of security returns. [36] explain if n is the correct number of pervasive factors, then there should be no signi(cid:28)cantdecrease in the cross-sectional mean square of idiosyncratic returns in moving from n to n + 1 factors. A wide collection ofin(cid:29)uential papers has also addressed this issue in the econo-physics literature to (cid:28)nd roughly comparable conclusions aroundthe idea of one large eigenvalue (see e.g.,[37],[38],[39]) eigenvectors into the subspaceof the optimised factors could be a very good approximation of the true eigenvectors. The main constrainedeigenvectors are combinations of Maximum-Variance portfolios which have the highest FCL. As FCL canchange in a brutal way depending on market issues, the eigenvectors can also change brutally that explainsthe di(cid:30)culty to interpret directly di(cid:27)erent eigenvectors. The Maximum-Variance (cid:28)nally enables to (cid:28)lter thecorrelation matrix and relies on a diagonalisation method that enables to obtain orthogonal factors as closeas possible to eigenvectors of the correlation matrix. It corresponds to a dimension reduction introduced byeconomic constraints to (cid:28)lter noises. Roughly speaking, those economic constraints applied to these factorsact like (cid:16)constrained eigenvectors(cid:17) and constitute a kind of (cid:28)lter of the correlation matrix that consists in areduction of the traded universe dimension from several hundred single stocks or more to few tens factors.The (cid:28)lter introduced by economical constraints helps to interpret the (cid:28)rst constrained eigenvectors and alsoallows one to capture small eigenvalues that are typically hidden by noises. Maximum-Variance enablesto show that small eigenvalues are mainly coming from combination of styles like capitalisation and book.The (cid:28)lter is complementary to the standard approach based on the Random Matrix theory, which makesstatements about the density of the eigenvalues of large random matrices. The empirical correlation matrixcomputed on a given realisation must be distinguished from the true correlation matrix of the underlyingstatistical process as the large number of simultaneous noisy variables creates important systematic errorsin the computation of the matrix eigenvalues [46].To our knowledge, there has so far been just no published study on how to build optimal portfolios foralternative beta portfolios despite the current major issues at stake with the emergence of the alternativerisk premia vehicles as the low cost and e(cid:30)cient solution of Hedge Fund that are made available to anyinvestor to manage assets. Building upon the body of evidence discussed above, this paper provides a novelmethodology with a closed-form solution to compute optimal risk premia portfolios. To test the Maximum-Variance methodology, we use stock returns at multiple time scales (from 5 minutes to 100 days) from apanel of the largest 500 U.S. stock since 2000 to compute the di(cid:27)erent Maximum-Variance portfolios and totest the improvement brought by the optimisation in the capture of the eigenvalues and their dynamics. Thisshows that optimising factors helps to increase the explanation power of the cross-section of single stocksreturns and to replicate better any alternative beta strategy. We also test improvement of the Sharpe ratiobut, as expected, measurements are not signi(cid:28)cant. We have restricted ourselves to the 24 most popularfactors according to the literature (market mode, dividend yield, capitalisation, volume/capitalisation, STR, [42] already introduced in 1971 the concept of constrained eigenvalues and eigenvectors into a subspace. They correspondto the eigenvalues and eigenvectors of a transformed matrix projected into the subspace The need to ‘clean’ the empirical correlation matrix requires a device for distinguishing signal from noise [43]; it requiresdistinguishing meaningful eigenvalues (beyond the edge) from noisy ones (inside the bulk) given that all eigenvalues in the bulkof the Mar£enko-Pastur spectrum are deemed as noise; to that end, [44] de(cid:28)ne a threshold that separates only those eigenvaluesthat are outside the noise band. The standard approach to clean up the empirical correlation matrix requires separating thelargest eigenvalue, economically interpreted as the (cid:16)market mode(cid:17), from the bulk where all other eigenvalues reside and areburied under the noise [45].
The main objective of this paper is to show how one can (cid:16)clean(cid:17) a correlation matrix by using supplementaryinformation regarding the time-series at hand. Two series having similar characteristics according to the newinput information should have close correlations with the other time-series. Our method allows to reducethe initial N -dimensional space to a smaller K -dimensional subspace, where N and K are the number ofthe time-series and the characteristics respectively. The method consists of two steps. First, we (cid:28)nd K one-dimensional subspaces for each characteristic independently. Second, we determine the optimal eigenvectorsin the K -dimensional subspace, which is the sum of the smaller K N (the number of single stocks) to N − (thedimension of the subspace of returns orthogonal to the stock index that is highly correlated to the (cid:28)rsteigenvector), either from N to K (the number of characteristics) or from N to Q (the number of quantilesto group stocks with similar characteristics). Our approach relies on the fact that constrained eigenvectors,namely those forced to belong to a given subspace, are also the eigenvectors of the matrix reduced to thissubspace. If the choice of the reduced space (for example, the K Maximum-Variance or the Q quantileportfolios) has a su(cid:30)cient overlap with the space spanned by the leading eigenvectors then the constrainedeigenvalues will be close to the unconstrained eigenvalues of the correlation matrix.In Section 2.1 we de(cid:28)ne the covariance and the overlap matrices of N elementary single stocks. Thecovariance and overlap matrices between any K or Q portfolios can be generalized. We de(cid:28)ne then theFactor Correlation Level (FCL) that measures the variance of a portfolio when it is normalized by theoverlap matrices.In Section 2.2 we introduce the Two-Factor model which can be implemented independently for each(cid:28)nancial characteristic. It helps us to generate one solution per characteristic, that is implemented inde-pendently from other characteristics, that optimizes the FCL and that is correlated to the characteristic ofinterest.In Section 2.3 we de(cid:28)ne the Fundamental Maximum-Variance portfolio as such a solution and we derive anexplicit formula from the Two-Factor Model parameters. In Section 2.4 we demonstrate that the Maximum-Variance portfolios are the portfolios replicating as best as possible the entire correlation matrix given aselection of characteristics. In Section 2.5 we describe a methodology to measure the parameters of theTwo-Factor model, the factor loadings.In Section 2.6 we comment on on empirical universal law that we found for the optimal factor loadingsand in Section 2.7 we (cid:28)nally compare our methodology with the usual ordinary least square (OLS) and theFama-MacBeth regressions. In Section 2.8 we provide some plausible theoretical patterns of the correlationmatrix of single stocks returns that explain why the overlap between the K Maximum-Variance and the5pace spanned by the leading eigenvectors is higher than we could have expected if the eigenvectors wererandomly generated. Let us (cid:28)rst introduce notations that we will use throughout the paper. r i ( t ) are the time series of N single stock returns with i = 1 , . . . , N . For a given functional ϕ ( r i ( t )) we will denote by E t − ( ϕ ( r i ( t ))) the conditional expectation of ϕ ( r i ( t )) based on information available from the entire previous period. Forinstance, we will assume that E t − ( r i ( t )) = 0 , unless otherwise mentioned. With this assumption ΣΣΣ is thevolatility vector de(cid:28)ned by Σ i ( t ) = p E t − ( r i ( t )) . With these conventions the conditional covariance andcorrelation matrices of returns are given by: H ij ( t ) = E t − ( r i ( t ) r j ( t )) and C ij ( t ) = H ij ( t ) Σ i ( t ) Σ j ( t ) . (1)Notice that Σ i = √ H ii and so C ii ( t ) = 1 for any t as it should be for a correlation matrix. For theupcoming de(cid:28)nitions we will also de(cid:28)ne ΓΓΓ as the covariance matrix of positions. In the special case wherethe constituents of the base do not have any common position as it is for the elementary single stocks, thematrix
ΓΓΓ will correspond to the diagonal matrix of variances Γ ij ≡ Σ i δ ij , from which C = ΓΓΓ − H ΓΓΓ − . (2)Since the correlation matrix is less biased towards high-volatility stocks and therefore is more adapted forthe application considered in this paper than the covariance matrix, the ΓΓΓ − matrix will be often usedto calculate vector products. For example, we will refer to two vectors u and u as ΓΓΓ − -orthogonal ifthey satisfy u T ΓΓΓ − u = 0 . In (cid:28)nancial terms it reduces the weights of companies with large volatilities.As volatilities change with time, thus generating heteroscedasticity, it makes another reason to prefer thecorrelation over the covariance matrix.A generic portfolio p is determined by N (time-dependent) weights ω p i ( t ) and its return is then givenby r p ( t ) = P i ω p i ( t ) r i ( t ) . Among all possible ωωω p ’s the market-mode portfolio plays a special role. We willdenote it by ωωω m ( t ) and the corresponding return by r m ( t ) . In the paper, r m is taken to be the stock indexreturn, which is very close to the value-weighted portfolio return. In the latter the weight of each stockis proportional to its market capitalisation, which we will denote by Cap i ( t ) . The market-mode portfolioweights are close, therefore, to the principal component of the matrix p Cap i H ij p Cap j (with neither i nor j summations). r m ( t ) provides a good proxy of the accumulated gain or loss for all investors. It thuscorresponds to the real systematic risk that all investors want to avoid for alternative beta vehicles in orderto diversify and optimise their investments. The value weighted portfolio di(cid:27)ers slightly from the principalcomponent of C . The two portfolios produce highly correlated returns but the value-weighted portfolio isinvested primarily in companies with large capitalisation whereas the principal component of C is investedmainly in the small (cid:28)rms as they are better represented in portfolios with large N .In our conventions a given stock beta describes the conditional sensitivity of the stock return to the stockindex: β i ( t ) ≡ E t − ( r i ( t ) r m ( t )) E t − (cid:16) ( r m ( t )) (cid:17) or βββ ≡ ( Σ m ) − · H ωωω m (3) With a few obvious exceptions we will use the bold font for matrices and vectors with no explicit indices.
6n the matrix notations. We will also need an additional time-series of returns derived from βββ : r m ? ( t ) = N P j =1 r j β j Σ − jN P k =1 (cid:0) β k Σ − k (cid:1) , (4)where the time-dependence on the right-hand side is, again, implicit. For reasons to be clari(cid:28)ed below, wewill refer to r m ? ( t ) as the return of Maximum-Variance market-mode portfolio. This portfolio is optimal ina sense to be clari(cid:28)ed shortly. The relation between r m ? ( t ) and r m ( t ) is straightforward. Starting from thestock index return r m ( t ) one can compute the betas as in (3). On the other hand, knowing the full setof β i =1 ,...,N ( t ) and the stock returns r i ( t ) , the index return might be reproduced (or, to be more precise,approximated) by r m ? ( t ) . Importantly, if beta’s time-dependence is negligible, we may interpret both (3)and (4) as outputs of the weighted least-squares (WLS) and the ordinary least-squares (OLS) regressionsrespectively, both based on r i ( t ) = β i r m ( t ) relation. In Appendix A we show that for a (cid:28)xed time-period T computing iteratively the (constant) betas from the market mode and the market return from the betasleads (after a su(cid:30)ciently large number of iterations) to (cid:28)xed-point values, where the identity r m ? ( t ) = r m ( t ) holds and the product b i r m ( t ) is merely the principal component (PC) of the correlation matrix.It will be later on useful to introduce two additional notations: the ρ H -correlation of two portfolios ωωω p and ωωω p , which denotes the conditional correlation between the corresponding returns, r p , ( t ) = P i ω p , i ( t ) r i ( t ) , and the similarly de(cid:28)ned position overlap ρ Γ : ρ H ( ωωω p , ωωω p ) ≡ ωωω p T H ωωω p (cid:0) ωωω p T H ωωω p (cid:1) (cid:0) ωωω p T H ωωω p (cid:1) ρ Γ ( ωωω p , ωωω p ) ≡ ωωω p T ΓΓΓωωω p (cid:0) ωωω p T ΓΓΓωωω p (cid:1) (cid:0) ωωω p T ΓΓΓωωω p (cid:1) (5)A portfolio p is market-neutral if its return is uncorrelated with r m ( t ) , that is E t − ( r p ( t ) r m ( t )) = 0 . Interms of the weights it amounts to ρ H ( ωωω m ( t ) , ωωω p ( t )) = 0 . Using (3) this is also equivalent to βββ T ( t ) ωωω p ( t ) = 0 . (6)Employing these notations, we de(cid:28)ne the factor correlation level (FCL) mentioned earlier in Introduction: λ (0) ( ωωω p ) ≡ ( ωωω p ) T H ωωω p ( ωωω p ) T ΓΓΓωωω p , (7)where for the sake of shortness we omitted the time-dependence. It follows from (2) that, once √ ΓΓΓωωω p ischosen to be an eigenvector of the correlation matrix (2), the FCL is equal to the corresponding eigenvalue.Otherwise it could be interpreted as the weighted average of the eigenvalues of C with the weights given bythe squares of the eigenvectors projections on ωωω p . Indeed, if e i =1 ,...,N are the eigenvectors of the correlationmatrix in (2) and ‘ i =1 ,...,N are the corresponding eigenvalues, then C = P Ni =1 ‘ i e i e i T / (cid:0) e T i e i (cid:1) and λ (0) ( ωωω p ) = N X i =1 ( a p i ) ‘ i , where a p i ≡ ρ Γ (cid:16) ΓΓΓ − e i , ωωω p (cid:17) with N X i =1 ( a p i ) = 1 . (8)As we explained above, the main goal of this paper is to optimise the FCL (7) in a reduced subspace thatcontains the leading eigenvectors. To identify the subspace we use the (cid:28)nancial information coming in formof signals as we review in the next section. 7 .2 The Two-Factor Model As it was brie(cid:29)y outlined in Introduction we want to optimise the FCL in (7) separately for each of the K factors/styles. To estimate λ (0) we will approximate the returns using a Two-Factor model pertinent for agiven factor as will be described below.Each of these K models employs the market mode r m ( t ) and an additional factor mode r f a ( t ) for a =1 , . . . , K that captures the relevant style information encoded in a time-dependent signal s a,i ( t ) . These K di(cid:27)erent signals correspond to book, capitalisation and other (cid:28)nancial criteria we list in Table 14 of AppendixB. The signals determine (a priori time-dependent) rankings of the stocks. We will denote the rankings by q a,i ( t ) and we will drop the a -index all the way up to Section 2.4 focusing meanwhile on a single factor/style.The rankings take values among , . . . , N and by de(cid:28)nition q i ( t ) < q j ( t ) if and only if s i ( t ) < s j ( t ) . Forexample, if N = 3 and we have ( s ( t ) , s ( t ) , s ( t )) = (5 ., ., − . ) then ( q ( t ) , q ( t ) , q ( t )) = (2 , , .The usual practice is to model r f ( t ) by the so-called benchmark portfolio return, r b.m. ( t ) . The latteris built by buying the top of the stocks and shorting the bottom , while sizing the two legs tokeep the portfolio beta market-neutral. The weights of the benchmark portfolio may appear either in theequal-weighted or the equal-risk-weighted version: ω b.m. i ( t ) = Equalweighted Equal-riskweighted q i ( t ) ω + ( t ) ω + ( t ) /Σ i ( t ) q i ( t ) > .
80 0 0 . < q i ( t ) < . ω − ( t ) ω − ( t ) /Σ i ( t ) q i ( t ) . r b.m. ( t ) = N X i =1 ω b.m. i ( t ) r i ( t ) . (9)Here ω + ( t ) and ω − ( t ) are (cid:28)xed by the benchmark portfolio market-neutrality condition, see (6): N X i =1 β i ( t ) ω b.m. i ( t ) = 0 , (10)and by the over-all normalisation max ( ω + ( t ) , ω − ( t )) = 1 /N . Notice that the solutions for ω + ( t ) and ω − ( t ) depend on the choice of the version in (9).One may extend the benchmark portfolio construction to include as well stocks which are not part of ouroriginal selection of N stocks. If the number of stocks is su(cid:30)ciently large the speci(cid:28)c risk will vanish andthe new benchmark portfolio, ωωω f ( t ) , will be an exogenous factor. We will denote by r f ( t ) the return of ωωω f ( t ) .The Two-Factor model then relates the residual returns r i − β i r m ? and the factor return r f ( t ) as follows: r i ( t ) − β i ( t ) r m ? ( t ) = b i r f ( t ) + (cid:15) i ( t ) for i = 1 , . . . , N , (11)where the last term stands for the idiosyncratic returns. We will refer to b i as factor loadings. The modelhas to satisfy two orthogonality conditions: • All of (cid:15) i ( t ) ’s are uncorrelated with the factor return, r f ( t ) :E t − (cid:0) r f ( t ) (cid:15) i ( t ) (cid:1) = 0 . (12) • The vector
ΓΓΓ − b is an eigenvector of ΓΓΓ − H (cid:15) ΓΓΓ − , where H (cid:15) ( t ) = E t − ( (cid:15) i ( t ) (cid:15) j ( t )) is the covariancematrix of the idiosyncratic returns: H (cid:15) ( t ) ΓΓΓ − ( t ) b = ‘ (cid:15),bbb ( t ) · b , (13) In this section we assume that the signals do not coincide. Notice that we have r m ? rather than r m in this formula. t the eigenvalue ‘ (cid:15),bbb remains much smaller than the variance of the factor return: ‘ (cid:15),bbb (cid:28) (cid:0) Σ f (cid:1) where (cid:0) Σ f (cid:1) ≡ E t − (cid:16)(cid:0) r f ( t ) (cid:1) (cid:17) (14)We will clarify later on in this section what do we mean by su(cid:30)ciently (cid:16)small(cid:17).Importantly, both requirements above are motivated by the principal component analysis of the correlationmatrix. If the β i r m ? ( t ) and the b i r f terms were respectively the leading and the sub-leading terms in thePCA expansion of the correlation matrix C , the orthogonality of the modes (12) would not be an additionalconstraint but rather a direct consequence of the expansion. Moreover, it would as well enforce ‘ (cid:15),bbb = 0 . Weshow it in Appendix A. In our case neither the betas nor the market return are principal components of C , but we may adopt the orthogonality property as a good approximation for our model (see the previoussubsection).Before closing the section we should stress that (11) is conceptually di(cid:27)erent from the multivariate factormodel well explored in the literature. In this model there would be a single equation combining the marketreturn, K additional factors returns and one series of the idiosyncratic returns. In our approach there areinstead K equations for each factor with a di(cid:27)erent (cid:15) i ( t ) term. This will allow us to optimise FCL separatelywithin each one of the K Two-Factor models, thus reproducing eventually the largest eigenvalues of thecorrelation matrix. We come back to this issue in Appendix H.
Given a fundamental signal, a corresponding Two-Factor model (11) generates the matrices H ( t ) , ΓΓΓ ( t ) and C ( t ) in terms of r m ( t ) , β i ( t ) , r f ( t ) , b i and (cid:15) i ( t ) . Modelling H ( t ) and ΓΓΓ ( t ) by means of the Two-Factormodel (11) helps to isolate among the N − market-neutral portfolios whose weights ω i ( t ) optimise (locally)the FCL (7), a single optimal portfolio that is highly correlated with the factor return r f . We call thisportfolio/vector the maximum-variance market-neutral factor and denote it by ωωω (0) ? and the optimal of FCLvalue by λ (0) ? (as was already mentioned earlier we omit the factor index a ). To recapitulate: λ (0) ? = λ (0) (cid:16) ωωω (0) ? (cid:17) with δλ (0) ( ωωω ) δωωω (cid:12)(cid:12)(cid:12)(cid:12) ωωω = ωωω (0) ? = 0 and βββ T ωωω (0) ? = 0 , (15)and for any other market-neutral portfolio ωωω satisfying the optimisation condition in (15), yet di(cid:27)erent from ωωω (0) ? , one necessarily has ρ H (cid:0) ωωω f , ωωω (cid:1) < ρ H (cid:16) ωωω f , ωωω (0) ? (cid:17) (16)with the short-cut notations (5). Again, we dropped above the explicit time-dependence on order to keepthe formulae short and readable. The optimisation, in fact, takes place for any t and thus λ (0) ? = λ (0) ? ( t ) isalso time-dependent.The weights ωωω (0) ? are called the maximum-variance market-neutral portfolio with the (cid:28)rst part of the namecoming from the fact that it maximises (optimises) the variance (the numerator in the FCL de(cid:28)nition (7)),while maintaining market-neutrality and the correlation with the given signal. Importantly, the conditions Here the ? alludes to the FCL optimisation and the (0) superscript will be clari(cid:28)ed in subsection 2.4. In fact the de(cid:28)nition can be presented mathematically as ωωω (0) ? ≡ argmax opts.t. βββ T ωωω =0 ( λ (0) ( ωωω ) ) (cid:16) ρ H (cid:16) ωωω f , ωωω (cid:17)(cid:17) but this de(cid:28)nition is too cumbersome to be exploited. ω i → const · ω i . The ambiguity might be easilyeliminated, for instance, by setting to one the ωωω (0) ? -portfolio variance.It is worth emphasising here that ωωω (0) ? does not necessarily provide a maximum of the FCL. In AppendixC we argue that the signature of the Hessian matrix obtained from an FCL-like Lagrangian at one of itsoptimal points has both positive and negative directions, thus describing a saddle point, rather than a (local)maximum, which, in turn, appears only if we pick up the highest eigenvalue as the optimisation solution.Despite the somewhat verbose de(cid:28)nition of λ (0) ? it apparently has a simple straightforward interpretation.First, notice that the unconstrained optimisation of (7) is equivalent to solving for eigenvalues of the matrix ΓΓΓ − H ΓΓΓ − . To see this, one can simply rede(cid:28)ne vector ωωω by ωωω → ΓΓΓ ωωω . Second, as we explain in details inAppendix D, (cid:28)nding constrained eigenvectors v i of a square matrix M subject to an additional requirement c T v i = 0 is equivalent to the unconstrained diagonalisation of P c M P c , where P c is a projection operatorde(cid:28)ned by P c c = 0 and ( P c ) = P c . To be more speci(cid:28)c, if l i = 0 and u i are an eigenvalue/eigenvectorpair of P c M P c , then v i = P c u i is a constrained eigenvector of M .In our case M = ΓΓΓ − H ΓΓΓ − and the constraint vector is c = ΓΓΓ − βββ , where the ΓΓΓ -factor comes from theaforementioned rede(cid:28)nition of ωωω .To summarize, the maximum-variance market-neutral portfolio ωωω (0) ? can be found following these foursteps:1. To (cid:28)nd all eigenvectors with non-vanishing eigenvalues of the P c -projection of the correlation matrix P c ΓΓΓ − H ΓΓΓ − P c , where c = ΓΓΓ − βββ and thus ( P c ) ij = δ ij − (cid:0) β i Σ − i (cid:1) (cid:0) β j Σ − j (cid:1) N P l =1 (cid:0) β l Σ − l (cid:1) . (17)Notice that the maximum number of such vectors is ( N − .2. To calculate the P c -projection of each of these eigenvectors with P c given in (17).3. To multiply the projected eigenvectors by ΓΓΓ − .4. To arrive at ωωω (0) ? , the vector/portfolio producing the strongest correlation to the factor return, see (16),should be selected among the available vectors.Let us now rewrite (17) in terms of the returns de(cid:28)ned by the Two-Factor model (11). Upon (11) thematrix P c ΓΓΓ − H ΓΓΓ − P c becomes: (cid:16) P c ΓΓΓ − H ΓΓΓ − P c (cid:17) ij = Σ − i Σ − j E t − (cid:0)(cid:0) b i r f + (cid:15) i (cid:1) (cid:0) b j r f + (cid:15) j (cid:1)(cid:1) = (cid:16) ΓΓΓ − (cid:16)(cid:0) Σ f (cid:1) · bbbbbb T + H (cid:15) (cid:17) ΓΓΓ − (cid:17) ij . (18)Here (cid:0) Σ f (cid:1) = E t − (cid:16)(cid:0) r f ( t ) (cid:1) (cid:17) is the conditional variance of the factor return. In deriving this result we usedthe (cid:28)rst property of the Two-Factor model (12) and the relation X j P c ij (cid:18) r j Σ j (cid:19) = 1 Σ i ( r i − β i r m ? ) , (19)which follows directly from (17), as well as the de(cid:28)nitions (3) and (4).Notice now that by virtue of the last property of the Two-Factor model, see (13), the vector ΓΓΓ − bbb is aneigenvector of the matrix P c ΓΓΓ − H ΓΓΓ − P c . Indeed, this is an eigenvector of all terms on the right-handside of (18), so it should also be an eigenvector of their sum.10inally, ΓΓΓ − times the P c -projection of the vector ΓΓΓ − bbb is equal to: (cid:16) ω (0) ? (cid:17) i ( t ) ∼ Σ − i ( t ) b i − N P j =1 b j β j ( t ) Σ − j ( t ) N P k =1 β k ( t ) Σ − k ( t ) β i ( t ) or ω ? ( t ) ∼ ΓΓΓ − ( t ) b − b T ΓΓΓ − ( t ) βββ ( t ) βββ T ( t ) ΓΓΓ − ( t ) βββ ( t ) · βββ ( t ) ! . (20)This is precisely the maximum-variance market-neutral portfolio that we de(cid:28)ned in this section. All we haveto do in order to verify it, is to calculate the correlation between the factor return r f ( t ) and the return ofthe portfolio (20). Using the de(cid:28)nition of r m ? ( t ) in (4) and the Two-Factor model (11) we obtain: N X i =1 (cid:16) ω (0) ? (cid:17) i r i ( t ) ∼ N X i =1 b i Σ i ( t ) ( r i ( t ) − β i ( t ) r m ? ( t )) = N X i =1 b i Σ i ( t ) · r f ( t ) + N X i =1 b i (cid:15) i ( t ) Σ i ( t ) . (21)The property (12) then guarantees thatE t − r f · N X i =1 (cid:16) ω (0) ? (cid:17) i r i ! ∼ b T ΓΓΓ − b · (cid:0) Σ f (cid:1) E t − N X i =1 (cid:16) ω (0) ? (cid:17) i r i ! ∼ (cid:16) b T ΓΓΓ − b (cid:17) · (cid:0) Σ f (cid:1) + b T ΓΓΓ − H (cid:15) ΓΓΓ − b . (22)Using (13) and the de(cid:28)nition of ρ H from (5) we thus have: ρ H (cid:16) ωωω f , ωωω (0) ? (cid:17) = ‘ (cid:15),bbb ( Σ f ) (cid:16) b T ΓΓΓ − b (cid:17) − / . (23)We see, (cid:28)nally, that (14) ensures a strong correlation between the return of the portfolio (20) and the factorreturn r f ( t ) , and therefore for su(cid:30)ciently small ‘ (cid:15),bbb (cid:0) Σ f (cid:1) − all other portfolios satisfying (15) will also ful(cid:28)lthe second part of the maximum-variance market-neutral portfolio de(cid:28)nition, the inequality (16).In Section 2.5 we will present a robust method to estimate the factor loadings b i . We will see that theratios b i /Σ i can be well modelled empirically by the ranking of stock according to the signal. The estimateworks well for most of the factors/styles con(cid:28)rming the universality of the Two-Factor model.Before closing this subsection let us make two comments: • There is a connection between the Two-Factor model (11) and the principal component analysis (PCA).In the paragraph following (4) we have already mentioned the way r m , r m ? and βββ are related to thePC of the correlation matrix, C . Similarly, the right-hand side of (11) alludes to the sub-leadingterm in the PCA. In Appendix A we present an expression for b i as if it were derived from the WLSregression of the left-hand side of (11) with respect to r f . In particular, this expression leads to thesecond eigenvector of C (and so ‘ (cid:15),bbb = 0 ), provided r f is replaced by the sub-leading term in the PCAexpansion of the correlation matrix. It is very important to stress, nevertheless, that we do not expect b to resemble any of C ’s eigenvectors, as it would have implied, among other things, that b is the samefor all factors. Instead, the procedure described in Section 2.5 leads to di(cid:27)erent results for b i . We use the ∼ sign to keep in mind the overall rescaling. The weights of the Maximum-Variance market-mode portfolio, that was (cid:28)rst mentioned below (4), are ( ω m ? ) j ( t ) = β j Σ − jN P k =1 (cid:0) β k Σ − k (cid:1) , (24)In Appendix E we argue that ωωω m ? optimizes the FCL but without any market-neutral constraint treatinginstead the market mode as yet another factor and replacing b i and r f in (11) by β i and r m ? respectivelyand drop the β i r m ? term on the left-hand side. The Maximum-Variance market-mode portfolio couldbe interpreted as the complementary portfolio of the Minimum Variance and Maximum Diversi(cid:28)cationportfolios. Contrary to these two popular portfolios, the weights (24) are higher for higher-beta stocks. Once the K styles are selected and the corresponding K di(cid:27)erent Maximum-Variance portfolios (cid:16) ωωω (0) ? (cid:17) a ( a = 1 , . . . , K ) are derived, we may also determine their best combinations to match the unconstrainedeigenvectors of the full N × N correlation matrix. In a very idealistic (and highly unrealistic) scenario eachfactor corresponds to a di(cid:27)erent PC of the stock returns. In practice, the vectors (cid:16) ωωω (0) ? (cid:17) a are not eigenvectorsof neither H nor ΓΓΓ − / H ΓΓΓ − / , and they are by no means expected to be orthogonal. Therefore, in orderto mimic the (largest) eigenvalues of the correlation matrix, we have to diagonalize H in the K -dimensionalsubspace spanned by (cid:16) ωωω (0) ? (cid:17) , · · · , (cid:16) ωωω (0) ? (cid:17) K subject to the ΓΓΓ -product normalisation. In other words, we haveto (cid:28)nd a new base of K vectors (cid:16) ωωω (1) ? (cid:17) a = R ba (cid:16) ωωω (0) ? (cid:17) b , which maximise (cid:16) ωωω (1) ? (cid:17) T a H (cid:16) ωωω (1) ? (cid:17) a for each a withthe constraint (cid:16) ωωω (1) ? (cid:17) T a ΓΓΓ (cid:16) ωωω (1) ? (cid:17) b = δ ab .This, in turn, is equivalent to the optimisation of (cid:16) RhR T (cid:17) aa for each a with the extra constraint of R γγγ R T = I K , where γ ab ≡ (cid:16) ωωω (0) ? (cid:17) T a ΓΓΓ (cid:16) ωωω (0) ? (cid:17) b and h ab ≡ (cid:16) ωωω (0) ? (cid:17) T a H (cid:16) ωωω (0) ? (cid:17) b . This is equivalent to sayingthat O = R γγγ / is the diagonalisation matrix of h : O γγγ − / h γγγ − / O T = λ (1) ? · · · ... . . . ... · · · λ (1) ? K , (25)where λ (1) ? a ’s are the eigenvalues to be compared to those of the empirical correlation matrix. This formulareduces to the diagonalisation of the full correlation matrix for K = N , since in this case γγγ = ΓΓΓ , h = H .Crucially, for any other K the matrix γγγ is not diagonal as it is no longer related to the covariance matrix by γγγ = diag ( h ) . Indeed, a necessary condition for γγγ = diag ( h ) to happen is to have zero overlap between thefactor portfolios (cid:16) ωωω (0) ? (cid:17) a , which is very unrealistic. Consequently, γγγ − / h γγγ − / does not have ones along itsmain diagonal, and so the trace is not equal to K .The matrices γγγ and h have a simple relation to the FCLs of Maximum-Variance portfolios: λ (0) ? a ≡ λ (0) (cid:16) ωωω (0) ? a (cid:17) = h aa γ aa for a = 1 , . . . , K , (26)where we used the FCL de(cid:28)nition (7). Alternatively, one can say that λλλ (0) ? are the eigenvalues of the diagonalmatrix diag ( γγγ ) − diag ( h ) diag ( γγγ ) − . 12e interpret ωωω (1) ? a and λ (1) ? a as the constrained eigenvectors and eigenvalues of the correlation matrixinside the subspace generated by all of the Maximum-Variance factors. Recall that we have already encoun-tered the constrained diagonalisation. The market-neutrality constraint is replaced now by the requirementto remain in the K -dimensional space spanned by (cid:16) ωωω (0) ? (cid:17) a for a = 1 , . . . , K . The analogue of P c would benow the projection matrix from the N -dimensional into the K -dimensional space spanned by the vectors (cid:16) ωωω (0) ? (cid:17) a , which is by de(cid:28)nition the same space as the one spanned by (cid:16) ωωω (1) ? (cid:17) a : P ωωω (0) ? ≡ K X a =1 P a , where P a ≡ (cid:16) ωωω (1) ? (cid:17) a (cid:16) ωωω (1) ? (cid:17) Ta ΓΓΓ . (27)Here we used the fact that (cid:16) ωωω (1) ? (cid:17) a are already ΓΓΓ -orthonormalised (see the end of this section’s (cid:28)rst para-graph).Our approach is, in fact, similar to the model of linear combination of atomic orbitals in quantumchemistry (LCAO). Since the Schr(cid:246)dinger wave equation is hard to solve for a system with many electrons,the model limits the molecule orbital wave function to a linear combination of atom orbitals. The Hartree-Fock method is then used to determine the coe(cid:30)cients. The analogue of the molecular and atomic orbitals arethe ωωω (0) ? and ωωω (1) ? vectors respectively, and the FCL optimisation can be seen as the energy level minimisationof the molecular and atomic Hamiltonians: the former corresponds to λ (1) ? and the latter to λ (0) ? .Overall the procedure of (cid:28)nding the eigenvalues of C can be summarised in the diagram: λλλ (0) Optimisation −−−−−−−−→ see (15) λλλ (0) ? Diagonalisation −−−−−−−−−−→ see (25) λλλ (1) ? ? ≈ λλλ Emp (28)where the latter denotes the (cid:28)rst K eigenvalues of the empirical correlation matrix C .In Appendix G we prove that λλλ (1) are maximised in the Maximum-Variance portfolio (the (cid:28)rst arrow inthe diagram) assuming that the optimisation does not impact too much the o(cid:27)-diagonal elements of h . Tobe more explicit, we show that the ordered λ (1) i ’s derived from DhD are smaller than those obtained fromthe diagonalisation of h provided D is diagonal matrix with entries smaller than one. This shows that theFCL optimisation (15) is the right way to reproduce the eigenvalues of the correlation matrix, λλλ Emp . Thatis to say that the Maximum-Variance factors allow one to replicate as good as possible any alternative betastrategy.In Section 4 we show that λλλ (1) ? captures well λλλ Emp and its dynamics, meaning that the (cid:28)nancial andeconomic constraints help to withdraw some noises in the measurement of λλλ
Emp . We also show that theconstrained eigenvectors ωωω (1) ? a appear to be unstable portfolios. Indeed, they are invested mainly in thefactors with the highest (cid:16) λ (0) ? (cid:17) a , which in turn vary strongly with time explaining the portfolio instability. The goal of this section is to estimate the sensitivities b i ’s appearing in the Two-Factor model (11). As webrie(cid:29)y mentioned below (11), knowing the covariance matrix of the residual returns, the factor return r f , thesensitivities as well as the market stock index and the betas, one may readily model the correlation matrixof the returns, C .To use (11) as a de(cid:28)nition of b i , one has to provide the r f ( t ) time-series. We may replace r f ( t ) by thebenchmark return r b.m. ( t ) , see (9), which should be a reasonable approximation for su(cid:30)ciently large N . Ingeneral, one may start using r b.m. to extract b i from (11) using either OLS or WLS linear regression. Thisshould be equivalent to picking an eigenvector b i of the market-neutral projection of the correlation matrix,13 c -projection of the correlation matrix P c ΓΓΓ − H ΓΓΓ − P c , whose respective Maximum-Variance portfolioreturn P i (cid:16) ω (0) ? (cid:17) i r i ( t ) has the strongest correlation with r b.m. ( t ) among all eigenvectors. This method is,however, not practical for two reasons. First, most eigenvectors are determined with a lot of noises. Second,the correlation between the most correlated eigenvector and the signal could be insigni(cid:28)cant.Let us (cid:28)rst discuss how to reduce the noise. The simplest way to achieve this goal, is to group (aggregate)stocks whose signals, and so the rankings, are su(cid:30)ciently close to each other with respect to the given style F , and to follow the analysis of the previous paragraph for groups rather than for single stocks. If twostocks fall into two di(cid:27)erent groups with respect to the signal associated with factor F , but into the samegroups with respect to a di(cid:27)erent factor F , then the impact of the F signal on our eigenvector analysiswill be signi(cid:28)cantly suppressed. As we want the groups to be of the same size, the aggregation is equivalentto grouping the stocks into quantiles. The new parameter Q should be su(cid:30)ciently small in order to reducethe noise, but still large enough in order for the regression/eigenvector analysis results to be reliable. Wedenote the overall number of these quantiles by Q , meaning that at any time there are N/Q stocks in everygroup/quantile. We will employ a new notation q i ( t ) for the quantiles. That is q i ( t ) ∈ [1 , . . . , Q ] for any t ,in contrast to q i ( t ) ∈ [1 , . . . , N ] and q i = (cid:20) q i · QN (cid:21) . (29)To go on with the grouping idea we have to rede(cid:28)ne the sensitivities (factor loadings) b i ’s. Two stocks i and j belonging to the same quantile should now have identical sensitivities: b i = b j . This identi(cid:28)cation,however, seems to be far-fetched for stocks of di(cid:27)erent size classes. In view of (11) it makes sense to normalisethe factor loadings by the corresponding stock volatilities: Σ − i b i = Σ − j b j . This normalisation is yet anothermanifestation of the fact that the fundamental object describing the stocks dynamics is the correlation, ratherthe covariance, matrix. We thus have to rede(cid:28)ne the factor loadings as functions of the quantile rankings: b i → Σ i ( t ) B ( q i ( t )) . (30)This is one of the central formulae in this paper, and it is worth making two important comments. First,contrary to the original theoretical Two-Factor model the factor loadings are now time-dependent. Thistime-dependence, however, comes from the ranks and the volatilities, both varying much slower than theother functions, r m ( t ) for instance. We thus do not depart too far form (11). Second, we are about to arguethat the function B ( q ) is surprisingly the same for most of the factors. This is one of our main observations.We will refer to B ( q ) as the quantile factor loading (or quantile sensitivity) of the q -quantile.Here we determine the Market neutral Maximum-Variance portfolio that optimizes the FCL in the sub-space generated by the Q equal-risk weighted quantile portfolios de(cid:28)ned by: (cid:16) ω ( q ) (cid:17) i ( t ) ≡ (cid:26) Σ − i ( t ) if q i ( t ) = q otherwise . (31)The Maximum Variance portfolio will be the (cid:28)rst market neutral constrained eigenvector of the projectedmatrix e γγγ − e h e γγγ − of ΓΓΓ − H ΓΓΓ − into the subspace of dimension Q . The e γγγ and e hhh are the covariance matrixof positions and the covariance matrix of returns between the Q quantile portfolios ω ( q ) ( t ) and are obtainedin the same way as the γγγ and hhh matrices of Section 2.4. The (cid:28)rst di(cid:27)erence is that instead of reducing thedimension from N single stocks to K Maximum-Variance portfolios, we reduce here the dimension from N single stocks to the Q quantile portfolios. The second di(cid:27)erence is that there is no overlap between the Q quantile portfolios so that e γγγ is diagonal. The important point is that there is no need to use the unknown We skip the factor superscript f , because until late in the section we consider only one factor at a time. [ x ] sands for the integer part of x . e γγγ − e h e γγγ − to be sure that one of the constrained eigenvectorcaptures well the signal. Indeed, if Q is small enough, the (cid:28)rst constrained eigenvector is very likely tocapture the signal, if it is strong enough, as we will see later. By identi(cid:28)cation with the theoretical portfolio(20) we can therefore determine the B ( q ) as the market neutral constrained (cid:28)rst eigenvector of e γγγ − e h e γγγ − .We determine the beta reduced in Q dimension used in the market neutral condition as the beta of eachquantile portfolio through (32). e β q ( t ) ≡ X q i ( t )= q Σ − i ( t ) β i ( t ) . (32)By analogy with the (cid:16)original(cid:17) covariance matrix of returns and of positions of single stocks, we de(cid:28)nethe covariance matrix of returns and of positions of the Q quantile portfolios (see (1) and (2)):The covariance matrix The positionoverlapmatrix The correlationmatrix Dimensions Indices H ΓΓΓ = diag ( H ) C = ΓΓΓ − H ΓΓΓ − N × N i, j, · · · e h e γγγ e C = e γγγ − e h e γγγ − Q × Q q , q , · · · (33)It is important to remind here, that while the (cid:28)rst line in Table (33) has no reference to any particularstyle/factor, the quantities in the second line are di(cid:27)erent for each factor. If the relevant signal is su(cid:30)cientlystrong and/or the Q parameter is correctly chosen, the (cid:28)rst market neutral constrained eigenvector e C wouldbe a good proxy of the Market neutral Maximum Variance portfolio derived from the correlation matrix (18)computed from the returns modelled by (11). Therefore by simple term identi(cid:28)cation, we may expect B ( q ) to be the (cid:28)rst market neutral constrained eigenvector of e C . There are three important obstacles: • First, the estimation of e C of the previous sections was based on the conditional expectations, see (1),which is a purely theoretical concept. In practice we have a single length- T time-series for each stock.The best way to built a (cid:16)smoothed(cid:17) covariance matrix e h from this data is to use the exponential movingaverage (EMA) with a parameter α satisfying (cid:28) α − (cid:28) T . In Appendix I we explain in details theEMA of all the matrices leading to our estimate of e C ( t ) . In what follows we denote by D e C ( t ) E theaverage of this matrix over the entire period T . • Second, e β q ( t ) could depend on time, and we believe that subtracting β i ( t ) r m ( t ) from the stock returnsused to estimate e h could lead to a very minor improvement of the estimation of market neutral FCL.As the constrained eigenvectors of D e C ( t ) E would be only market neutral on average and not at anytime, we believe that subtracting β i ( t ) r m ( t ) would simulate an hedge with the stock index to maintainthe eigenvectors returns hedged against the index at any time and not only on average so that there isno contribution of the market mode into the FCL. • Third, to estimate the market neutral constrained eigenvector we have to consider, once more, theconstrained eigenvectors of e C . Recall that the latter is a Q × Q matrix, while the original P c projectionoperator in (18) is N × N , as c = ΓΓΓ − βββ is an N -vector. The new projection vector is determinedthrough c = e γγγ − e β de(cid:28)ned in (32).Contrary to the correlation matrix C , however, it will be di(cid:30)cult to project out the market modeindependently for each t as we did for the Two-Factor model around (18). This is so because we use We estimate a market neutral form of B ( q ) that is to say it is possible that in reality the weights correspond to B ( q )+ k e γγγ − e β B ( q ) (blue line) and B ( q ) (red line) for two factors: Beta (left) and Cash (right)with Q = 10 . In both cases the adjustment of the second step is quite small for all quantiles q = 0 , . . . , .The same holds for the remaining eleven factors.EMA to compute the empiric correlation matrix e C . To make sense of the projection one then has to(cid:16)smooth(cid:17) as well the corresponding f P c matrix built from e β q ( t ) , ending up with a projection matrixthat does not satisfy the basic property P = P . We will prefer instead to diagonalise D e C ( t ) E with aconstant projection de(cid:28)ned by: f P c = I q − e βββ e γγγ − e βββ T e βββ T e γγγ − e βββ where e βββ ≡ h βββ ( t ) i (34)is the mean of the betas over the T -period and I is the Q × Q identity matrix. As we discussed inthis section on two di(cid:27)erent occasions, the constrained diagonalisation of e C is equivalent to the regulardiagonalisation of f P c e C f P c . This additional subtraction of the market-mode should provide a betterestimation of the factor sensitivities.With these three points in mind we propose a two-step procedure to evaluate the factor loadings B ( q ) .First step. We calculate the EMA version of the correlation matrix e C ( t ) from the Σ -normalised quantileportfolio ω q ( t ) de(cid:28)ned in (31) while subtracting the part of the portfolio returns explained by the stockindex, and use f P c to (cid:28)nd the ( Q − constrained eigenvectors of D e C ( t ) E orthogonal to e γγγ − e β q ( t ) .Each of the eigenvectors, which we will denote by B ( p ) with p = 1 , . . . , Q − and their eigenvalues by λ B p ≡ λ (cid:0) B ( p ) (cid:1) , gives rise to a portfolio v ( p ) de(cid:28)ned by (20) with factor loadings determined by b i ( t ) = Σ i ( t ) B ( p ) ( q i ( t )) . (35)These ( Q − portfolios are by construction market-neutral and, as we commented above the portfoliobuilt from B q i ( t ) , that is v , has a good chance to mimic the Maximum-Variance portfolio ωωω (0) ? providedthe signal is strong enough and the number of quantiles is properly chosen to capture it. Accordingly,the (cid:28)rst constrained eigenvalue, λ B , associated with B is a good estimate of λ (0) ? , the optimal FCLfrom (15). The signal strength is directly translated into λ B ’s value.Second step. At the (cid:28)nal stage of the (cid:28)rst step we get ( Q − portfolios. We used (20) to ensure theirmarket-neutrality for any t . This adjustment does not necessarily respect the optimisation involved in16nding B ( q ) . Therefore, these portfolios are not a priori optimal. The second step will correct the v portfolios slightly so they become optimal while staying market neutral at any time. We may considerthe covariance matrices ( e h , e γγγ ) of positions and of returns of the v p portfolios for p = 1 , . . . , ( Q − .To obtain this matrix we proceed as in the previous step (including the EMA and the time average),but replacing the normalised quantile portfolios ω ( q ) ( t ) with the ( Q − v ( p ) portfolios. The covariancematrices e h , e γγγ should be very close to be diagonal, since the portfolios were by construction orthogonalto each other before being slightly adjusted. Let us denote by O the rotation matrix that diagonalisesthe covariance matrix e γγγ − e h e γγγ − and λ B p the eigenvalues. Then the improved sensitivities B ( p ) are: B ( p ) = Q − X ¯ p =1 O p ¯ p B (¯ p ) (36)As above for each vector B ( p ) we have a corresponding market-neutral portfolio, v ( p ) , and v (1) shouldprovide even a better approximation for Maximum-Variance market-neutral portfolio (20), than v (1) of the (cid:28)rst step. The (cid:28)rst eigenvalue λ B associated with O is expected to be a better estimation ofthe optimal λ (0) ? than the previous λ B . This step may be repeated again, but we will see that there isno need in doing so, as the portfolio stabilises after the very (cid:28)rst adjustment.The two-step procedure can be summarized in the following diagram: (cid:18) r i ( t ) q i ( t ) (cid:19) (31) ω ( q ) ( t ) (33)(32) e γγγ e h e β (cid:18) B ( p ) λ B p (cid:19) (35) b i (20) v ( p ) i (cid:18) O p p λ B p (cid:19) (36) B p (35) b i (20) v ( p ) i = (cid:16) ω (0) ? (cid:17) i (37)with i ∈ [1 , N ] q ∈ [1 , Q ] p ∈ [1 , Q − . (38)It is crucial to emphasize here that the de(cid:28)nition of the Maximum-Variance market-neutral portfolio includedthe maximum correlation condition (16) which we have not yet mentioned here. Instead we select the (cid:28)rstconstrained eigenvector and use it to construct ωωω (0) ? . To verify the consistency of this approach we will (cid:28)ndthe correlation between the returns of the portfolio obtained in the end of (37) and the benchmark portfolio.In Table 14 of Appendix B we list the fourteen factors used to estimate the quantile sensitivities. We set Q = 10 . We summarise technical details in Appendix I, with the (cid:28)rst two lines of Table 18 being the referencefor the relevant calculation. Our results show that the di(cid:27)erence between B ( p ) and B ( p ) is insigni(cid:28)cantfor all the factors. On Figure 1 we present both functions for two randomly selected factors with p = 1 toillustrate this fact. We thus may assume that the eigenvalues λ B p do not change much after the second stepadjustment. In order to avoid clustering of indices and superscripts we will omit the in B ( p ) and λ B p .On Figure 2 we demonstrate the B (1) results for all fourteen factors. We see that with a common overallnormalisation for nine factors this function is very well approximated by: B ? ( q , Q ) ≈ Q − · ( q − − for q = 1 , . . . , Q . (39)This con(cid:28)rms the universal nature of our Ansatz (35). This is one of our main observations. As for theremaining factors, we will argue below that their signals capture a risk factor that is too weak, explainingthe deviation of their B ( q ) functions from (39).On Figure 4 we compare the double Heaviside function used to construct a benchmark portfolio (9), B ? ( q , Q ) of (39) and a random form of B ( p ) ( q ) . Since for Q = 10 (and in fact for any other multiple17igure 2: For nine factors (left) the function B ( q ) is very close to (39). For the remaining (cid:28)ve factorsthese functions have a di(cid:27)erent shape that could be explained either by noises or by convergence to anotherfactor/risk. We will see below that (cid:28)ve signals capture risk factors that are too weak to measure B ( q ) properly. The normalisation is identical for all factors.of ) the Heaviside function of the benchmark portfolio might be represented as a linear combination of Q generic vectors B ( p ) ( q ) , we may assume that the benchmark portfolio is, in turn, a linear combination ofthe Q market-neutral quantile portfolios v ( p ) ’s. This simple observation allows us to use (8) in the ( Q − dimensional subspace spanned by v ( p ) ’s, in order to (cid:28)nd the benchmark portfolio FCL as a weighted sum ofthe eigenvectors λ B p : λ (0) (cid:0) ωωω b.m. (cid:1) = Q − X p =1 (cid:0) a b.m. p (cid:1) λ B p with a b.m. p ≡ ρ Γ (cid:16)e γγγ − v ( p ) , ωωω b.m. (cid:17) and Q − X p =1 (cid:0) a b.m. p (cid:1) = 1 . (40)Though this equation is only an approximation, we may use it to evaluate the strength signal for di(cid:27)erentfactors. Ordering the eigenvalues λ B p , we see that: (cid:0) a b.m. (cid:1) > λ (0) (cid:0) ωωω b.m. (cid:1) − λ B λ B − λ B . (41)The right-hand side of this inequality may serve as indications for the signal strength. The value of λ B maybe seen as the optimised FCL in the ( Q − dimensional space of market-neutral quantile portfolios. On theother hand, ωωω b.m. is our proxy for the factor portfolio ωωω f . Thus for a strong signal (strong means that thesignal captures a risk that is high and that both λ (0) (cid:0) ωωω b.m. (cid:1) and λ B should be high) and a properly chosen Q , one should get λ (0) (cid:0) ωωω b.m. (cid:1) closer to λ B than to λ B , implying that the right-hand side of (41) is greaterthan . That implies that a b.m. , that measures the position overlap between the benchmark portfolio andthe (cid:28)rst constrained eigenvector, is higher than . This makes it very likely for the (cid:28)rst eigenvector to bethe most correlated to the signal. For weaker signals with low λ (0) (cid:0) ωωω b.m. (cid:1) one should get λ (0) (cid:0) ωωω b.m. (cid:1) closerto λ B than to λ B , meaning that the right-hand side of (41) is below . For weak signals it is very likelythat the (cid:28)rst eigenvector captures some noise or a risk that has nothing to do with the original signal. Webelieve that λ B is increasing with Q so that for weaker signal we should set a smaller Q in order to increasethe possibility for the (cid:28)rst constrained eigenvector to capture well its signal.On Figure 3 we show the dependence between the time average of the two sides of (41). The inequalityholds for all fourteen factors, and even more impressively, the values are close to one exactly for the those18igure 3: For the left graph the ( x, y ) -values are the right and the left-hand sides of the inequality (41)respectively for di(cid:27)erent factors averaged over the T -time period. While the inequality holds for all factors,the values are close to the point ( . , . (surrounded by the red ellipse) only for the (cid:16)strong(cid:17) factors withmeasurements that resemble B (1) ( q ) of (39). On the right graph we plotted the same position overlapsversus the H -correlation coe(cid:30)cients (5). The red ellipse encircles the same factors as on the left graph.factors whose vector B is well-approximated by (39). On the same (cid:28)gure we show the connection betweenthe position overlap ρ Γ (cid:16)e γγγ − v , ωωω b.m. (cid:17) and the ρ H -correlation between the same portfolios, as it was de(cid:28)nedin (5). The two quantities are closely related. For two portfolios with similar positions, the correspondingreturns will be strongly correlated. As the graph shows, the points standing for the factors producing (39)are indeed close to ( ρ Γ , ρ H ) = (1 , . In the previous subsection we showed how to estimate the factor loadings b i ’s by grouping the N stocks into Q quantiles. The (cid:28)nal result b i ( t ) = Σ i ( t ) B ? ( q i ( t )) with B ? ( q ) = 1 Q − · ( q − − . (42)is universal for nine stronger factors out of fourteen.We (cid:28)nd this result very surprising. Ignoring the impact of di(cid:27)erent Σ i ’s, the linear form of B ? ( q ) guarantees that for a (cid:28)xed range of factor loadings, between ¯ b and ¯ b + δb , there is the same number ofquantiles with loading in this range regardless of ¯ b . Extrapolating to the full , . . . , N ranking, it means thatthe sensitivities have a uniform distribution. This is in contrast to our expectations of getting a distributionclose to the Gaussian one. This would lead to higher values of B ( q ) for lower quantiles, and smaller B ( q ) for larger q ’s. Instead we observe a very good straight line approximation.By its very construction the estimate forces a sensibility b i ( t ) to depend on the relevant quantile ranking q i ∈ [1 , . . . , Q ] rather than the standard ranking q i ( t ) ∈ [1 , . . . , N ] . Once we have arrived at the (cid:28)nal estimate,however, we may take one step forward and adapt (39) for the full rankings q i ( t ) ’s: b i ( t ) = Σ i ( t ) B ? ( q i ( t )) with B ? ( q ) = 1 N − · ( q − − . (43)Here the expression for B ? ( q, N ) follows from (42) by replacing ( q , Q ) → ( N, q ) . In what follows we willsometimes refer to B ? ( q ) and B ? ( q ) as the (cid:16)multi-step(cid:17) and (simply) the linear B -functions. They appear19igure 4: the schematic comparison between the benchmarkweights (9) with the bottom/top double Heaviside func-tion shown in black, the linear B ? ( q , Q ) function of (39)shown in red and (cid:28)nally a generic form of B ( p ) ( q ) for p > in light blue. For simplicity we use the same normalisationfor the two functions and ignore the Σ -factors in (9). Thedouble Heaviside function may be approximated by a linearcombination of the Q − functions B ( p ) ( q ) .on Figure 5. As we have mentioned in Introduction, a similar version of this weighting (without the volatilityfactor) was used in [16] for the Value (Book in our conventions) and Momentum factors.With the factor loadings (42) and (43) at hand we may now construct two portfolios using (20). Fromthe discussion of the previous subsection, it is clear that the former is the Maximum-Variance market-neutral portfolio of the pair (cid:16)e h , e γγγ (cid:17) . It is thus natural to expect that the portfolio built from (43) is a goodapproximation for the Maximum-Variance market-neutral portfolio of the original pair ( H , ΓΓΓ ) . With thisassumption in mind, we denote by λ (0) ? the (optimised) FCL of this portfolio. We already referred to the(optimised) FCL of (42) as λ B . We summarise the values λ (0) ? and λ B on Table 1 for all fourteen factors. In this section we would like to compare our (cid:28)ndings with the most popular approaches known in theliterature.In the previous section we argued that a straightforward linear regression between r i ( t ) and one of thebenchmark returns r b.m. ( t ) is not the best tool to estimate the factor loadings. Instead we adopted Ansatz(35) and avoided almost completely the use of r b.m. ( t ) . Nevertheless, it is interesting to compare the FCLwith the classical R that is optimized in the Fama-MacBeth regression [47] when using the linear regressionanalysis. R ( q ) for each quantile portfolio is estimated for the market model (CAPM) and the Two-Factormodel: • R ( q ) obtained from a multi-linear regression of the Two-Factor model (11) and • R CAPM ( q ) obtained from a simple linear regression of the pair ( r i ( t ) , r m ( t )) The di(cid:27)erence between the two coe(cid:30)cients may measure the improvement in the replication of asset returnby the two returns, r m ( t ) and r f ( t ) , of (11), over that with the single r m ( t ) return of the CAPM within thegiven quantile q .This approach was carried out by Fama and French in [2] for the celebrated Three-Factor Model. Table 4of this paper summarizes the R coe(cid:30)cients for the CAPM (cid:28)tting within each group of the Small-Minus-Big(SMB) classi(cid:28)cation of market capitalizations and the High-Minus-Low (HML) classi(cid:28)cation of book-to-market ratios. The coe(cid:30)cients in this table are signi(cid:28)cantly smaller as compared to their counterparts inTable 7a, where the multi-linear regression now includes two additional returns: those of the SMB and theHML benchmark portfolios. This increase con(cid:28)rms the predictive power of the Three-Factor Model.In Figure 6 we plot the optimised FCL λ (0) ? for all our factors as a function of (cid:10) R F M − R CAP M (cid:11) , with theaverage over the Q quantiles. We see that the higher the FCL is the higher the improvement in R could beexpected. In Appendix H we provide more details about the similarities and di(cid:27)erences between the classical20igure 5: The comparison between the linear(blue) function B ? ( q ) and the multi-step (red)function B ? ( q ) of (42) and (43) respectively.The relation between q and q appears in (29).The black dashed lines correspond to two dif-ferent values of q . They intersect B ? ( q ) onthe same (cid:16)step(cid:17), and yet yield di(cid:27)erent values of B ? ( q ) . This may explain the results in Table 1.approach of [47, 2] and the one of our paper. We also argue in Appendix D that the FCL optimisation mightbe interpreted as a constrained WLS regression analysis, in contrast to the unconstrained OLS of [47, 2].Another important point is that the original Fama-MacBeth approach does not yield optimal factors, unlessone repeats the regression analysis iteratively. Nevertheless, successive iterations will eventually converge tothe eigenvectors of the empirical covariance matrix (either in its full version or reduced version). In the bestcase scenario these eigenvectors are, in turn, a noisy mixture of the initial signals. In the worst case, they willhave no relation to the initial signals. Moreover, the number Q of quantiles in the Fama-MacBeth approachis proportional to K (the number of factors squared), making the dimensional reduction very di(cid:30)cult inthe regime, where Q (cid:28) N does not hold any more. That explains why the 20% top-bottom approach hasremained a standard tool in (cid:28)nance even though it is not optimal. Our method of (cid:28)nding the eigenvalues of the correlation population matrix consists of two main steps. First,we optimise the FCL for each factor separately. This yields K Maximum-Variance portfolios (cid:16) ωωω (0) ? (cid:17) a for a = 1 , . . . , K . We denoted optimal FCLs values by (cid:16) λ (0) ? (cid:17) a . At the second step we optimise the FCL in the K -dimensional subspace spanned by the K portfolios. This results in yet another K pairs ωωω (0) ? a , λ (0) ? a .The goal of this section is to test the second step using randomly generated factor portfolios (cid:16) ωωω (0) ? (cid:17) a .This is a robustness check of our scheme.We start with a random correlation matrix C r. . We denote its eigenvectors/eigenvalues pairs by ( e r. i , ‘ r. i ) for i = 1 , . . . , N as in (8). Then the i th component of the a th factor portfolio is generated by: ( v r. a ) i = N X j =1 (cid:0) ‘ r. j (cid:1) µ/ (cid:0) e r. j (cid:1) i E ja , (44)where E is a N × K matrix of standard normal random variables simulating our returns, and µ is a freepositive parameter to be (cid:28)xed soon. The intuition behind this Ansatz is that for a given a the portfolio v r. a isa random linear mixture of the correlation matrix eigenvectors with weights determined by their eigenvalues.To put it di(cid:27)erently, the linear combination is dominated by the leading eigenvectors, and this dominance iscontrolled by µ . In particular, in the limit µ → ∞ all of the vectors de(cid:28)ned in (44) are proportional to the(cid:28)rst eigenvector, while for µ = 0 there is no preference to any particular eigenvector.We identify (44) with the Maximum-Variance market-neutral portfolios ωωω (0) ? : v r. a ∼ ωωω (0) ? a (45)21actor λ (0) ? λ B Beta 6.35 6.10 • Momentum 5.87 5.34 •
5Y Rates 5.38 5.35 • Capitalisation 4.43 4.39 • STR 4.10 4.14 • Dividend 3.86 5.64Euro 3.72 3.60 • Sales 2.88 2.96 • Liquidity 2.82 2.88 • Book 2.49 2.60 • Leverage 2.08 2.32Earning 1.89 2.09Cash 1.52 1.64Growth 1.47 1.68 Table 1: Summary of two FCLs λ (0) ? and λ B , correspond-ing to the Maximum-Variance portfolios based on (43) and(42) respectively. We see that λ (0) ? outperforms λ B for thethree leading factors, while the situation is di(cid:27)erent for thesmallest values. It can be explained by the fact that themulti-step function fails to capture the impact of two stocksthat fall into the same quantile, but nevertheless have signif-icantly di(cid:27)erent correlations with the factor return, see Fig-ure 5. This is obviously more relevant for larger FCLs. Forsmaller ones λ B > λ (0) ? instead, because λ B comes from agenuine optimisation rather than from the (cid:16)educated guess(cid:17)(43). The bullet on the right-hand side denotes factors(styles) with strong signals, see Figure 3.It is now straightforward to compute the FCL associated with v r. a : the (cid:28)rst formula in (8) is precisely whatwe need. We obtain: λ r. (0) a = N P j =1 (cid:0) ‘ r. j (cid:1) µ E jaN P j =1 (cid:0) ‘ r. j (cid:1) µ . (46)Continuing along the lines of Section 2.5 we can also (cid:28)nd the constrained eigenvalues (cid:16) λ r. (1) (cid:17) a . As wedescribed in details in Section 2.5, they are the optimised FCLs in the K -dimensional space spanned by thevectors v r. a , and one can easily compute them from the diagonalisation of the matrix γγγ r. − h r. γγγ r. − , where h r. ab = v r. a C r. v r. b and γ r. ab = N X i =1 ( v r. a ) i C r. ii ( v r. b ) i . (47)Following previous notations we denote the constrained eigenvalues by λ r. (1) i .To summarize so far, we showed how to derive the λ r. (0) a and λ r. (1) i eigenvalues from the random correlationmatrix C r. and a Gaussian matrix E . To (cid:28)nally test our method we still have to decide how to generatethe correlation matrix. For su(cid:30)ciently large N , the eigenvalues of a random covariance matrix Ω ij withtrace equal to N (Tr (ΩΩΩ) = N ) and of the corresponding correlation matrix Ω ii − Ω ij Ω ii − are very closeto each other. We therefore will construct H r. instead of C r. . Furthermore, we will set the eigenvectors e r. i to be a random orthonormal basis (rows of an orthonormal matrix), while for the eigenvalues ‘ r. i we willtake the empiric eigenvalues in the penultimate column of Table 5. Recall that these values were obtainedfrom 5-minutes returns over almost six-years long period. The Mar£enko-Pastur ratio N/T is therefore verysmall, meaning that these empiric eigenvalues should not be di(cid:27)erent from the eigenvalues of the (cid:16)theoretical(cid:17)22igure 6: The FCL λ (0) ? (the (cid:28)rst column ofTable (1)) as a function of the mean di(cid:27)erence R F M ( q ) − R CAP M ( q ) . We see that the di(cid:27)erenceis always positive, indicating a better regression (cid:28)t.Moreover, the improvement clearly works much bet-ter for factors with larger FCL. It means that thestronger the signal, the better works the Two-Factormodel (11). We used here the equal-weighted versionof the benchmark portfolio, see (9), to be closer to theoriginal computations of [2].population correlation matrix. Notice as well that for i > K these eigenvalues are very noisy. It makes thento set all ‘ r. i for i > K to the same constant value ‘ c. , which is (cid:28)xed by the trace condition. To summarize: H r. = K X a =1 λ Emp a e r. a e r. a T + ‘ c. N X j = K +1 e r. j e r. j T with ‘ c. = 1 − K P a =1 λ Emp a N − K . (48)Overall our test follows the following diagram: (cid:18) ‘ r. i = (cid:0) λ Emp a , ‘ c. (cid:1) Table 5 e r. i random, e r. k T e r. l = δ kl (cid:19) → λ r. (0) a λ r. (1) a ! . (49)Here the a -index reminds that we are interested in matching the (cid:28)rst K eigenvalues only. We present ourresults on Figure 7. We found that the best matching between λ r. (1) a and λ Emp a occurs for µ = 1 . . Thisenabled us to reproduce the empirical link between the ordered FCLs and the ordered (constrained andunconstrained) eigenvalues (the blue line on Figure 7 is close to the yellow points).Notice that so far, we have had no need to generate the returns. This can be done by r i ( t ) = N X j =1 q ‘ r. j (cid:0) e r. j (cid:1) i ε j ( t ) , (50)where both e r. j are ‘ r. j are de(cid:28)ned as in (48) and ε j ( t ) are T N standard normal random variables. For T (cid:28) N the covariance matrix of these returns reduces to (48), but away from this regime these matrices are di(cid:27)erent.Starting from these returns we calculate the new correlation matrix, C r. , as well as its N unconstrained and K constrained eigenvalues with the h r. and γγγ r. matrices computed from (47). We demonstrate the outputof this calculation on Figure 7.From the bottom-left graph of Figure 7 we learn that the projection into the subspace generated by the K random factors helps to reduce the noises and the bias for T (cid:28) N . That is also interesting (and somewhatintriguing) to see from the bottom-right plot that for T (cid:29) N the matching between the simulated constrainedeigenvalues and the unconstrained eigenvalues of the population correlation matrix is much weaker than wecould have expected from the empirical results of Table 5 of Section 4.2.The discrepancy between our simulation and empirical measurements could be explained by the universallinear law that we could not have respected. To do so one needs to generate a random rotation matrix e ij (44) whose j -row elements have a uniform distribution instead of the more natural (cid:16)bell-like(cid:17) distribution.23 power 3.4power 2.4power 1.4power 1empiric power 3.4power 2.4power 1.4power 1empiric rank truepower 3.4power 1.4simulated empiricreal constrained rank truepower 3.4power 1.4simulated empiricreal constrained Figure 7: Top left: theoretical distribution of λ (0) ? with µ = 1 . . We simulate the λ (0) ? distribution based ona random selection of signals. We suppose that the angle between the random factor and any unconstrainedeigenvector is a Gaussian random variable with a standard deviation proportional to the square root of theunconstrained eigenvalue power µ . We apply a Monte Carlo simulation with , trajectories. Top middle:within this model, we explain the relationship between the ordered values of λ (0) ? and the true λ Emp withoutthe measurement noises. µ = 1 . is the best (cid:28)t among µ = 1 , µ = 2 . and µ = 3 . . Top right: withinthe model, we explain the relationship between the ordered values of λ (1) ? and the true λ Emp without themeasurement noise. Surprisingly µ = 3 . and µ = 2 . are the best (cid:28)t among µ = 1 , µ = 1 . . T = 50 (Bottom left) and T = 80 , (Bottom right) are used to replicate the measurement noise of the simulatedempirical eigenvalues. T = 80 , is the number of 5 minutes returns used in Table 5. In both cases( T = 50 and T = 50 , ) the simulated constrained eigenvalues with µ = 3 . are the best estimation ofthe true eigenvalues. For T = 50 the simulated empirical eigenvalues overestimate the true eigenvalues verysigni(cid:28)cantly whereas the simulated constrained eigenvalues with µ = 3 . remains close to the true eigenvaluesuntil the 10th rank. For T = 80 , the measurement noise is considerably reduced. The real constrainedeigenvalues perform better than the simulated ones as they (cid:28)t until the 15th rank. We suspect that theinconsistency ( µ = 1 . (cid:28)ts the λ (0) ? in the top middle plot but does not (cid:28)t the λ (1) ? whereas µ = 3 . doesnot (cid:28)t the λ (0) ? in the top middle plot but (cid:28)ts the other plots) is coming from a drawback of the randomgeneration of factors that does not take into account the Maximum-Variance optimization and the ’universallaw’ that is an important pattern of the empirical correlation matrix and that should play an important role. Here we describe di(cid:27)erent variations of the Maximum-Variance and benchmark portfolios tested in the paper.In the last subsection we summarise all of the proposed improvements in a single table.
We argue that under certain additional assumptions regarding the Two-Factor model (11), the Maximum-Variance portfolio (20) has also the largest Sharpe ratio under the market-neutral constraint that capturesthe signal.We adopt the following conjectures: 24
We suppose that the signal captures a positive risk premium and we thus have E t − (cid:0) r f (cid:1) > , whileresidual returns are expected to have mean zeros, E t − ( (cid:15) i ( t )) = 0 . • The covariance matrix of the residual returns takes the form H (cid:15) = kΓΓΓ , where k is a positive constant. In other words, di(cid:27)erent (cid:15) i ’s are uncorrelated, E t − ( (cid:15) i (cid:15) j ) = 0 , and their volatilities are proportional tothe corresponding stock volatilities, E t − (cid:0) (cid:15) i (cid:1) = kΣ i .For the sake of simplicity we will also ignore the time-dependence of the portfolio weights. With theseassumptions and focusing meanwhile exclusively on a market neutral portfolio with weights ω i , we (cid:28)nd that:E t − N X i =1 ω i r i ! = (cid:16) b T ωωω (cid:17) E t − (cid:0) r f (cid:1) and E t − N X i =1 ω i r i ! = (cid:16) b T ωωω (cid:17) (cid:0) Σ f (cid:1) + k (cid:0) ωωω T ΓΓΓωωω (cid:1) . (51)The Sharpe ratio is therefore: S f ≡ E t − (cid:18) N P i =1 ω i r i (cid:19)vuut E t − (cid:18) N P i =1 ω i r i (cid:19) ! = E t − (cid:0) r f (cid:1)vuut ( Σ f ) + k · (cid:0) ωωω T ΓΓΓωωω (cid:1)(cid:16) b T ωωω (cid:17) . (52)Maximising this expression with respect to ωωω subject to the market-neutrality condition βββ T ωωω = 0 , we arriveat (20). We can conclude therefore that (cid:28)nding the highest Sharpe ratio of a market-neutral portfolio isequivalent to the FCL optimisation.The same approach might be used as well if we weaken the second assumption above. We may presumeinstead that the covariance matrix of the residual returns has a small rank-1 o(cid:27)-diagonal market-neutral term: H (cid:15) = k (cid:0) ΓΓΓ + ε u (cid:15) u (cid:15) T (cid:1) . Requiring ωωω both to be market neutral and orthogonal to u (cid:15) , ωωω T u (cid:15) = ωωω T βββ = 0 , thenew solution is ωωω ? ∼ ΓΓΓ − b − b T ΓΓΓ − ββββββ T ΓΓΓ − βββ · βββ − b T ΓΓΓ − u (cid:15) u (cid:15) T ΓΓΓ − u (cid:15) · u (cid:15) ! . (53)Here the (cid:28)rst two terms appear already in the Maximum-Variance market-neutral portfolio (20), while thelast one guarantees the u (cid:15) -orthogonality. This is a Maximum-Variance portfolio which is both market and u (cid:15) -neutral, as it optimises the FCL under these two constraints. ν The formula (43) for b i ( t ) might be generalised to b i ( t ) = Σ νi ( t ) B ? ( q i ( t )) , (54)where the original assumption (30) corresponds to ν = 1 , and B ? ( q ) is the same as in (43). For E ( r f ) > , theTwo-Factors model (11) implies that the excess expected return satis(cid:28)es ( E ( r i ) − β i E ( r m )) ∼ Σ νi B ? ( q i ( t )) .We will consider the following variations: • ν = 0 . This is equivalent to assuming that the excess expected return of a single stock does not dependon its volatility. Alternatively, one may say that the risk is not fairly rewarded. The model would be called a scalar strict factor model, if H (cid:15) were proportional to the identity matrix. ν = 1 . This choice makes sense from the economic point of view and was also brie(cid:29)y justi(cid:28)ed in theparagraph above (30). It implies that the reward is proportional to the single stock volatility. Thereis a direct link between the ν = 1 choice and the Maximum-Diversi(cid:28)cation portfolio of [11]. The mainhypothesis behind the construction in [11] is the proportionality between the expected return of a stockand its volatility, E ( r i ) ∼ Σ i . Together with the central CAPM result, E ( r i ) = β i E ( r m ) , it leads to β i ∼ Σ i . The analogue of the Maximum-Diversi(cid:28)cation hypothesis for the excess return should be ( E ( r i ) − β i E ( r m )) ∼ Σ i B ? ( q i ( t )) , and according to the Arbitrage Pricing theory [26] the same excessreturn is proportional to the factor sensitivity, ( E ( r i ) − β i E ( r m )) ∼ b i provided E ( r f ) > . Combiningthe two formulae we see that b i ∼ Σ i B ? ( q i ( t )) . We also argue that the Maximum Variance portfoliowith ν = 1 is equivalent to the constrained WLS regression (see Appendix D). • ν = 2 . It describes the situation where most volatile stocks generating more reward than expected for ν = 1 . It also corresponds to the common-practice OLS regression but with constraints: comparingthe two last columns of Table 17, we see that the ν = 2 Maximum Variance portfolio is close to theconstrained OLS regression with ν = 0 . In Section 3.1 we presented (53) that is an extension of the Maximum Variance portfolio that is not onlymarket neutral but also neutral to another main factor of risk, the control variable, that remains to de(cid:28)ne.It is usually one of the Fama and French popular Book or Size factors. The extended portfolio was derivedto get the optimal Sharpe ratio. That motivates to develop a method to make factors as decorrelated aspossible. We presented a methodology, in the Section 2.4, that transforms the initial risk premia factorsinto the constrained eigenvectors of the correlation matrix, that would be the natural orthogonalized riskpremia. Unfortunately we show in the empirical validation part that these constrained eigenvectors presentan unstable combination of initial risk premia factors. We believe that this instability is intrinsic to anymethodology that claims to orthogonalize the factors. Here we present the alternative methodologies, oneof which will be implemented and tested in Section 4.Multiply-sorted portfolios are implemented in [3]. More complex schemes are used in Asset Pricing modelsto withdraw bias, when di(cid:27)erent characteristics are correlated [48, 49]. To solve the dependency problembetweens factors, an optimal procedure is proposed in [50] to (cid:28)nd orthogonalized risk premia inspired fromthe methodology attributed to [51]. Normalising returns of factors by the square root of the covariance matrixis also used in agnostic risk parity [52]. The new orthogonalized risk premia may diverge signi(cid:28)cantly fromthe original ones. Many (cid:16)orthogonalisation(cid:17) methods, though popular in the Asset Management industry,are yet to be documented. One of the most famous ones is known as the residual method, since it triesto withdraw the common part directly from signals. Suppose that for any time t and stock i we are giventhe raw ranking q ri ( t ) of the variable to be priced and q ci ( t ) of the control variable. For (cid:28)xed t , we regress q r ( t ) , . . . , q rN ( t ) against q c ( t ) , . . . , q cN ( t ) treating the regression residual as the new signal of the variable tobe priced. The (cid:16)orthogonalized(cid:17) signal is therefore the residual ranking that is not explained by the controlvariable. This is ideologically close to the so-called residual momentum strategy, (cid:28)rst presented in [53], wherethe regression analysis is performed instead on the residual returns rather than the rankings.We select Book, Size and Beta as the three control variables. The Book and Size are justi(cid:28)ed, sinceaccording to Fama and French they are the best candidates to explain the cross-section of the expectedreturns. The Beta signal was selected because according to its empirical FCL it appears to be the mostimportant factor. As we will see later, the empirical results are very disappointing. Our interpretation isthat there is no way to determine which control variable to select as it requires to prefer some signals overthe others. As we brie(cid:29)y discuss in 4, the hierarchy could be determined based on raw FCL though it maychange very rapidly with time. 26otation Model ν Numberofclusters Number ofsectoralfactors ResidualMaxVar(1,6,9) Maximum-Variance 1 6 9MaxVar(0,6,9) (cid:22)"(cid:22) 0 6 9 -MaxVar(2,6,9) (cid:22)"(cid:22) 2 6 9 -MaxVar(1,6,30) (cid:22)"(cid:22) 1 6 30 -MaxVar(1,1,9) (cid:22)"(cid:22) 1 1 9 -MaxVar(1,6,9,Beta) (cid:22)"(cid:22) 1 6 9 BetaMaxVar(1,6,9,Book ) (cid:22)"(cid:22) 1 6 9 BookMaxVar(1,6,9,Size) (cid:22)"(cid:22) 1 6 9 SizeBM-ERW(6,9) Benchmark (equal-risk weighted) - 6 9 -BM-EW(6,9) Benchmark (equal-weighted) - 6 9 -Table 2: Di(cid:27)erent methods of the portfolio construction tested in the paper. The equal-risk-weighted andequal-weighted benchmark portfolios were introduced in (9) and ν is the model parameter in (54), whichis relevant for the Maximum-Variance models only. The next two columns correspond to the number ofindustries and sectoral factors used for signal sorting. The last column speci(cid:28)es the residual choice for theorthogonalisation method discussed in Section 3.3. It will be tested only for three factors listed in this table. Most academic papers do not take into account the sectors as control variable, even though it may reducenoises of measured risk premia and increase Sharpe ratios. As we will show in the next section, commonfactors extracted from industry returns explain signi(cid:28)cant cross-sectional returns, even surpassing the ex-planatory power of Book and Size. We estimate that more than of the unconstrained factors variance isexplained by sectoral risk (di(cid:27)erence between the FCL with or without sectoral constraint). This is in linewith [54] which showed that the Sharpe ratio of the Value (Book in our conventions) factor is higher if thesectoral risk is completely withdrawn. Conversely, the Momentum premium is found to be better explainedwhen the sectors are taken into account [55].We thus decided to investigate sectoral factors (either 9 or 30), while maintaining our fourteen styles(Book, Size, etc.) sector-neutral. An additional incentive for our choice is the fact that most alternative riskpremia vehicles are marketed as sector-neutral portfolios. Since the signals for the sector factors can only bebinary (a company at question either belongs or does not belong to the sector) we can use neither the linearfunction B ? ( q ) of (43)) nor the step-function B ? ( q ) of (42). Instead we adopt the following weight functionfor a given sector s and a company i : B s ( i ) = − n s N if the company i belongs to the sector s − n s N if it does not , (55)where n s is the sector s size and N = P s n s is the total number of stocks. Notice that the sum P i B s ( i ) vanishes for all s , exactly as it does for B ? ( q ) and B ? ( q ) .Our style portfolio construction is based on sorting with respect to the relevant signal. To impose sector-neutrality we rank the stocks separately within 6 industry clusters (more on this below) and then combinethe rankings to get a single sorting list. In other words, six (cid:28)rst-ranked stocks are to be followed by six27econd-ranked stocks, etc. Since we deal with a su(cid:30)ciently large number of stocks, the way the stocks withidentical rankings are ordered between them, is unimportant. These six clusters are explained in AppendixF. They are not optimised but rather inspired by the Global Industry Classi(cid:28)cation Standard (GICS), whichalso was the basis for our selection of 9 sectoral factors presented in the same appendix. In this paper we have not considered any liquidity, turnover and leverage constraints. Liquidity is importantin order to examine the full impact of trading. Turnover incurs brokerage fees and slippage. Leveragegenerates (cid:28)nancial cost because of the gap between the borrowing and lending rates. All the more, weassume that all interest rates are zero, and so all stocks could be borrowed and be shorted with the zerointerest rate. The Sharpe ratio of those factors that require more (cid:28)nancing for the long legs than for theshort legs, for instance the Low-Beta (Beta below) factor, should be sensitive to this assumption.
In Table 7 we summarize di(cid:27)erent modi(cid:28)cations of the Maximum-Variance and benchmark portfolios. Forthe former we test three values of the ν -parameter, 6 versus a single industry cluster, either 9 or 30 sectoralfactors, and, (cid:28)nally, three di(cid:27)erent control factor selections. For the benchmark portfolios we explore theequal-risk and the equal-risk-weighted schemes, while keeping 6 clusters and 9 sectoral factors. The goal of this section is to test the ideas discussed so far with empirical data on stock returns.
Our universe consists of 611 USA stocks selected in 2013, all of whom had capitalisation above one billiondollars back in the period. Between 2000 to 2013 the universe su(cid:27)ered from the survivor bias. It impactsonly the daily data and the measured Sharpe ratio of the Capitalisation factor, overestimating it by 1. Atthe same time, it underestimates the Momentum Sharpe ratio by 0.2 and has negligible in(cid:29)uence on market-neutral factors. By the end of November we expect to have the unbiased results for a rolling universe thatreplicates the historical constituents of the SP500 for the daily data.We have two types of data: • Five-minutes returns for the period 2013-2018 provided by John Locke Investments. The marketrelaxation time is believed to be around one minute [56], and thus the correlation should be su(cid:30)cientlystable on the scale between (cid:28)ve minutes and one day. Nevertheless, we detect a weak autocorrelationthat changes the correlation matrix measurements as we move between the 1-day and the 100-daystime-scales. • Daily returns for the period 2000-2018 provided by FactSet.As we have already explained in the previous section, stock’s companies are divided into 30 industriesaccording to their GICS classi(cid:28)cation, see Appendix F. They are further grouped into 6 clusters, as shownin the same appendix.Finally, we have 14 signals for the styles summarised in Appendix B. The (cid:28)nancial information is providedon daily basis by FactSet. Importantly, the data is accessible with a one-day shift. For all but one case inTable 2, the 14 styles are accompanied by exactly 9 sectoral factors. Since we always take into account themarket mode, it amounts to capturing the (cid:28)rst 24 eigenvectors of the empirical correlation matrix.28ive-minutes returns Daily returns 100 days returnsconditional T = 72 · unconditional T = 72 · · · conditional T = 255 · unconditional T = (255 · / λλλ Emp ( N = 610) λλλ (1) ( N = 24) ‘ + = ‘ + ( N, T ) for λλλ Emp and λλλ (1) with the number of availablereturns. Notice the high value of the threshold for the unconditional empirical estimation based on 100 days.The table may be easily extended to other time-scales of Tables 10 and 11, but here we report only theextreme values.
In the theoretical part of the paper we used the conditional expectations E t − ( · · · ) to construct the covarianceand other matrices. In practice we can only estimate these conditional expectations. A common practiseis to use the [ t, t − ∆ t ] period average of a given time-series as an estimate of the conditional expectationof the relevant quantity at time t . To get a better result one can use an Exponential Moving Average(EMA) with the parameter α − = 1 / ∆ t . As we explain in Appendix I for the 5-minutes returns the optimalaveraging period is one week (5 days). As for the daily returns, we do not possess su(cid:30)ciently long historicdata to estimate conditional expectations of various eigenvalues. Nonetheless, we can use the daily data to(cid:28)nd multi-day returns, which, as we explain below, lead to di(cid:27)erent results for the same eigenvalues. Inwhat follows we will occasionally refer to conditional results as time-dependent, and to unconditional astime-independent.We are interested in three di(cid:27)erent sets of eigenvalues: λλλ (0) , λλλ (1) and λλλ Emp . The (cid:28)rst two are determinedby the h and γγγ matrices as in (26) and (25) respectively. In Appendix I we explain in details how to estimatethe conditional (time-dependent) and unconditional (constant) matrices h and γγγ based either on the 5-minutes or the (multi-)daily returns. In the last subsection of the same appendix we describe the derivationof the empiric eigenvalues λλλ Emp . Our ultimate goal is to compare λλλ (1) and λλλ
Emp . As we will explain shortly,the latter are our best shot at the eigenvalues of the (cid:16)true(cid:17) (also called population) correlation matrix.Obviously, the closer λλλ (1) to λλλ Emp , the better our approach. To avoid further confusion we will stick to thedictionary: λλλ (0) : FCLs λλλ (1) : Constrained eigenvalues λλλ
Emp : Empiric eigenvalues (56)Here constrained refers to the fact that λλλ (1) were obtained from the constrained diagonalisation, see Section2.4.Before proceeding to other quantities investigated in this section let us make an important comment.According to the celebrated paper [57] by Mar£enko and Pastur, the eigenvalues ‘ i of the matrix X T X , an T × N matrix of IID standard normal variables, are non-uniformly distributed between two parameters, ‘ − and ‘ + : ‘ ± = ± r NT ! (57)provided both N, T → ∞ with
N/T hold (cid:28)ned. The larger this ratio, the stronger ‘ i deviate from the (cid:16)true(cid:17)value of 1. In spite of the fact that our (cid:16)true(cid:17) eigenvalues are very di(cid:27)erent from 1 (in fact, we will see thatthe largest eigenvalue is of order N/ ) and the distribution is far from the Gaussian one, we might still adopt29ive-minutes returns Daily and multi-dailyreturnsconditional unconditional conditional unconditional λλλ Emp
Figure 8 Table 5 - - λλλ (0)
Figure 8 Tables 6 and 7 - Tables 8 and 9 λλλ (1)
Figure 8 Table 5 - Tables 10 and 11 x ab Figure 9 Table 12 - -Sharpe ratio - - Figure 10 Table 13Table 4: Summary of the (cid:28)gures and tables of this section according to the data used (either 5-minutes ordaily returns) and the time-dependence: the (cid:28)gures present conditional (time-dependent) variables and thetables are reserved for unconditional (constant) results. ‘ + as the threshold between the signi(cid:28)cant and noisy eigenvalues. In what follows we refer to this value asthe Mar£enko-Pastur threshold.For λλλ Emp we always use the full collection of stocks, and so N = 611 as we already mentioned in thebeginning of this section. For λλλ (1) , on the other hand, N = K since we obtain these eigenvalues fromthe γγγ − h γγγ − diagonalisation in the subspace spanned by K Maximum-Variance portfolios. In this paper K = 14 ( styles ) + 9 ( sectoral factors ) + 1 ( market ) = 24 . The situation is slightly more complicated for T .For instance, for the unconditional measurement based on the 5-minutes returns we have on average dailyreturns, working days, weeks per year and, (cid:28)nally, six years of data. Thus T ≈ . · . At the sametime, for the respective conditional calculation T = 72 · , if we take one week as the averaging periodto estimate the expectation. Conversely, for the daily returns T = 255 · , because now we have yearsof data. In Table 3 we summarise the Mar£enko-Pastur (MP) thresholds for the empiric and constrainedeigenvalues, and di(cid:27)erent evaluation schemes. The thresholds are important, as they indicate how manyeigenvalues we should consider as non-noisy. There are two lessons we might learn from this table. First, theMP threshold for the unconditional empiric eigenvalues on the time-scale of 100 days is very large. We willsee that it leaves out only the market mode. Second, the one-day threshold, though much lower, is still wellabove the unconditional threshold for the 5-minutes returns. We conclude therefore it is worth to evaluatethe empiric eigenvalues only from the 5-minutes returns that will serve as a reference to evaluate how wellour method reproduces the (cid:16)true(cid:17) eigenvalues. Consequently we will investigate the time-scale dependenceof the eigenvalues using only the constrained eigenvectors method, see the second row of Table 3.Apart from the eigenvalues we consider two additional measurements: the transformation matrix O fromMaximum-Variance portfolio to constrained eigenvectors introduced in (25), and the Sharp ratio. Since thefactors may capture di(cid:27)erent level of risk, it will be more illustrative to compare the rescaled components, x ab = O ab q λ (1) b q λ (0) a . In this paper the Sharpe ratio is measured as the ratio between the average andthe standard deviation of the daily-returns based portfolio. We also multiply the expression by √ inorder to annualise the (cid:28)nal result. To compare the Sharpe ratios for di(cid:27)erent factors we exploit only dailydata from 2000 to 2017. We stick with the constant gross-investment normalisation, P i | ω i ( t ) | = const, asit is commonly accepted in Asset Pricing. It was shown in [58] that due to the non-Gaussian distributionof returns, the normalisation by monthly look-back volatilities leads to the anomalies and the Sharpe ratioswhich are di(cid:27)erent from those derived with the constant gross-investment normalisation. We veri(cid:28)ed thatall our observations regarding the Sharpe ratio still hold under the normalisation method of [58].30 λ (1) ? , Maximum-Variance λ (1) λ (1) ? λ Emp % (1,6,9) (0,6,9) (2,6,9) (1,1,9) (1,6,9,Beta) (1,6,9,Book) BM-ERW(6,9) BM-EW(6,9) MaxVar(1,6,30)1 100.35 100.27 100.17 102.33 100.07 99.63 102.93 100.53 101.06 109.02 -32 20.49 20.17 20.43 21.37 20.42 20.33 18.40 17.68 20.88 22.32 113 12.44 10.94 11.69 13.31 12.40 12.25 11.76 10.45 12.62 12.84 64 8.65 7.92 8.56 9.12 8.62 8.52 8.43 8.24 9.19 10.01 35 7.55 7.48 7.55 7.74 7.55 7.39 7.02 6.78 7.63 7.94 86 5.28 4.83 5.19 6.38 5.28 4.87 4.97 4.52 5.62 5.92 67 5.14 4.69 5.06 5.67 5.13 4.73 4.78 4.20 5.32 5.79 88 4.27 4.04 4.25 5.24 4.27 4.22 3.64 3.53 4.37 4.70 179 3.84 3.73 3.82 4.88 3.83 3.45 3.24 2.90 4.00 4.29 1910 3.52 3.40 3.47 4.36 3.48 3.13 2.69 2.67 3.85 3.53 3111 3.28 3.20 3.28 3.69 3.22 2.98 2.51 2.38 3.30 2.77 3012 2.95 2.91 2.96 3.17 2.92 2.40 2.30 2.22 3.10 2.42 2813 2.49 2.42 2.49 2.89 2.49 2.34 2.08 1.98 2.63 2.38 2014 2.24 2.28 2.21 2.47 2.22 2.09 1.79 1.68 2.42 2.23 2515 1.86 1.90 1.84 2.07 1.98 1.86 1.58 1.51 2.16 2.10 1816 1.52 1.44 1.50 1.96 1.62 1.62 1.35 1.34 1.99 2.03 1317 1.31 1.28 1.30 1.69 1.47 1.50 1.19 1.18 1.85 1.96 1118 1.20 1.16 1.18 1.44 1.25 1.26 1.03 1.05 1.83 1.92 1619 1.11 1.10 1.09 1.40 1.11 1.17 0.97 0.96 1.74 1.84 1420 1.05 1.04 1.04 1.23 1.05 1.05 0.92 0.94 1.70 1.73 1421 0.95 0.94 0.94 1.12 0.96 0.95 0.89 0.89 1.64 1.64 722 0.91 0.88 0.90 1.08 0.91 0.93 0.86 0.85 1.49 1.60 623 0.87 0.87 0.87 1.04 0.87 0.92 0.83 0.82 1.45 1.56 524 0.84 0.83 0.82 0.92 0.83 0.88 0.80 0.81 1.38 1.50 5 Table 5: The 2013-2018 conditional eigenvalues obtained with di(cid:27)erent methods of Table 2. λ Emp -columnare the sample eigenvalues. The last column shows the improvement of the MaxVar(1,6,9) method comparedto the standard benchmark equal-risk weighted portfolio. We see that Maximum-Variance increases all butthe (cid:28)rst eigenvalue, which corresponds to the market mode. Recall that to construct the Maximum-Variancemarket-neutral portfolio we used the stock index r m ( t ) and its proxy r m ? ( t ) , rather than the (cid:28)rst principalcomponent of the returns matrix. We believe owing to this approximation the equal-risk-weighted portfolioperforms better than ωωω m.(0) ? . For λ Emp i ( i = 2 , . . . , ) the Maximum-Variance improvement comes largelyfrom the sectoral factors. On the other hand, the Sharpe ratio increases in the last column for eigenvaluesranging between 2 and 6 is due to the style factors, mostly Book. These issues are discussed in more detailsin Section 4.3.1. Among all Maximum-Variance methods, MaxVar(1,1,9) has the best results.In Table 4 we list all (cid:28)gures and tables containing the measurements of the eigenvalues, the transformationmatrix elements and the Sharpe ratio both conditional and unconditional and based on the 5-minutes or(multi-)daily returns. 31 (0) ? , Maximum-Variance λ (0) , BenchmarkSectoralfactor (1,6,9) (0,6,9) (2,6,9) (1,1,9) BM-ERW(6,9) BM-EW(6,9)Utilities 14.04 13.65 14.03 14.04 12.99 12.74Energy 10.27 9.08 9.25 10.27 9.81 8.67Reits 9.76 9.43 9.69 9.75 9.59 9.53Finance 7.74 7.09 7.64 7.74 7.87 7.70Pharmacy 5.75 5.18 5.62 5.76 5.62 5.39IT 5.33 4.97 5.22 5.33 5.22 5.02Consumer 4.56 3.98 4.47 4.56 4.61 3.71Discretionary vs Staples 3.83 3.68 3.84 3.83 3.47 3.03Industry 3.53 3.56 3.57 3.53 3.14 2.97Table 6: Di(cid:27)erent unconditional FCL ( λ (0) ) for the sectoral factors obtained for the 2013-2018 period withdi(cid:27)erent methods of Table 2. Improvement from the Maximum-Variance optimisation is very limited forsectoral factor as the signal is binary. The sectoral factors have higher FCL than the style factors. Themajor sectoral factor is Utilities, that is expected to be a highly leveraged sector, despite its small size, as itis used by traders to speculate on the FED policy that was the major issue during the period. Energy andREITS were also very volatile due to the oil price decline and the crisis of Malls. From Table 5 we see that the Maximum-Variance portfolios capture well the (cid:16)true(cid:17) empirical eigenvaluesmeasured with 5-minutes returns of the average correlation matrix: λ (1) ? ( MaxVar (1 , , are very closeto the λ Emp . The 24 (cid:28)rst unconstrained eigenvectors explain R = 41 . of the cross-section regressionof normalized returns ( R = (cid:16)P Kk =1 λ Emp k (cid:17) /N = 41 . ) whereas the 24 (cid:28)rst constrained eigenvectors ωωω (1) ? ( MaxVar (1 , , obtained with the Maximum-Variance optimisation (MaxVar(1,6,9)) explain 37.62%(40.03% when sector constraints are withdrawn, MaxVar(1,1,9)). We (cid:28)nd also a signi(cid:28)cant di(cid:27)erence be-tween λ (0) ? ( MaxVar (1 , , and λ (1) ? ( MaxVar (1 , , (see Tables 7 and 6). This is the impact of the stronginteraction between the economic factors (see Section 2.4). Moreover, all market-neutral factors have FCLsmuch smaller than the FCL of the market mode (70), which is close to 100. This is consistent with our choiceto have K Two-Factor models for all the factors di(cid:27)erent from the market rather using a single multi-factormodel, see the last paragraph of Section 2.2.We observe as well that the major market-neutral factors ωωω (0) ? ( MaxVar (1 , , are mostly the sectoralfactors: Utilities, a small size sector, has surprisingly the highest λ (0) ? ( MaxVar (1 , , . This could be speci(cid:28)cto the sample period when the monetary policy change played a crucial role. Traders used to speculate onshorting the Utilities, believed to be highly indebted, in the goal to take positions on an upward interestmove.Presence of Utilities in the (cid:28)rst eigenvector was con(cid:28)rmed by [59] in which Finance, Oil and Utilitieswere found to be the main components of this eigenvector. We also see Reits as an important factor mostcertainly due to the Malls crisis in 2017. We (cid:28)nd that the two most popular risk premia, Momentum and32 (0) ? , Maximum-Variance λ (0) , BenchmarkStylefactor (1,6,9) (0,6,9) (2,6,9) (1,1,9) (1,6,9, Beta) (1,6,9,
Book)
ERW(6,9) EW(6,9)10Y Rates 8.10 8.58 8.42 14.62 7.76 6.94 4.38 3.76Beta 7.63 7.71 7.76 12.72 7.63 7.52 4.82 4.47Momentum 6.10 5.69 5.81 9.39 6.09 4.29 4.39 3.95Capitalisation 5.14 4.79 4.99 6.08 5.14 2.21 4.00 3.65Dividend 4.28 4.59 4.45 6.01 4.27 4.27 2.47 2.16Euro 3.85 3.97 3.89 6.82 3.86 3.36 2.49 2.27Liquidity 3.70 3.23 3.17 6.24 7.03 6.73 3.13 3.00Book 3.53 3.27 3.31 5.74 3.54 3.54 2.75 2.59STR 3.44 3.46 3.42 5.51 3.39 3.05 2.29 2.09Sales 3.35 3.04 3.28 4.58 3.36 1.53 2.71 2.45Leverage 2.05 2.25 2.12 2.99 2.05 2.31 1.27 1.22Earning 2.04 2.01 2.06 3.12 2.04 2.03 1.48 1.45Cash 1.79 1.92 1.85 2.50 1.78 1.71 1.19 1.14Growth 1.24 1.16 1.17 1.84 1.24 1.24 1.06 1.05
Table 7: The 2013-2018 unconditional FCL obtained with di(cid:27)erent methods of Table 2 and for all styles.The six λ (0) ? -columns correspond to the Maximum-Variance method and so we omit the common part inthe method references. Similarly, we save space for the last two benchmark columns. Comparing theMaxVar(1,6,9) and the BM-ERW(6,9) columns, one might notice an improvement of 45% thanks to theMaximum-Variance optimisation. We also see that the sectoral constraints are suboptimal: MaxVar(1,1,9)appears to be the best method. But to maintain the sector constraints, on average MaxVar(1,6,9) appearsthe best. The model ν = 1 appears to be most realistic for most factors even if some exception could bereal. The most important style factors are the 10Y Rates (stocks that are the most sensitive positively tothe interest rate increase vs stock that are less) as the FED policy was a major issue on the period. TheBeta and Momentum factors are the two other major factors whereas the traditional Fama & French factors,Capitalisation and Book, seem to be far less important.Beta, are the 6th and the 7th risk factors in λ (0) ? ( MaxVar (1 , , . The Book factor has a surprising low λ (0) ? ( MaxVar (1 , , at a 5-minutes time scale whereas it is supposed to be an important factor at least at alonger time scale based on the known results in the literature. The Growth factor is very close to noise as itsFCL is just slightly above an FCL of a random signal. We see that Maximum-Variance approach managesnevertheless to improve its FCL.4.3.1 Comparison between the Maximum-Variance and the benchmark portfoliosFor the 14 styles the Maximum-Variance optimisation allows to improve λ (0) by 45% compared to thebenchmark portfolios (see Table 7). As we have already mentioned above the improvements for the sectoralfactors are signi(cid:28)cantly weaker. This is actually not too surprising if we think of the Maximum-Variancefunction B ? ( q ) as the (cid:16)smoothed(cid:17) version of the double Heaviside function used for the benchmark portfolioconstruction (9). No such (cid:16)smoothing(cid:17) option exists for a sectoral factor, whose signals are strictly binary.Thus the linear function optimisation has a weaker impact.33actor 1 day 5 days 10 days 20 days 40 days 80 days 100 daysBeta 6.35 7.99 8.77 9.34 10.37 11.64 14.45STR 4.10 4.73 5.00 5.51 6.72 8.45 12.21Momentum 5.87 7.09 7.64 8.20 9.00 9.60 10.80Capitalisation 4.43 4.95 5.24 5.66 6.48 7.53 9.54Book 2.49 2.94 3.40 3.99 4.89 5.50 6.40Sales 2.88 3.47 3.85 4.19 4.85 5.26 6.30Dividend 3.86 4.44 4.90 5.28 5.77 5.83 5.7910Y Rates 5.38 6.24 6.26 6.06 5.80 5.80 5.42Liquidity 2.82 2.93 3.12 3.34 3.77 4.11 4.91Euro 3.72 4.12 4.15 4.08 3.82 3.74 3.55Leverage 2.08 2.30 2.54 2.72 2.93 3.10 3.41Earning 1.89 2.09 2.20 2.24 2.40 2.59 3.17Cash 1.52 1.66 1.72 1.73 1.85 2.04 2.46Growth 1.47 1.67 1.76 1.78 1.92 2.00 2.00Table 8: The FCL of the Maximum Variance MaxVar(1,6,9) at di(cid:27)erent time scales are optimised by 47%as compared to the benchmark BM-ERW(6,9). Improvement are 41% for 1 day scale to 49% to 100 daysscale. Based on daily data from 2000 to mid 2018.This observation provides a good explanation for the last column of Table 5. The sectoral factors contribu-tion is more signi(cid:28)cant for the leading (the (cid:28)rst four) eigenvalues, while it decreases for the intermediate ones.As a result our Maximum-Variance approach is more successful for λ (1) i , with i > . The λ (0) improvementis more signi(cid:28)cant for longer time scales as one can see from Tables 8 and 9.Surprisingly, the (cid:28)rst constrained eigenvalue for the benchmark portfolios, λ (1) ( BM-ERW (6 , , isslightly higher than the (cid:28)rst Maximum-Variance eigenvalue λ (1) ? ( MaxVar (1 , , meaning that it is slightlybetter to model the market-mode portfolio as an equal-risk-weighted portfolio (9) than the Maximum-Variance market-mode portfolio (70). This is also evident from the fact that the FCL of the Maximum-Variance market-mode portfolio λ m (0) ? ( MaxVar (1 , , ≈ . is slightly lower than the FCL of the equal-risk-weighted portfolio λ m (0) ( BM − ERW (6 , ≈ . . We believe that it is related to our choice to stickwith the β ’s derived from the stock index r m ( t ) , rather than the Maximum-Variance return r m ? ( t ) (4). Wediscussed in details the di(cid:27)erence between the two available indices in the paragraphs preceding (3).The upshot of the last two paragraphs is that R of the cross-section regression with the 24 factors isimproved only slightly from . with the benchmark to . with the Maximum-Variance portfolios.With longer time scale the improvement is even more signi(cid:28)cant ( for days) and the R is higheras well ( . for days) (Tables 10 and 11). This is thanks to the quarter returns being strongercorrelated than the daily returns. We see that FCL for the styles are always optimised for the Maximum-Variance portfolios compared to the benchmark portfolios for any time scale (Tables 8 and 9) and interestinglythe FCLs are increasing with the time scale. The Book factor becomes the (cid:28)fth most important factor afterBeta, Capitalisation and Momentum at -days time scale.34actor 1 day 5 days 10 days 20 days 40 days 80 days 100 daysBeta 4.13 4.98 5.44 5.94 6.82 7.90 10.12STR 2.94 3.23 3.37 3.64 4.29 5.29 7.55Momentum 4.27 4.91 5.18 5.49 5.97 6.39 7.31Capitalisation 3.25 3.61 3.89 4.29 4.98 5.86 7.24Book 1.80 1.97 2.22 2.58 3.11 3.39 3.79Sales 2.36 2.80 3.13 3.45 3.98 4.26 5.10Dividend 2.51 2.75 2.99 3.16 3.47 3.50 3.3510Y Rates 3.07 3.42 3.35 3.26 3.17 3.12 2.73Liquidity 2.18 2.19 2.31 2.49 2.85 3.03 3.39Euro 2.47 2.64 2.67 2.62 2.47 2.48 2.49Leverage 1.43 1.52 1.62 1.77 1.95 2.12 2.45Earning 1.39 1.49 1.57 1.59 1.69 1.81 2.18Cash 1.13 1.17 1.20 1.23 1.36 1.51 1.79Growth 1.17 1.26 1.32 1.35 1.50 1.57 1.52Table 9: The FCL of the benchmark BM-ERW(6,9) at di(cid:27)erent time scales. Based on daily data from 2000to mid 2018.Di(cid:27)erent versions of the Maximum-Variance portfolios MaxVar(1,6,9), MaxVar(0,6,9) and MaxVar(2,6,9)give similar results as volatilities Σ k are not heterogeneous enough.We see that the sectoral constraints appear to be highly suboptimal as the solution without any sec-toral constraints MaxVar(1,1,9) gives higher FCL λ (0) ? ( M axV ar (1 , , and higher constraint eigenvalues λ (1) ? ( M axV ar (1 , , (Tables 7 and 5).The application of the residual methods (columns MaxVar(1,6,9,Beta) and MaxVar(1,6,9,Book ) of Table7) is surprisingly e(cid:30)cient only for the following three factors: Liquidity, Momentum and Capitalisation.The immediate interpretation is that the signals of these three factors are strongly correlated to the Booksignal. As for the λ (1) ? eigenvalues in Table 10, the variations MaxVar(1,6,9,Beta), MaxVar(1,6,9,Book ) andMaxVar(1,6,9,Size) produce almost the same results as the Maximum-Variance method MaxVar(1,6,9). Weconclude, therefore, that the residual method does not bring any major improvement. Apart from that,as we discussed at the end of Section 3.3, the control variable selection has the drawback of breaking therotational symmetry between the K factors.4.3.2 Capturing the dynamics of eigenvalues and eigenvectorsUp to this point we discussed only the means of the empirical and modelled correlation matrices and theireigenvalues. In this section we would like to focus instead on the dynamics (time-dependence) of the eigen-systems.The Maximum-Variance FCL, λ (0) ? , resonates stronger with the factor volatility jumps than the relevantbenchmark portfolio. We demonstrate this phenomenon on the top three graphs of Figure 9. The strong35igenvalue 1 day 5 days 10 days 20 days 40 days 80 days 100 days1 152.08 164.58 163.63 166.58 164.34 165.96 174.742 17.05 20.46 22.37 23.48 24.45 25.08 27.763 12.67 13.41 14.09 14.80 15.91 17.00 21.244 8.68 10.12 10.60 11.59 12.57 14.16 15.785 7.31 7.76 7.97 8.07 8.67 9.70 11.726 5.78 6.67 6.93 7.38 7.86 8.36 9.417 5.21 5.76 6.12 6.53 7.14 8.08 8.048 4.67 5.00 5.51 5.71 6.37 6.60 6.209 4.07 4.28 4.50 5.01 5.27 4.92 4.5610 3.88 4.21 4.12 4.13 4.47 4.51 4.3911 3.41 3.62 3.61 3.53 3.64 3.75 3.6412 2.96 3.12 3.02 3.18 3.59 3.62 3.30Table 10: λ (1) ? ( MaxVar (1 , , from the Maximum-Variance is a good proxy for true eigenvalue of thecorrelation matrix. We see that correlation increase sharply with time scale and that the optimisation isworking for any time scale. Indeed the (MaxVar(1,6,9)) optimised by an average of 23% compared to BM-ERW(6,9). The improvement is 16% in daily scales to 27% in 100-days scales. The trace is increased by10% at the longer time scale. Daily data from 2000 to mid 2018resonance helps the constrained eigenvalues λ (1) ? to capture well the dynamics of the (cid:28)rst empirical eigenvalues λ Emp , see the bottom three graphs of Figure 9.For large dimensions the (cid:28)rst eigenvalue of a correlation matrix is linked to the average of its o(cid:27)-diagonalelements. In (cid:28)nancial terms it implies that the high volatility of the (cid:28)rst (market-mode) eigenvalue might beinterpreted as the increased correlation between single stocks. Both the empirical and Maximum-Varianceeigenvalues reproduce this behaviour as one can see on Figure 9. Moreover, the dynamics of the two eigen-values (the constrained and the unconstrained) are very close to each other. The same holds for the secondand the third eigenvalues.The (cid:28)rst eigenvector is well known as the market mode but the second eigenvector has always remaineddi(cid:30)cult to interpret according to the literature. Our (cid:28)ndings clarify the origin of this problem. The largestcomponents of the 2nd eigenvector come from the factors with the highest FCLs. The latter, however,are very volatile, and so are the components of the second eigenvector. In general the eigenvalues of alarge random matrix are expected to be repulsive and so no crossover phenomenon usually happens. Whatwe observe nevertheless is the crossover of the factor FCLs and their components in the second and thirdeigenvectors. For example, as soon as the Financials sector FCL exceeds Utilities’s FCL, the componentsof the second eigenvector change accordingly. This is shown on Figure 9. Beta, Rates and Utilities are themain contributors to the second eigenvector in the period from 2013 to 2018 (Table 12 and Figure 9). λ (1) ( BM-ERW (6 , calculated for the daily returns from 2000 to mid 2018.those factors the value E ( r f ) in our Sharpe ratio (52) is su(cid:30)ciently large. These factors are STR, Liquidityand Cash. We present the results in Table 13 and Figure 10. On the other hand, for the three most popularfactors, Momentum, Book and Capitalisation, the value E ( r f ) is too small, the benchmark Sharpe ratio isweak (or even negative) and therefore we cannot test our optimisation method for this period. Indeed, theSharpe ratio during these 18 years is only . annualized for Book and so is not even actually statisticallysigni(cid:28)cant ( t -statistics is . √ ≈ ). The Sharpe ratio for Beta is only . and is even negative forMomentum. The three factors are nevertheless the most popular risk premia: Beta for quality, Momentumfor trend and growth and Book for value. According to most of the references on market anomalies andasset pricing in Table 14 these factors are substantially pro(cid:28)table but on a much longer period (usually since1960), though even this claim is controversial. The 2008 crisis generated exceptional losses to the Beta andMomentum factors, although these losses are not representative for a longer historical period (see Figure10). The factor returns are highly non-Gaussian with extreme losses accumulating into a short period andt-statistics that are common test in asset pricing should be interpreted with caution. Consequently, theSharpe ratio estimation is very sensitive to the portfolio normalisation. At the end of Section 4.2 we alreadymentioned two possible ways to normalise the portfolios.To summarize, the theoretical Sharpe ratio improvement in the framework of the Maximum-Varianceoptimisation could not be con(cid:28)rmed empirically in a conclusive manner, since the 20 years period is tooshort of a sample to produce statistically signi(cid:28)cant results. This is not really disappointing. To verifyempirically a (very optimistic) annual Sharp ratio improvement of . with t stat > , we would need atleast 100 years of daily data. Despite all this we (cid:28)rmly believe that upon the assumption that there is asubstantial alternative risk premium, the Maximum-Variance portfolio has a higher expected Sharpe ratiothan the standard top-bottom portfolio.4.4.2 Skewness, Leverage e(cid:27)ect and alternative risk premiaBased on our measurements the daily returns skewness is not necessarily signi(cid:28)cant. It is negative ( − . ) forthe market mode. Momentum, Beta and Liquidity exhibit negative skewness (-0.35, -0.39, -0.08 respectively)37
200 400 600 800 1000 1200 time time time time time time Figure 8: Top: measure from 2013 to 2018 of the λ (0) ? ( MaxVar (1 , , and λ (0) BM-ERW (6 , for Momen-tum, Beta and 10Y Rates. λ (0) ? ( MaxVar (1 , , correspond to the optimal case (Maximum-Variance (1))and λ (0) ( BM-ERW (6 , to the benchmark case (top-bottom 20% (4)). We see that the optimal case enterseasier in resonance like in April to August 2016 where the three factors were excited. Bottom: measureof the (cid:28)rst three eigenvalues λ (1) ? ( MaxVar (1 , , and λ Emp . λ Emp corresponds to the sample but noisyeigenvalue without any constraint. λ (1) ? ( MaxVar (1 , , corresponds to the constrained eigenvalues usingthe Maximum-Variance optimisation (MaxVar(1,6,9)) . It appears that λ (1) ? ( MaxVar (1 , , looks to beless noisy and be a good proxy of de-noised λ Emp . We see how brutally the (cid:28)rst eigenvalues increased inFebruary 2018. We also see hat the excitation of the second eigenvalue in April to August 2016 correspondsto the excitation of Momentum, Beta and 10Y Rates of the Top graphs. We also see a spike in July 2017 inthe third eigenvalue with an interaction with the second one.while STR, Cash, Capitalisation and Book factors all have positive skewness ( . , . , . , . respec-tively). By the Central Limit Theorem argument the skewness is expected to decrease as ( time scale ) − .the negative skewness of factor returns may justify theoretically and empirically the presence of alternativerisk premia (see, for example, [60]). In short, investors prefer to combine occasional strong gains with frequentsmall losses. This translates into a positive skewness of the returns distribution. Once the skewness becomesnegative, the same investors would like to have an alternative risk premia to compensate for the unattractiverisk pro(cid:28)le.The Leverage E(cid:27)ect (LE), that is negative correlation between returns and volatility variation, is a well-known phenomena in stock market. That generates high negative skewness for a large variety of time scales[61, 62, 63]. We believe that for any portfolio this is more natural to study LE by analysing the variation ofportfolio’s FCL rather than its volatility (or variance) variation. The FCL is the ΓΓΓ -normalised variance ofportfolio’s returns. As an example, for the Maximum-Variance market portfolio the FCL variation accountsfor the variation of the average correlation between single stocks rather than the variation of the averagesingle stock volatility. Moreover, in this case, the FCL variation is a good proxy of the variation of thecorrelation matrix (cid:28)rst eigenvalue, see Section 4.3.2.To analyse LE for di(cid:27)erent factors, we start with the
ΓΓΓ -normalised Maximum-Variance portfolio of agiven factor. We then regress the monthly variations of its FCL against its monthly returns. The higher thecoe(cid:30)cient of determination, R , the stronger the evidence for LE.We found that the market mode is the only factor that exhibits a signi(cid:28)cant LE with R = 0 . . All other38actor ωωω (1) ? ( MaxVar (1 , , Factor ωωω (1) ? ( MaxVar (1 , , Market Mode -9.17 Utilities 2.63Capitalisation 0.10 10Y Rates -1.65Beta 0.07 Beta -1.49Energy -0.07 Reits 1.37Liquidity -0.07 Energy -0.92Momentum 0.07 Dividend 0.73Table 12: A composition measured from 2013 to 2018 in risk of (cid:28)rst and second conditional constrainedeigenvector ωωω (1) ? ( MaxVar (1 , , , ωωω (1) ? ( MaxVar (1 , , obtained through the Maximum-Variance optimi-sation. We present only the 6 highest risk contributions. The constrained (cid:28)rst eigenvector of the averagematrix is exposed to the market mode risk and the capitalisation whereas the second eigenvector is exposedto the utilities, rates and beta factors. But if we focus on the eigenvectors of the conditional weekly matrix,we would see that the second eigenvector is changing and that the beta factor arrived in the top position forthe (cid:28)rst and second eigenvectors.market-neutral factors do not exhibits any LE as R < . , see Table 13. Without any LE, the skewness isexpected to converge quickly to zero at larger time-scales and that could challenge the theory of alternativerisk premia. Here we summarize some results and open problems relevant for the realistic correlation matrix modellinggiven relatively precise measurements realized by means of our economics constraints (cid:28)lter: • The (cid:28)rst eigenvalue of the correlation matrix of γγγ − h γγγ − is weakly volatile (see the yellow line onFigure 11) and its dynamics seems to be governed by a systematic factor and the market mode FCL, λ m (0) ? , appears to be a good candidate. This (cid:28)rst eigenvalue is relatively stable for di(cid:27)erent timeperiods and may be interpreted as the average correlation between fundamental factors as if the positionoverlaps were completely suppressed. The factor overlap (blue line on Figure 11) moves only moderatelywith time and seems to be correlated with the momentum performance. • The time dependence of ln( λ (0) )( t ) can be modelled by an Ornstein(cid:21)Uhlenbeck process with a relaxationperiod of 60 days. It is tempting to model the correlation matrix di(cid:27)usion by the FCLs di(cid:27)usion, whilekeeping constant the correlation matrix of γγγ − h γγγ − , and then to compare the patterns with those ofthe classical Wishart process [64]. • It will be interesting to use the autocorrelation model of [65, 66] to reproduce the time scale dependencyof λ (0) and λ (1) (see Figures 10 and 11). This autocorrelation model introduces a drift following anOrnstein(cid:21)Uhlenbeck process that might be justi(cid:28)ed by the lack of liquidity and the herding e(cid:27)ect. Asa consequence, the moving average of the factor returns may serve a good proxy for the conditionalexpected returns, while the unconditional ones remain zeros. The model captures ine(cid:30)ciency in thestock market, which is yet to be documented. It is di(cid:27)erent from the Epps e(cid:27)ect [67, 56] that identi(cid:28)esstocks lags at the intra-day time scale. The autocorrelation of [65, 66] could be more robust thanthe classical anomalies describing the discrepancy between the measured unconditional expected factor39
200 400 600 800 1000 1200 time x betacapitalization time x betautilities time Figure 9: Left and Centre: A composition in risk from 2013 to 2018 of the (cid:28)rst constrained eigenvector( ωωω (1) ? ( MaxVar (1 , , ) and second conditional constrained eigenvector ( ωωω (1) ? ( MaxVar (1 , , ). We seethat the interaction between the market mode, Capitalisation and Beta factors makes the (cid:28)rst constrainedeigenvector oscillate around the market mode, while the second eigenvector is exposed to the Beta factordespite its relatively low FCL. In July 2017, a new risk factor, the (cid:28)nance factor, replaced Beta and Utilities.The increases of the FCL of the (cid:28)nancial factor appears at the same period (Right).returns and the theoretical CAPM. Recently it has been argued in [23] that certain stock marketanomalies become weaker after a study describing it has being published. In the same spirit most ofthe known anomalies were claimed to be fallacious and rather explained by over-(cid:28)tting or selectionbias [24]. We introduced the Maximum-Variance optimisation to build Maximum-Variance portfolios that capture aspurely as possible the di(cid:27)erent signals used for extracting risk premia. We introduced the factor correlationlevel that the Maximum-Variance portfolio is optimising at any time scale. The Sharpe ratio under certainassumptions is also optimised and Maximum-Variance portfolios capture as best as possible the de-noisedeigenvalues of the correlation matrix. An empirical test con(cid:28)rms the improvement from 5 minutes to 100days time scales. The Maximum-Variance optimisation could therefore be used to reduce the dimension andto model and (cid:28)lter in a proper way the correlation matrix. The Maximum-Variance optimisation opens newproblems to solve in the model of the correlation matrix around the dynamics and time scale dependency ofeigenvalues and eigenvectors.
General idea
In this appendix we elaborate the connection between the principal component analysis (PCA) and the (or-dinary, weighted or generalized) least squares approaches in a linear regression model. The results presentedhere appear rather scattered in the literature.Let Z be an m × n matrix for m > n . We present it in a form Z = xy T + E , (58)40 time -0.3-0.2-0.100.10.2 c u m u l a t i v e r e t u r n s time -0.100.10.20.30.40.5 c u m u l a t i v e r e t u r n s time -0.100.10.20.30.40.5 c u m u l a t i v e r e t u r n s time -0.100.10.20.30.40.5 c u m u l a t i v e r e t u r n s time -0.500.511.5 c u m u l a t i v e r e t u r n s time -0.100.10.20.30.40.5 c u m u l a t i v e r e t u r n s Figure 10: Cumulated gains at the same volatility for the Momentum, Beta and Book factors (top), and forthe Liquidity, STR and Cash factors (bottom) from 2000 to May 2018. The three graphs on the top coverthe most popular risk premia: Beta for quality, Momentum for trend-and-growth and Book for value. Thesefactors were shown to be signi(cid:28)cantly pro(cid:28)table, but on a longer period (usually since 1960). The 2008crisis generated exceptional losses to the beta and momentum factors that are not representative of a longerhistorical period. Bottom: the most signi(cid:28)cant factors on the period from 2000 to 2018: Liquidity, STR andCash. We tested the Maximum-Variance optimisation with MaxVar(1,6,9), MaxVar(0,6,9), MaxVar(2,6,9)and BM-ERW(6,9). We see that the Maximum-Variance MaxVar(0,6,9) overperforms slightly for Beta fac-tor but the Benchmark BM-ERW(6,9) overperforms slightly for Momentum and Book factors whereas theMaximum-Variance is theoretically expected to be the optimal solution for the Sharpe ratio. Our interpreta-tion is that the backtest is noisy and not signi(cid:28)cant enough as the period is too short based on the weaknessof di(cid:27)erent risk premia. Nevertheless the Maximum-Variance optimisation is con(cid:28)rmed empirically whenrisk premia is strong enough (Cash, STR and Liquidity factor) and if investors are convinced that a riskpremium could actually exist, they should use the FCL (empirical or theoretical) to determine the best wayto capture it.where x and y are m -by- and n -by- vectors respectively, and the error matrix E has the same dimensionsas Z . There are two ways to minimise E . Either one (cid:28)nds y = y min that minimises Tr (cid:16) E T M x E (cid:17) forgiven x and a symmetric positive-de(cid:28)nite n × n matrix M x , or x = x min that does the same job forTr (cid:16) EM y E T (cid:17) but this time for a (cid:28)xed y and a di(cid:27)erent symmetric positive-de(cid:28)nite m × m matrix M y . Ifthe M ’s are unit matrices, the two minimised quantities are identical and in both cases we have the ordinaryleast squares (OLS), while for diagonal and general M ’s we have weighted (WLS) and generalised (GLS)least squares respectively. Starting (say) with x ( ) we may determine y = y ( ) that minimises the squareof E = Z − x ( ) y T , and then x = x ( ) for E = Z − xy T ( ) , etc. Proceeding this way we will obtain thesequence x ( ) → y ( ) → x ( ) → y ( ) → x (2) → y (2) → · · · (59)with the following recursive identities: y ( i ) = Z T M x x ( i ) x T ( i ) M x x ( i ) and x ( i +1) = ZM y y ( i ) y T ( i ) M y y ( i ) . (60)41 actor λ (0) ? ρ H S S ( · · · ) − S ( BM-ERW(6,9) ) S ( · · · ) − S ( MaxVar (1 , , R MaxVar MaxVar MaxVar MaxVar MaxVar MaxVar MaxVar MaxVar(1,6,9) (1,6,9) (1,6,9) (0,6,9) (2,6,9) (1,6,9,Beta) (1,6,9,Book) (1,6,9,Size)10Y Rates 8.10 0.95 -0.20 -0.02 -0.03 -0.04 -0.02 -0.05 -0.24 0.01Beta 7.63 0.96 0.34 -0.07 0.02 -0.02 -0.00 -0.24 -0.46 0.05Momentum 6.10 0.97 -0.23 -0.12 -0.12 -0.08 -0.02 0.03 -0.04 0.01Capitalisation 5.14 0.96 1.44 -0.15 -0.33 -0.23 -0.11 -0.58 0.00 0.01Dividend 4.28 0.95 -0.22 -0.05 0.04 -0.04 -0.00 -0.00 1.01 0.00Euro 3.85 0.95 0.40 0.05 0.12 0.08 0.00 -0.06 0.16 0.00Liquidity 3.70 0.94 0.77 0.17 0.16 0.15 -0.14 -0.39 0.01 0.08Book 3.53 0.93 0.51 -0.13 -0.24 -0.19 -0.05 0.00 -1.34 0.01STR 3.44 0.96 1.89 0.07 0.13 0.10 0.01 0.05 -0.31 0.05Sales 3.35 0.94 -0.07 0.08 0.26 0.13 0.07 0.50 0.98 0.07Leverage 2.05 0.89 0.46 0.05 0.04 0.08 0.00 0.05 -0.73 0.03Earning 2.04 0.91 0.52 -0.04 0.03 0.04 0.00 0.01 0.61 0.00Cash 1.79 0.89 1.13 0.01 -0.22 0.00 0.00 -0.35 -0.17 0.01Growth 1.24 0.91 0.27 -0.04 -0.02 0.01 0.00 0.00 0.11 0.00
Table 13: Di(cid:27)erence in the Sharpe ratios between the Maximum-Variance and the benchmark portfolios (seeTable 2) from 2000 to April 2018. ρ H is the correlation between returns that are very high. Maximum-Variance and the 20% top-bottom are therefore highly correlated. The Sharpe ratio is statistically notsigni(cid:28)cant for most factors except for STR (without cost), Cash and Liquidity. Capitalisation’s Sharpe ratiois overestimated as it su(cid:27)ers from the survival bias of our data. According to the literature (see Table 14for a partial list of references) shows that for a much longer period (around 50 years) Book, Momentum andCapitalisation are the main signi(cid:28)cant risk premia even if there is no consensus on these anomalies thatappear to be very sensitive to the normalisation method (see the discussion in the very end of Section 4.2).The di(cid:27)erences between the empirical Sharpe ratios are statistically insigni(cid:28)cant (below one sigma) as theperiod is too short and as risk premia are too weak. The last column presents the R -coe(cid:30)cient of thelinear regression between the monthly variations of λ (0) ? ( MaxVar (1 , , and the monthly factor returns.The connection between these coe(cid:30)cients and the leverage e(cid:27)ect for all these factors was covered in Section4.4.2.Eliminating x we arrive at the relation between y ( i +1) and y ( i ) : y ( i +1) = κ ( i ) · Z T M x ZM y y ( i ) , where κ ( i ) ≡ y T ( i ) M y y ( i ) y T ( i ) M y Z T M x ZM y y ( i ) . (61)With a little algebra it can be shown that the sequence (59) converges to: x ? = ZM y v i v T i v i y ? = M − y v i , (62)42
200 400 600 800 1000 1200 time f i r s t e i gen v a l ue H -1/2 H -1/2 Figure 11: Measure of the (cid:28)rst eigenvalue of the correlation matrices of the covariance matrix h , γ and γγγ − h γγγ − . A random selection of factor with µ = 1 . generates an (cid:28)rst eigenvalues of . for the correlationmatrix of h and . for the correlation matrix of γ in agreement with the graph. The (cid:28)rst eigenvalue ofthe average correlation matrix h and γ are only slightly di(cid:27)erent ( . and . ). We can guess that thedynamics of the (cid:28)rst eigenvalue of the correlation matrix of γγγ − h γγγ − appears to be linked to the dynamicsof the (cid:28)rst eigenvalue of the correlation matrix of the returns of the signal stocks. When the stock marketgets stress, the volatility increases, the (cid:28)rst eigenvalue of the correlation matrix of single stock returns andthe (cid:28)rst eigenvalue of the correlation matrix of γγγ − h γγγ − increase meaning that the fundamentals factorstend to become more volatile and more correlated when market is stressed.where v i is the i th eigenvector of the matrix M y Z T M x ZM y , and its norm is (cid:28)xed by the choice of x ( ) . It is very important to notice that (62) implies EM y y ? = 0 and x T ? M x E = 0 . (63)The relation to PCA becomes explicit if we replace M ’s by the identity matrices: xy T is just the i thterm of the PCA expansion of Z , (cid:0) λ i v i v T i (cid:1) / (cid:0) v T i v i (cid:1) , where λ i is the i th largest eigenvalue of Z T Z (andtherefore also of ZZ T ) and v i is the corresponding eigenvector of Z T Z . The Market Mode
To reproduce (3) and (4) one has to set ( m, n ) = ( T, N ) , ( Z ) ti = r i ( t ) as well as M x = T − I T and M y = ΓΓΓ − . Then plugging x (0) = r m ( t ) and i = 0 into (60) we arrive at y (0) = βββ and then x (1) = r m ? ( t ) exactly as in the two formulae. The only di(cid:27)erences are the conditional expectation in (3) replaced here bythe regular mean, and the fact we used time-dependent betas in (4) rather then constant ones like here. Notice that the square root of M y is well-de(cid:28)ned since this matrix is positive-de(cid:28)nite. he Two-Factor Model Loadings To (cid:28)nd b i in the Two-Factor model (11) using the linear regression we need the (cid:28)rst equation of (60) for i = 0 with ( Z ) ti = r i ( t ) − β i r m ? ( t ) , M x ∝ I T , x = r f ( t ) and y = b . This leads to b i = T P t =1 ( r i ( t ) − β i r m ? ( t )) r f ( t ) T P t =1 ( r f ( t )) = (cid:0) Σ f (cid:1) − (cid:10) ( r i − β i r m ? ) r f (cid:11) . (64)The market-neutrality N P i =1 β i b i = 0 follows directly from (3) and (4) . B Styles
In Table 14 we provide a brief description of all the (cid:28)nancial styles used in the paper. The sectoral factorsare presented in the next appendix.
C Maxima, minima and saddle points
Let M be an N × N positive matrix, ( ‘ i , v i ) its sorted ( ‘ > ‘ > · · · > ‘ N ) eigenvalue/eigenvector pairsand H i be the Hessian matrix of the Lagrangian L ( v ) = v T M vv T v (65)computed at a local optimal point v = v i . We search for the signature of H i . Substituting v + v i δ v intothe Lagrangian one can easily see that L ( v ) = ‘ i + 1 v T i v i · δ v T ( M − ‘ i · I N ) δ v + O (cid:0) δ v (cid:1) , (66)where as expected the linear term vanishes upon M v i = ‘ i v i . It immediately follows that the signature is: − , · · · , − | {z } i − , , , · · · , | {z } N − i . (67)Here the (cid:29)at direction corresponds to δ v ∝ v i . We conclude that v = v i is a maximum (minimum) only for i = 1 ( i = N ). For any other < i < N the solution v = v i is a saddle point.The derivation generalises trivially for the constrained eigensystems of Appendix D. Instead of δ v onehas to consider P c δ u , and replace M by P c M P c . D Constrained eigensystems
Let us search for an eigensystem (eigenvectors and eigenvalues pairs) of an n × n symmetric non-singularmatrix M , that is v T M v is optimised, under an additional constraint v T c = 0 for a unit vector c , c T c = 1 .It was shown in [42] that the ( n − constrained eigenvalues of M will coincide with the non-zero eigenvalues44tyle De(cid:28)nition Long/Shorthigh/low LiteratureDividend Yield Annual dividend income per sharedivided by the current share price % [68]Capitalisation Total market value ofa company’s shares & [1, 17]Liquidity Volume of transaction invalue divided by Capitalisation & [69, 70, 71, 72, 73,74, 75]Short-termReversion (STR) Short-term reversal based on a20 days moving average of returns % [76]Momentum Based on the last12 months-1 month moving average % [21]Beta Based on the 90 days regressionon daily returns using the SP500 & [13, 77, 78, 19, 79,80, 81, 82, 83]Leverage Debt to Equity % [84, 85]Book Book to Price ratio % [1]Cash Cash to Price ratio % [86, 87, 88]Earning Price to Earning ratio % [18, 89]Growth One year change of Earningdivided by Equity % [90]Euro Price sensitivity tothe weekly change in Euro/dollarbased on the last 200 days % [91]Rates Price sensitivity to the weeklychange in 10 years US Bond yield & [92]Sales Sales to Price ratio & [89, 93, 94, 95]Table 14: summary of the basic information about the styles (non-sectoral factors) used in this paper. Apartfrom the de(cid:28)nition and the relevant references we also present the long/short strategy. Our notations are asfollows: % stands for long high & short low and & for long low & short high.of the matrix P c M P c , where P c = I n − cc T is the projection matrix into the subspace of vectors orthogonalto c . Moreover, if u is an eigenvector of P c M P c with a non-zero eigenvalue ‘ , then P c u is necessarily aneigenvector of P c M with exactly the same eigenvalue ‘ .45o prove this statement one starts with a Lagrangian (see (1.4) of [42]): L ( v , Λ , Λ ) ≡ v T M v − Λ (cid:0) v T v − (cid:1) + 2Λ v T c , (68)where Λ and Λ are the Lagrange multipliers ensuring the normalisation of v and the orthogonality of c respectively. From the equations of motion with respect to the three variables one (cid:28)nds then that P c M v = Λ v , (69)where P c is the projection operator satisfying ( P c ) = P c . This guarantees that the eigenvalues of P c M coincides with those of P c M P c (recall that for any two square matrices A and B , the eigenvalues of AB coincide with those of BA .) We see, therefore, that Λ is a constrained eigenvalue of M . Moreover, v inthe last equation might be written as P c u , where u is a standard eigenvector of P c M P c .In Appendix A we discussed linear regressions with the cost functions Tr (cid:16) E T M x E (cid:17) and Tr (cid:16) EM y E T (cid:17) .It can be easily extended to the cost function minimisation under a given constraint, that is to say to theconstrained WLS: the eigenvector v i in (62) will be simply replaced by a constrained eigenvector of thematrix M y Z T M x ZM y . This is directly related to the Maximum-Variance portfolio of Section 2.5, wherewe had M x = I , M y = ΓΓΓ − and Z T Z = H . E The Maximum-Variance Market-Mode portfolio
One way to exploit the approach of Section 2.3 is to treat the market mode as a factor. To this end we mayreplace b i and r f in (11) by β i and r m ? respectively and drop the β i r m ? term on the left-hand side. This waywe (cid:16)discover(cid:17) the Capital Asset Pricing Model (CAPM): r i ( t ) = βr m ? ( t ) + (cid:15) ( t ) i . As we have discussed above,it is related to the PCA analysis of the correlation matrix.Proceeding as above we arrive at the Maximum-Variance market-mode portfolio: (cid:16) ω m ? (0) (cid:17) i ∼ Σ − i β i . (70)As we have already mentioned below (4), the return of ωωω m ? (0) is equal to r m ? ( t ) . The fastest way to obtain(70) is to replace b i by β i in (20) and to omit the irrelevant part of the market-neutrality projection.It is worth to compare ωωω m ? (0) with other market portfolios proposed in the literature. In [10] the 1-factormodel was used to introduce Minimum Variance and Maximum Diversi(cid:28)cation portfolios. The former isthe optimal Sharpe ratio portfolio under the assumption that the expected stock returns are identical andpositive. Maximum Diversi(cid:28)cation, on the other hand, is Sharpe-optimal if we assume that the expectedreturns are proportional to their volatilities. The Minimum Variance and Maximum Diversi(cid:28)cation portfoliosmay contain only long positions, although this condition is weak as all betas are close to one (see below).We will show later that the Maximum-Variance portfolio has the optimal Sharpe ratio if the expectedreturns are proportional to their betas, which is akin to the e(cid:30)cient market hypothesis, and if the residualreturn volatilities ( Σ i ) and the stock volatilities are proportional. The latter is a meaningful and robustapproximation that avoids problematic concentration due to fallacious correlation [40]. Table 15 bringstogether the three portfolios. F Sectors
In Table 16 we report all Sectors and Industries of the Global Industry Classi(cid:28)cation Standard (GICS) usedto construct the sectoral factors and the clusters as outlined in Subsection 3.4.46aximum Diversi(cid:28)cation Minimum Variance Maximum-Variance Σ i σ i (cid:18) − β i Σ m Σ i ρ L (cid:19) if β i Σ m ρ L Σ i < otherwise σ i (cid:18) − β i β L (cid:19) if β i β L < otherwise (cid:16) ω ? m (0) (cid:17) i = β i Σ i Table 15: Summary of the weights of the two portfolios proposed in [10] (the (cid:28)rst two columns) and the onede(cid:28)ned by (70). Notice that the (cid:16) ω ? m (0) (cid:17) i > restriction does not a(cid:27)ect many stocks. The parameters ρ L and β L are (cid:28)xed from the Sharpe ratio optimisation, see [10]. The Minimum and the Maximum-Varianceportfolios are invested in the low-beta and the high-beta stocks respectively. In these two portfolios theweights are inversely proportional to the square of volatilities, though for Minimum Variance these are thevolatilities of the residual returns. Thus the Minimum and the Maximum-Variance portfolios can be seen ascomplementary.Sector Industry GICS Cluster1 Energy 1 Energy Equipment & Services 101010 12 Oil Gas & Consumable Fuels 101020 1- Materials 3 Chemicals 151010 1Construction Materials 151020 1Containers & Packaging 151030 1Paper & Forest Products 151050 14 Metals & Mining 151040 12 Industrials 5 Aerospace & Defence 201010 2Building Products 201020 2Electrical Equipment 201040 2Trading Companies & Distributors 201070 26 Machinery 201060 27 Commercial Services & Supplies 202010 3Professional Services 202020 38 Air Freight & Logistics 203010 3Airlines 203020 3Marine 203030 3Road & Rail 203040 39 Transportation Infrastructure 203050 3To be continued47ector Industry GICS Cluster3 ConsumerDiscretionary 10 Auto Components 251010 3Automobiles 251020 311 Household Durables 252010 3Leisure Products 252020 3Textiles, Apparel & Luxury Goods 252030 312 Hotels, Restaurants & Leisure 253010 3Diversi(cid:28)ed Consumer Services 253020 313 Media 254010 314 Distributors 255010 3Internet & Direct Marketing Retail 255020 3Multiline Retail 255030 3Specialty Retail 255040 34 ConsumerStaples 15 Food & Staples Retailing 301010 416 Beverages 302010 4Food Products 302020 4Tobacco 302030 417 Household Products 303010 4Personal Products 303020 45 Health Care 18 Health Care Equipment & Supplies 351010 4Health Care Providers & Services 351020 4Health Care Technology 351030 419 Biotechnology 352010 4Pharmaceuticals 352020 4Life Sciences Tools & Service 352030 46 Financials 20 Banks 401010 5Thrifts & Mortgage Finance 401020 521 Diversi(cid:28)ed Financial Services 402010 5Consumer Finance 402020 5Capital Markets 402030 522 Mortgage Real Estate Investment Trusts (REITs) 402040 523 Insurance 403010 57 InformationTechnology 24 Internet Software & Services 451010 6IT Service 451020 625 Software 451030 626 Communications Equipment 452010 6Technology Hardware Storage & Peripherals 452020 6Electronic Equipment Instruments & Components 452030 627 Semiconductors & Semiconductor Equipment 453010 6To be continued48ector Industry GICS Cluster- TelecommunicationServices 28 Diversi(cid:28)ed Telecommunication Services 501010 4Wireless Telecommunication Services 501020 48 Utilities 29 Electric Utilities 551010 4Multi-Utilities 551030 4Water Utilities 551040 4Independent Power andRenewable Electricity Producers 551050 49 Real Estate 30 Equity Real Estate Investment Trusts (REITs) 601010 5Real Estate Management & Development 601020 5Table 16: The (cid:28)rst two columns on the left describe the nine GICS sectors used for all but the MaxVar(1,6,30)method, see Table 2. The next two columns contain the 30 industries used to build the sectoral factors ofMaxVar(1,6,30). The last column marks the six clusters employed in all but the MaxVar(1,1,9) method,which has no clustering. G Proof of optimal capture of eigenvalues
Let us denote by λ i ( M ) the i -th largest eigenvalue of a n × n Hermitian matrix M . In other words, λ ( M ) > λ ( M ) > · · · > λ n ( M ) . In this appendix we demonstrate that: λ k (cid:16) ΓΓΓ − DhD
ΓΓΓ − (cid:17) λ k (cid:16) ΓΓΓ − h ΓΓΓ − (cid:17) for any k = 1 , . . . , K . (71)To remind the reader, h is a covariance matrix (and so by de(cid:28)nition positive de(cid:28)nite), ΓΓΓ is a correlationmatrix, and (cid:28)nally D is a diagonal matrix whose (real) entries belong to the range (0 , , or < λ k ( D ) ,but as we will see below this might be replaced by a weaker condition < | λ k ( D ) | , while D does nothave to be diagonal, but rather only symmetric.According to Lidskii [96] the following holds for two arbitrary n × n positive semide(cid:28)nite Hermitianmatrices U and V : l Y s =1 λ i s ( U V ) l Y s =1 λ i s ( U ) λ s ( V ) , (72)with any i · · · i l K and l = 1 , . . . , K . (73)In particular, for l = 1 this theorem implies that λ i ( U V ) λ i ( U ) λ ( V ) for any i = 1 , . . . , n . (74)We now apply this inequality for U = ΓΓΓ − h ΓΓΓ − and V = ΓΓΓ D ΓΓΓ − D ΓΓΓ : λ i (cid:16) ΓΓΓ − DhD
ΓΓΓ − (cid:17) = λ i (cid:0) hD ΓΓΓ − D (cid:1) == λ i (cid:16) ΓΓΓ − h ΓΓΓ − · ΓΓΓ D ΓΓΓ − D ΓΓΓ (cid:17) λ i (cid:16) ΓΓΓ − h ΓΓΓ − (cid:17) λ (cid:16) ΓΓΓ D ΓΓΓ − D ΓΓΓ (cid:17) . (75)49 = 2 ITPCSR 2-Step K = 1 TPCSR (OLS) 2-Step K = 1 TPCSR (WLS) ω f (0) ? b i − N P k =1 b k β kN P k =1 β k β i b i − N P k =1 b k β kN P k =1 β k β i Σ − i b i − N P j =1 b j β j Σ − jN P k =1 β k Σ − k β i ω m (0) ? β i − N P k =1 β k b kN P k =1 b k b i β i Σ − i β i Table 17: The weights of the market (second line) and all other styles/factors ((cid:28)rst line) in di(cid:27)erent ap-proaches.Here we used the fact that for two square non-singular matrices A and B , the eigenvalue spectra of AB and BA are always identical. To complete the proof we notice that: λ (cid:16) ΓΓΓ D ΓΓΓ − D ΓΓΓ (cid:17) (cid:16) λ (cid:16) ΓΓΓ D ΓΓΓ − (cid:17)(cid:17) = ( λ ( D )) , (76)where in the (cid:28)rst inequality we used (74) once more. H Comparison with Other Cross-Sectional Regressions
Two-pass cross-sectional regression (TPCSR) was frequently used to estimate the optimal factor loadingsand weights from a general K -factor model. Recall that in our treatment we have K di(cid:27)erent two-factormodels (11). Here we would like to outline the similarities and the di(cid:27)erences between TPCSR and ourmodel.A good starting point is to notice that the model (11) might be re-written as r = B T R f + (cid:15)(cid:15)(cid:15) , where B T =( βββ, b ) is the ( N, -matrix of style loadings (market and a single style from our set of styles), R f = (cid:0) r m i , r f i (cid:1) is the (2 , T ) -matrix of factor returns, and (cid:15)(cid:15)(cid:15) are the idiosyncratic returns. These matrices can be triviallygeneralised to capture K > di(cid:27)erent factor/styles. The matrices B and R f will then be of size ( K, T ) and ( K, N ) respectively. With these conventions, TPCSR consists of recurrent time- and cross-sectionalregressions. Starting from a given set of factor returns (for example, with r b.m. i ( t ) ) one (cid:28)xes the factorloadings with the OLS linear regression of the time series, r i ( t ) against R f . At the second step the B -matrixis used to determine a new matrix of factor returns, R f , by means of the cross-sectional (cross-factorialto be more precise) OLS linear regression. One can repeat these two-step procedure inde(cid:28)nitely, in whichcase the model goes under the name ITPCSR. As was pointed out in [97], the iteration has a (cid:28)xed point,where the factor returns converge to the K (cid:28)rst eigenvectors (the eigenvectors corresponding to the K largesteigenvalues) of the N × N covariance matrix. This is just a K > generalisation of the PCA decompositiondiscussed in Appendix A.The loadings B (0) ? of our model de(cid:28)ned in Section 2.5 might be seen as a two-stage implementation ofthe K = 1 version of ITPCSR construction with the market-neutrality constraint. • For a short time scale of 90 days, we use an exponential moving average of the K = 1 version ofITPCSR to determine the betas. In practise we do not repeat the iteration in order to get the beta50ith respect to the stock index return, r m ( t ) , rather than with respect to the Maximum-Variancemarket mode portfolio return, r m ? ( t ) . It enables to take into account the variation of β ( t ) , which is animportant feature of (cid:28)nancial markets [98]. It was shown in [87] that time-varying beta makes the lowvolatility anomaly disappear, thus improving the empirical validation of the CAPM. Our methodologysupports a time-varying beta which is not the case of multi-factorial ITPCSR that needs more datathan 90 days in order to estimate as precisely as possible the eigenvectors. • At the second stage, for a long time scale of several years, we applied K = 1 version of ITPCSR withmarket-neutrality constraint. This second ITPCSR when using a WLS instead of OLS and aggregationof similar stocks into Q = 10 di(cid:27)erent portfolios lead to the (cid:28)rst constrained eigenvector of matrix e γγγ − e h e γγγ − (demonstrated in Appendix D). We will thus refer to our approach as the two-stage K = 1 version of ITPCSR.Let us address the following crucial points: • We used the WLS instead OLS, since it (cid:28)ts better with the heterogeneity of the stock volatilities andcorrect the e(cid:27)ect of the heteroscedasticity. This can be seen in the Σ factors in (20). • The (cid:28)xed point portfolio corresponds to the Maximum-Variance portfolio if the signal is strong enough( λ (0) (cid:0) ωωω b.m. (cid:1) should be high enough). • In Table 17 we summarize the weights obtained using the standard K = 2 ITPCSR applied to themarket and an additional factor ((cid:28)rst column), and the weights derived using our method with theOLS linear regression (second columns) and the WLS (last column). Di(cid:27)erent versions give similarweights except that WLS is inversely proportional the variance and that in the 2 step K = 1 case theweights for the market mode is not interfering with the signal. By keeping K = 1 at every step weavoided mixing between the market return and the given style return. The K = 2 TPCSR with thesame factor and the market mode will loose all the information about the original signal of this styleafter the (cid:28)rst few iterations. In our approach, however, the factor return is kept mark-neutral at everyiteration with equivalent weights to every stock in the same quantile, and thus the (cid:28)xed point returnand the sensitivity will preserve connection to the initial signal input. Even more importantly, themarket weights, the betas, will not be a(cid:27)ected at all at the second step. • We obtain factors which are market-neutral with respect to the value-weighted stock index, whilethe factors of [97] for K = 2 (the market and an additional factor) become, after su(cid:30)ciently manyinteractions, neutral with the respect to the (cid:28)rst component of the covariance matrix. • For K (cid:29) (including K = 24 ) the iterative procedure of [97] fails to incorporate the grouping of stocksinto quantiles. For instance, Q = 10 necessitates K di(cid:27)erent portfolios. This is in contrast with ourapproach, since we group stocks separately for each factor and so there are overall K × Q portfolios.As a result, the iteration can converge only to the noisy eigenvalues of the N × N correlation matrix,while quickly losing connection to the initial (cid:28)nancial information.To conclude, the input of the initial benchmark portfolio is washed away in the iterative process of linearregressions that yield to a solution that is (cid:28)nally not optimal. As a consequence the only available solutionis the benchmark portfolio. Even if it is not optimized, ω b.m. remains the reference method to capture(cid:28)nancial signals. As an example, in the mainstream Fama-French framework ω b.m. is implemented in itsequal-weighted and neutral in nominal: P i ω i = 0 instead of P i β i ω i = 0 .51 Conditional and unconditional estimates based on the EMA
Estimating conditional volatilities and beta based on daily returns
To estimate the matrices γγγ ( t ) , h ( t ) and all other matrices we have (cid:28)rst to (cid:28)nd the volatilities Σ i ( t ) andthe betas β i ( t ) . As we discussed in the main text, we aim to estimate the conditional values referred to inSection 2.1 as E t − ( · · · ) . The best way to do this is to average the time-dependent variables over the period [ t, t − ∆ t ] . The better practice is, however, to use the exponential moving average with α = ∆ − instead ofthe moving-window. Although it cannot account for all the delays, this approach still provides a good proxyfor the conditional values.We determine the variances with α − Σ = 40 days, and then use the conditional volatility of the stockindex to estimate the betas with α − β = 90 days: Σ i ( t + δt ) = (1 − α Σ ) · Σ i ( t ) + α Σ · r i ( t ) β i ( t + δt ) = (1 − α β ) · β i ( t ) + α β · r i ( t ) r m ( t ) Σ m ( t ) . (77)Here δt = 1 day and respectively all the returns are daily. We neglected the daily mean of the returns andused the same moving average to (cid:28)nd the volatility index as for the single stocks: ( Σ m ) i ( t + δt ) = (1 − α Σ ) · ( Σ m ) i ( t ) + α Σ · ( r m ) i ( t ) . (78) Estimating h and γγγ for di(cid:27)erent time scales based on daily returns In this appendix we outline the algorithm to approximate the unconditional values of the matrices h ( τ ) and γγγ ( τ ) at di(cid:27)erent time scales τ from 1 day to 100 days. We will write down a single generalised formula thatcovers all the instances of these matrices discussed in this paper. The main di(cid:27)erences between the instancesare the portfolio weights, for example v v i of Section 2.5 or ωωω (1) ? of Section 2.4. Below we denote the portfolioweights simply by ω Ai ( t ) . As before the i index stands for the single stocks, whereas the capital lettersindices should be replaced depending on the case at hand. For example, for the calculation of Section 2.4one needs the factor indices, meaning A, B, · · · → a, b, · · · , and the matrices h ( τ ) and γγγ ( τ ) are accordingly K × K .First, recall that the Maximum-Variance (20) as well as the benchmark weights (9) are well-de(cid:28)ned onlyup to an overall time-dependent normalisation. This issue was already mentioned at the end of Section 4.2.For the purposes of this section we keep constant the following quantity: N X i =1 ω Ai ( t ) Σ i ( t ) . (79)Notice that the normalisation constant is the same for all t and A .Second, we introduce two sample means, the mean return of portfolio A and the mean (market-neutral)return of all single stocks: ˆ r i ( t ) = r i ( t ) − η · β i ( t ) r m ( t ) µ A ≡ T T X t =1 N X i =1 ω Ai ( t ) ˆ r i ( t ) r = 1 T N T X t =1 N X i =1 r i ( t ) . (80)52n all cases but one we set η = 0 . With these formulae we de(cid:28)ne the following two auxiliary functions: E h AB ( t, τ ) ≡ t X t = t − τ N X i =1 ω Ai ( t ) ˆ r i ( t ) − µ A ! t X t = t − τ N X i =1 ω Bi ( t ) ˆ r i ( t ) − µ B ! E γγγAB ( t, τ ) ≡ N X i =1 t X t = t − τ ω Ai ( t ) ( r i ( t ) − r ) ! t X t = t − τ ω Bi ( t ) ( r i ( t ) − r ) ! . (81)Here τ is the time-scale used to estimate the matrices, see, for example, Table 10. In other words, thesummation P tt = t − τ gives the accumulated portfolio return for the period [ t − τ, t ] . Notice also that all theparentheses in (81) contain only zero-mean quantities.Next, we use the exponential moving average with α − h,γ = 3 τ − days to (cid:16)smooth(cid:17) these matrices: e E h AB ( t + δt, τ ) ≡ (cid:16) − α h,γ τ (cid:17) · e E h AB ( t, τ ) + α h,γ τ · E h AB ( t, τ ) e E γγγAB ( t + δt, τ ) ≡ (cid:16) − α h,γ τ (cid:17) · e E γγγAB ( t, τ ) + α h,γ τ · E γγγAB ( t, τ ) , (82)where, again, δt = 1 day. We are (cid:28)nally in a position to write down the expressions for h and γγγ : h AB ( τ ) = 1 T T X t =1 e E h AB ( t, τ ) q e E γγγAA ( t, τ ) e E γγγBB ( t, τ ) γ AB ( τ ) = 1 T T X t =1 e E γγγAB ( t, τ ) q e E γγγAA ( t, τ ) e E γγγBB ( t, τ ) (83)Let us explain these formulae. The time-dependent matrices E h and E γγγ in (81) follow directly from thede(cid:28)nition of h and γγγ , and the EMA is the standard procedure to reduce the impact of extremely volatileone-day returns. For E γγγ the measurement noise is further weakened for large N , because, in contrast to E h ,it has a single summation of the stock index. Regardless of the noise, the volatility is a rough stochasticprocess close to the fractional Brownian motion [99]. This is precisely where the normalisation ((79) )becomesimportant. For larger Σ i ( t ) ’s, it keeps the portfolio weights lower, reducing the crisis impact on E γγγ . Thisway the (cid:28)nal results for h and γγγ are not overweighted by the contribution of large volatility periods. Thenormalisation comes with a cost though. The conditional volatilities de(cid:28)ned in (77) have a α − Σ = 40 daysdelay that brings in an (cid:16)arti(cid:28)cial(cid:17) volatility in both E h and E γγγ . To treat this problem, we can adjustthe normalisation. In (79) the weights were normalised by the diagonal matrix Γ ij = δ ij Σ i . It thereforemakes sense to use e E γγγ for the last step normalisation, and this is precisely what the square roots in (83)were introduced for. The procedure is similar to what is usually done to reduce the heteroscedasticity ofa stochastic volatility process: one divides the stochastic function by its standard deviation. The formulae(83) can also be seen as weighted means over the entire time period. Since the product γγγ − / h γγγ − / isinvariant under a simultaneous rescaling of the matrices h and γγγ , we do not have to divide by the sum ofthe weights, P t (cid:16) e E γγγAA e E γγγBB (cid:17) − / . Finally, let us add a comment on α h,γ . The EMA of E h is strictly speakingunnecessary, as its impact becomes irrelevant after the t -averaging in (83). The EMA of E γγγ , on the otherhand, is important as it changes the normalisation. As we wrote above, for large N the function E γγγ ( t, τ ) isnot too noisy and so we keep α h,γ small compared to other EMA parameters, α − Σ and α − β .53 , B, · · · ω Ai ( t ) τ η The (cid:28)rst stepof the B ? ( q ) evaluationin Section 2.5 q , q , · · · (31) one day 1The second stepof the B ? ( q ) evaluationin Section 2.5 p , p , · · · Maximum-Variance portfolio v ( p ) derived from (20) and (35)with B ( q ) found at the previous step one day 0Section 4 a, b, · · · Maximum-Variance portfolios (20) ofTable 2 with the universal law (43) , · · · , days(as in Tables8, 9, 12 and 10) day 0 a, b, · · · Benchmark portfolios (9)of Table 2 , · · · , days(as in Tables8, 9, 12 and 10) 0Table 18: Di(cid:27)erent instances of applications of (83) in the paper. Estimating h ( t ) and γγγ ( t ) based on 5-minutes returns In this appendix we describe the computation of h ( t ) , γγγ ( t ) and the correlation matrix both conditional(time-dependent) and unconditional (time-independent).We start with two successive EMA (cid:28)lters: E C ij ( t + 5 min. ) = (cid:16) − α C (cid:17) · E C ij ( t ) + α C · r i ( t ) r j ( t ) e E C ij ( t + 1 day ) = (1 − α C ) · e E C ij ( t ) + α C E C ij ( t ) , (84)where α − C = 5 days and we set the factor because this is the average number of available 5-minutesreturns for one day. The (cid:28)ve days averaging period is important because of the daily [100] and the weekly[101] U-patterns of the return variation.With E C ij ( t ) at hand we calculate the conditional empiric correlation matrix as: C Emp ij ( t ) = e E C ij ( t ) q e E C ii ( t ) e E C jj ( t ) . (85)The time-dependent eigenvalues of this matrix, λ Emp i ( t ) , appear in Figure 8, and the time-independenteigenvalues, λ Emp i , of the average of C Emp ij ( t ) , that is of C Emp ij = 1 T X t C Emp ij ( t ) (86)are given in Table 5. 54o arrive at the conditional constrained eigenvalues λ (1) ( t ) we (cid:28)rst de(cid:28)ne E h AB ( t ) ≡ N X i,j =1 ω Ai ( t ) e E C ij ( t ) ω Bj ( t ) E γ AB ( t ) ≡ N X i =1 ω Ai ( t ) e E C ii ( t ) ω Bi ( t ) . (87)Here ω Ai ( t ) are the weights from the last two lines of Table 18. We assume that they evolve weakly comparedto the returns. This simpli(cid:28)es the formulae greatly. We then proceed to the third and (cid:28)nal EMA: e E h AB ( t + 1 day ) = (1 − α C ) · E h AB ( t ) + α C · e E h AB ( t ) e E γγγAB ( t + 1 day ) = (1 − α C ) · E γγγAB ( t ) + α C · e E γγγAB ( t ) , (88)Finally, the matrices we are interested in, follow from e E h AB and e E γγγAB similarly to (83): h AB ( t ) = e E h AB ( t ) q e E γγγAA ( t ) e E γγγBB ( t ) γ AB ( t ) = e E γγγAB ( t ) q e E γγγAA ( t ) e E γγγBB ( t ) (89)The dynamics of the eigenvalues of γγγ − ( t ) h ( t ) γγγ − ( t ) is shown on Figure 8 and the time-independent eigen-values of T X t γ − ik ( t ) h kl ( t ) γ − lj ( t ) (90)are listed in Table 5. References [1] E. Fama and K. French, (cid:16)The cross-section of expected stock returns,(cid:17) Journal of Finance 47 no. 2,(1992) 427(cid:21)65.[2] E. Fama and K. French, (cid:16)Common risk factors in the returns on stocks and bonds,(cid:17) Journal ofFinancial Economics 33 no. 1, (1993) 3(cid:21)56.[3] E. F. Fama and K. R. French, (cid:16)A (cid:28)ve-factor asset pricing model,(cid:17) Journal of Financial Economics116 no. 1, (2015) 1(cid:21)22.[4] R. Thaler and S. Benartzi, (cid:16)Naive diversi(cid:28)cation strategies in de(cid:28)ned contribution saving plans,(cid:17)American Economic Review 91 no. 1, (2001) 79(cid:21)98.[5] V. DeMiguel, L. Garlappi, F. J. Nogales, and R. Uppal, (cid:16)A generalized approach to portfoliooptimization: Improving performance by constraining portfolio norms,(cid:17) Management Science 55no. 5, (2009) 798(cid:21)812.[6] D. B. Chaves, J. C. Hsu, F. Li, and O. Shakernia, (cid:16)Risk parity portfolio vs. other asset allocationheuristic portfolios,(cid:17) Journal of Investing 20 no. 1, (2011) 108(cid:21)118.557] J. Treynor, (cid:16)Why market-valuation-indi(cid:27)erent indexing works,(cid:17) Financial Analysts Journal 61 no. 5,(2005) 65(cid:21)69.[8] H. Markowitz, (cid:16)Portfolio selection,(cid:17) Journal of Finance 7 no. 1, (1952) 77(cid:21)91.[9] R. G. Clarke, H. de Silva, and S. Thorley, (cid:16)Minimum-variance portfolio in the us equity market,(cid:17)Journal of Portfolio Management 33 no. 10, (2006) 33(cid:21)10.[10] R. G. Clarke, H. de Silva, and S. Thorley, (cid:16)Risk parity, maximum diversi(cid:28)cation, and minimumvariance: An analytic perspective,(cid:17) Journal of Portfolio Management 39 no. 3, (2012) 39(cid:21)53.[11] Y. Choueifaty and Y. Coignard, (cid:16)Toward maximum diversi(cid:28)cation,(cid:17) The Journal of PortfolioManagement 35 no. 1, (2008) .[12] R. G. Clarke, H. de Silva, and S. Thorley, (cid:16)Minimum-variance portfolio composition,(cid:17) The Journal ofPortfolio Management 37 no. 2, (2012) 31(cid:21)45.[13] A. Ang, R. Hodrick, Y. Xing, and X. Zhang, (cid:16)The cross-section of volatility and expected returns,(cid:17).[14] C. R. Harvey and Y. Liu, (cid:16)Lucky factors,(cid:17) Working paper (2014) .[15] D. A. Conway and M. R. Reinganum, (cid:16)Stable factors in security returns: Identi(cid:28)cation usingcross-validation,(cid:17) Journal of Business & Economic Statistics 6 no. 1, (1988) 1(cid:21)15.[16] C. S. Asness, T. J. Moskowitz, and L. Pedersen, (cid:16)Value and momentum everywhere,(cid:17) Journal ofFinance 68 no. 3, (2013) 929(cid:21)985.[17] R. W. Banz, (cid:16)The relationship between return and market value of common stocks,(cid:17) Journal ofFinancial Economics 9 no. 1, (1981) 3(cid:21)18.[18] S. Basu, (cid:16)Investment performance of common stocks in relation to their price-earnings ratios: A testof the e(cid:30)cient market hypothesis,(cid:17) Journal of Finance 32 no. 3, (1977) 663(cid:21)82.[19] F. Black, M. C. Jensen, and M. Scholes, (cid:16)The capital asset pricing model: Some empirical tests,(cid:17)Studies in the Theory of Capital Markets (1972) 79(cid:21)121.[20] M. Carhart, (cid:16)On persistence in mutual fund performance,(cid:17) Journal of Finance 52 no. 1, (1997) 57(cid:21)82.[21] N. Jegadeesh and S. Titman, (cid:16)Returns to buying winners and selling losers: Implications for stockmarket e(cid:30)ciency,(cid:17) The Journal of Finance 48 no. 1, (1993) 65(cid:21)91.[22] J. Green, J. R. M. Hand, and X. F. Zhang, (cid:16)The supraview of return predictive signals,(cid:17) Review ofAccounting Studies 18 no. 3, (2013) 692(cid:21)730.[23] R. D. McLean and J. Ponti(cid:27), (cid:16)Does academic research destroy stock return predictability?,(cid:17) Journalof Finance 71 no. 1, (2016) 5(cid:21)32.[24] C. Harvey, Y. Liu, and H. Zhu, (cid:16)... and the cross-section of expected returns,(cid:17) Review of FinancialStudies 29 no. 1, (2016) 5(cid:21)68.[25] G. Connor, (cid:16)The three types of factor models: A comparison of their explanatory power,(cid:17) FinancialAnalysts Journal 51 no. 3, (1995) 42(cid:21)46.[26] S. Ross, (cid:16)The arbitrage theory of capital asset pricing,(cid:17) Journal of Economic Theory 13 no. 3, (1964)341(cid:21)360. 5627] W. Sharpe, (cid:16)Capital asset prices: A theory of market equilibrium under conditions of risk,(cid:17) Journalof Finance 19 no. 3, (1964) 425(cid:21)442.[28] R. Roll and R. Stephen, (cid:16)An emprirical investigation of the arbitrage pricing theory,(cid:17) Journal ofFinance 35 no. 5, (1980) 1073(cid:21)1103.[29] N. Chen, (cid:16)Some empirical tests of the theory of arbitrage pricing,(cid:17) Journal of Finance 38 no. 2,(1983) 1393(cid:21)1414.[30] P. Dhrymes, I. Friend, and B. Gultekin, (cid:16)A critical examination of the empirical evidence on thearbitrage pricing theory,(cid:17) Journal of Finance 39 no. 2, (1984) 323(cid:21)346.[31] G. Chamberlain and M. Rothschild, (cid:16)Arbitrage and mean-variance analysis on large asset markets,(cid:17)Econometrica 51 no. 1, (1983) 1281(cid:21)1301.[32] C. Trzcinka, (cid:16)On the number of factors in the arbitrage pricing model,(cid:17) Journal of Finance 41 no. 2,(1986) 347(cid:21)368.[33] S. J. Brown and M. I. Weinstein, (cid:16)A new approach to testing asset pricing models: The bilinearparadigm,(cid:17) Journal of Finance 38 no. 2, (1983) 711(cid:21)743.[34] D. A. Conway and M. R. Reinganum, (cid:16)Stable factors in security returns. identi(cid:28)cation through crossvalidation,(cid:17) Journal of business and economic statistics 6 no. 1, (1988) 1(cid:21)15.[35] S. J. Brown, (cid:16)The number of factors in security returns,(cid:17) Journal of Finance 44 no. 5, (1989)1247(cid:21)1262.[36] G. Connor and R. Korajczyk, (cid:16)A test for the number of factors in an approximate factor model,(cid:17)Journal of Finance 48 no. 4, (1993) 1263(cid:21)91.[37] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, (cid:16)Noise dressing of (cid:28)nancial correlationmatrices,(cid:17) Physical Review Letters 83 no. 1, (1998) 1467.[38] P. Gopikrishnan, B. Rosenow, V. Plerou, and H. E. Stanley, (cid:16)Quantifying and interpreting collectivebehavior in (cid:28)nancial markets,(cid:17) Physical Review E 64 no. 1, (2001) 35106.[39] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, T. Guhr, and H. E. Stanley, (cid:16)Randommatrix approach to cross correlations in (cid:28)nancial data markets,(cid:17) Physical Review E 65 no. 1, (2002)66126.[40] R. Michaud, (cid:16)The Markowitz optimization enigma: Is optimized optimal?,(cid:17) Financial AnalystsJournal 45 no. 1, (1989) 31(cid:21)42.[41] R. Allez and J.-P. Bouchaud, (cid:16)Eigenvector dynamics: general theory and some applications,(cid:17) Phys.Rev. E 86 no. 4, (2012) 046201.[42] G. H. Golub, (cid:16)Some modi(cid:28)ed matrix eigenvalue problems,(cid:17) SIAM Review 2 (1973) 318(cid:21)334.[43] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, (cid:16)Random matrix theory,(cid:17) Risk Magazine 12no. 3, (1999) 69.[44] J.-P. Bouchaud and M. Potters, (cid:16)Financial applications of random matrix theory: a short review,(cid:17)The Oxford handbook of Random Matrix Theory (2011) 824(cid:21)850.5745] M. Potters, J.-P. Bouchaud, and L. Laloux, (cid:16)Financial applications of random matrix theory: Oldlaces and new pieces,(cid:17) Acta Physica Polonica B 36 no. 9, (2005) 2767(cid:21)2784.[46] J. Bun, J.-P. Bouchaud, and M. Potters, (cid:16)Cleaning large correlation matrices: tools from randommatrix theory,(cid:17) Physics Reports 666 no. 1, (2017) 1(cid:21)109.[47] E. Fama and J. D. MacBeth, (cid:16)Risk, return, and equilibrium: Empirical tests,(cid:17) Journal of PoliticalEconomy 81 no. 3, (1973) 607(cid:21)36.[48] M. Lambert and G. H(cid:252)bner, (cid:16)Size matters, book value does not! The Fama-French empirical CAPMrevisited,(cid:17) (2014) .[49] M. Lambert and G. H(cid:252)bner, (cid:16)Comoment risk and stock returns,(cid:17) Journal of Empirical Finance 23(2013) 191(cid:21)205.[50] R. F.Kleina and V. K.Chow, (cid:16)Orthogonalized factors and systematic risk decomposition,(cid:17) TheQuarterly Review of Economics and Finance 53 no. 2, (2013) 175(cid:21)187.[51] P.-O. L(cid:246)wdin, (cid:16)On the nonorthogonality problem,(cid:17) Advances in Quantum Chemistry 5 (1970)185(cid:21)199.[52] R. Benichou, Y. LempØriŁre, E. SØriØ, J. Kockelkoren, P. Seager, J.-P. Bouchaud, and M. Potters,(cid:16)Agnostic risk parity: Taming known and unknown-unknowns,(cid:17) Journal of Investment Strategies 6no. 3, (2017) 1(cid:21)12.[53] D. Blitz, J. Huij, and M. Martens, (cid:16)Residual momentum,(cid:17) Journal of Empirical Finance 18 no. 3,(2011) 506(cid:21)521.[54] R. Novy-Marx, (cid:16)Operating leverage,(cid:17) Review of Finance 15 no. 1, (2011) 103(cid:21)134.[55] T. J. Moskowitz and M. Grinblatt, (cid:16)Do industries explain momentum?,(cid:17) The Journal of Finance 54no. 4, (1999) 1249(cid:21)1290.[56] S. Valeyre, D. Grebenkov, and S.Aboura, (cid:16)Emergence of correlation of securities at short time scales,(cid:17)Physica A (submitted) .[57] V. A. Mar£enko and L. A. Pastur, (cid:16)Distribution of eigenvalues for some sets of random matrices,(cid:17)Math. USSR-Sbornik 1 no. 4, (1967) 457(cid:21)483.[58] A. Moreira and T. Muir, (cid:16)Volatility-managed portfolios,(cid:17) Journal of Finance 122 no. 4, (2017)1611(cid:21)1644.[59] M. Pelger, Essays in Financial Econometrics, Asset Pricing and Corporate Finance. 2015.[60] Y. Lemperiere, C. Deremble, T.-T. Nguyen, P. Seager, M. Potters, and J.-P. Bouchaud, (cid:16)Risk premia:asymmetric tail risks and excess returns,(cid:17) Quantitative Finance 17 no. 1, (2017) 1(cid:21)14.[61] J.-P. Bouchaud, A. Matacz, and M. Potters, (cid:16)The leverage e(cid:27)ect in (cid:28)nancial markets: retardedvolatility and market panic,(cid:17) Physical Review Letters 87 no. 22, (2011) 228701.[62] S. Valeyre, D. Grebenkov, S. Aboura, and Q. Liu, (cid:16)The reactive volatility model,(cid:17) QuanitiativeFinance 13 no. 11, (2013) 1697(cid:21)1706.[63] S. Valeyre, D. Grebenkov, and S. Aboura, (cid:16)The reactive beta model,(cid:17) Journal of Financial Research(in revision) . 5864] C. Gourieroux, (cid:16)Continuous time wishart process for stochastic risk,(cid:17) Econometric Reviews 25no. 2-3, (2006) 177(cid:21)217.[65] D. S. Grebenkov and J. Serror, (cid:16)Following a trend with an exponential moving average: Analyticalresults for a gaussian model,(cid:17) Physica A: Statistical Mechanics and its Applications 394 (2014)288(cid:21)30.[66] D. Grebenkov and J. Serror, (cid:16)Optimal allocation of trend following strategies,(cid:17) Physica A: StatisticalMechanics and its Applications 433 (2015) 107(cid:21)125.[67] T. W. Epps, (cid:16)Comovements in Stock Prices in the Very Short Run,(cid:17) Journal of the AmericanStatistical Association 74 (1979) .[68] R. H. Litzenberger and K. Ramaswamy, (cid:16)The e(cid:27)ect of personal taxes and dividends on capital assetprices: Theory and empirical evidence,(cid:17) Journal of Financial Economics 7 no. 2, (1979) 163(cid:21)195.[69] V. T. Datar, N. Y. Naik, and R. Radcli(cid:27)e, (cid:16)Liquidity and stock returns: An alternative test,(cid:17) Journalof Financial Markets 1 no. 2, (1998) 203(cid:21)219.[70] R. Korajczyk and R. Sadka, (cid:16)Pricing the commonality across alternative measures of liquidity,(cid:17)Journal of Financial Economics 87 no. 1, (2008) 45(cid:21)72.[71] W. Liu, (cid:16)A liquidity-augmented capital asset pricing model,(cid:17) Journal of Financial Economics 82no. 3, (2006) 631(cid:21)671.[72] M. Brennan and A. Subrahmanyam, (cid:16)Market microstructure and asset pricing: On the compensationfor illiquidity in stock returns,(cid:17) Journal of Financial Economics 41 no. 3, (1996) 441(cid:21)464.[73] Y. Amihud and H. Mendelson, (cid:16)Asset pricing and the bid-ask spread,(cid:17) Journal of FinancialEconomics 17 no. 2, (1986) 223(cid:21)249.[74] V. Acharya and L. Pedersen, (cid:16)Asset pricing with liquidity risk,(cid:17) Journal of Financial Economics 77no. 2, (2005) 375(cid:21)410.[75] A. Frazzini and L. Pedersen, (cid:16)Betting against beta,(cid:17) Journal of Financial Economics 111 no. 1,(2014) 1(cid:21)25.[76] N. Jegadeesh, (cid:16)Evidence of predictable behavior of securities returns,(cid:17) Journal of Finance 45 no. 3,(1990) 881(cid:21)898.[77] G. W. Douglas, Risk in the Equity Markets: An Empirical Appraisal of Market E(cid:30)ciency. UniversityMicro(cid:28)lms, Inc., Ann Arbor, Michigan, 1968.[78] A. Ali, L.-S. Hwang, and M. A. Trombley, (cid:16)Arbitrage risk and the book-to-market anomaly,(cid:17) Journalof Financial Economics 69 no. 2, (2003) 355(cid:21)373.[79] F. Black, (cid:16)Capital market equilibrium with restricted borrowing,(cid:17) The Journal of Business 45 no. 3,(1972) 444(cid:21)55.[80] R. A. Haugen and A. J. Heins, (cid:16)Risk and the rate of return on (cid:28)nancial assets: Some old wine in newbottles,(cid:17) Journal of Financial and Quantitative Analysis 10 no. 05, (1975) 775(cid:21)784.[81] R. A. Haugen and N. L. Baker, (cid:16)The e(cid:30)cient market ine(cid:30)ciency of capitalization(cid:21)weighted stockportfolios,(cid:17) The Journal of Portfolio Management 17 no. 3, (1991) 35(cid:21)40.5982] H. Hong and D. Sraer, (cid:16)Speculative betas,(cid:17) Journal of Finance 71 no. 5, (2016) 2095(cid:21)2144.[83] M. P. Baker, B. Bradley, and R. Taliaferro, (cid:16)The low beta anomaly: A decomposition into micro andmacro e(cid:27)ects,(cid:17) Financial Analysts Journal (forthcoming) .[84] N.-F. Chen, R. Roll, and S. Ross, (cid:16)Economic forces and the stock market,(cid:17) The Journal of Business59 no. 3, (1986) 383(cid:21)403.[85] L. C. Bhandari, (cid:16)Debt/Equity ratio and expected common stock returns: Empirical evidence,(cid:17)Journal of Finance 43 no. 2, (1988) 507(cid:21)28.[86] H. Desai, S. Rajgopal, and M. Venkatachalam, (cid:16)Value-glamour and accruals mispricing: One anomalyor two?,(cid:17) The Accounting Review 79 no. 2, (2004) 355(cid:21)385.[87] T. G. Bali, R. F. Engle, and Y. Tang, (cid:16)Dynamic conditional beta is alive and well in the cross sectionof daily stock returns,(cid:17) Management Science 63 no. 11, (2017) 3760(cid:21)3779.[88] Z. Da and M. C. Warachka, (cid:16)Cash(cid:29)ow risk, systematic earnings revisions, and the cross-section ofstock returns,(cid:17) Journal of Financial Economics 94 no. 3, (2009) 448(cid:21)468.[89] E. F. Fama and K. French, (cid:16)Pro(cid:28)tability, investment and average returns,(cid:17) Journal of FinancialEconomics 82 no. 3, (2006) 491(cid:21)518.[90] P. S. Mohanram, (cid:16)Separating winners from losers among lowbook-to-market stocks using (cid:28)nancialstatement analysis,(cid:17) Review of Accounting Studies 10 no. 2(cid:21)3, (2005) 133(cid:21)170.[91] M. Adler and B. Dumas, (cid:16)International portfolio choice and corporation (cid:28)nance: A synthesis,(cid:17)Journal of Finance 38 no. 3, (1983) 925(cid:21)84.[92] K. C. Chan, N.-f. Chen, and D. A. Hsieh, (cid:16)An exploratory investigation of the (cid:28)rm size e(cid:27)ect,(cid:17)Journal of Financial Economics 14 no. 3, (1985) 451(cid:21)471.[93] William C. Barbee, Jr. and Sandip Mukherji and Gary A. Raines, (cid:16)Do sales-price and debt-equityexplain stock returns better than book-market and (cid:28)rm size?,(cid:17) Financial Analysts Journal 52 no. 2,(1996) 56(cid:21)60.[94] R. Balvers and D. Huang, (cid:16)Productivity-based asset pricing: Theory and evidence,(cid:17) Journal ofFinancial Economics 86 no. 2, (2007) 405(cid:21)445.[95] R. Novy-Marx, (cid:16)The other side of value: The gross pro(cid:28)tability premium,(cid:17) Journal of FinancialEconomics 108 no. 1, (2013) 1(cid:21)28.[96] V. Lidskii, (cid:16)The proper values of the sum and product of symmetric matrices (in Russian),(cid:17) Dokl.Akad. Nauk SSSR 74 (1950) 769(cid:21)772.[97] G. Connor, R. Korajczyk, and R. T. Uhlaner, (cid:16)A synthesis of two factor estimation methods,(cid:17)Journal of Financial and Quantitative Analysis 50 no. 04, (2015) 825(cid:21)842.[98] R. F. Engle, (cid:16)Dynamic conditional beta,(cid:17) Journal of Financial Econometrics 14 no. 4, (2016)643(cid:21)667.[99] J. Gatheral, T. Jaisson, and M. Rosenbaum, (cid:16)Volatility is rough,(cid:17) Quantitative Finance 18 no. 6,(2018) 933(cid:21)949.[100] R. Allez and J.-P. Bouchaud, (cid:16)Individual and collective stock dynamics: intra-day seasonalities,(cid:17).60101] H. Berument and H. Kiymaz, (cid:16)The day of the week e(cid:27)ect on stock market volatility,(cid:17) Journal ofEconomics and Finance 25 no. 2, (2001) 181(cid:21)193.61 . Time Scale Effect on Correlation betweenSecurities at Long Time Horizon ime Scale E(cid:27)ect on Correlation between Securities atLong Time Horizon
February 26, 2019
AbstractThe dependence of correlations on time scales larger than a week is di(cid:30)cult tomeasure due to insu(cid:30)cient data and strong noises. Here, such measurements are madepossible by reducing the size of the correlation matrix to 24 risk factors. We observe thatcorrelations continue to grow signi(cid:28)cantly. We propose a model for autocorrelation ofincrements of various risk factors that reproduces the scaling e(cid:27)ect. While this modelpresents some ine(cid:30)ciencies that are more subtle than the alternative risk premia, itseems to be more robust.
Valeyre, Grebenkov and Aboura (2018) measured and modeled a simple lead-lag e(cid:27)ect oncorrelations at time scales from 1 minute to 1 day. However, such a measurement becametoo noisy above the 1 day time scale. This problem is well explained by the random matrixtheory (Laloux et al. (1998)). Valeyre et al. (2018) proposed a practical solution of thisproblem by reducing the size of the correlation matrix to 24 major risk factors that reproducethe largest eigenvalues and their dynamics. This solution allows one to reduce noises andcon(cid:28)rms the Epps e(cid:27)ect up to the horizon of one year. The Epps e(cid:27)ect (Epps (1979)) wasobserved initially only at the intraday time scales as noises were too problematic at dailyscales. In our study, we model this e(cid:27)ect di(cid:27)erently by introducing an autocorrelation terminto the returns of risk factors. This term of positive autocorrelations can be explained bya herd behavior of investors (Guedj and Bouchaud (2005); Michard and Bouchaud (2005);Cont and Bouchaud (2000); Wyart and Bouchaud (2007); Lux and Marchesi (1999)) and bylack of market liquidity implying that a move of market takes some time as an investor needstime for his transaction being executed. We rely on the model developed by Grebenkov andSerror (2014) that describes autocorrelations between di(cid:27)erent stock indices and explains theperformances of trend following strategies and CTA funds over two centuries (Lemperiere1t al. (2014)). This model of autocorrelation represents an ine(cid:30)ciency that is more subtlebut also more robust than than the alternative risk premia. Harvey and Liu (2015) listed316 potential factors from 313 articles published since 1967. The majority of these factorsoverlap with each other. Fama and French (2015) proposed a 5-factor model (size, book, cash,momentum et accrual). In order to justify such alternative risk premia (lack of liquidity,asymmetry), the (cid:28)nancial theories are revised because these anomalies tend to disappearafter their publication. McLean and Ponti (2015) propose multiple explainations: the bias insample, along with problems in optimization and adaptation to markets. To our knowledge,no prior works attempted to reveal autocorrelations of risk factors that may present anine(cid:30)ciency that is more subtle but also more robust in (cid:28)nancial markets. Strategies basedon factor timing are however well documented but do not use only trend indicators andthere is not consensus about their robustness and pro(cid:28)tability (Asness (2016); Lee (2017);Bender et al. (2018); Bass, Gladstone et Ang (2017); DeMiguel et al. (2017); Hodges et al.(2017); Dichtl et al. (2018); Brandt, Santa-Clara et Valkanov (2009)). Our autocorrelationmodel could support the pro(cid:28)tability of strategies using a factor timing based on trendfollowing signals.
We project the correlation matrix between single stocks onto the optimized subspace of 24maximum variance portfolios introduced in Valeyre et al. (2018). The eigenvalues of thefull correlation matrix, called (cid:16)unconstrained eigenvalues(cid:17), are then close to the (cid:16)constrainedeigenvalues(cid:17) of the projected matrix C ( τ ) = γ − / h ( τ ) γ − / . Here h and γ are the × covariance and overlap matrices of 24 factors that were also introduced in Valeyre et al.(2018). τ is the time scale. We suppose that only the correlations between stocks depend onthe time scale, whereas the matrix γ and stocks’ volatilities do not depend on τ . We alsosuppose that the correlation between factors ( Corrcov ( h ( τ )) ) does not depend on τ (seethe plot on Fig. 1(bottom, right) that partly validates this hypothesis).We suppose that there is an inertia in the returns of the Maximum variance portfoliosthat represent the main elements of the systematic trading. This inertia can be explainedeither via the herding e(cid:27)ect, or by lack of liquidity. We start from the autocorrelationmodel developed in Grebenkov and Serror (2014) but apply it directly to the returns of theMaximum variance portfolios which are normalized in such a way that the diagonal elementsof the matrix γ ( t ) are equal to in every time moment. Denoting r kt the return of the k -thfactor at day t , we get r kt = (cid:15) kt + κ k t X i = −∞ (1 − χ ) t − − i ξ ki (1)2here (cid:15) and ξ are Gaussian random variables without autocorrelation. One can show that C ( τ ) = γ − / V − / τ ∞ h (1) V − / τ ∞ γ − / (2)where V τ ∞ is a diagonal matrix where every element in the diagonal is derived from Eq. (9)of Grebenkov and Serror (2014): V k ( τ ) = 1 + 2 (1 − χ ) [ κ k ] χ (1 + [ κ k ] ) (cid:18) − − (1 − χ ) τ τ χ (cid:19) (3)with κ k = κ k p χ (2 − χ ) (4)One can show that the eigenvalues at the time scale τ correspond to the eigenvalues attime scale of 1 day multiplied by the same coe(cid:30)cient V τ ∞ if all the factors have the sameautocorrelation κ k = κ .The problem is getting more complicated when some factors are more correlated thanthe others. It seems to be the case as the sectorial factors are less cross-correlated than thestyle factors. This changes the eigenvectors as well: • The measures (see Fig. 1) rely on the methods introduced in Valeyre et al. (2018)to estimate the returns of the maximum variance factors, γ ( τ ) and h ( τ ) , then C ( τ ) .One can use the method introduced in Grebenkov and Serror (2014) to estimate thevariograms. However, the variagrams estimated from 20 years of historical daily data,are very noisy. The di(cid:27)erences in measurement among factors are in agreement withthe noise. On average, they show that the factor returns are positively correlated. TheFCLs, introduced in Valeyre et al. (2018), increase for style factors but remain stablefor sectorial ones. Finally, the eigenvalues increase with the time scale. The rankingof style factors according to their FCL is very sensitive to the time scale (see Fig. 3).Note that the Book and Capitalization factors appear in the 4th and 5th places afterthe Beta STR and Momentum factors, whereas they were ranked to be less relevant atthe time scale of one hour. • The simulation of the model (Fig. 2) is based on the matrices γ et h generatedrandomly only once by the method described in section 2.8 of Valeyre et al. (2018)from the empirical eigenvalues of the averaged correlation matrix of daily returns. Thestyle and sectorial factors are generated randomly only once, with κ = 0 . for 13style factors and κ = 0 for 9 sectorial factors. One (cid:28)xes χ = 0 . to reproduce themeasurements correctly. It turns out that the parameters correspond also to the modelof autocorrelation of the DowJones index that could be estimated accurately due toits 120 years of data, see Grebenkov and Serror (2014).3
50 100 150 200 horizon no r m a li z ed v a r i og r a m horizon l a m bda0 horizon l a m bda0 horizon l a m bda1 horizon l a m bda1 horizon f i r s t e i gen Figure 1: Top left: measurement of the variograms of the style factors; Top Middle: FCLsof sectorial factors that do not increase (i.e., these factors are just weakly autocorrelated);Top Right: FCLs of style factors that grow, particularly, the book factor by Fama andFrench that arrives at the 5-th place at long time scales (strong herding); Bottom left:constrained eigenvalues; Bottom Middle: constrained eigenvalues without the largest one.One can see that correlation keep strongly growing with the time scale; Bottom right: the(cid:28)rst eigenvalue of
Corrcov ( C ) that is equivalent to ρ ( τ ) with ρ ( τ ) being the averagecorrelation between factors. The dependence of correlations on time scales larger than a week is di(cid:30)cult to measure due toinsu(cid:30)cient data and strong noises. Here, such measurements are made possible by reducingthe size of the correlation matrix to 24 risk factors. We observe that correlations continueto grow signi(cid:28)cantly. We propose a model for autocorrelation of increments of various riskfactors that reproduces the scaling e(cid:27)ect. While this model presents some ine(cid:30)ciencies thatare more subtle than the alternative risk premia, it seems to be more robust.
References
Asness, C. S., (cid:16)The Siren Song of Factor Timing aka Smart Beta Timing aka Style Timing.(cid:17),Journal of Portfolio Management, (2016).Bass, R., S. Gladstone, and A. Ang, (cid:16)Total Portfolio Factor, not just Asset,Allocation.(cid:17),Journal of Portfolio Management, 43(5), 38. (2017).4
50 100 150 time scales v a r i og r a mm e time scales l a m bda0 time scales e i gen v a l ue s rank Figure 2: Top left: simulation of variograms of style factors with κ = 0 . and χ = 0 . ;Top middle: simulation of FCL (note that for sectorial factors κ = 0 ); Top right: simulationof the eigenvalues; Bottom: empirical eigenvalues that were used an input for Monte Carlosimulation to generate γ and h (1) ; one can see that the model accurately reproduces themeasurements.Bender, J., X. Sun, R. Thomas, and V. Zdorovtsov, (cid:16)The Promises and Pitfalls of FactorTiming.(cid:17), Journal of Portfolio Management, 44(4), 79.(2018).Brandt, M. W., P. Santa-Clara, and R. Valkanov, (cid:16)Parametric Portfolio Policies: ExploitingCharacteristics in the Cross-Section of Equity Returns.(cid:17), Review of Financial Studies,22(9), 3411 (2009).Cont, R., and Bouchaud, J.-P., (cid:16)Herd Behavior and Aggregate Fluctuations in FinancialMarkets.(cid:17), Macroeconomic Dynamics Get access Volume 4, Issue 2 June (2000) , pp. 170-196.Dichtl, H, W. Drobetz, H. Lohre, C. Rother, and P. Vosskamp, (cid:16)Optimal Timing and Tiltingof Equity Factors(cid:17), Working Paper (2018).Epps,T. W., (cid:16)Comovements in Stock Prices in the Very Short Run.(cid:17), J. Am. Stat. Asso. 74,291-298 (1979).Fama, E. F., and K. R. French, (cid:16)A Five Factor Model.(cid:17), Journal of Financial Economics, 116(2015), pp. 1-22.Grebenkov, D., and J. Serror, (cid:16)Following a Trend with an Exponential Moving Average:Analytical Results for a Gaussian Model(cid:17), Physica A. (2014).5actor 1 day 5 days 10 days 20 days 40 days 80 days 100 daysBeta 6.35 7.99 8.77 9.34 10.37 11.64 14.45STR 4.10 4.73 5.00 5.51 6.72 8.45 12.21Momentum 5.87 7.09 7.64 8.20 9.00 9.60 10.80Capitalization 4.43 4.95 5.24 5.66 6.48 7.53 9.54Book 2.49 2.94 3.40 3.99 4.89 5.50 6.40Sales 2.88 3.47 3.85 4.19 4.85 5.26 6.30Dividend 3.86 4.44 4.90 5.28 5.77 5.83 5.795Y Rates 5.38 6.24 6.26 6.06 5.80 5.80 5.42Liquidity 2.82 2.93 3.12 3.34 3.77 4.11 4.91Euro 3.72 4.12 4.15 4.08 3.82 3.74 3.55Leverage 2.08 2.30 2.54 2.72 2.93 3.10 3.41Earning 1.89 2.09 2.20 2.24 2.40 2.59 3.17Cash 1.52 1.66 1.72 1.73 1.85 2.04 2.46Growth 1.47 1.67 1.76 1.78 1.92 2.00 2.00Figure 3: FCL at di(cid:27)erent time scales.Guedj, O. , and Bouchaud, J.-P., (cid:16)Experts Earning Forecasts: Bias, Herding and GossamerInformation(cid:17), International Journal of Theoretical and Applied Finance Vol. 08, No. 07,pp. 933-946 (2005).Harvey, C.R., and Liu, Y., (cid:16)Lucky factors.(cid:17), Working paper, University of Duke.(2015).Hodges, P., K. Hogan, J. R. Peterson, and A. Ang, (cid:16)Factor Timing with Cross-Sectional andTime-Series Predictors.(cid:17), Journal of Portfolio Management, 44(1), (2017).Laloux L., Cizeau P., Bouchaud J.-P., and Potters M., (cid:16)Noise Dressing ofFinancial Correlation matrices.(cid:17), Physical Review Letters 83, 1467. (1998).Lee, W. (2017), (cid:16)Factors Timing Factors(cid:17), Journal of Portfolio Management, 43(5), (2017).Lemperiere,Y., C. Deremble, P. Seager, M. Potters, and J.-P. Bouchaud, (cid:16)Two Centuries ofTrend Following"", Journal of Investment Strategies, Volume3, number 3 June (2014).Lux, T., and Marchesi, M., (cid:16)Scaling and Criticality in a Stochastic Multi-Agent Model of aFinancial Market.(cid:17), Nature volume 397, pages 498(cid:226)(cid:128)(cid:147)500 11 February (1999).McLean, R.D., and Ponti, J. , (cid:16)Does Academic Research Destroy Stock Return Predictabil-ity?.(cid:17), Forthcoming, Journal of Finance (2015).6ichard, Q., and Bouchaud, J.-P., (cid:16)Theory of Collective Opinion Shifts: from Smooth Trendsto Abrupt Swings.(cid:17), The European Physical Journal B - Condensed Matter and ComplexSystems, September (2005), Volume 47, Issue 1, pp 151(cid:226)(cid:128)(cid:147)159.DeMiguel,V., Martin-Utrera, F.J. Nogales, and R. Uppal, (cid:16)A Portfolio Perspective on theMultitude of Firm Characteristics.(cid:17), LBS working paper (2017). Best Paper Award at theXXIV Finance Forum, presented at 2017 EFA Annual Meeting and 2018 AFA AnnualMeeting (2017).Valeyre, S., S. Kuperstein, D. S. Grebenkov, and S. Aboura, (cid:16)The Market Neutral Funda-mental Maximum Variance Portfolios(cid:17) Working paper (2018).Valeyre, S., D. S. Grebenkov, and S. Aboura, (cid:16)Emergence of Correlation between Securitiesat Short Time Scales(cid:17) Submited to Physica A (2018).Wyart, M., and Bouchaud, J.-P., (cid:16)Self-Referential Behaviour, Overreaction and Conventionsin Financial Markets.(cid:17), Journal of Economic Behavior & Organization Volume 63, Issue 1,May (2007), Pages 1-24. 7 . The Reactive Beta Model he Reactive Beta Model February 14, 2019
AbstractWe present a reactive beta model that accounts for the leverage e(cid:27)ect and beta elastic-ity. For this purpose, we derive a correlation metric for the leverage e(cid:27)ect to identifythe relation between the market beta and volatility changes. An empirical test basedon the most popular market neutral strategies is run from 2000 to 2015 with exhaustivedata sets, including 600 US stocks and 600 European stocks. Our (cid:28)ndings con(cid:28)rmthe ability of the reactive beta model to withdraw an important part of the bias fromthe beta estimation and from most popular market neutral strategies. To examine therobustness of the reactive beta measurement, we conduct Monte Carlo simulations overseven market scenarios against (cid:28)ve alternative methods. The results con(cid:28)rm that thereactive model signi(cid:28)cantly reduces the bias overall when (cid:28)nancial markets are stressed.Keywords: Beta, Correlation, Volatility, Leverage e(cid:27)ect, Market Neutral Strategies.JEL classi(cid:28)cation: C5, G01, G11, G12, G32. Introduction
Finding an appropriate measurement of market betas is of paramount importance for many(cid:28)nancial applications, including market neutral hedge fund managers who target a near-zerobeta. Contrary to common belief, perfect beta neutral strategies are di(cid:30)cult to achievein practice, as the mortgage crisis in 2008 exempli(cid:28)ed, when most market neutral fundsremained correlated with stock markets and experienced considerable unexpected losses.This exposure to the stock index (Banz, 1981; Fama and French, 1992, 1993; Carhart, 1997;Ang et al., 2006) is even stronger during down market conditions (Mitchell and Pulvino,2001; Agarwal and Naik, 2004; BussiŁre et al., 2015). In such a period of market stress,hedge funds may even add no value (Asness et al., 2001).In this paper, we derive a stock market beta measure that we implement to test thequality of hedging for four popular strategies in the hedge funds industry. The (cid:28)rst and mostimportant strategy captures the low beta anomaly (Black, 1972; Black et al., 1972; Haugenand Heins, 1975; Haugen and Baker, 1991; Ang et al., 2006; Baker et al., 2013; Frazziniand Pedersen, 2014; Hong and Sraer, 2016) that de(cid:28)es conventional wisdom on the riskand reward trade-o(cid:27) predicted by the CAPM (Sharpe, 1964). According to this anomaly,high beta stocks underperform low beta stocks. Similarly, stocks with high idiosyncraticvolatility earn lower returns than stocks with low idiosyncratic volatility (Malkiel and Xu,1997; Goyal and Santa-Clara, 2003; Ang et al., 2006, 2009). The related strategy consistsof shorting high beta stocks and buying low beta stocks. The second important strategycaptures the size e(cid:27)ect (Banz, 1981; Reinganum, 1981; Fama and French, 1992), in whichstocks of small (cid:28)rms tend to earn higher returns, on average, than stocks of larger (cid:28)rms.The related strategy consists of buying stocks with small market capitalization and shortingthose with high market capitalization. The third strategy captures the momentum e(cid:27)ect(Jegadeesh and Titman, 1993; Carhart, 1997; Grinblatt and Moskowitz, 2004; Fama andFrench, 2012), where past winners tend to continue to show high performance. This strategyconsists of buying the past year’s winning stocks and shorting the past year’s losing ones.The fourth strategy captures the short-term reversal e(cid:27)ect (Jegadeesh, 1990), where pastwinners in the last month tend to show low performance. This strategy consists of buyingthe past month’s losing stocks and shorting the past month’s winning stocks, which would behighly pro(cid:28)table if there were no transaction cost and no market impact. Testing the qualityof the hedge of the strategies is equivalent to assessing the quality of the beta measurements,which is di(cid:30)cult to realize directly as the true beta is not known.The implementation of all these strategies requires a reliable estimation of the betas tomaintain the hedge. Ordinary least squares (OLS) estimation remains the most frequentlyemployed method, even though it is impaired in the presence of outliers, especially fromsmall companies (Fama and French, 2008), illiquid companies (Amihud, 2002; Acharyaa andPedersen, 2005; Ang et al., 2013), and business cycles (Ferson and Harvey, 1999). In thesecircumstances, the OLS beta estimator might be inconsistent. To overcome these limitations,2ur approach consists of renormalizing the returns to make them closer to Gaussian and thusto make the OLS estimator more consistent. In addition, many papers report that betasare time varying (Blume, 1971; Fabozzi and Francis, 1978; Jagannathan and Wang, 1996;Fama and French, 1997; Bollerslev et al., 1988; Lettau and Ludvigson, 2001; Lewellen andNagel, 2006; Ang and Chen, 2007; Engle, 2016). This can lead to measurement errors thatcould create serious bias in the cross-sectional asset pricing test (Shanken, 1992; Chan andLakonishok, 1992; Meng et al., 2011; Bali et al., 2017) . In fact, (cid:28)rms’ stock betas do changeover time for several reasons. The (cid:28)rm’s assets tend to vary over time via acquiring orreplacing new businesses, which makes them more diversi(cid:28)ed. The betas also change for (cid:28)rmsthat change in dimension to be safer or riskier. For instance, (cid:28)nancial leverage may increasewhen (cid:28)rms become larger, as they can issue more debt. Moreover, (cid:28)rms with higher leverageare exposed to a more unstable beta (Galai and Masulis, 1976; DeJong and Collins, 1985).One way to account for the time dependence of betas is to consider regime changes whenthe return history used in the beta estimation is long enough. Surprisingly, only one paper(Chen et al., 2005) suggests a solution to capture the time dependence and discusses regimechanges for the beta using a multiple structural change methodology. The study shows thatthe risk related to beta regime changes is rewarded by higher returns. Another approach is toexamine the correlation dynamics. Francis (1979) (cid:28)nds that (cid:16)the correlation with the marketis the primary cause of changing betas... the standard deviations of individual assets arefairly stable(cid:17). This (cid:28)nding calls for special attention to the correlation dynamics addressedin our paper but that are apparently insu(cid:30)ciently investigated in other works.Despite the extensive literature on this issue, little attention has been paid to the linkbetween the leverage e(cid:27)ect and the beta. The leverage e(cid:27)ect is de(cid:28)ned as the negativecorrelation between the securities’ returns and their volatility changes. This correlationinduces residual correlations between the stock overperformances and beta changes. In fact,earlier studies have heavily focused on the role of the leverage e(cid:27)ect on volatility (Black,1976; Christie, 1982; Campbell and Hentchel, 1992; Bekaert and Wu, 2000; Bouchaud et al.,2001; Valeyre et al., 2013). Surprisingly, despite its theoretical and empirical underpinnings,the leverage e(cid:27)ect has not been considered so far in beta modeling, while it is a measure ofrisk. We aim to close this gap. Our paper starts by investigating the role of the leverage e(cid:27)ectin the correlation measure by extending the reactive volatility model (Valeyre et al., 2013), Note that we are not dealing with the restricted de(cid:28)nition of the (cid:16)leveraged beta(cid:17) that comes from thedegree of leverage in the (cid:28)rm’s capital structure. Notice that the market beta may be non-linearly relatedto the market return, which could lead to spurious inference in beta measurement (DeBondt and Thaler,1987) while leverage e(cid:27)ect could possibly be a major explanation of such non-linearity (e.g. Garlappi andYan (2011) relate leverage to default probability; Daniel, Jagannathan and Kim (2012) relate the (cid:28)nancialleverage to the operating leverage; Choi (2013) relates leverage to economic conditions; Moreira and Muir(2017) relate leverage and volatility managed portfolios; Liu, Stambaugh and Yuan (2018) relate the leverageto the beta-idiosyncratic volatility relation). In this context, the time-variation e(cid:27)ect in conditional betaadds on this bias (Boguth et al., 2011).
In this section, we present the reactive beta model with three independent components.First, we take into account the speci(cid:28)c leverage e(cid:27)ect on the beta. Second, we considerthe systematic leverage e(cid:27)ect on the correlation. Third, we model the relation between therelative volatility and the beta via nonlinear beta elasticity.4 .1 The leverage e(cid:27)ect on beta
We (cid:28)rst account for relations among returns, volatilities, and the beta, which are character-ized by the so-called leverage e(cid:27)ect. This component takes into account the phenomenonwhere a beta increases as soon as a stock underperforms the index. Such a phenomenon canbe fairly well described by the leverage e(cid:27)ect captured in the reactive volatility model. Wecall the speci(cid:28)c leverage e(cid:27)ect the negative relation between speci(cid:28)c returns and the risk(here, the beta), where the speci(cid:28)c return is the nonsystematic part of the returns (a stock’soverperformance). The speci(cid:28)c leverage e(cid:27)ect on the beta follows the same dynamics as thespeci(cid:28)c leverage e(cid:27)ect introduced in the reactive volatility model.2.1.1 The reactive volatility modelThis section aims to capture the dependence of betas on stock overperformance (when a stockis overperforming, its beta tends to decrease). For this purpose, we rely on the methodologyof the reactive volatility model (Valeyre et al., 2013) to derive a stable measure of the betaby using the renormalization factor that depends on the stock’s overperformance. The modeldescribes the systematic and speci(cid:28)c leverage e(cid:27)ects. Systematic leverage, which is due to thepanic e(cid:27)ect, and speci(cid:28)c leverage, which is due to a retarded e(cid:27)ect, have very di(cid:27)erent relax-ation times and intensities. These two di(cid:27)erent e(cid:27)ects were investigated by Bouchaud et al.(2001), who introduced the measurement of the returns’ volatility correlation function at dif-ferent time scales τ . They de(cid:28)ned this measurement as L ( τ ) = E ( r ( t + τ ) r ( t )) /E ( r ( t )) ,where r ( t ) is the daily return at day t, and they showed that it exhibits an exponential de-cay curve depending on τ with 2 parameters: the relaxation time and the initial amplitude,which describes the intensity of the leverage. The intensity measured is 9 times higher forthe stock index than for the single stocks, and the relaxation time is 6 times smaller for thestock index. The higher intensity and the shorter relaxation times applied to the stock indexwere explained by the panic e(cid:27)ect that occurs as soon as all single stocks decrease at thesame time. The low intensity and the longer relaxation time applied to single stocks wereexplained by the retarded e(cid:27)ect: On short time scales, the standard deviation of di(cid:27)erencesin price is the criteria used by traders to assess the risk, whereas on longer time scales, thestandard deviation of returns is used. The retarded e(cid:27)ect works as if traders need time totake into account a change in price in the analysis of the risk. The reactive volatility modelreproduces very well the measurement of L ( τ ) for the stock index and for the single stocks.We start by recalling the construction of the reactive volatility model, which explicitlyaccounts for the leverage e(cid:27)ect on volatility. Let I ( t ) be a stock index at day t . It is wellknown that arithmetic returns, r I ( t ) = δI ( t ) /I ( t − , are heteroscedastic, partly due to price-volatility correlations. Throughout the text, δ refers to the di(cid:27)erence between successivevalues, e.g., δI ( t ) = I ( t ) − I ( t − . The reactive volatility model aims to construct anappropriate (cid:16)level(cid:17) of the stock index, L ( t ) , to replace the original returns δI ( t ) /I ( t − δI ( t ) /L ( t − .For this purpose, we (cid:28)rst introduce two (cid:16)levels(cid:17) of the stock index as exponential movingaverages (EMAs) with two time scales: a slow level L s ( t ) and a fast level L f ( t ) . In addition,we denote by L is ( t ) the EMA (with the slow time scale) of the price S i ( t ) of the stock i attime t . These EMAs can be computed using standard linear relations: L s ( t ) = (1 − λ s ) L s ( t −
1) + λ s I ( t ) , (1) L f ( t ) = (1 − λ f ) L f ( t −
1) + λ f I ( t ) , (2) L is ( t ) = (1 − λ s ) L is ( t −
1) + λ s S i ( t ) , (3)where λ s and λ f are the weighting parameters of the EMAs that we set to λ s = 0 . and λ f = 0 . , relying on the estimates by Bouchaud et al. (2001). The slow parametercorresponds to the relaxation time of the retarded e(cid:27)ect for speci(cid:28)c risk, whereas the fastone corresponds to the relaxation time of the panic e(cid:27)ect for systematic risk. These tworelaxation times are found to be rather universal, as they are stable over years and donot change among di(cid:27)erent mature stock markets. The appropriate levels, L ( t ) and L i ( t ) ,accounting for the leverage e(cid:27)ect on the volatility to correctly normalize the di(cid:27)erence inprice, were introduced for the stock index and individual stocks, respectively. L ( t ) = I ( t ) (cid:18) L s ( t ) − I ( t ) I ( t ) (cid:19) (cid:18) ‘ L f ( t ) − I ( t ) L f ( t ) (cid:19) , (4) L i ( t ) = S i ( t ) (cid:18) L is ( t ) − S i ( t ) S i ( t ) (cid:19)| {z } speci(cid:28)c risk (cid:18) ‘ i L f ( t ) − I ( t ) L f ( t ) (cid:19)| {z } systematic risk , (5)with the parameters ‘ and ‘ i quantifying the leverage. The parameter ‘ was introduced byValeyre et al. (2013) to reproduce the exponential (cid:28)t of the returns’ volatility correlationfunction L ( τ ) at di(cid:27)erent time scales τ . The initial parameters of the exponential (cid:28)t wereestimated on 7 major stock indexes so that ‘ was deduced to be approximately 8. If ‘ = ‘ i , thecorrelation between the stock index and the individual stock i is not impacted by the leveragee(cid:27)ect. In turn, if ‘ > ‘ i , the correlation increases when the stock index decreases. Although ‘ i can generally be speci(cid:28)c to the considered i -th stock, we ignore its possible dependenceon i and set ‘ i = ‘ . Using the levels L ( t ) and L i ( t ) , we introduce the normalized returns: ˜ r I = ˜ r I ( t ) = δI ( t ) L ( t − , ˜ r i = ˜ r i ( t ) = δS i ( t ) L i ( t − (6) In practice, a (cid:28)ltering function is introduced to attenuate the contribution from eventual outliers (extremeevents or wrong data). The (cid:28)lter was applied to z = L s ( t ) − I ( t ) I ( t ) and z = L is ( t ) − S i ( t ) S i ( t ) in Eqs. (4, 5) and wasde(cid:28)ned as F φ ( z ) = tanh( φz ) /φ with φ = 3 . (in the limit φ = 0 , there is no (cid:28)lter: F ( z ) = z ). ˜ σ I and ˜ σ i through the EMAs as: ˜ σ I ( t ) = (1 − λ σ )˜ σ I ( t −
1) + λ σ ˜ r I ( t ) , (7) ˜ σ i ( t ) = (1 − λ σ )˜ σ i ( t −
1) + λ σ ˜ r i ( t ) , (8)where λ σ is a weighting parameter that has to be chosen as a compromise between the accu-racy of the estimated renormalized volatility and the reactivity of that estimation. Indeed,the renormalized returns are constructed to be homoscedastic only at short times becausethe renormalization based on the leverage e(cid:27)ect with short relaxation times ( λ s , λ f ) cannotaccount for long periods of changing volatility related to economic cycles. Since economicuncertainty does not change signi(cid:28)cantly over a period of two months (40 trading days),we set λ σ to /
40 = 0 . . This sample length leads to a statistical uncertainty of ap-proximately p / ≈ . Finally, these renormalized variances can be converted into thereactive volatility σ I ( t ) of the stock index quantifying the systematic risk governed by thepanic e(cid:27)ect, and the reactive volatility σ i ( t ) of each individual stock quantifying the speci(cid:28)crisk governed by the leverage e(cid:27)ect: σ I ( t ) = ˜ σ I ( t ) L ( t ) I ( t ) , (9) σ i ( t ) = ˜ σ i ( t ) L i ( t ) S i ( t ) . (10)This reactive volatility captures a large part of the heteroscedascticity, i.e., a large partof the volatility variation is completely explained by the leverage e(cid:27)ect. That was the mainresult of Valeyre et al. (2013): For instance, if the stock index loses 1%, L ( t ) I ( t ) increases by ‘ ×
1% = 8% , and the stock index volatility increases by 8%. That e(cid:27)ect is enough to capturea large part of the VIX variation, with R = 0 . . In turn, if the stock underperforms thestock index by 1%, L i ( t ) S i ( t ) increases by 1%, and the single stock volatility increases by 1%.2.1.2 The speci(cid:28)c leverage e(cid:27)ect in the reactive beta modelThe volatility estimation procedure naturally impacts the estimation of the beta. Many(cid:28)nancial instruments rely on the estimated beta, β i , which corresponds to the slope of alinear regression of the stocks’ arithmetic returns r i on the index arithmetic return r I : r i = β i r I + (cid:15) i , with r i = δS i ( t ) S i ( t − , r I = δI ( t ) I ( t − , (11)where (cid:15) i is the residual random component speci(cid:28)c to stock i . We consider another betaestimate, ˜ β i , based on the reactive volatility model, in which the renormalized stock returns ˜ r i are regressed on the renormalized stock index returns ˜ r I : ˜ r i = ˜ β i ˜ r I + ˜ (cid:15) i , with ˜ r i = δS i ( t ) L i ( t − , ˜ r I = δI ( t ) L ( t − . (12)7e then obtain a reactive beta measure: β i ( t ) = ˜ β i ( t ) σ i ( t ) ˜ σ I ( t ) σ I ( t ) ˜ σ i ( t ) = ˜ β i L is ( t ) I ( t ) L s ( t ) S i ( t ) , (13)which includes two improvements: • ˜ β i , which becomes less sensitive to price changes by accounting for the speci(cid:28)c leveragee(cid:27)ect; • σ i ˜ σ I / ( σ I ˜ σ i ) , which changes instantaneously with price changes.When taking into account the short-term leverage e(cid:27)ect in correlations, the reactive termis reduced to L is ( t ) I ( t ) L s ( t ) S i ( t ) . This term has a signi(cid:28)cant impact, as the beta of underperformingstocks should increase. ‘ for single stocksWe use the term systematic leverage e(cid:27)ect to denote the negative relation between systematicreturns and the risk (here, the correlation), where the systematic returns are the nonspeci(cid:28)cpart of the returns (stock index performance). The systematic leverage e(cid:27)ect on the corre-lation follows the same dynamics as the systematic leverage e(cid:27)ect introduced in the reactivevolatility model (the phenomenon’s duration is approximately 7 days for λ f = 0 . ). Allcorrelations are impacted together in the same way by the systematic leverage e(cid:27)ect, andsingle stocks and their stock indexes should also shift in the same direction. This explainswhy the stock’s beta will not change with respect to the index. The implication is that betasare not very sensitive to the systematic leverage e(cid:27)ect, in contrast to the speci(cid:28)c leveragee(cid:27)ect. We consider the impact of the short-term systematic leverage e(cid:27)ect on correlation.Assuming that the correlation between each individual stock and the stock index is the samefor all stocks, one can de(cid:28)ne the implied correlation as: ρ ( t ) = σ I ( t ) − P i w i σ i ( t ) P i = j w i w j σ i ( t ) σ j ( t ) , (14)where w i represents the weight of stock i in the index. Denoting e I ( t ) = ˆ L s ( t ) I ( t ) − , e i ( t ) = ˆ L is ( t ) S i ( t ) − , (15)
8e use Eqs. (9, 10) to obtain: ρ = ˜ σ I (1 + e I ) (cid:16) ‘ L f − IL f (cid:17) − (cid:16) ‘ L f − IL f (cid:17) P i w i (1 + e i ) σ i (cid:16) ‘ L f − IL f (cid:17) P i = j w i w j ˜ σ i ˜ σ j (1 + e i )(1 + e j ) . (16)If the weights w i are small, we can ignore the second term; in addition, if e i are small, then X i = j w i w j ˜ σ i ˜ σ j (1 + e i )(1 + e j ) ≈ (1 + e I ) ˜ σ , where ˜ σ is an average of ˜ σ i . Keeping only the leading terms of the expansion in terms ofthe small parameter ( L f − I ) /L f , one thus obtains ρ ≈ ˜ σ I ˜ σ (cid:18) ‘ − ‘ ) L f − IL f (cid:19) . (17)This relation shows the dynamics of the implied correlation ρ induced by the leverage e(cid:27)ect(accounted for through the factor ( L f − I ) /L f ). We assume that the same dynamics areapplicable to correlations between individual stocks, i.e., ρ i,j = ˜ ρ i,j (cid:18) ‘ − ‘ ) L f − IL f (cid:19) , (18)where ˜ ρ i,j are the parameters speci(cid:28)c to each pair of stocks i and j . From this relation, wederive a measure of correlation accounting for the leverage e(cid:27)ect between the single stock i and the stock index: ρ i = ˜ ρ i (cid:18) ‘ − ‘ ) L f − IL f (cid:19) , (19)where ˜ ρ i are the parameters speci(cid:28)c to each stock i . Note that there is no factor 2 in frontof ( ‘ − ‘ ) in Eq. (19) because we have a one-factor model here. We use Eq. (19) in thereactive beta model (see Eqs. (34, 36) below) to take into account the varying nature ofthe correlation in the regression. We rescale the measurement by the normalization factor (1 + ( ‘ − ‘ )( L f − I ) /L f ) and then recover the variation of the correlation through thedenormalization factor / (1 + ( ‘ − ‘ )( L f − I ) /L f ) . We emphasize that the parameter ‘ inEq. (4) that quanti(cid:28)es the systematic leverage for the stock index is slightly di(cid:27)erent from theparameter ‘ in Eq. (5) that quanti(cid:28)es the systematic leverage for single stocks. Accordingto Eq. (18), when the market decreases, correlations between stocks increase as ‘ > ‘ ,and therefore, the stock index volatility increases more than the single stock’s volatility: δ ( σ i /σ I ) < . Once again, the beta is, in contrast to the correlation, weakly impacted by thesystematic leverage e(cid:27)ect, as all correlations increase in the same way. More precisely, this9 daily variation of (L f −I)/L f da il y v a r i a t i on o f no r m a li z ed I C I Figure 1: Daily variations of the CBOE S&P 500 Implied Correlation Indices (ICI) since theirinception, divided by their mean, versus daily variations of the leverage factor ( L f − I ) /L f .A linear regression (solid line) yields the coe(cid:30)cient . ± . (i.e., ‘ − ‘ ) = 1 . ), with R = 0 . and t-statistics of . . Period: 2007-2015.means that the impact of the increase in correlation in the beta measurement is compensatedby a decrease in the relative volatility: δ ( σ i /σ I ) < , i.e., the single stock volatility increaseis lower than that of the stock index volatility. For this reason, the reactive beta model inEqs. (34, 36) is not very sensitive to the choice of ‘ . Nevertheless, we explain in this sectionhow ‘ is calibrated using the implied volatility index. We measure the level of the systematicleverage e(cid:27)ect ‘ for a single stock by regressing Eq. (17) with data from the market-impliedcorrelation S&P 500 index. Figure 1 illustrates the slope of this regression. By regressing L f − IL f against ρ ˜ ρ , where ˜ ρ is the average of ρ , we deduce that empirically, we can set: ‘ − ‘ = 0 . ± . , (20)with a t-statistic of . . Since ‘ − ‘ (cid:28) ‘ (= 8) , we deduce an important result, namely,that the systematic leverage impact on the correlation is more than 8 times smaller thanthe systematic leverage impact on the volatility. The main consequence is that although itis statistically signi(cid:28)cant, the leverage e(cid:27)ect is not a major component of the correlation.2.2.2 The systematic leverage e(cid:27)ect component in the reactive modelAs discussed above, the correlation increases when the stock index price decreases. Thise(cid:27)ect could generate a bias in the beta measurement, as stock index prices could (cid:29)uctuate in10 sample used to measure the slope. Our solution is to adjust the beta between renormalizedreturns through the correction factor L ( t ) , de(cid:28)ned as L ( t ) = 1 + ( ‘ − ‘ ) (cid:18) L f ( t − − I ( t − L f ( t − (cid:19) , (21)The correction factor L ( t ) should be used to estimate the slope between the stock index andsingle stock returns and should then be used to denormalize the slope to obtain the reactivebeta that depends directly on L ( t ) . ˜ σ i / ˜ σ I , as an explanatory variable of ˜ β i because ˜ β i is expected to be constant if the ratio ˜ σ i / ˜ σ I is constant. However, empirically,the ratio ˜ σ i / ˜ σ I can change dramatically between periods of high dispersion (i.e., when stocksare, on average, weakly correlated) and low systematic risk (i.e., when stock indexes are notstressed) and periods of low dispersion and high systematic risk. Figure 2 illustrates, forboth European and US markets, that the dispersion among stocks decreases, on average,when markets become volatile. A linear regression of rescaled daily variations of ˜ σ i yields: δ ˜ σ i ( t )˜ σ i ( t − ≈ . δ ˜ σ I ( t )˜ σ I ( t −
1) + (cid:15) i , (22)where (cid:15) i is the residual (speci(cid:28)c) noise. Using the standard rules for in(cid:28)nitesimal increments,we (cid:28)nd from this regression the following: δ (cid:18) ˜ σ i ˜ σ I (cid:19) ’ δ ˜ σ i ˜ σ I − ˜ σ i δ ˜ σ I ˜ σ I = ˜ σ i ˜ σ I (cid:18) δ ˜ σ i ˜ σ i − δ ˜ σ I ˜ σ I (cid:19) ’ − . σ i ˜ σ I δ ˜ σ I ˜ σ I , (23)i.e., the relative volatility ˜ σ i / ˜ σ I is relatively stable, but its small variations can still impactthe beta estimation. This empirical relation shows that when there is a volatility shock inthe market, the stock index volatility increases much faster than the average single stockvolatility.Because we want to take into account the impact of the relative volatility change on thebeta measurement, we introduce the beta elasticity as the derivative of the beta with respectto the logarithm of the squared relative volatility: f ( ˜ β i ) = d ˜ β i d ln(˜ σ i / ˜ σ I ) = d ˜ β i d (˜ σ i / ˜ σ I ) ˜ σ i σ I . (24)11 δσ I / σ I δ σ i / σ i ~ ~ ~~ Figure 2: Normalized daily variations of ˜ σ i , δ ˜ σ i / ˜ σ i = ˜ σ i ( t ) − ˜ σ i ( t − σ i ( t − versus normalized dailyvariations of ˜ σ I , δ ˜ σ I / ˜ σ I = ˜ σ I ( t ) − ˜ σ I ( t − σ I ( t − for the European market (blue crosses) and the USmarket (red pluses). The two gray lines show the linear regressions of the two datasets, withregression coe(cid:30)cients of . ( R = 0 . ) and . (with R = 0 . ) for the European andUS markets, respectively. The time frame includes observations from the technology bubbleburst, U.S. subprime, and Euro debt crises. Period: 1998-2015.12e expect that f ( ˜ β i ) is positive and increasing with ˜ β i . Indeed, we expect that a stockwith a low beta should have a stable beta (less sensitive to its relative volatility increase),as the increase in this case is most likely due to a speci(cid:28)c risk increase. In such a case, thesensitivity of the beta to the relative volatility is weak. In the opposite case of a high beta, astock that is highly sensitive to the stock index will face a beta decline as soon as its relativevolatility decreases. Consequently, when there is a volatility shock in the market, δ ( ˜ σ i ˜ σ I ) isnegative, and therefore, the beta of stocks with a high beta and a high f is reduced. In turn,the stocks with a low beta are less impacted because f is smaller and δ (˜ σ i / ˜ σ I ) is expectedto be less negative.When the correlation of the stock with the stock index is constant, we can use a linearmodel: f ( ˜ β i ) = ˜ β i / . In fact, using the relation ˜ β i = ˜ ρ i ˜ σ i ˜ σ I and the assumption that ˜ ρ i isconstant (i.e., it does not depend on ˜ σ i ˜ σ I ), one obtains from Eq. (24) f = ˜ ρ i ˜ σ i σ I = ˜ β i / . Ingeneral, however, the correlation can depend on the relative volatility, and thus, the function f may be more complicated. To estimate f , one needs the renormalized beta and the relativevolatility. For a better estimation, we aim to reduce the heteroscedasticity even further byusing an exponential moving regression of the returns ˜ r i and ˜ r I that are renormalized by theestimated normalized index volatility ˜ σ I . We denote these renormalized returns as: ˆ r i ( t ) = ˜ r i ( t )˜ σ I ( t − , ˆ r I ( t ) = ˜ r I ( t )˜ σ I ( t − . (25)Computing the EMAs, ˆ φ i ( t ) = (1 − λ β ) ˆ φ i ( t −
1) + λ β ˆ r i ( t ) ˆ r I ( t ) , (26) ˆ σ I ( t ) = (1 − λ β )ˆ σ I ( t −
1) + λ β (cid:2) ˆ r I ( t ) (cid:3) , (27)with λ β = 1 / , we estimate the beta as: ˆ β i ( t ) = ˆ φ i ( t )ˆ σ I ( t ) . (28)Here, ˆ φ i is an estimation of the covariance between stock index returns and single stockreturns that includes two normalizations: the levels L i and L from the reactive volatilitymodel and ˜ σ I to further reduce heteroscedasticity. We write ˆ β i instead of ˜ β i to stress thisparticular way of estimating the beta. Similarly, the hat symbol in Eq. (27) is used todistinguish ˆ σ I ( t ) , computed with renormalized index returns, from ˜ σ I ( t ) . In principle, theabove estimate ˜ β could be directly regressed to the ratio of earlier estimates of ˜ σ i and ˜ σ I from Eqs. (7). However, to use the normalization by ˜ σ I consistently, we consider the ratioof these volatilities obtained in the renormalized form, i.e., ˆ σ i ( t ) / ˆ σ I ( t ) , where ˆ σ I ( t ) is givenin Eq. (27), and ˆ σ i ( t ) = (1 − λ β )ˆ σ i ( t −
1) + λ β (cid:2) ˆ r i ( t ) (cid:3) . (29)13 σ i / σ I ) − 2ln( 〈σ i / σ I 〉 ) β i − 〈 β i 〉 〉 〉 〉 〉 〈〈 Figure 3: Relation between the beta ˆ β i and the doubled logarithm of the relative volatility ln(ˆ σ i / ˆ σ I ) , from which the mean values h ˆ β i i and ln( h ˆ σ i / ˆ σ I i ) were subtracted (the mean isobtained by averaging over time for each i ). A linear regression is shown by the solid line: ˆ β i −h ˆ β i i = 0 . (cid:2) ln(ˆ σ i / ˆ σ I ) − ln( h ˆ σ i / ˆ σ I i ) (cid:3) , with R = 0 . . For better visualization, only 10,000randomly selected points are shown (by circles) among 271,958 points from the Europeandataset. Period: 2014-2015.Figure 3 illustrates the sensitivity of the beta to relative volatilities by plotting ˆ β i ( t ) fromEq. (28) versus ln(ˆ σ i ( t ) / ˆ σ I ( t )) for all stocks i and times t from 2000 to 2015, although weonly display the time frame of 2014-2015 for clarity of illustration. On both axes, we subtractthe mean values h ˆ β i i and ln( h ˆ σ i / ˆ σ I i ) averaged over all times in the whole sample. This plotenables us to measure the average of the f ( ˜ β i ) in Eq. (24), which is close to . / . .To obtain the dependence of f on the beta, we estimate the slope between ˆ β i ( t ) − h ˆ β i i from Eq. (28) and σ i ( t ) / ˆ σ I ( t )) − h ˆ σ i / ˆ σ I i ) locally around each value of ˆ β i . For thispurpose, we sort all collected values of ˆ β i and group them into successive subsets, each with10,000 points. In each subset, we estimate the slope between ˆ β i ( t ) − h ˆ β i i from Eq. (28)and σ i ( t ) / ˆ σ I ( t )) − h ˆ σ i / ˆ σ I i ) by a standard linear regression over 10,000 points. Thisregression yields the value of f of that subset that corresponds to some average value of ˆ β i . Repeating this procedure over all subsets, we obtain the dependence of f on ˆ β i , whichis plotted in Figure 4. We show that f increases with the beta. For both European andUS markets, we propose the following approximation of the function f with three di(cid:27)erent14igure 4: The function f from Eq. (24) versus the beta for the European market (bluecrosses) and the US market (red pluses). This function is estimated locally for 4 di(cid:27)erenttime periods. The solid black line shows the approximation (30). Period: 2000-2015.regimes: f ( ˜ β i ) = , ˜ β i < . , .
6( ˜ β i − . , . < ˜ β i < . , . β i > . . (30)In the (cid:28)rst regime, for low beta stocks (mostly quality and value stocks), the beta elasticity iszero, which is equivalent to the constant beta case. For the intermediate regime, the elasticityincreases linearly with ˜ β i and is close to the constant correlation case with f ( ˜ β i ) = ˜ β i / . Inthe third regime for high beta stocks (mostly speculative and growth stocks), the elasticityis constant. The shape of the beta elasticity is similar for the European market and the USmarket.2.3.2 The nonlinear beta elasticity component in the reactive modelAccording to Eq. (30), the sensitivity of the normalized beta to changes in the relativevolatility is nonlinear. This elasticity could generate bias in the beta estimation if therelative volatility changes in a sample used to measure the slope. Our solution is to adjustthe beta between normalized returns through the correction factor F ( t ) , de(cid:28)ned as: F ( t ) = 1 + 2 f ( ˜ β i ( t ))˜ β i ( t ) ∆ (cid:18) ˜ σ i ˜ σ I (cid:19) . (31)15he function f is approximated by Eq. (30), ‘ − ‘ is given by Eq. (20), and ∆ (cid:18) ˜ σ i ˜ σ I (cid:19) = ˜ σ i ( t − / ˜ σ I ( t − − p κ i ( t − p κ i ( t − (32)with κ i ( t ) = (1 − λ β ) κ i ( t −
1) + λ β (cid:18) ˜ σ i ( t )˜ σ I ( t ) (cid:19) (33)being the EMA of the squared relative volatility (˜ σ i / ˜ σ I ) . The ∆(˜ σ i / ˜ σ I ) quanti(cid:28)es deviationsof the relative volatility from its average over the sample that will be used to estimate thebeta.The correction factor F ( t ) should be used to estimate the slope between stock index andsingle stock returns and should then be used to denormalize the slope to obtain the reactivebeta that depends directly on F ( t ) . In this section, we recapitulate the reactive beta model that combines the three independentcomponents that we described in the previous sections: the speci(cid:28)c leverage e(cid:27)ect on thebeta, the systematic leverage e(cid:27)ect on correlation, and the relation between the relativevolatility and the beta. Starting with the time series I ( t ) and S i ( t ) for the stock indexand individual stocks, one computes the levels L f ( t ) , L ( t ) , and L i ( t ) from Eqs. (2, 4, 5),the normalized stock index and individual stock returns ˜ r I ( t ) and ˜ r i ( t ) from Eqs. (6), thenormalized stock index volatility ˜ σ I ( t ) from Eq. (7), the renormalized stock index andindividual stock returns ˆ r I ( t ) and ˆ r i ( t ) from Eq. (25), the associated volatilities ˆ σ I ( t ) and ˆ σ i ( t ) from Eqs. (27, 29), and the renormalized beta ˆ β i ( t ) from Eq. (28). From thesequantities, one re-evaluates the covariance between ˆ r i and ˆ r I by accounting for the leveragee(cid:27)ects and excluding the other e(cid:27)ects. In fact, we compute ˆΦ i ( t ) as an EMA of the normalizedcovariance of the normalized daily returns: ˆΦ i ( t ) = (1 − λ β ) ˆΦ i ( t −
1) + λ β ˆ r i ( t ) ˆ r I ( t ) L ( t ) F ( t ) , (34)where L ( t ) and F ( t ) are two correction factors de(cid:28)ned in Eq. (21) and Eq. (31) that areused to withdraw bias from the systematic leverage and the beta elasticity. The parameter λ β describes the look-back used to estimate the slope and is set to / , as 90 days oflook-back appears to us as a good compromise. In fact, for a longer look-back, variations inbeta, correlation and volatilities are expected to happen due to changes in market stress andbusiness cycles and are not taken into account properly by our reactive renormalization. Inturn, for a shorter look-back, the statistical noise of the slope would be too high.16inally, the stable estimate of the normalized beta is ˜ β i ( t ) = ˆΦ i ( t )ˆ σ I ( t ) , (35)with ˆ σ I ( t ) de(cid:28)ned in Eq. (27) from which the estimated reactive beta of stock i is deducedas β i ( t ) = ˜ β i ( t ) (cid:18) L i ( t ) I ( t ) S i ( t ) L ( t ) (cid:19) L ( t ) F ( t ) . (36)The estimation of the normalized stable beta ˜ β i ( t ) is close to the slope estimated by an OLS but with exponentially decaying weights to accentuate recent returns and with normalizedreturns to withdraw di(cid:27)erent biases. Then, the normalized stable beta ˜ β i ( t ) is (cid:16)denormalized(cid:17)by the factor that combines the three main components: the speci(cid:28)c leverage e(cid:27)ect on beta, ( L i /S i )( I/L ) , the systematic leverage e(cid:27)ect, L ( t ) , and nonlinear beta elasticity, F ( t ) . The(cid:28)nal beta estimation β i ( t ) is a reactive dynamic conditional estimation that is adjusted assoon as prices moves through the instantaneous variations of the 3 correction factors.Every term impacts the hedging of a certain strategy: • the term with L ( t ) does not have a signi(cid:28)cant impact on the beta, as it is compensatedin L i /L , which models the short-term systematic leverage e(cid:27)ect on the correlation inEqs. (34, 36) (introduced in Sec. 2.2), whereas the levels L i and L were introduced inthe reactive volatility model. However, it could impact the correlation by +10% if themarket decreases by . • the term with L i I/ ( LS i ) that models the speci(cid:28)c leverage e(cid:27)ect on volatilities (intro-duced in Sec. 2.1.2) could impact the beta by if the stocks underperform by .This term impacts the hedging of the short-term reversal strategy. • the term with F ( t ) that models the nonlinear beta elasticity, which is the sensitivityof the beta to the relative volatility (introduced in Sec. 2.3), could impact the beta by if the relative volatility increases by . This term impacts the hedging of thelow volatility strategy.The reduced version of the reactive beta model, when only the leverage e(cid:27)ect is introducedwithout beta elasticity and stochastic normalized volatilities, de(cid:28)nes an interesting class ofstochastic processes that appears to be a mean reverting with a standard deviation linkedto ˜ σ i p /λ s and a relaxation time linked to /λ s .The reactive beta model is based on the (cid:28)t of several well-identi(cid:28)ed e(cid:27)ects. Impliedparameters work universally for all stock markets ( ‘ − ‘ is the only one that was (cid:28)tted only We assumed that the average of daily returns was zero. That assumption makes sense as at a daily timescale, and the average of returns can be completely neglected compared to the standard deviation.
17n the US market, as the implied correlations for other countries are not traded). Here, wesummarize the di(cid:27)erent parameters used in the reactive beta model: • λ f = 0 . , which describes the relaxation time of 7 days for the panic e(cid:27)ect; • λ s = 0 . , which describes the relaxation time of 40 days for the retarded e(cid:27)ect; • l = 8 , which describes the leverage intensity of the panic e(cid:27)ect; • ‘ − ‘ = 0 . based on implied correlations on the US stock market; • the di(cid:27)erent thresholds in the function f ( ˜ β i ) from Eq. (30) that describes the nonlinearbeta elasticity. We used only daily returns. For the empirical calibration of ‘ − ‘ , we chose the CBOES&P 500 Implied Correlation Index (ICI), which is the (cid:28)rst widely disseminated market-based estimate of implied average correlation of the stocks that comprise the S&P 500 Index(SPX). This index begins in July 2009, with historical data going back to 2007. We takethe front-month correlation index data from 2007 and roll it to the next contract untilthe previous one expires. We also use the daily S&P 500 stock index. For the empiricalcalibration of the other parameters of the reactive beta model, we use the daily S&P 500stock index and the 600 largest US stocks from January 1, 2000, to May 31, 2015. For theEuropean market, we consider the EuroStoxx50 index and the 600 largest European stocksover the same period. The same data are used for both the calibration of parameters andempirical tests.For consistency, we kept the parameters of the reactive volatility model that describe theintensity of the panic e(cid:27)ect ( l ), the relaxation time of the panic e(cid:27)ect ( λ f ) and the relaxationtime of the retarded e(cid:27)ect( λ s ) identical to those that were calibrated in a period prior to2000 by Bouchaud et al. (2001), since they are seen as being universal. In this section, we show that exposure to common risk factors can sometimes lead to a highexposure of market neutral funds to the stock market index if the betas are not correctlyassessed. Indeed, although market neutral funds should be orthogonal to traditional assetclasses, this is not always the case during extreme moves (Fung and Hsieh, 1997). For in-stance, Patton (2009) tests the zero correlation against the nonzero correlation and (cid:28)nds18hat approximately 25% of the market neutral funds exhibit signi(cid:28)cant non-neutrality, con-cluding that (cid:16)many market neutral hedge funds are in fact not market neutral, but overallthey are, at least, more market neutral than other categories of hedge funds.(cid:17) The reactivebeta model can help hedge funds be more market neutral than others. To demonstrate this,we empirically test the e(cid:30)ciency of hedging using the most popular market neutral strategies(low volatility, short term reversal, momentum and size): • low volatility (beta) strategy: buying the stocks with the highest beta and shortingthose with the lowest beta (estimated by the OLS); • short term reversal: shorting the stocks with the highest one-month returns andbuying those with the lowest one-month returns; • momentum strategy: buying the stocks with the highest two-year returns andshorting those with the lowest two-year returns; • size strategy: buying the stocks with the highest capitalization and shorting thosewith the lowest capitalization.The construction of the four most popular strategies that target beta neutrality is ex-plained in Appendix B. The di(cid:27)erent portfolios are dynamic. The e(cid:30)ciency of the hedgedepends on the accuracy of the beta estimation. For each strategy, we compare two di(cid:27)erentmethods to estimate the beta that use only the past information to avoid look-ahead bias:ordinary least squares (OLS) (which corresponds to a speci(cid:28)c case of our model with L i = S i , L = I , ‘ = ‘ = 0 , and f = 0 , with the same exponential weighting scheme) and our reactivemethod. We analyze two statistics: • Statistic 1: the CorSTD, which is de(cid:28)ned as the standard deviation of the 90-day cor-relation of the strategy with the stock index returns, describes the lack of robustnessof the hedge and, consequently, the ine(cid:30)ciency of the beta measurement. The morerobust the strategy is, the lower the CorSTD statistics are. If the strategy was wellhedged, the correlation would (cid:29)uctuate around , within the theoretical stan-dard deviation, and CorSTD would be (a CorSTD of is obtained with twoindependent Gaussian variables for 90-day correlations). • Statistic 2: the Bias, which is de(cid:28)ned as the correlation of the strategy with the stockindex returns on the whole period, describes the bias in the hedge of the strategy and,therefore, the bias of the beta measurement.These statistics present a proxy for assessing the quality of the beta measurement, whichis very di(cid:30)cult to realize directly, as true betas are not known.19able 1 summarizes the statistics of the four strategies for the US and Europe markets.We see the highest bias for the low volatility strategy when hedged with the standard ap-proach ( − . for USA and − . for Europe). The CorSTD is approximately , i.e.,twice as high as expected if the volatility was stable, which means that the e(cid:30)ciency of thehedge is time-varying. This could represent an important risk for the funds of funds man-agers, where hidden risk could accumulate and arise especially when the market is stressed.Indeed, the bias seems to have been higher by approximately − for both the USA andEurope when the market was stressed in 2008. The use of the reactive beta model reducesthe bias in the low volatility factor, and the residual bias comes from the selection bias(see Appendix A). When using the OLS, the possible loss in 2008 would have been − . ( = − × × / ) for a stock decline with a fund invested entirely in a lowvolatility anomaly with a bias of − and a target annualized volatility of for the fundand for the index.We also see a signi(cid:28)cant bias for the short-term reversal strategy when hedged withthe standard approach (approximately . in the USA and in Europe). The CorSTD isapproximately . The e(cid:30)ciency of the hedge depends on the recent past performanceof the strategy. As soon as the strategy starts to lose, the e(cid:30)ciency will decline and riskwill arise, as in 2009. Again, we see that the reactive beta model reduces the bias in theshort-term reversal factor. The biases and CorSTD are lower for the momentum strategy( − . in the USA, with a CorSTD of . ) and are of the same magnitude for the sizestrategy ( − . in the USA with a CorSTD of . ). The reactive beta model furtherreduces the bias and the CorSTD. This is also valid for the European market.We conclude that the reactive beta model reduces the bias of the low volatility factorwhen it is stressed by the market. The remaining residual is most likely explained by theselection bias (see Appendix A for a formal proof). The improvement is more signi(cid:28)cant forthe momentum factor and the size factor in the U.S. only.We also illustrate these (cid:28)ndings by presenting the correlation between the stock index andthe low volatility strategy (Figure 5) and the short-term reversal strategy (Figure 6), whichare the strategies with the highest bias. A period surrounding the (cid:28)nancial crisis was chosen(2007-2010). One can see that the beta computed by the OLS is highly positively exposedto the stock index in 2008. In turn, the exposure is reduced within the reactive model. Theimprovement becomes even more impressive in extreme cases when the strategies are stressedby the market. We see that in some extreme cases (a stress period with extreme strategies),the common approach could generate high biases ( − for the short-term reversal strategiesin 2008-2009 and − for the beta strategy in 2008). In each case, our methodology allowsone to signi(cid:28)cantly reduce the bias. 20
007 2008 2009 2010 year -1-0.500.51 da ys c o rr e l a t i on (a) OLS methodreactive method year -1-0.500.51 da ys c o rr e l a t i on (b) OLS methodreactive method
Figure 5: Ninety-day correlation of the low volatility factor with the stock index in theEuropean market (a) and in the USA market (b). Solid and dashed lines present theproposed reactive beta model and the OLS methodology, respectively. The dotted horizontalline shows the selection bias of − . , as shown in Appendix A. A time frame surroundingthe (cid:28)nancial crisis is chosen. Period: 2007-2010.21
007 2008 2009 2010 year -1-0.500.51 da ys c o rr e l a t i on (a) OLS methodreactive method year -1-0.500.51 da ys c o rr e l a t i on (b) OLS methodreactive method
Figure 6: Ninety-day correlation of the short-term reversal factor with the stock index in theEuropean market (a) and in the US market (b). Solid and dashed lines present the proposedReactive beta model and the OLS methodology, respectively. A time frame surrounding the(cid:28)nancial crisis is chosen. Period: 2007-2010. 22trategy \ Method OLS Reactive U S Statistics : Bias CorSTD Bias CorSTDlow volatility -25.54% 21.73% -16.79% 21.43%short-term reversal 13.09% 18.96% -6.06% 18.50%momentum -6.27% 18.28% -2.95% 16.54%size -7.56% 17.00% -1.84% 17.26% E u r o p e low volatility -22.39% 19.97% -14.68% 20.94%short-term reversal 13.05% 17.51% 0.64% 14.52%momentum -4.42% 18.03% -1.55% 17.23%size 3.12% 17.15% 3.79% 15.63%Table 1: Bias is de(cid:28)ned as the correlation over the whole sample between the stock indexand each of the OLS and Reactive strategies for the US and Europe markets. CorSTD isde(cid:28)ned as the standard deviation of the 90-day correlation over the whole sample betweenthe stock index and each of the OLS and Reactive strategies for the US and Europe markets.The residual bias for the low volatility strategy in the reactive method can be explained bythe selection bias, as demonstrated in Appendix A. Period: 2000-2015. This section presents a robustness check analysis by comparing the quality of several methodsfor beta measurements against the reactive beta model. We build the comparative analysisbased on two important articles. Chan and Lakonishok (1992) enables the assessment of therobustness statistics of some alternative methods to the classical ordinary least squares (OLS)method when assuming implicitly that betas are static and returns are homoscedastic. Thissection extends their work by including alternative dynamics beta estimators to be consistentwith our reactive model and with the work by Engle (2016) that demonstrates that the betasare signi(cid:28)cantly time-varying using dynamic conditional betas. The models and the methodsare presented in detail in Appendix C.
In (cid:28)nancial research, one often resorts to simulated data to estimate the error of measure-ments. For instance, Chan and Lakonishok (1992) built their main results on numericalsimulation while applying real data for simple comparison between betas estimated withOLS and quantile regression (QR).The comparative analysis is based on a two-step procedure. The (cid:28)rst step simulatesreturns using di(cid:27)erent models that capture some market patterns, and the second stepestimates the beta from simulated returns by using our reactive method and alternative23ethods. We tested the same estimators as used by Chan and Lakonishok (1992), includingOLS, the minimum absolute deviation (ABSD), and the Trimean quantile regression (TRM).We also added two variations of the dynamical conditional correlation (DCC), which hasbecome a mainstream model to measure the conditional beta when the beta is stochastic(Bollerslev et al., 1988; Bollerslev, 1990; Engle, 2002; Cappiello et al., 2006). We analyze theerror of measurements, which we de(cid:28)ned as the di(cid:27)erence between the measured beta andthe true beta of the simulated data.4.1.1 The (cid:28)rst step: simulationThe (cid:28)rst step simulates 30,000 paths of T =1,000 consecutive returns for both the stockindex ( r I ) and the single stock ( r i ). It also allows one to generate 1,000 conditional (cid:16)true(cid:17)expected betas per path (Fig. 7). To that end, following Chan and Lakonishok (1992),normally distributed residuals and Student-t distributed residuals are considered to takeinto consideration the robustness of di(cid:27)erent methods to outliers.In our setting, we implemented seven Monte Carlo simulations for the returns r i and r I .In the simulations, we target the realistic case of an unconditional single stock annualizedvolatility of 40%, an unconditional stock index volatility of 15% and an unconditional betaof 1. We also target the realistic case of a correlation between the index and the stock of0.4, since the relative precision of the beta measurement is inversely proportional to thesquare root of the number of returns when the correlation is close to zero. First, we considerthe naive version of the market model, based on Eq. (11), which we call (cid:16)the basic marketmodel(cid:17). For the case of a constant beta, as in the paper by Chan and Lakonishok (1992),the simulated data are based on the hypothesis of a null intercept, and the beta is set equalto to characterize the ideal case with a Gaussian (MC1) or a Student-t distribution (MC2)for residuals. In the most simple reactive version of the market model, which we call (cid:16)thereduced reactive market model(cid:17) (MC3 and MC4), normalized returns ˜ r i and ˜ r I are (cid:28)rstgenerated randomly through Eq. (12) with a normalized beta set to 1. Then, based on thelevels L s , L is , which are respectively the slow moving averages of the stock index and thestock prices de(cid:28)ned in Eq. (1), we generate δI and δS de(cid:28)ned in Eq. (6) and then r i and r I .Finally, we update L s and L is . That model is su(cid:30)cient to capture the leverage e(cid:27)ect on betawith increasing beta as soon as a single stock underperforms the stock index. Even if thenormalized beta is set to unity, the denormalized beta in Eq. (13) becomes time dependent(Fig. 7). As previously, MC3 and MC4 di(cid:27)er by the distribution of residuals, Gaussian(MC3) versus Student-t (MC4).The full reactive market model (MC5) includes all the components described in Sec. 2,i.e., the leverage e(cid:27)ect and the nonlinear beta elasticity. For the full version, we generatedstochastic ˜ σ i and ˜ σ I , which generate ˜ r i and ˜ r I from Eq. (12), using the normalized beta (cid:28)xedto F ( t ) L ( t ) (see de(cid:28)nitions in Eqs. (31) and (21)). This allows the generation of returnsthat capture the leverage e(cid:27)ect pattern and the empirical nonlinear beta elasticity (Fig. 324nd Fig. 4).We also used another way to generate random returns that captures a time-varying betathrough the implementation of the dynamic conditional correlation (DCC) model (Engle,2002, 2016), which generalizes the GARCH(1,1) process to two dimensions. This is a main-stream model that has two variations: symmetric and asymmetric, the latter capturing theleverage e(cid:27)ect. The symmetric and asymmetric versions of the DCC model are denoted asMC6 and MC7, respectively.To summarize, the seven Monte Carlo simulations are the following: • MC 1: The basic market model in Eq. (12), where residuals ( (cid:15) i ) are normally dis-tributed, and the constant beta is set to 1. • MC 2: The basic market model in Eq. (12), where residuals ( (cid:15) i ) follow a Student-tdistribution (with three degrees of freedom), and the constant beta is set to 1. • MC 3: The reduced reactive market model in Eq. (12), where residuals ( ˜ (cid:15) i ) are normallydistributed with constant volatilities ( ˜ σ i , ˜ σ I ) and constant renormalized beta ( ˜ β ) is setto 1, but the denormalized beta is now time-dependent (Fig 7). The conditional beta( β ) is a mean reversion process with a relaxation time /λ s = 40 days. MC3 includesonly the leverage e(cid:27)ect and ignores the nonlinear beta elasticity. • MC 4: The reduced reactive market model in Eq. (12), where residuals ( ˜ (cid:15) i ) follow aStudent-t distribution (with three degrees of freedom) with constant relative volatilityand a constant renormalized beta set to 1, as in MC3. • MC 5: The full reactive market model in Eq. (12), where residuals ( ˜ (cid:15) i ) follow aStudent-t distribution (with three degrees of freedom) whose standard deviation ( s i )is stochastic and where the normalized stock index return ( ˜ r I ) is a Gaussian whosestandard deviation ( s I ) is also stochastic. We suppose that log( s I ) and log( s i ) − log( s I ) follow two independent Ornstein-Uhlenbeck processes (with a relaxation time of 100days and a volatility of volatility of 0.04). In that way, the stock index annualizedvolatility could jump up to 40%. The normalized beta that was set to 1 in MC4 isnow set to F ( t ) L ( t ) to take into account the nonlinear beta elasticity (see de(cid:28)nitionsin Eqs. (31) and (21)). Both the leverage e(cid:27)ect and stochastic normalized volatilitiesmake the volatilities and the beta de(cid:28)ned in Eq. (36) time-dependent (Fig. 7). • MC 6: The symmetric DCC model in two dimensions, which generates volatilities ofvolatilities and a correlation of similar amplitude as MC5 (Fig. 7). • MC 7: The ADCC model in two dimensions, which generates volatilities of volatilitiesand a correlation of similar amplitude as MC5 (Fig. 7).25n Fig. 7, we plot a Monte Carlo path generated for a true beta for MC 3 to 7 (MC1 andMC2 are excluded, as they generate a true beta of 1). We also plot the conditional correlationand volatilities that are highly volatile and thus make the estimation of the conditional betacomplicated.4.1.2 The second step: measurementsThe second step is devoted to the analysis of the error measurement of the beta estimations,de(cid:28)ned as the di(cid:27)erence between the measured beta and the true beta of the simulated data.In our setting, we test 5 alternative beta estimations that should replicate the true beta asclosely as possible. Note that in all (cid:28)ve con(cid:28)gurations, we use an exponentially weightedscheme to give more weight to recent observations, to be in line with the reactive marketmodel ( /λ β = 90 ). Consequently, in a path of T =1,000 generated returns, only the 90 lastreturns truly matter (note that Chan and Lakonishok (1992) is based on the statistics from35 returns with an equal weight scheme). The (cid:28)rst alternative method is the ordinary leastsquares (OLS) of the returns, which was also implemented in the empirical test based onreal data. Note that the OLS would give the same measurement as our reactive method ifthe parameters were set di(cid:27)erently ( λ s = 1 , λ f = 1 , l = l = 0 , f = 0 ). The square errors inthe OLS are weighted by (1 − λ β ) T − t . The second method estimates the beta by using theminimum absolute deviation (MAD), which is supposed to be less sensitive to outliers becauseabsolute errors are minimized instead of square errors. The absolute errors are weighted by (1 − λ β ) T − t . The third alternative is the beta computed from the Trimean quantile regression(TRM), which is reputed to be more robust to outliers according to Chan and Lakonishok(1992). The absolute errors are also weighted by (1 − λ β ) T − t . The fourth and (cid:28)fth methodsare the conditional beta computed from the DCC model. The DCC method was calibratedusing the same exponential (1 − λ β ) T − t weights introduced in the log-likelihood function toextract the optimal unconditional volatilities and correlations, while other parameters suchas the relaxation time and volatilities of volatilities and volatilities of correlations were setto the values that were used for the Monte Carlo simulation.We summarize the reactive method and the (cid:28)ve alternative methods that were imple-mented to estimate the beta as follows: • β OLS : beta estimated by the ordinary least squares method; • β MAD : beta estimated by the minimum absolute deviation method; • β T RM : beta estimated by the trimean quantile regression; • β DCC : T th conditional beta estimated by using the DCC model; We assumed that the average of daily returns was zero. That assumption makes sense because at a dailytime scale, the average of returns can be completely neglected compared to the standard deviation. β ADCC : T th conditional beta estimated by using the ADCC model; • β R : beta estimated by the reactive method in Eq. (36).4.1.3 The statisticsWe analyze for every path the error of measurement, de(cid:28)ned as the di(cid:27)erence between themeasured beta based on di(cid:27)erent methods applied to T returns and the true beta value attime T .To assess the quality of di(cid:27)erent methods, we use three statistics following Chan andLakonishok (1992). The (cid:28)rst statistic is the bias, which gives the average error of measure-ment. Obtaining the bias is more informative than simply obtaining an estimated averageestimation of the beta, because in our case, the true beta is not always 1 but (cid:29)uctuatesaround 1 for time-varying models from MC3 to MC7. Because we focus on capturing theleverage e(cid:27)ect in the beta measurement, we also de(cid:28)ne winner (loser) stocks, which are stocksthat have outperformed (underperformed) the stock index during the last month. Due tothe leverage e(cid:27)ect, the OLS method is expected to underestimate the beta for loser stocksand to overestimate the beta for winner stocks. It would be interesting to see how robustthe improvement of the reactive beta estimation is. We therefore measure the average erroramong the loser stocks and among the winner stocks. The loser and winner biases are relatedto the bias in hedging of the short-term reversal strategy measured on real data, and theycan con(cid:28)rm the robustness of the empirical measurements. We also de(cid:28)ne the low (high)beta stocks, which are the stocks whose conditional true beta is lower (higher) than 1. Wemeasure the average error among low and high beta stocks that are related to the bias inhedging of the low beta strategy measured from real data and can con(cid:28)rm the robustness ofthe beta measurement when adding the component describing the nonlinear beta elasticity.The second statistic is the ABSolute Deviation (ABSD) of a measurement. It re(cid:29)ects theaverage absolute errors such that the positive and negative sign errors cannot be mutuallycompensated. It is a measurement of the robustness.The third statistic, which is equivalent to ABSD, is the inverse of the variance of the errorsof measurement ( V OLS V m ) to characterize the relative robustness of the alternative beta estima-tion. The alternative beta method (with subscript m ) that brings the highest improvementis the one with the highest ratio.The three statistics that were implemented are summarized as follows: • Statistic 1: the bias, the winner bias and the loser bias, the low beta bias and the highbeta bias; • Statistic 2: the absolute deviation of measurement (ABSD); • Statistic 3: the relative variance statistics V OLS V m .27 .2 Empirical tests We summarize the statistics in Table 2. We see that all methods are unbiased on averagein most Monte Carlo simulations. However, this is misleading, as biases from one group ofstocks can be signi(cid:28)cant and can o(cid:27)set others.4.2.1 Winner and loser biasThe estimated β DCC and β ADD appear to be biased as soon as fat tails are included (MC2).The reactive beta is the only one to be unbiased for winner and loser stocks when theleverage e(cid:27)ect is introduced in Monte Carlo simulations (MC 3, 4, 5). The biases for winnerstocks and loser stocks are signi(cid:28)cant for all methods except for the reactive beta. The biasesare ampli(cid:28)ed when a fat tail of residual distribution is introduced (MC 4). Winner/loserbiases can reach 14%. This is in line with the empirical test implemented on real data,where we see that the reactive method reduces the bias of hedging of the short-term reversalstrategy (Tab. 1).When all components that deviate from the Gaussian market model are mixed in MC5(fat tails, nonlinear beta elasticity, stochastic volatilities, leverage e(cid:27)ect), we see a kind ofcocktail e(cid:27)ect, as bias is generated for most methods on average and not only in some groupsof stocks. The reactive method provides the best results and is the only method that has nobias. β MAD and β T RM , which were supposed to be robust, appear to perform very poorly,with high bias (average, loser or winner) as soon as the stochastic volatility is added, whichis con(cid:28)rmed with MC6 and MC7.We also see that the reactive model looks to be incompatible with the DCC and ADCCmodels. Indeed, MC5 generates high bias for β DCC and β ADD in the winner and loserstocks even if the leverage e(cid:27)ect and the dynamic beta are implemented in the ADCC. Inthe same way, MC 6 generates bias for the reactive method that is even ampli(cid:28)ed whenleverage e(cid:27)ect is generated through MC7. We can wonder which model is the most realistic.Both ADCC and the reactive model capture the volatility clustering and leverage e(cid:27)ectpatterns, but their dynamics are very di(cid:27)erent. For example, in the reactive model, volatilityincreases as soon as the price decreases, and it decreases as soon as the price increases. Incontrast, the volatility in ADCC increases only if the return is more negative than theunconditional standard deviation ( γ (cid:0) σ i [ ξ − i ( t )] − ˜ σ i (cid:1) > , see Eqs. (67, 69)). The reactivebeta model has three components that were tailored to three well-identi(cid:28)ed e(cid:27)ects (thespeci(cid:28)c leverage through the retarded e(cid:27)ect, the systematic leverage through the panic e(cid:27)ectand the nonlinear beta elasticity) whose main parameters appear to be stable and universalfor all markets. Bouchaud et al. (2001) measured most of the parameters for seven main stockindexes. The relaxation time is approximately 1 week for the panic e(cid:27)ect ( λ s = 0 . ), therelaxation time is 40 days for the retarded e(cid:27)ect ( λ s = 0 . ), and the leverage parameterfor the panic e(cid:27)ect is l = 8 . The systematic leverage parameter on correlation ‘ − ‘ = 0 . β = 0 . for the univariateGARCH), 4 days for French stocks (the decay factor is β = 0 . ) and 14 days for Spanishstocks (the decay factor is β = 0 . ). It is not surprising to see this variation if simpleautoregressive conditional heteroskedasticity cannot capture the complexity of the leveragee(cid:27)ect. We think it would be better to apply autoregressive conditional heteroskedasticity tomodel the residual part of the heteroskedasticity of returns once the part due to the leveragee(cid:27)ect is withdrawn through normalized returns from the reactive model. The relaxation timein that case is expected to be a couple of months.4.2.2 High and low beta biasThe reactive beta is the only one that reduces the bias for low and high beta stocks whenstochastic volatility is introduced and when the empirical nonlinear beta elasticity is imple-mented (MC 5). This is in line with the empirical test applied to real data, where we seethat the reactive method reduces the bias of hedging of the low volatility strategy (Tab. 1).4.2.3 ABSD and V OLS /V m The β OLS , which is the theoretical optimal estimation for Monte Carlo simulated returnswith the Gaussian market model (MC1), gives similar statistics to that of the reactive betafor the MC3. In this case (MC3), the reactive method outperforms the other consideredmethods. The ABSD of 0.17 is entirely explained by irreducible statistical noise that isintrinsic to any regression based on approximately 90 points with a weak correlation.When a fat tail is incorporated to the residual (MC4), the ABSD of the reactive beta isincreased and becomes intermediate between the ABSD of β OLS , β MAD and β T RM . β MAD and β T RM are more robust in the presence of fat tails. The reactive beta is expected to beas sensitive as the OLS would be due to the outliers. The reactive method could be still29mproved if a TRM regression were implemented instead of the classical OLS to measure thenormalized beta between normalized returns. When stochastic volatility and correlation areintroduced (MC5, MC6 and MC7), the reactive beta becomes as robust as β MAD and β T RM based on ABSD.
Method Bias Winner Bias Loser Bias Low Bias High Bias ABSD Vols/VmMC1 Gaussian basic market model β OLS -0.00 -0.00 -0.00 0.16 1.00 β Reactive 0.00 -0.05* 0.05* 0.18 0.79 β DCC β ADCC β MAD -0.00 0.00 -0.01 0.20 0.65 β TRM -0.00 0.00 -0.01 0.20 0.68MC2 t-Student basic market model β OLS -0.00 0.01 -0.01 0.28 1.00 β Reactive 0.01 -0.06* 0.08* 0.31 0.82 β DCC β ADCC β MAD -0.00 -0.00 -0.00 0.22 2.18 β TRM -0.00 -0.00 -0.00 0.22 2.24MC3 Gaussian reduced reactive market model β OLS -0.00 0.07* -0.07* 0.07* -0.07* 0.19 1.00 β Reactive -0.00 0.02* -0.02* 0.02* -0.02* 0.17 1.27 β DCC β ADCC β MAD -0.01 0.06* -0.08* 0.06* -0.08* 0.22 0.73 β TRM -0.01 0.06* -0.08* 0.06* -0.08* 0.22 0.75MC4 t-Student reduced reactive market model β OLS β Reactive -0.01 0.02 -0.04* 0.03 -0.05* 0.31 1.30 β DCC β ADCC β MAD -0.03* 0.09* -0.14* 0.10* -0.14* 0.27 2.68 β TRM -0.03* 0.09* -0.14* 0.10* -0.14* 0.27 2.76MC5 t-Student full reactive market model β OLS -0.01 0.13* -0.14* 0.14* -0.22* 0.50 1.00 β Reactive -0.04* -0.00 -0.07* 0.05* -0.17* 0.41 1.42 β DCC -0.01 0.10* -0.12* 0.20* -0.32* 0.52 1.31 β ADCC β MAD -0.09* 0.04* -0.22* 0.09* -0.37* 0.38 2.43 β TRM -0.09* 0.04* -0.22* 0.09* -0.36* 0.37 2.46MC6 Gaussian symmetric DCC model β OLS -0.11* -0.10* -0.11* 0.06* -0.27* 0.32 1.00 β Reactive -0.07* -0.11* -0.02 0.09* -0.23* 0.33 0.93 β DCC -0.01 -0.00 -0.02* -0.01 -0.01 0.16 4.09 β ADCC β MAD -0.14* -0.13* -0.15* 0.04* -0.32* 0.34 0.89 β TRM -0.14* -0.13* -0.15* 0.04* -0.32* 0.34 0.90MC7 Gaussian asymmetric DCC model β OLS -0.09* 0.03 -0.24* 0.09* -0.25* 0.30 1.00 β Reactive -0.07* 0.02 -0.17* 0.10* -0.21* 0.27 1.21 β DCC -0.04* 0.04* -0.15* -0.00 -0.08* 0.21 2.08 β ADCC -0.01 -0.01 -0.01 -0.00 -0.01 0.15 3.74 β MAD -0.13* -0.02 -0.28* 0.06* -0.29* 0.32 0.92 β TRM -0.13* -0.01 -0.28* 0.06* -0.29* 0.32 0.92
Table 2: Monte Carlo robustness tests. Statistics are provided for seven Monte Carlo simu-lations and six di(cid:27)erent methods to estimate the beta. We estimated statistics such as bias,which is the average error of beta measurements. Winner/loser biases are the biases amongwinner/loser stocks. Low/High biases are the biases among low/high beta stocks. ABSDis the average of the error in absolute value. Vols/Vm is the variance of the error in theOLS case divided by the variance of the error. * indicates a bias greater than 3 standarddeviations. 30
00 200 300 400 500 600 days
MC4MC5MC6MC7
100 200 300 400 500 600 days
MC4MC5MC6MC7
100 200 300 400 500 600 days I MC4MC5MC6MC7
100 200 300 400 500 600 days i MC4MC5MC6MC7
100 200 300 400 500 600 days i / I MC4MC5MC6MC7
Figure 7: Simulated paths for models MC4 (cid:21) MC7. The true conditional beta (top), trueconditional correlation (middle left), true conditional stock index volatility (middle right),true conditional single stock volatility (bottom left), and true conditional relative volatility(bottom right) are plotted. Paths limited to 500 days, which are independent from model tomodel, capture the same order of magnitude of variation in volatilities, beta and correlation.31
Open problems in other (cid:28)elds
The estimated beta is used in a wide range of (cid:28)nancial applications, including security val-uation, asset pricing, portfolio management and risk management. This extends also tocorporate (cid:28)nance in many applications, such as (cid:28)nancing decisions to quantify risk associ-ated with debt, equity and assets and (cid:28)rm valuation when discounting cash-(cid:29)ows using theweighted average cost of capital. The most likely reason is that the beta describes systematicrisk that could not be diversi(cid:28)ed and that should be remunerated. However, as explained,the OLS estimator of the beta is subject to measurement errors, which include the presenceof outliers, time dependence, the leverage e(cid:27)ect, and the departure from normality.
Bali et al. (2017) apply the DCC model by Engle (2016) to assess the cross-sectional variationin expected stock returns. They estimate the conditional beta for the S&P 500 using dailydata from 1963 to 2009. They test whether the betas have predictive power for the cross-section of individual stock returns over the next one to (cid:28)ve days. They show that thereis no link between the unconditional beta and the cross-section of expected returns. Mostremarkably, they also show that the time-varying conditional beta is priced in the cross-section of daily returns. At the portfolio level, they indicate that a long-short trading strategyof buying the highest conditional beta stocks and selling the lowest conditional beta stocksyields average returns of 8% per year. Thus, conditional CAPM is empirically valid, whereasunconditional CAPM is not empirically valid. Moreover, they show that improvements inbeta measurement from unconditional to conditional betas would not have signi(cid:28)cant pricingimpacts on major anomalies (size, book, momentum...). Thus, one can see that DCC greatlychanges the pricing of the low volatility anomaly that disappears and improves the empiricalvalidation of the CAPM but does not change the pricing of other major anomalies. Weexpect that the reactive method can bring further improvements. Indeed, as revealed byour robustness tests in Sec. 4, the leverage e(cid:27)ect and the nonlinear beta elasticity are likelyto generate bias in the DCC estimation. Because our reactive method was designed tocorrect for these biases, its use can help reveal pricing e(cid:27)ects of the dynamic beta on majoranomalies. This point is an interesting perspective for future research.
To determine a fair discount rate for valuing cash-(cid:29)ows, the (cid:28)rm’s manager must selectthe appropriate beta of the project given that the discount rate remains constant over time,while the project may exhibit signi(cid:28)cant variation over time and the leverage e(cid:27)ect due to thedebt-to-equity ratio. As such, Ang and Liu (2004) discuss how to discount cash-(cid:29)ows withtime-varying expected returns in a traditional set-up. For instance, the traditional dividend32iscount model assumes that the expected return along with the expected rate of cash-(cid:29)owgrowth are set as constant while they are time-varying and correlated. In practice, in the(cid:28)rst step, the manager computes the expected future cash-(cid:29)ows from (cid:28)nancial forecasts. Inthe second step, the manager uses a constant discount rate, usually relying on the CAPMfor the discounting factor. In contrast, Ang and Liu (2004) derive a valuation formula thatincorporates the correlation among stochastic cash-(cid:29)ows, betas and risk premia. They showthat the greater the magnitude of the di(cid:27)erence between the true discount rate and theconstant discount rate, the greater the project’s misvaluation. They even show that whencomputing perpetuity values from the discounting model, the potential mispricing can evenbecome worse. They conclude that accounting for time-varying expected returns can leadto di(cid:27)erent prices from using a constant discount rate from the traditional unconditionalCAPM. The impact of the leverage e(cid:27)ect and of the nonlinear elasticity of the beta onpotential mispricing deserves to be investigated. Indeed, our results seem to indicate thatthe mispricing might be higher for low and high beta stocks over a long period. This couldbe an interesting topic for future work.
Notice that in this paper, the reactive beta model is tailored for stocks. However, it couldhelp to withdraw some bias in a context involving assets other than stocks such like hedgefunds or mutual funds. Indeed the simple market neutral strategies (Short term reversal,momentum, size) can be extended to simple directional strategies (contrarian, trend follow-ing) to model the behaviors of funds managers. Some identi(cid:28)ed bias in beta measurementdescribed in Sec 3.2 captured by the market neutral strategies are also most likely to occurin the directional ones.An application of the reactive beta model on hedge funds would raise interesting concernsabout a better estimation of non-linearity features that would stem from option-like strate-gies or higher moments as documented by the literature: Fung and Hsieh (2001) warn thathedge funds employ dynamic trading strategies that have option-like returns even if themanager does not trade in derivatives markets. This means that asset pricing models ofinvestment styles are not designed to capture non-linear returns that commonly character-ized hedge fund industry. Agarwal and Naik (2004) observe that hedge funds report largelosses during crisis episodes, which suggests that they may be bearing signi(cid:28)cant left-tailrisk particularly during large market downturn. They (cid:28)nd that the non-linear option likepay-o(cid:27)s from a wide range of equity-oriented hedge funds resembles to a strategy of writ-ing a put option on the equity index. Recall that hedge funds generally employ long-shortdynamic strategies to capture non-standard risk premia, in constrast to mutual funds thatemploy overall long position on buy-and-hold strategies to capture standard risk premia likeequity/bond risk premia. Agarwal, Arisoy and Naik (2004) build on an augmented versionof the Fung and Hsieh (2004) a seven-factor model to (cid:28)nd that hedge funds with greater33everage, longer time in existence and larger assets under management have more negativeuncertainty betas. This echoes the (cid:28)ndings of Bali, Brown and Tang (2017) for stocks thatprovide evidence of signi(cid:28)cant non-linearity in uncertainty premium. Agarwal, Green, andand Ren (2018) measure risk-adjusted hedge fund performance using a range of single andmulti-factor models to (cid:28)nd that, surprisingly, hedge fund (cid:29)ows are being better explainedby CAPM alpha than by more sophisticated models. This echoes the (cid:28)ndings of Berk andvan Binsbergen (2016) who (cid:28)rst use capital (cid:29)ows of mutual funds for asset pricing modelsto (cid:28)nally conclude that the CAPM better explains risk than no model at all.An application on mutual funds would also raise interesting concerns about the estimationerrors in the individual beta estimate because beta is exposed to estimation errors for in-dividual stocks (see e.g. Chordia, Goyal and Shanken (2015)). But since mutual funds arethemselves diversi(cid:28)ed portfolios, it should alleviate the estimation error in the beta estimate.This is important because it addresses the controversy in the literature as to whether someexpected return variations associated with factor loadings (betas) are due to economic risk,or are due to mispricing e(cid:27)ects linked to this measurement error. At the same time, usingportfolios could also hide some precious information that exists at the individual stock levelas documented by the literature (Black et al., 1972; Fama and MacBeth, 1973). Such aninvestigation on alternative asset classes is left open towards a future research.
We propose a reactive beta model with three components that account for the speci(cid:28)c lever-age e(cid:27)ect (when a stock underperforms, its beta increases), the systematic leverage e(cid:27)ect(when a stock index declines, correlations increase), and beta elasticity (when relative volatil-ity increases, the beta increases). The three components were (cid:28)tted and incorporated throughelaborate statistical measurements. An empirical test is run from 2000 to 2015 with exhaus-tive data sets including both American and European securities. We compute the bias forhedging the most popular market neutral strategies (low volatility, short-term reversal, mo-mentum and capitalization) using the standard approach of the beta measurement and thereactive beta model. Our main (cid:28)ndings emphasize the ability of the reactive beta modelto signi(cid:28)cantly reduce the biases of these strategies, particularly during stress periods. Wefurther extend the research design to include robustness checks based on simulated data tocompare the reactive method with (cid:28)ve alternative methods (ordinary least squares, minimumabsolute deviation, trimean quantile regression, and dynamic conditional correlation with orwithout asymmetry) over seven Monte Carlo scenarios re(cid:29)ecting di(cid:27)erent market conditionsfrom calm (Gaussian residuals, no leverage e(cid:27)ect, constant beta) to stress (non-Gaussianresiduals, leverage e(cid:27)ect, nonlinear beta elasticity, stochastic volatility, nonconstant volatil-ity of volatility and volatility of correlation). We (cid:28)nd that the overall results con(cid:28)rm that thereactive beta presents the lower bias when stressed market conditions are included. Further,34he reactive model can be useful in other empirical applications such as asset pricing andcorporate (cid:28)nance and alternative asset classes such as hedge funds and mutual funds. Thisprovides a good starting point for future research.35 eferences
Acharyaa, V., and L. H., Pedersen. (cid:16)Asset Pricing with Liquidity Risk.(cid:17) Journal of Financial Economics, 77(2005), pp. 375-410.Agarwal, V., and N. Y. Naik. (cid:16)Risks and Portfolio Decisions Involving Hedge Funds.(cid:17) Review of FinancialStudies, 17 (2004), pp. 63-98.Agarwal, V., Y.E., Arisoy, and N.Y., Naik. (cid:16)Volatility of Aggregate Volatility and the Cross-Section of HedgeFund Returns.(cid:17) Journal of Financial Economics, 125 (2017), pp. 491-510.Agarwal, V., T.C., Green, and H., Ren. (cid:16)Alpha or Beta in the Eye of the Beholder: What Drives HedgeFund Flows?(cid:17) Journal of Financial Economics, 127 (2018), pp. 417-434.Amihud, Y. (cid:16)Illiquidity and Stock Returns: Cross-section and Time-series E(cid:27)ects.(cid:17) Journal of FinancialMarkets, 5 (2002), pp. 31-56.Ang, A., and G. Bekaert. (cid:16)International Asset Allocation with Time-Varying Correlations(cid:17) Review of Finan-cial Studies, 15 (2000), pp. 1137-1187.Ang, A., and J., Chen. (cid:16)Asymmetric Correlations of Equity Portfolios.(cid:17) Journal of Financial Economics, 63(2002), pp. 443-494.Ang, A., and J. Liu. (cid:16)How to Discount Cash(cid:29)ows with Time-Varying Expected Returns.(cid:17) Journal of Finance,59, 6 (2004), pp. 2745-2783.Ang, A., and J. Chen. (cid:16)CAPM over the Long Run: 1926-2001.(cid:17) Journal of Empirical Finance, 14 (2007), pp.1-40.Ang, A., R. Hodrick, Y. Xing, and X. Zhang. (cid:16)The Cross-Section of Volatility and Expected Returns.(cid:17)Journal of Finance, 61 (2006), pp. 259-299.Ang, A., R., Hodrick, Y. Xing, and X. Zhang. (cid:16)High Idiosyncratic Volatility and Low Returns: Internationaland Further U.S. Evidence.(cid:17) Journal of Financial Economics, 91 (2009), pp. 1-23.Ang, A., A. Shtauber, and P. C. Tetlock. (cid:16)Asset Pricing in the Dark: The Cross-Section of OTC Stocks.(cid:17)Review of Financial Studies, 26 (2013), pp. 2985-3028.Asness, C., R. Krail, and J. Liew. (cid:16)Do Hedge Funds Hedge?(cid:17) Journal of Portfolio Management, 28 (2001),pp. 6-19.Baker, M., B. Bradley, and R. Taliaferro. (cid:16)The Low Beta Anomaly: A Decomposition into Micro and MacroE(cid:27)ects.(cid:17) Working paper, Harvard Business School, 2013.Bali, T. G. and R. F. Engle, and Y. Tang. (cid:16)Dynamic Conditional Beta Is Alive and Well in the Cross Sectionof Daily Stock Returns.(cid:17) Management Science, 63, 11 (2017), pp. 3760-3779.Bali, T.G., S.J., Brown, and Y. Tang. (cid:16)Is Economic Uncertainty Priced in the Cross-Section of Stock Re-turns?(cid:17) Journal of Financial Economics, 126 (2017), pp. 471-489. anz, R. W. (cid:16)The Relationship Between Return and Market Value of Common Stocks.(cid:17) Journal of FinancialEconomics, 9 (1981), pp. 3-18.Bekaert, G., and G. Wu. (cid:16)Asymmetric Volatility and Risk in Equity Markets.(cid:17) Review of Financial Studies,13 (2000), pp. 1-42.Berk, J.B., and J.H., van Binsbergen. (cid:16)Assessing Asset Pricing Models using Revealed Preference.(cid:17) Journalof Financial Economics, 119 (2016), pp. 1-23.Black, F. (cid:16)Capital Market Equilibrium with Restricted Borrowing.(cid:17) Journal of Business, 4 (1972), pp. 444-455.Black, F., M. Jensen, and M. Scholes. (cid:16)The Capital Asset Pricing Model: Some Empirical Tests.(cid:17) In Studiesin the Theory of Capital Markets, edited by Michael Jensen. New York: Praeger, 1972.Black, F. (cid:16)Studies in Stock Price Volatility Changes.(cid:17) American Statistical Association, Proceedings of theBusiness and Economic Statistics Section, 177-181, 1976.Blume, M. E. (cid:16)On the Assessment of Risk.(cid:17) Journal of Finance, 26 (1971), pp. 1-10.Boguth, O., Carlson, M., Fisher, A., and M., Simutin. (cid:16)Conditional Risk and Performance Evaluation:Volatility Timing, Overconditioning, and New Estimates of Momentum Alphas.(cid:17) Journal of FinancialEconomics, 102 (2011), pp. 363-389.Bollerslev, T., R. Engle, and J. Wooldridge. (cid:16)A Capital Asset Pricing Model with Time-Varying Covariances.(cid:17)Journal of Political Economy, 96 (1988), pp. 116-131.Bollerslev, T. (cid:16)Modelling the coherence in short-run nominal exchange rates: A multivariate generalizedARCH model.(cid:17) Review of Economics and Statistics, 72 (1990), pp. 498-505.Bouchaud, J.-P., A. Matacz, and M. Potters. (cid:16)Leverage E(cid:27)ect in Financial Markets: The Retarded VolatilityModel.(cid:17) Physical Review Letters, 87 (2001), pp. 1-4.BussiŁre, M., M. Hoerova, and B. Klaus. (cid:16)Commonality in Hedge Fund Returns: Driving Factors andImplications.(cid:17) Journal of Banking & Finance, 54 (2015), pp. 266-280.Campbell, J. Y., and L. Hentchel, (cid:16)No News is Good News: An Asymmetric Model of Changing Volatilityin Stock Returns.(cid:17) Journal of Financial Economics, 31 (1992), pp. 281-318.Cappiello, L., R. Engle, and K. Sheppard. (cid:16)Asymmetric Dynamics in the Correlations of Global Equity andBond Returns.(cid:17) Journal of Financial Econometrics, 4 (2006), pp. 537-572.Carhart, M. (cid:16)On Persistence in Mutual Fund Performance.(cid:17) Journal of Finance, 52 (1997), pp. 57-82.Chan, L., and J. Lakonishok. (cid:16)Robust Measurement of Beta Risk.(cid:17) Journal of Financial and QuantitativeAnalysis, 27 (1992), pp. 265-282.Chen, A-S., T.-W. Zhang, and H. Wu. (cid:16)The Beta Regime Change Risk Premium.(cid:17) Working paper, NationalChung Cheng University, 2005. hoi, J. (cid:16)What Drives the Value Premium? The Role of Asset Risk and Leverage.(cid:17) Review of FinancialStudies, 26 (2013), pp. 2845-2875.Chordia, T., Goyal, A., and J., Shanken. (cid:16)Cross-Sectional Asset Pricing with Individual Stocks: Betas versusCharacteristics.(cid:17) Working paper, Emory University, (2015).Christie, A. (cid:16)The Stochastic Behavior of Common Stock Variances - Value, Leverage, and Interest RateE(cid:27)ects.(cid:17) Journal of Financial Economic Theory, 10 (1982), pp. 407-432.Daniel, K., R., Jagannathan, and S., Kim. (cid:16)Tail Risk in Momentum Strategy Returns.(cid:17) Working paper,National Bureau of Economic Research, (2012).DeBondt, W.M., and R.H., Thaler. (cid:16)Further Evidence on Investor Overreaction and Stock Market Season-ality.(cid:17) Journal of Finance, 42 (1987), pp. 557-581.DeJong, D., and D. W. Collins. (cid:16)Explanations for the Instability of Equity Beta: Risk-Free Rate Changesand Leverage E(cid:27)ects.(cid:17) Journal of Financial and Quantitative Analysis, 20 (1985), 73-94.Fabozzi, J. F., and C. J. Francis. (cid:16)Beta Random Coe(cid:30)cient.(cid:17) Journal of Financial and Quantitative Analysis,13 (1978), pp. 101-116.Engle, R. F. (cid:17)Dynamic Conditional Correlation: A Simple Class of Multivariate Generalized AutoregressiveConditional Heteroskedasticity Models.(cid:17) Journal of Business and Economic Statistics, 20 (2002), pp. 339-350.Engle, R. F. (cid:16)Dynamic Conditional Beta.(cid:17) Journal of Financial Econometrics, 14, Issue 4 (2016), pp. 643-667.Fama, E.F., and J.D., MacBeth. (cid:16)Risk, Return and Equilibrium: Empirical Tests.(cid:17) Journal of PoliticalEconomy, 81 (1973), pp. 607-636.Fama, E. F., and K. R. French. (cid:16)The cross-section of expected returns.(cid:17) Journal of Finance, 47 (1992), pp.427-465.Fama, E. F., and K. R. French. (cid:16)Common risk factors in the returns on stocks and bonds.(cid:17) Journal ofFinancial Economics, 33 (1993), pp. 3-56.Fama, E. F., and K. R. French. (cid:16)Industry Costs of Equity.(cid:17), Journal of Financial Economics, 43 (1997), pp.153-193.Fama, E. F., and K. R. French. (cid:16)Dissecting Anomalies.(cid:17) Journal of Finance, 63 (2008), pp. 1653-1678.Fama, E. F., and K. R. French. (cid:16)Size, Value, and Momentum in International Stock Returns.(cid:17) Journal ofFinancial Economics, 105 (2012), pp. 457-472.Ferson, W. E., and C. R. Harvey. (cid:16)Conditioning Variables and the Cross-Section of Stock Returns.(cid:17) Journalof Finance, 54 (1999), pp. 1325-1360.Forbes, K., and R., Rigobon. (cid:16)No Contagion, only Interdependence: Measuring Stock Market Comovements.(cid:17)Journal of Finance, 57 (2002), pp. 2223-2261. rancis, J. C. (cid:16)Statistical Analysis of Risk Surrogates for NYSE Stocks.(cid:17) Journal of Financial and Quanti-tative Analysis, 14 (1979), pp. 981-997.Frazzini, A., and L. Pedersen. (cid:16)Betting Against Beta.(cid:17) Journal of Financial Economics, 111 (2014), pp. 1-25.Fung, W., and D. A. Hsieh. (cid:16)Empirical Characteristics of Dynamic Trading Strategies: The Case of HedgeFunds.(cid:17) The Review of Financial Studies, 10 (1997), pp. 275-302.Fung, W., and D.A., Hsieh. (cid:16)The Risk in Hedge Fund Strategies: Theory and Evidence from Trend Followers.(cid:17)Review of Financial Studies, 14 (2001), pp. 313-341.Fung, W., and D.A., Hsieh. (cid:16)Hedge Fund Benchmarks: A Risk-Based Approach.(cid:17) Financial Analyst Journal,60, (2004), pp. 65-81.Galai, D. and R. W. Masulis. (cid:16)The Option Pricing Model and the Risk Factor of Stock.(cid:17) Journal of FinancialEconomics, 3 (1976), 53-81.Garlappi, L., and H., Yan. (cid:16)Financial Distress and the Cross-Section of Equity Returns.(cid:17) Journal of Finance,66 (2011), pp. 789-822.Glosten, L. R., R. Jagannathan, and D. E. Runkle. (cid:16)On The Relation between The Expected Value and TheVolatility of Nominal Excess Return on stocks.(cid:17) Journal of Finance, 48 (1993), pp. 1779-1801.Goyal, A., and P. Santa-Clara, (cid:16)Idiosyncratic Risk Matters!(cid:17) Journal of Finance, 58 (2003), pp. 975-1007.Grinblatt, M., and T., Moskowitz. (cid:16)Predicting Stock Price Movements from Past Returns: The Role ofConsistency and Tax-Loss Selling.(cid:17) Journal of Financial Economics, 71 (2004), pp. 541-579.Haugen, R. A., and A. Heins. (cid:16)Risk and the Rate of Return on Financial Assets: Some Old Wine in NewBottles.(cid:17) Journal of Financial and Quantitative Analysis, 10 (1975), pp. 775-784.Haugen, R. A., and N. L. Baker. (cid:16)The E(cid:30)cient Market Ine(cid:30)ciency of Capitalization-Weighted Stock Port-folios.(cid:17) Journal of Portfolio Management, 17 (1991), pp. 35-40.Hong, Y., Tu, J., and G., Zhou. (cid:16)Asymmetries in Stock Returns: Statistical Tests and Economic Evaluation.(cid:17)Review of Financial Studies, 20 (2006), pp. 1547-1581.Hong, H., and D. Sraer. (cid:16)Speculative Betas.(cid:17) Forthcoming, Journal of Finance, (2016).Jagannathan, R., and Z. Wang, (cid:16)The Conditional CAPM and the Cross-Section of Expected Returns.(cid:17)Journal of Finance, 51 (1996), pp. 3-53.Jegadeesh, N. (cid:16)Evidence of Predictable Behavior of Securities Returns.(cid:17) Journal of Finance, 45 (1990), pp.881-898.Jegadeesh, N., and S. Titman. (cid:16)Returns to buying winners and selling losers: Implications for stock markete(cid:30)ciency.(cid:17) Journal of Finance, 48 (1993), pp. 65-91.Jiang, L., K., Wu, and G., Zhou. (cid:16)Asymmetry in Stock Comovements: An Entropy Approach.(cid:17) Journal ofFinancial and Quantitative Analysis, 53 (2018), pp. 1479-1507. oenker, R., and G. Bassett. (cid:16)Regression Quantiles.(cid:17) Econometrica, 46 (1978), pp. 33-50.Lettau, M., and S. Ludvigson. (cid:16)Resurrecting the (C)CAPM: a Cross-Sectional Test when Risk Premia areTime-Varying.(cid:17) Journal of Political Economy, 109 (2001), pp. 1238-1287.Lewellen, J., and S. Nagel. (cid:16)The Conditional CAPM does not Explain Asset Pricing Anomalies.(cid:17) Journal ofFinancial Economics, 82 (2006), pp. 289-314.Liu, J., R.F., Stambaugh, and Y., Yuan. (cid:16)Absolving Beta of Volatility’s E(cid:27)ects.(cid:17) Journal of Financial Eco-nomics, 128 (2018), pp. 1-15.Longin, F., and B., Solnik. (cid:16)Extreme Correlation of International Equity Markets.(cid:17) Journal of Finance, 56(2001), pp. 649-676.Malkiel, B. G., and Y. Xu. (cid:16)Risk and Return Revisited.(cid:17) Journal of Portfolio Management, 23 (1997), pp.9-14.Meng, J. G., G. Hu, and J. A. Bai. (cid:16)A Simple Method for Estimating Betas When Factors Are Measuredwith Error.(cid:17) Journal of Financial Research, 34 (2011), pp. 27-60.Mitchell, M., and T. Pulvino. (cid:16)Characteristics of Risk and Return in Risk Arbitrage.(cid:17) Journal of Finance,56 (2001), pp. 2135-2175.Moreira, A., and T., Muir. (cid:16)Volatility-Managed Portfolios.(cid:17) Journal of Finance, 72 (2017), pp. 1611-1644.Patton, A. J. (cid:16)Are ‘Market Neutral’ Hedge Funds Really Market Neutral?(cid:17) Review of Financial Studies, 22(2009), pp. 2295-2330.Reinganum, R. (cid:16)Misspeci(cid:28)cation of Capital Asset Pricing: Empirical Anomalies based on Earnings Yieldsand Market Values.(cid:17) Journal of Financial Economics, 9 (1981), pp. 19-46.Roll, R. (cid:16)A Critique of the Asset Pricing Theory’s Tests Part I: On Past and Potential Testability of theTheory.(cid:17) Journal of Financial Economics, 4 (1977), pp. 129-176.Sharpe, W. F. (cid:16)Capital Asset Prices: A Theory of Market Equilibrium under Risk.(cid:17) Journal of Finance, 19(1964), pp. 425-442.Shanken, J. (cid:16)On the Estimation of Beta-Pricing Models.(cid:17) Review of Financial Studies, 5 (1992), 1-33.Sheppard, K. (cid:16)Univariate volatility modeling.(cid:17) Notes, Chapter 7, University of Oxford (2017).Valeyre, S., D. S. Grebenkov, S. Aboura, and Q. Liu. (cid:16)The Reactive Volatility Model.(cid:17) Quantitative Finance,13, (2013), pp. 1697-1706. Selection bias
Here, we provide some evidence that the bias in beta of the low volatility factor comes from the selectionbias: selection of the bottom beta stocks yields the stocks whose beta is underestimated.The measured beta β im of stock i is obtained by a standard linear regression of the i -th stock returns, r i , to the stock index returns, r I , r i = β im r I + (cid:15) i , (37)where (cid:15) i is the residual return. We suppose that the measured beta of the stock i , β im , is a(cid:27)ected by noise, β im = β iT + η i , (38)where β iT is the true beta (which is unknown), and η i is the error of the measurement inherent to the linearregression. The standard deviation of η i , σ η i , depends on the average correlation between the single stock i and the stock index and on the number n of independent points used for the regression (which we set at n = λ β = 90 ): σ η i = σ (cid:15)i σ I √ n , (39)where σ (cid:15)i is the standard deviation of the residual returns (cid:15) i . Averaging the above relation over all stocks,we obtain σ η = h σ (cid:15)i i σ I p λ β , (40)where h σ (cid:15)i i denotes the average. According to Eq. (37), the standard deviation of the stock returns, σ i , is σ i = q β im σ I + σ (cid:15)i ≈ σ (cid:15)i , (41)because ( β im σ I /σ i ) (cid:28) (stocks are much more volatile than the index). We thus obtain σ η ≈ h σ i i σ I p λ β . (42)The low volatility factor is 50% long of the 30% top β im stocks and 50% short of the 30% bottom β im stocks (here, we consider only one sector for simplicity). We adjust the most volatile leg to target a betaneutral factor if we suppose that η i are null. In reality, when taking into account the di(cid:27)erence between themeasured and the true beta, we obtain the beta of the low volatility factor as: β low factor = − h β iT | i ∈ Bottom i + 50% h β im | i ∈ Bottom ih β im | i ∈ Top i h β iT | i ∈ Top i . (43)This is essentially the beta neutral condition that we impose when constructing the factor (see AppendixB). Here, h β im | i ∈ Bottom i is the average of the measured beta over the stocks i in the 30% bottom in themeasured beta values β im (similar for other averages).De(cid:28)ning ∆ β B and ∆ β T as h β iT | i ∈ Bottom i = h β im | i ∈ Bottom i + ∆ β B , (44) h β iT | i ∈ Top i = h β im | i ∈ Top i + ∆ β T , (45) e rewrite Eq. (43) as β low factor = −
50% ( h β im | i ∈ Bottom i + ∆ β B ) + 50% h β im | i ∈ Bottom ih β im | i ∈ Top i ( h β im | i ∈ Top i + ∆ β T )= − β B + 50% h β im | i ∈ Bottom ih β im | i ∈ Top i ∆ β T . (46)Given that h β im | i ∈ Bottom i (cid:28) h β im | i ∈ Top i (as the β im in the top quantile are higher than the β im in thebottom quantile), we obtain the following approximation β low factor ≈ − β B . (47)If one knew the true β iT values and used them for constructing the low volatility factor, the excess ∆ β B would be zero. However, the true values are unknown, and one uses the measured beta β im that creates aselection bias and the nonzero ∆ β B , as shown below.To estimate ∆ β B , we consider the true beta β iT and the measurement error η i as independent randomvariables and replace the average over stocks by the following conditional expectation ∆ β B = h β iT − β im | i ∈ Bottom i ≈ E { β iT − β im | i ∈ Bottom } = B. (48)We have, then, − B = E { η i | i ∈ Bottom } = ∞ Z −∞ η P { η i ∈ ( η, η + dη ) | i ∈ Bottom } = ∞ Z −∞ η P { η i ∈ ( η, η + dη ) , i ∈ Bottom } P { i ∈ Bottom } , (49)where we wrote explicitly the conditional probability. The denominator is precisely the threshold determiningthe bottom quantile, P { i ∈ Bottom } = p , which we set to . We thus obtain − B = 1 p ∞ Z −∞ η P { η i ∈ ( η, η + dη ) , β im − β < Q } , (50)where the event i ∈ Bottom is equivalently written as β im < β + Q , where Q is the value of the measuredbeta that corresponds to the quantile p , and β is the mean of β im . Using Eq. (38) and the assumption that β iT and η i are independent, one obtains − B = 1 p ∞ Z −∞ η P { η i ∈ ( η, η + dη ) , β iT − β < Q − η } = 1 p ∞ Z −∞ η P { η i ∈ ( η, η + dη ) } P { β iT − β < Q − η } . (51)To obtain some quantitative estimates, we make a strong assumption that both β iT and η i are Gaussianvariables, with means β and and standard deviations σ β and σ η , respectively. We then obtain − B = 1 p ∞ Z −∞ dη η exp( − η / (2 σ η )) √ π σ η Φ (cid:0) ( Q − η ) /σ β (cid:1) , (52) here Φ( x ) = x Z −∞ dy e − y / √ π (53)is the cumulative Gaussian distribution. Changing the integration variable, one obtains − B = √ σ η p √ π ∞ Z −∞ dx x exp( − x )Φ (cid:0) ( Q − x √ σ η ) /σ β (cid:1) . (54)Integrating by parts and omitting technical computations, we obtain B = √ σ η p √ π σ η σ β √ b exp (cid:18) − a b (cid:19) , (55)where a = Q/ ( √ σ β ) and b = σ η /σ β . Setting Q = σ β √ q, q = erf − (2 p − , (56)we obtain B = σ η p √ π p σ β /σ η ) exp (cid:18) − q σ η /σ β ) (cid:19) , (57)from which β low factor ≈ − σ η p √ π p σ β /σ η ) exp (cid:18) − q σ η /σ β ) (cid:19) . (58)From the data for the USA, we estimate the standard deviation of the measured beta ( σ β = 0 . ),the volatility of the stock index ( σ I = 19 . ), the volatility of the low volatility factor ( . ), and h σ i i /σ I = 1 . . Setting λ β = 1 / , we obtain from Eq. (42) σ η = 1 . p /
90 = 0 . . For p = 0 . (bottom ), we obtain q = − . and, thus, β low factor ≈ . from Eq. (58). Finally, we concludethat ρ low factor = 3 . . . = 19 . . B Construction of the beta-neutral factors
We implement the four most popular strategies as four beta-neutral factors that are constructed as follows.First, we split all stocks into six clusters of sectors of similar sizes to minimize sectorial correlations. For eachtrading day, the stocks of the chosen cluster are sorted according to the indicator (e.g., the capitalization)available the day before (we use the publication date and not the valuation date). The related indicator-based factor is formed by buying the (cid:28)rst pN stocks in the sorted list and shorting the last pN stocks, where N is the number of stocks in the considered cluster and p is a chosen quantile level. As described in Sec. 3.2,we use p = 0 . for short-term reversal and long-term momentum factors and p = 0 . for the capitalizationand low volatility factors. The other stocks (with intermediate indicator values) are not included (weightedby ). To reduce the speci(cid:28)c risk, the weights of the selected stocks are set inversely proportional to thestock’s volatility σ i , whereas the weights of the remaining stocks are . Moreover, the inverse stock volatilityis limited to reduce the impact of extreme speci(cid:28)c risk. For each trading day, we recompute the weight w i as follows w i = + µ + min { , σ mean /σ i } , if i belongs to the (cid:28)rst pN stocks in the sorted list , − µ − min { , σ mean /σ i } , if i belongs to the last pN stocks in the sorted list , , otherwise, (59) here σ mean = N ( σ + . . . + σ N ) is the mean estimated volatility over the cluster of sectors. In this manner,the weights of low-volatility stocks are reduced to avoid strongly unbalanced portfolios concentrated in suchstocks. The two common multipliers, µ ± , are used to ensure the beta market neutral condition: N X i =1 β i w i = 0 , (60)where β i is the sensitivity of stock i to the market obtained either by an OLS or by our reactive method.In every case, the method to estimate beta uses the rolling daily returns and only past information to avoidthe look-ahead bias. If the aggregated sensitivity of the long part of the portfolio to the market is higherthan that of the short part of the portfolio, its weight is reduced by the common multiplier µ + < pN , whichis obtained from Eq. (60) by setting µ − = pN (which implies that the sum of absolute weights | w i | doesnot exceed ). In the opposite situation (when the short part of the portfolio has a higher aggregated beta),one sets µ + = pN and determines the reducing multiplier µ − < pN from Eq. (60). The resulting factoris obtained by aggregating the weights constructed for each supersector. We emphasize that the factors areconstructed on a daily basis, i.e., the weights are re-evaluated daily based on updated indicators. However,most indicators do not change frequently, so the transaction costs related to changing the factors are notsigni(cid:28)cant. C Description of alternative methods
C.1 Unconditional beta
The theory.
Chan and Lakonishok (1992) produce an empirical analysis that describes various robustmethods for estimating constant beta, as they provide an alternative to ordinary least squares (OLS). Theirmethod is built on the work of Koenker (1978), which provides robust alternatives to the sample mean usinga more complex linear combination of order statistics in order to face the case of non-Gaussian errors, whichare the source of outliers. Instead of minimizing the sum of squared residuals, they consider an estimatorthat is based on minimizing the criterion, including a penalty function % on the residuals (cid:15) : T X t =1 % θ ( (cid:15) t ) (61)for % θ ( (cid:15) t ) = θ | (cid:15) t | if (cid:15) t ≥ , or (1 − θ ) | (cid:15) t | if (cid:15) t < , where < θ < .Chan and Lakonishok (1992) minimize the sum of absolute deviations of the residuals (cid:15) it from the marketmodel instead of the sum of squared deviations. The resulting minimum absolute deviations (MAD) estimatorof the regression parameters corresponds to the special case of θ = 1 / , where half of the observations lieabove the line, while half lie below. More generally, large or small values of the weight θ attach a penalty toobservations with large positive or negative residuals. Varying θ between 0 and 1 yields a set of regressionquantile estimates ˆ β ( θ ) that is analogous to the quantiles of any sample of data. However, they recognizethat MAD does not prove itself to be a clearly superior method, and they suggest that it may be improvedvia linear combinations of sample quantiles such as trimmed means.For that reason, Chan and Lakonishok (1992) test di(cid:27)erent combinations of regressions quantiles servingas the basis for the robust estimators. They discuss the general case of the trimmed regression quantile(TRQ) given as a weighted average of the regression quantile statistics: ˆ β α = (1 − α ) − Z − αα ˆ β ( θ ) dθ (62) here < α < / and < θ < .More speci(cid:28)cally, Chan and Lakonishok (1992) suggest a more straightforward and equivalent methodthat considers estimators that are (cid:28)nite linear combinations of regression quantiles (QR) and are computa-tionally simpler: β ω = N X i =1 ω i ˆ β ( θ i ) (63)where weights < ω i < , i = 1 , ..., N and P Ni =1 ω i = 1 . The speci(cid:28)c case of the weighted average is givenby Tukey’s trimean (TRM) estimator: ˆ β T RM = 0 .
25 ˆ β (1 /
4) + 0 . β (1 /
2) + 0 .
25 ˆ β (3 / (64) The application.
Their analysis is based mainly on simulated return data, although they add sometests with actual return data. The main advantages of a simulation are that the true values of the underlyingparameters are known and that the extent of departures from normality can be controlled. They begin witha baseline simulation with 25,000 replications using data generated from a normal distribution. They alsoconsider the case where the residual term is drawn from a Student-distribution with three degrees of freedomin order to explain the observed leptokurtosis in daily return data. We follow the same methodology to assessthe quality of the OLS, the MAD and the TRM beta estimators using Gaussian and t-Student residuals inthe seven types of Monte Carlo simulations (MC1,...,MC7).To replicate the exponential weight scheme of the reactive model ( λ β = 1 / ), Eq. (61) is replaced by T X t =1 (1 − λ β ) T − t % θ ( (cid:15) t ) (65) C.2 Conditional Beta
The theory.
The (cid:28)rst application of time-varying beta was proposed in Bollerslev et al. (1988), sincethe beta was computed as the ratio of the conditional covariance to the conditional variance. Engle (2002)generalizes Bollerslev (1990)’s constant correlation model by making the conditional correlation matrix time-dependent with the Dynamic Conditional Correlation (DCC) model, which constrains the time-varyingconditional correlation matrix to be positive de(cid:28)nite and the number of parameters to grow linearly bya two-step procedure. The (cid:28)rst step requires the GARCH variances to be estimated univariately. Theirparameter estimates remain constant for the next step. The second stage is estimated conditioned on theparameters estimated in the (cid:28)rst stage.Hereafter, we extend the modeling of the DCC beta for the inclusion of an asymmetric term in theconditional variance equation. In the case of asymmetry in the conditional variance, we select the GJR-GARCH(1,1) speci(cid:28)cation by Glosten et al. (1993), which assumes a speci(cid:28)c parametric form with leveragee(cid:27)ect in the conditional variance (DCC-GJR beta). The basic idea is that negative shocks at period ( t − have a stronger impact on the conditional variance at period t than positive shocks. Note that even thoughthe conditional distribution is Gaussian, the corresponding unconditional distribution can still present excesskurtosis.We select the ADCC model by Cappiello et al. (2006) to incorporate asymmetry in correlation. Thecase mixing asymmetry located in the variance equation (GJR-GARCH) and in the correlation equation There is a rich literature documenting the existence of asymmetry in correlation overall during bearmarkets. To cite a few examples, Ang and Bekaert (2000) (cid:28)nd evidence of the presence of a high volatilityand high correlation regime that tend to coincide with a bear market. Longin and Solnik (2001) (cid:28)nd that ADCC) is examined (ADCC-GJR GARCH). In our paper, the symmetric GARCH DCC will be calledsimply DCC, and the asymmetric ADCC-GJR will be called simply ADCC.Let us consider r i and r I as the returns of a single stock and the stock index, respectively. We assumethat their respective conditional variances follow a (GJR-)GARCH(1,1) speci(cid:28)cation. The stock return r i isde(cid:28)ned by its conditional volatility, σ i , and a zero-mean white noise ξ i ( t ) : r i ( t ) = σ i ( t − ξ i ( t ) (66)The conditional variation speci(cid:28)cation of the stock return is the following: σ i ( t ) = (1 − a − b − γ/ σ i + aσ i ( t − ξ i ( t )] + bσ i ( t −
1) + γσ i [ ξ − i ( t )] (67)where ˜ σ i is the unconditional volatility, and a , b , and γ are parameters re(cid:29)ecting respectively the ARCH,GARCH and asymmetry e(cid:27)ects. When γ = 0 , the speci(cid:28)cation collapses to a GARCH model; otherwise, itstands for the GJR-GARCH model, where the asymmetric term is de(cid:28)ned as ξ − i ( t ) = ξ i ( t ) if ξ i ( t ) > , or ξ − i ( t ) = 0 otherwise.The stock index return r I is de(cid:28)ned by its conditional volatility, σ I , and a zero-mean white noise ξ I ( t ) that is correlated to ξ i ( t ) : r I ( t ) = σ I ( t − ξ I ( t ) (68)The conditional variance speci(cid:28)cation of the stock index return is the following: σ I ( t ) = (1 − a − b − γ/ σ I + aσ I ( t − ξ I ( t )] + bσ I ( t −
1) + γσ I [ ξ − I ( t )] (69)We de(cid:28)ne the normalized conditional variance diagonal terms as follows: q ii ( t ) = (1 − a ρ − b ρ − γ ρ /
2) + a ρ ξ i ( t − ξ i ( t −
1) + b ρ q ii ( t −
1) + γ ρ ξ − i ( t − ξ − i ( t − (70) q II ( t ) = (1 − a ρ − b ρ − γ ρ /
2) + a ρ ξ I ( t − ξ I ( t −
1) + b ρ q II ( t −
1) + γ ρ ξ − I ( t − ξ − I ( t − (71)The normalized conditional covariance term q iI ( t ) is given by: q iI ( t ) = (1 − a ρ − b ρ − γ ρ / ρ + a ρ ξ i ( t − ξ I ( t −
1) + b ρ q iI ( t −
1) + γ ρ ξ − i ( t − ξ − I ( t − (72)correlation among large negative returns is much larger than the correlation among large positive returns.Forbes and Rigobon (2002) warn that the correlation can increase only because the volatility increases evenif the beta remains constant. To that end, there has been a controversy in the literature on the statisticalsigni(cid:28)cance of such an asymmetry. For this purpose, Ang and Chen (2002) develop a summary statistic thatquanti(cid:28)es the degree of asymmetry in correlations across downside and upside markets relative to a particularmodel. They (cid:28)nd that stocks from either small (cid:28)rms, value (cid:28)rms, or low past returns (cid:28)rms, tend to exhibitgreater asymmetric correlations. Hong, Tu and Zhou (2006) extend the Ang and Chen (2002) analysis to amodel-free approach so that if symmetry is rejected, then the data cannot be modeled by any symmetricaldistributions. They (cid:28)nd that the betas can be asymmetric even if there is no asymmetry in the correlation.They also (cid:28)nd strong evidence of asymmetries for both size and momentum portfolios, but no evidence forbook-to-market portfolios. Jiang, Wu and Zhou (2018) extend the Hong, Tu and Zhou (2006)’s correlation-based test approach to (cid:28)nally (cid:28)nd that asymmetry is much more pervasive than previously thought. Indeed,they address asymmetry beyond the second moment as the correlation coe(cid:30)cient is a measure of lineardependence, captured by the market beta, between individual stock returns and the market portfolio return.In contrast to Hong, Tu and Zhou (2006), they (cid:28)nally (cid:28)nd evidence of asymmetry in some portfolios sortedby the book-to-market ratio. hen γ ρ = 0 , the speci(cid:28)cation collapses to a DCC model; otherwise, it stands for the ADCC model,where the asymmetric term is de(cid:28)ned as ξ − i ( t ) = ξ i ( t ) if ξ i ( t ) > , or ξ − i ( t ) = 0 otherwise.The conditional correlation between ξ I ( t + 1) and ξ i ( t + 1) is then updated by: ρ iI ( t ) = q iI ( t ) / p q II ( t ) q ii ( t ) (73)The beta DCC and beta ADCC estimation are de(cid:28)ned in the same way: β DCC ( t ) = ρ iI ( t ) σ i ( t ) /σ I ( t ) (74)The log-likelihood function is optimized to calibrate the parameters ˜ ρ , ˜ σ I and ˜ σ i for estimation: L DCC = 12 T X t ( L V ( t ) + L C ( t )) (75) L V ( t ) = − π ) − ξ I ( t ) − ξ i ( t ) − σ I ( t )) − σ i ( t )) (76) L C ( t ) = − log( det ( R ( t ))) − U ( t ) R ( t ) − U ( t ) − U ( t ) U ( t ) (77)with det as the determinant of a matrix, and R ( t ) = (cid:20) ρ iI ( t ) ρ iI ( t ) 1 (cid:21) , U ( t ) = (cid:20) ξ i ( t ) ξ I ( t ) (cid:21) (78) The application.
For Monte Carlo simulation purposes: • ξ i ( t ) is either generated randomly in MC6 and MC7 according to a standard Gaussian or measuredthrough returns r i ( t ) and σ i ( t − for beta DCC estimation. • γ = 0 for MC6 and beta DCC estimation but γ > for MC7 and beta ADCC, which captures theasymmetry term of the GJR-GARCH. • ξ I ( t ) is either generated randomly in MC6 and MC7 according to a standard Gaussian random variablethat is correlated with the random variable ξ i ( t ) (the correlation between ξ i ( t ) and ξ I ( t ) is ρ iI ( t − )or is measured through returns r I ( t ) and σ I ( t − for beta DCC estimation. • γ ρ = 0 for MC6 and beta DCC but γ ρ > for MC7 and beta ADCC, which captures the asymmetryterm of the ADCC.The (cid:28)xed parameters that are supposed to be known when testing the beta DCC are set to US marketestimates from Sheppard (2017): • (cid:28)xed parameters for the univariate symmetric GARCH(1,1) process (MC6, i.e., DCC): b = 0 . , b is the decay coe(cid:30)cient, and / (1 − b ) is related to the number of days the processneeds to mean revert; a = 0 . describes the level of the volatility of the volatility. • (cid:28)xed parameters for the univariate asymmetric GJR-GARCH(1,1,1) process (MC7, i.e., ADCC): b = 0 . , b is the decay coe(cid:30)cient, and / (1 − b ) is related to the number of days the processneeds to mean revert; a = 0 . , a + γ/ describe the level of the volatility of the volatility; γ = 0 . , γ describe the asymmetry. he (cid:28)xed parameters that are supposed to be known when testing the DCC and ADCC betas are set toUS market estimates from Cappiello et al. (2006): • (cid:28)xed parameters for the symmetric cross-term process (MC6, i.e., DCC): b ρ = 0 . , b ρ is the decay coe(cid:30)cient and is linked to the relaxation time; a ρ = 0 . describes the level of the volatility. • (cid:28)xed parameters for the asymmetric cross-term process (MC7, i.e., ADCC): b ρ = 0 . , b ρ is the decay coe(cid:30)cient and is linked to the relaxation time; a ρ = 0 . , a ρ + γ ρ / describes the level of the volatility of the correlation; γ ρ = 0 . , γ ρ describes the asymmetry.The (cid:28)xed parameters that are not known when testing the DCC beta and are estimated through theoptimization of log-likelihood are set by MC simulation to: • ˜ ρ = 0 . / . , unconditional correlation; • ˜ σ I = 0 . / √ , ˜ σ i = 0 . / √ unconditional stock index volatility; • ˜ σ i = 0 . / √ unconditional single stock volatility.To replicate the exponential weight scheme in the reactive model ( λ β = 1 / ), Eq. (75) is replaced by L DCC = 12 T X t (1 − λ β ) T − t ( L V ( t ) + L C ( t )) (79) . The Model of Diffusion of Correlationsbetween Securities
1. The results of this chapter were obtained in collaboration with Stanislav Kuperstein. he model of di(cid:27)usion of the correlations betweensecurities
June 7, 2019
AbstractThe measurement of di(cid:27)usion of the correlation matrix between securities is almostimpossible as di(cid:27)usion of the correlation of population is hidden by measurement noises.The use of (cid:28)ve minutes returns and the reduction of the dimension of the matrix from500 single stocks to the 23 main market neutral risk factors by taking into account(cid:28)nancial information allow us to measure some di(cid:27)usion patterns. The empirical dis-tribution of the eigenvalues of the increments of the correlation matrix is estimated.The deformation of that distribution with time scale is also studied. We introduce analternative model that is based on a stochastic equation governing the volatilities ofthe risk factors. The non-orthogonality of the factors enables to generate an interestingbehavior and enables to (cid:28)t the empirical di(cid:27)usion patterns of the eigenvectors: Theeigenvectors of the matrix tend to be invested at time t on the risk factors that are themost volatile at time t and therefore di(cid:27)use as well due to the endless rotation of themost volatile factors.
The measurement of di(cid:27)usion of the correlation matrix between securities is almost impos-sible as the di(cid:27)usion is hidden by in measurement noises. Allez and Bouchaud (2012) showthat the empirical main eigenvector is oscillating around the population one and that theangle of oscillation is related to the ratio between the second eigenvalue and the (cid:28)rst eigen-value and to the ratio between the size of the matrix and the number of independent returnsper instrument. We could extend this rule for other eigenvectors and we guess that they aremeasured with a lot of noises as eigenvalues become closer to their neighbors. As a resultthe eigenvectors with corresponding small eigenvalues are not well de(cid:28)ned or are chaotic.Valeyre et al. (2018) introduces a powerful method that takes advantage of the (cid:28)nancialinformation to (cid:28)lter the measurement noises by reducing the dimension of the correlation1atrix of single stock returns to 24 main risk factors, among which 23 are market neutral.This method reproduces the main eigenvectors orthogonal to the market mode and theirdynamics. We exclude the market mode from the analysis as the (cid:28)rst eigenvector appears tobe less noisy and less random with a high eigenvalue. So we focus the study on the marketneutral sub space. When the method introduced by Valeyre et al. (2018) is applied with5 minutes returns to estimate the correlation matrix, we can measure properly the weeklyvariation of the correlation. That enables to test for the (cid:28)rst time how well the di(cid:27)erentmainstream and theoretical models of the literature, that describe how population covariancematrix can change and be stochastic, can reproduce the empirical statistics of the variations.The Wishart process is a mainstream tool to model the dynamics of the covariancematrix. This process can be interpreted as the (cid:16)square(cid:17) of a matrix of Brownian motionsor in its stationary version as the square of a matrix of Ornstein-Uhlenbeck processes. Bru(1991) derived the stochastic equation that describes the dynamics of the eigenvalues thatrepulse each other. Their distribution follows the Mar£enko-Pastur law for high-dimensionalmatrices.Motivated by price multi-asset option or default intensities, Cuchiero et al. (2011) ana-lyzed the foundation of the stochastic continuous a(cid:30)ne process on the universe of covariancematrices. The Wishart process extends in fact the Feller di(cid:27)usion from one dimension toseveral dimensions. Gourieroux (2007) introduced a mean reversion term and extended theprocess of Cox, Ingersoll and Ross (1985) from one to several dimensions. The process ofCox, Ingersoll and Ross (1985) is very popular in (cid:28)nance to model the dynamics of theinterest rate or the volatility of single stocks. In that way, Gourieroux (2007) could modelproperly the risk of a portfolio taking into account the risk that correlation could change. Inthe same way Fonseca, Grasselli and Tebaldi (2008) extended to several dimensions to pricebasket options the model of Heston (1993) where the volatility of the Brownian process, thatdescribes prices, is stochastic and modeled by a CIR process.Other stochastic matrices are very well documented. Ahdida and Alfonsi (2013) workedon a mean reversion process of correlation matrix through the Wright-Fisher di(cid:27)usion. Plentyof algorithms were also documented to generate random walk among the ensemble of therotation matrices, that can be used to describe directly the di(cid:27)usion of the eigenvectors ofthe correlation matrix. As an example the Walk by Kac (1959) is a very e(cid:30)cient algorithmthat generates random paths but there is no mean reversion component so that after awhile the matrix loses the connection with the initial matrix. Gaussian matrices were alsovery well studied. The distribution of the eigenvalues of the symmetric Gaussian matrix isthe well-known Wigner semi-circle law. This is an important point if we consider that theincrement of a covariance or correlation matrix could be well modeled by a Gaussian matrix,we can guess that the distribution of the eigenvalues of the increment should be close to thesemi-circle law.In this paper we (cid:28)rst de(cid:28)ne in Section 2 the empirical di(cid:27)usion patterns we want toreproduce. We then introduce our di(cid:27)usion model in Section 3. We then present how well2he model captures the empirical patterns in Section 4. We (cid:28)nally compare with the resultsobtained with the mainstream models from the literature in Section 5.
Valeyre et al. (2018) proposed a practical solution to (cid:28)lter noises of the correlation matrixbetween single stocks, noted as C ( t ) , by reducing the size of the correlation matrix from500 or more single stocks to 24 major risk factors that reproduce the largest eigenvaluesand their dynamics. These factors were named ’Fundamental Maximum Variance Marketneutral’ portfolios as their construction was optimized to capture as best as possible theempirical eigenvalues.We use the same data and factors as in Valeyre et al. (2018): we selected the 500most liquid stocks from the US stock market, from 2013 to 2018 and the K = 23 mostpopular market neutral factors according to the literature (dividend yield, capitalization,volume/capitalization, STR, momentum, beta, leverage, sales to price, book to price, cashto price, price to earning, growth of earning, sensitivity to Euro dollar, sensitivity to 10 yearsrates, energy, (cid:28)nance, IT, utilities, consumer, industry, pharmacy, consumer discretionaryvs. staple, REITs). So we exclude from the analysis the market mode and select only marketneutral factors.Instead of analyzing di(cid:27)usion of the large and noisy matrix C ( t ) , we analyze di(cid:27)usionof its proxy which is the reduced matrix C p ( t ) , Eq. (1), introduced in Eq. (25) of Valeyreet al. (2018). C p ( t ) depends on the reduced h ( t ) and γγγ ( t ) matrices, that can be interpretedas the overlap between the K factors positions and as the covariance between the K factorsreturns. They can be estimated accurately using 5 minutes returns based on Eq. (89) ofValeyre et al. (2018). C p ( t ) = γγγ − ( t ) h ( t ) γγγ − ( t ) (1)Valeyre et al. (2018) argue that the main eigenvalues from C p ( t ) measured from 5minutes returns with a lookback of 1 week are very close to the main empirical eigenvaluesof the correlation matrix of single stocks and that the dynamics of eigenvalues are also wellreproduced. Therefore C p ( t ) can be used as a good proxy of C ( t ) . In this subsection, we introduce some di(cid:27)usion patterns that we believe are important to re-produce. As di(cid:27)usion of the eigenvalues of the correlation matrix is already well documented3n the literature, we focus on di(cid:27)usion of the eigenvectors. We only know from the literaturethat Allez and Bouchaud (2012) model the stability of the subspace generated by successiveeigenvectors based on the overlap between the new subspace and the old one. But theyassumed that the population correlation matrix was constant and that only measurementnoise could explain the di(cid:27)usion of the empirical eigenvectors.We de(cid:28)ne the matrix O ( t ) of the eigenvectors that diagonalize C p ( t ) . These eigenvectorscould be interpreted as the constrained and (cid:28)ltered eigenvectors of the large correlationmatrix between single stocks. We have O T ( t ) C p ( t ) O ( t ) = Ω( t ) , where the diagonal matrix Ω( t ) contains the eigenvalues of C p ( t ) but could be interpreted as the constrained eigenvaluesof the large correlation matrix between single stocks.We de(cid:28)ne S ( t, τ ) in Eq. (2) as the increment of the correlation matrix corresponding tothe time increment τ . We changed the basis and set S ( t, τ ) into the basis generated by theinitial eigenvectors O ( t ) . We de(cid:28)ne S ( t, τ ) in Eq. (3) as a tilted version to minor the impactof change in eigenvalues Ω( t + τ ) − Ω( t ) after the time increment τ and get measurement onlysensitive to di(cid:27)usion of eigenvectors of single stocks but not to the di(cid:27)usion of eigenvalues.We de(cid:28)ne, in Eq. (4), S ( t, τ ) as a tilted version to minor the impact of the eigenvalues Ω( t ) in the weighting and get measurement that is based on the same weight for all eigenvectors(major or minor ones). S ( t, τ ) = O T ( t ) ( C p ( t + τ ) − C p ( t )) O ( t ) , (2) S ( t, τ ) = (Ω( t ) / Ω( t + τ )) O T ( t ) C p ( t + τ ) O ( t ) (Ω( t ) / Ω( t + τ )) − O T ( t ) C p ( t ) O ( t ) , (3) S ( t, τ ) = CorrCov (cid:0) O T ( t ) C p ( t + τ ) O ( t ) (cid:1) − Id , (4)Matrices S ( t, τ ) , S ( t, τ ) and S ( t, τ ) quantify whether (cid:16)old(cid:17) eigenvectors O ( t ) are stillclose to be the (cid:16)new(cid:17) eigenvectors O ( t + τ ) . S ( t, τ ) measures the way the portfolios thatwere initially (cid:28)xed as eigenvectors start to be correlated to each other as time τ elapses.We can interpret the eigenvalues of S ( t, τ ) as a measure of the way how the eigenvectors ofthe proxy of the large correlation matrix between single stocks di(cid:27)use. The direct way thatwould have consisted in measuring the distance between old and new eigenvectors wouldhave not made any sense as eigenvectors with corresponding small eigenvalues are close tobe chaotic and change dramatically as τ changes a little, while the eigenvalues of S remainempirically continuous with τ .To be more precise as the eigenvalues of S ( t, τ ) are also simply the eigenvalues of C p ( t + τ ) − C p ( t ) , the eigenvalues of S ( t, τ ) could be interpreted as the eigenvalues of the incrementsof the large but cleaned correlation matrix between single stocks.The eigenvalues of S ( t, τ ) are also simply the eigenvalues of the change in correlationbetween main risk factors. S ( t, τ ) is particularly interesting and could reveal peculiar4roperties as S ( t, τ ) is rather homogeneous and random so it could be close to symmetricGaussian matrix and we can expect that the distribution of the eigenvalues looks like to adeformed version of the Wigner semi-circle law, without any tails and singularities.We de(cid:28)ne two di(cid:27)usion patterns to measure from real data that we want to reproducewith stochastic models for both S , S and S : • | λ | max ( τ ) is the largest eigenvalues in absolute value of S ( t, τ ) , S ( t, τ ) or S ( t, τ ) thedepending on the time scale of the increment τ and averaged on the di(cid:27)erent t . Theincrease of | λ | max ( τ ) with τ describes the way the portfolios that were initially set aseigenvectors starts to be more and more correlated. | λ | max ( τ ) will converge toward anasymptotic value when τ tends to ∞ with a certain relaxation time; • ρ τ ( λ ) is the distribution of the eigenvalues of of S ( t, τ ) , S ( t, τ ) or S ( t, τ ) for alldi(cid:27)erent t . The shape of the distribution depends on τ , the time scale of the increment.We cannot reduce di(cid:27)usion of the largest eigenvalue | λ | max ( τ ) and it is important toworry about the distribution of all eigenvalues. Indeed that is important to understandif the risk of correlations change can be extreme and localized by concentrating on onlyone factor, i.e that is important to determine whether the distribution ρ τ ( λ ) couldhave tails or not if the tails are the results of the distribution of the eigenvalues of thecorrelation matrix or the results of a more sophisticated phenomenon. That explainswhy it is interesting to measure the shape of ρ τ ( λ ) for both S , S and S . λ i ( t ) = h ii ( t ) γγγ ii ( t ) (5) λ ( t ) , ..., λ K ( t ) are the time dependent Factor correlation levels, (cid:16)FCL(cid:17), of the K factors,which were introduced in Valeyre et al. (2018). The interpretation of Eq. (5), that de(cid:28)nesthe (cid:16)FCL(cid:17), is that it corresponds to the conditional variance of the returns of the corre-sponding normalized risk factor. The (cid:16)FCL(cid:17) has very appealing properties as for example itcorresponds to the weighted average of the eigenvalues of the correlation matrix of the singlestocks returns, with the weights given by the squares of the eigenvectors projections of therisk factor. Each market neutral factor was determined with the Maximum Variance formula(Eq.43 combined with Eq.54 of with Valeyre et al. (2018) ν = 1 ), that enables to optimizethe (cid:16)FCL(cid:17) and to reproduce very well the empirical eigenvalues and their dynamics. x ( t ) , ..., x K ( t ) are the logarithm of the ratio between λ i ( t ) and λ i the unconditional(cid:16)FCL(cid:17) estimation (Eq.6) and according to Valeyre et al. (2018) they could be well modeled by5ndependent Ornstein-Uhlenbeck processes (Eq.7). σ describes the volatility and α describesthe inverse of the relaxation time. x i ( t ) = ln (cid:18) λ i ( t ) λ i (cid:19) (6) dx i ( t ) = − αx i ( t ) dt + σdB i ( t ) . (7)The parameter α could be (cid:28)tted from a normalized variogram introduced in Grebenkovand Serror (2014) through Eq. (11). The left graph of Fig. 1 exhibits the normalizedvariogram of the daily variation of x ( t ) , .., x K ( t ) for the K risk factors. The normalizedvariograms are (cid:28)tted for most factors by an Ornstein-Uhlenbeck process with a relaxationtime of 60 days. Some factors deviate but it could be explained by the noise of the mea-surement of the method. The right graph of Fig. 1 shows the empirical eigenvalues of theunconditional correlation matrix h C ( t ) i . time no r m a li z ed v a r i an c e theoreticalempirical rank Figure 1: Left: Normalized variogram of the daily variation of the logarithm of the FCL λ , .., λ K using (cid:28)ve minutes returns from 2013 to 2018. Each curve corresponds to a riskfactor. Most of the empirical measurement are (cid:28)tted by theoretical variogram obtainedfor the Ornstein-Uhlenbeck process with /α = 60 . Right: Eigenvalues of the empiricalunconditional correlation matrix, we exclude from the analysis the (cid:28)rst one that correspondsto the market mode.We de(cid:28)ne γγγ and h as the unconditional covariance and the unconditional overlap ma-trices whereas h ( t ) et γγγ ( t ) are the conditional matrices depending on time.We introduce in Eq. (8) the model of di(cid:27)usion of the correlation matrix that is gov-erned by di(cid:27)usion of the (cid:16)FCL(cid:17). C sim p ( t ) could be generated completely in a random way,to replicate empirical patterns of C p ( t ) . Eq. (44) of Valeyre et al. (2018) describes howto generate the random unconditional matrices ( h sim , γγγ sim and C sim p = γγγ sim − h sim γγγ sim − )6hrough the random selection of the K factors. They are generated randomly once based onthe eigenvalues of the empirical unconditional correlation matrix between single stocks. Asummary is described in Appendix B. The unconditional empirical eigenvalues Ω are basedon empirical eigenvalues using the 5 minutes data from 2013 to 2018 and are reported in Ta-ble 1. The (cid:28)rst eigenvalue was excluded. Eq. (8) helps to simulate the heteroscedsacticity ofthe correlation matrix through change in (cid:16)FCL(cid:17) with the stochastic processes x ( t ) ,... x K ( t ) that can be updated step by step randomly from t = 1 to T using the parameters of Eq. (7).That enables to generate stochastic (cid:16)FCL(cid:17) that we could interpret as volatilities for the K risk factors. For numerical simulation we set T = 1071 . λ Emp λ Emp are the sample unconstrained eigenvalues of the correlation matrix betweensingle stocks obtained from 2013-2018 with 5 minutes returns. Ω is set as λ Emp ,..., λ Emp .7 sim p ( t ) = e x ( t ) ... ... ... e x K ( t ) − C sim p e x ( t ) ... ... ... e x K ( t ) − (8) Fig. 2 displays the time scale dependency of | λ | max ( τ ) for empirical S and simulated S sim .We see that simulation captures well the measurement. The (cid:28)t could have been much better ifthe parameter σ was optimized. The curve looks like a square root law that is a characteristicof di(cid:27)usion but in fact it seems to converge to a value between 8 and 12 with an exponentialdecay and a relaxation time close to 60 days ( /α ). We can interpret that the asymptotebetween 8 and 12 as close to the (cid:28)rst eigenvalue of C sim p multiplied by σ p /α . We include | λ | max ( τ ) obtained for S when withdrawing the impact of di(cid:27)usion of the eigenvalues Ω( t ) of the correlation matrix on the measurement. We see a small di(cid:27)erence between | λ | max ( τ ) obtained for S and | λ | max ( τ ) obtained for S and that di(cid:27)usion of eigenvalues Ω( t ) has asmaller impact than di(cid:27)usion of the eigenvectors into di(cid:27)usion of the correlation matrix ofsingle stocks. The very likely scenario that explains di(cid:27)usion of the correlation matrix istherefore a permanent rotation between factors that matters for risk. In that scenario thedistribution of the eigenvalues of the correlation matrix is maintained rather stable.Fig.3 exhibits | λ | max ( τ ) for empirical S and simulated S sim . At τ = 60 days | λ | max ( τ ) is close to . but we have to add another . to achieve the likely asymptote and another toestimate the (cid:28)rst eigenvalue of the correlation matrix between factors that were set initiallyas eigenvectors of the empirical conditional correlation matrix. Indeed that (cid:28)rst eigenvaluecorresponds to the (cid:28)rst eigenvalue of Id + S sim ( t, τ = ∞ ) . So should be compared to . that corresponds to the value of (cid:28)rst eigenvalue of the correlation matrix derived fromthe empirical unconditional γγγ − h γγγ − or to . that corresponds to the average of the(cid:28)rst eigenvalue of the correlation matrix derived from the random matrix γγγ sim − h sim γγγ sim − based on a random selection of the K factors. It is therefore almost not worth to try toorthogonalize the random factors as correlation change will destroy a large part of the e(cid:27)ectof the orthogonalization after 60 days or more.Fig. 4 displays ρ τ ( λ ) , the empirical histogram of the eigenvalues of S and the histogramof the simulated S sim for di(cid:27)erent time scales τ from 1 to 30 days. We see that the empiricaland simulated histograms are very close. They have tails and are very di(cid:27)erent from theWigner semi-circle law, we could have expected if the variations of the correlation matrixwere Gaussian. The model appears to be realistic. That is con(cid:28)rmed by Fig. 5 obtainedfor S that is very similar to Fig. 4 obtained for S . Fig. 6 obtained for S , that minorsthe impact of the eigenvalues Ω( t ) in the weighting and get the measurement of di(cid:27)usionthat is based on the same weight for all eigenvectors (major or minor ones), still exhibits a8
10 20 30 40 50024681012 F i r s t E i gen v a l ue i n ab s o l u t e v a l ue empirical S simulation S empirical S simulation S Figure 2: | λ | max ( τ ) based on S , S sim , S and S . It corresponds to the largest eigenval-ues in absolute value of the increment of the proxy of the correlation matrix between singlestocks depending on τ , the time horizon of the increment. (cid:16)Empirical S (cid:17) and (cid:16)Empirical S (cid:17) are the empirical measurements. (cid:16)Empirical S (cid:17) is the case where the impact of di(cid:27)usionof the eigenvalues Ω( t ) is arti(cid:28)cially withdrawn. (cid:16)simulation S (cid:17) and (cid:16)simulation S (cid:17) wereobtained with σ = 0 . and /α = 60 to make the (cid:16)FCL(cid:17) stochastic. The simulationis generated by our model that captures well the empirical measurements based on the 5minutes returns on the period 2013-2018.distribution with tails and remains very di(cid:27)erent from the Wigner semi-circle law or fromthe pointed hat shape distribution for both empirical and simulated cases. To determine how well our model is adapted to the reality, we compare it to other models,that were selected among mainstream models from the literature. We will check if thestandard models from the literature also could generate histogram ρ τ ( λ ) corresponding to S , S and S that are close to the empirical ones. Feller Di(cid:27)usion with the Wishart process is used to model di(cid:27)usion of the covariance betweenportfolios that were initially (cid:28)xed as eigenvectors. C sim p,t ( τ ) = Ω Id + σ B T t ( τ ) B t ( τ )1 + σ τ L Ω
10 20 30 40 5000.511.52 F i r s t E i gen v a l ue i n ab s o l u t e v a l ue empiricWishartstochastic FCL Figure 3: | λ | max ( τ ) based on S , S sim for both the Wishart model (derived from Eq. 9)and the model governed by the stochastic (cid:16)FCL(cid:17). It corresponds to the largest eigenvalues inabsolute value of the normalized increment of the proxy of the correlation matrix betweensingle stocks depending on τ , the time horizon of the increment.with B t ( τ ) set as a Brownian matrix of size K × L with the path t , as if di(cid:27)usion wascoming only from statistical error of measurement of the correlation as in Allez and Bouchaud(2012). The di(cid:27)usion model of the population eigenvectors by the error of measurement ofthe empirical eigenvectors sounds weird. We simulated | λ | max ( τ ) and ρ τ ( λ ) for S sim ( t, τ ) de(cid:28)ned by the Eq. (9) S sim ( t, τ ) = Ω Id + σ B T t ( τ ) B t ( τ )1 + σ τ L Ω − Ω (9)Here we chose to have independent paths t . We generate plenty of paths t = 1 , .., t = T = 1071 . Ω is set to empirical unconditional eigenvalues of the large correlation matrixbetween single stocks, the measurements are reported in the Tab.1. The (cid:28)rst empirical eigen-value that corresponds to the market mode was excluded. L and σ were (cid:28)tted to replicateapproximatively, without any optimization, the empirical measurement of | λ | max ( τ ) .Fig. 7 shows that the quality of the (cid:28)t whereas Fig. 8 exhibits how the model reproduceswell the empirical histogram of the eigenvalues of S .When we simulate S sim ( t, τ ) = CorrCov (cid:18) Ω Id + σ B T t ( τ ) B t ( τ )1 + σ τ L Ω (cid:19) − Id, we see in Fig. 9 that the model could not manage to replicate the tails of ρ τ ( λ ) , thehistogram of the increments and generate only distribution without any tails close to thedeformed semi-circle law of Wigner. 10igure 4: ρ τ ( λ ) based on S and S sim with τ = 1 , or days. The histograms correspondto the empirical distribution of the eigenvalues of the increments of the correlation matrix. τ = 1 , or days correspond to the time horizon of the increment. The simulation, thatwas obtained with σ = 0 . and /α = 60 to make the (cid:16)FCL(cid:17) stochastic, captures wellthe measurement based on the 5 minutes returns on the period 2013-2018 at any time scale.In conclusion the Wishart model could not reproduce the empirical normalized change incorrelation. We could have tested with the large matrix C sim of dimension N = 500 insteadof C sim p of dimension K but it would have generate the same disappointing results with theincapacity to get for ρ τ ( λ ) a distribution whose shape is very di(cid:27)erent from the Wignersemi-circle law. This di(cid:27)usion is used to model di(cid:27)usion of the correlation between portfolios that wereinitially (cid:28)xed as eigenvectors. We simulated directly the stochastic correlation matrix, A ( t, τ ) generated by the stochastic process introduced by Ahdida and Alfonsi (2013). The di(cid:27)erentparameters were (cid:28)tted to replicate approximatively, without any optimization, the empiricalmeasurement of | λ | max ( τ ) (Fig. 10). We generated plenty of path t = 1 , ..., t = T = 1071 and11igure 5: ρ τ ( λ ) based on S and S with τ = 1 , or days. The histograms correspondto the empirical distribution of the eigenvalues of the increments of the correlation matrixreadjusted by the eigenvalues increments. τ = 1 , or days correspond to the timehorizon of the increment. The simulation, that was obtained with σ = 0 . and /α = 60 to make the (cid:16)FCL(cid:17) stochastic, captures well the measurement based on the 5 minutes returnson the period 2013-2018 at any time scale. A ( t, was initialized at the identity matrix for every path. The (cid:16)mean matrix(cid:17) of the meanreversion term was also set to the identity. We also plot the histogram of the eigenvalues of S sim ( t, τ ) de(cid:28)ned by Eq. (10) (Fig. 11) that looks realistic. S sim ( t, τ ) = Ω A ( t, τ )Ω − Ω (10)When we simulate S sim ( t, τ ) = A ( t, τ ) − Id , the Wright-Fisher di(cid:27)usion could not helpto avoid for getting the distribution ρ τ ( λ ) that looks like a Wigner semi-circle law (Fig.12).In conclusion the di(cid:27)usion of Wright-Fisher with mean reversion term could not reproducethe empirical normalized change in correlation;12igure 6: ρ τ ( λ ) based on S and S with τ = 1 , or days. The histograms correspondto the empirical distribution of the eigenvalues of the increments of the correlation matrixreadjusted by the eigenvalues increments. τ = 1 , or days correspond to the timehorizon of the increment. The simulation, that was obtained with σ = 0 . and /α = 60 to make the (cid:16)FCL(cid:17) stochastic, captures well the measurement based on the 5 minutes returnson the period 2013-2018. Mean reversion random walk on the ensemble of the rotation matrices describes directlydi(cid:27)usion of the eigenvectors of C sim p . We simulate directly O ( t, τ ) initialized to the identitymatrix. We simulated | λ | max ( τ ) and ρ τ ( λ ) for S sim ( t, τ ) = O T ( t, τ )Ω O ( t, τ ) − Ω for di(cid:27)er-ent path t = 1 , .., t = T = 1071 . We also test S sim ( t, τ ) = CorrCov (cid:0) O T ( t, τ )Ω O ( t, τ ) (cid:1) − Id .Parameters were set to reproduce approximatively, without any optimization, the empiricalmeasurement of | λ | max ( τ ) for the three di(cid:27)erent following methods: • Gram-Schmidt algorithm that is described in Appendix C.1. We (cid:28)t the parametersthrough Fig.13. But Fig.14 exhibits two abnormal bumps on the right and the left andFig.15 exhibits an abnormal pointed hat shape;13
10 20 30 40 50024681012 F i r s t E i gen v a l ue i n ab s o l u t e v a l ue empirical S simulation S Figure 7: | λ | max ( τ ) based on S , S sim , S and S . It corresponds to the largest eigenval-ues in absolute value of the increment of the proxy of the correlation matrix between singlestocks depending on τ , the time horizon of the increment. (cid:16)Empirical S (cid:17) is the empiricalmeasurements. (cid:16)simulation S (cid:17) is derived from a Wishart process in Eq. (9) derived froma Gaussian matrix of dimension L × K with L = 30 and K = 23 . The last parameter is σ = 0 . . • Algorithms based on the Walk by Kac (1959) that was tuned to include a mean rever-sion term that is described in Appendix C.1; • A new stochastic di(cid:27)erential equation that is described in Appendix C.1.In the two cases, we (cid:28)t the parameters through Fig.13 (Fig.16). But Fig.14 (Fig.17) ex-hibits two abnormal bumps on the right and the left and Fig.15 (Fig.21)exhibits an abnormalpointed hat shape.
Classical model where the eigenvalues of the matrix, instead of the (cid:16)FCL(cid:17), are stochastic. Inthat model the eigenvectors are stable and do not di(cid:27)use.
The measurement of di(cid:27)usion of the correlation matrix is almost impossible as di(cid:27)usion ishidden by measurement noises. The use of (cid:28)ve minutes returns and the reduction of thedimension of the matrix from 500 single stocks to 24 main risk factors allow us to measuresome di(cid:27)usion patterns. The distribution of the eigenvalues of the variation of the matrixis measured. The deformation of the distribution with time scale is studied. The empirical14igure 8: ρ τ ( λ ) based on S and S using the Wishart model (Eq. 9) with τ = 1 , or days. The histograms correspond to the empirical distribution of the eigenvalues of theincrements of the correlation matrix readjusted by the eigenvalues increments. τ = 1 , or days correspond to the time horizon of the increment. The simulation was obtained with L = 30 and σ = 0 . .patterns are not well reproduced by the standard stochastic models derived from the Wishartprocess or from standard random walk on rotation matrices. We introduce a new alternativemodel that is based on a stochastic equation for the volatilities of the risk factors that (cid:28)t theempirical patterns. The eigenvectors of the matrix tend to be invested on the risk factors thatare the most volatile and therefore di(cid:27)use. Mainstream models could not capture extremechanges of correlation localized in one direction. Our alternative model appears more robustand realistic. A Variogram V i ( τ ) = Var T u = T ( l i ( u ) − l i ( u − τ )) Var T u = T ( l i ( u ) − l i ( u − ... Var T u = T ( l i ( u − τ ) − l i ( u − τ − (11)15igure 9: ρ τ ( λ ) based on S and S using the Wishart model (normalized version ofEq. 9) with τ = 1 , or days. The histograms correspond to the empirical distributionof the eigenvalues of the increments of the correlation matrix readjusted by the eigenvaluesincrements. τ = 1 , or days correspond to the time horizon of the increment. Thesimulation was obtained with L = 30 and σ = 0 . .where Var T u = T is the empirical variance based on the sample from T to T . B Generating C sim ( t ) governed by stochastic FCL In the Paper we we modelled the returns by r i ( t ) = N X j =1 q ‘ r. j (cid:0) e r. j (cid:1) i ε j ( t ) , (12)16igure 10: | λ | max ( τ ) based on S sim . It corresponds to the largest eigenvalue in absolutevalue of the increment of the proxy of the correlation matrix between single stocks dependingon τ , the time horizon of the increment. S sim is derived from Ahdida and Alfonsi (2013) (Eq. 10) with | λ | max ( τ ) for ( k, a ) = (1 ., . and c = I . This choice of k and a provides thebest match.where ε j ( t ) are T N standard normal random variables, e r. is a random orthonormal basis(meaning that e r. j are entries of an SO ( N ) matrix) and ‘ r. i = λ Emp i for i K − ( N − K ) − K P a =1 λ Emp a for i > K . (13)For large N the covariance matrix of r i ( t ) , H r. , is close to the correlation obtained from thesame returns. In other words, H r. ii ≈ . Next, we modelled the Maximum-Variance marketneutral portfolios as (cid:0) ωωω (0) ? a (cid:1) i = N X j =1 (cid:0) ‘ r. j (cid:1) µ/ (cid:0) e r. j (cid:1) i E ja , (14)where E is a N × K matrix of standard normal random variables simulating our factorloadings, and µ is a free parameter (cid:28)xed to µ ≈ . in order to match the observations.With these conventions the unconditional matrices become: h r. ab = ωωω (0) ? T a H r. ωωω (0) ? b and γγγ r. ab = N X i =1 (cid:0) ω (0) ? a (cid:1) i H r. ii (cid:0) ω (0) ? b (cid:1) i . (15)These matrices are used to de(cid:28)ne C r. = γγγγγγγγγ r. − / h r. γγγγγγγγγ r. − / (16)17igure 11: ρ τ ( λ ) based on S using Ahdida and Alfonsi (2013) (Eq. 10) with τ = 1 , or days. The histograms corresponds to the simulated distribution of the eigenvaluesof the increments of the correlation matrix. τ = 1 , or days corresponds to the timehorizon of the increment. The simulation, that was obtained with ( k, a ) = (1 ., . and with100 iterations (paths).Figure 12: ρ τ ( λ ) based on S using Ahdida and Alfonsi (2013) (normalized version of Eq.10) with τ = 1 , or days. The histograms correspond to the simulated distribution ofthe eigenvalues of the increments of the correlation matrix. τ = 1 , or days correspondto the time horizon of the increment. The simulation was obtained with ( k, a ) = (1 ., . and with 100 iterations (paths).Finally we model the FCL dynamics by the following Ornstein(cid:226)(cid:128)(cid:147)Uhlenbeck process: dx a ( t ) = − αx a ( t ) dt + σdB a ( t ) . (17)Here B a ( t ) are independent Wiener processes and the parameters α and σ are determinedfrom the best match to the FCL variograms. The time variation is then mimicked by C sim ( t ) = e x ( t ) · · · ... . . . ... · · · e x K − ( t ) C r. e x ( t ) · · · ... . . . ... · · · e x K − ( t ) (18)18igure 13: | λ | max ( τ ) based on S sim . It corresponds to the largest eigenvalue in absolutevalue of the increment of the proxy of the correlation matrix between single stocks dependingon τ , the time horizon of the increment. S sim is derived by a Gram-Schmidt algorithm for (cid:15) = 0 . Figure 14: ρ τ ( λ ) based on S using a Gram-Schmidt algorithm with τ = 1 , or days. The histograms correspond to the simulated distribution of the eigenvalues of theincrements of the correlation matrix. τ = 1 , or days correspond to the time horizonof the increment. The simulation was obtained for (cid:15) = 0 . C Random Matrices from the literature
To measure the (cid:16)distance" between two matrices we will use the normalized version of theHilbert-Schmidt operator also known as the Frobenius inner product : d ( A, B ) = 12 N h A, B i HS ≡ N Tr (cid:0) A † B (cid:1) . (19)If A is real and orthonormal ( AA T = A T A = N × N ), while B = N × N , then the abovereduces to: d ( A, N × N ) = (cid:18) − Tr AN (cid:19) . (20)19igure 15: ρ τ ( λ ) based on S using a Gram-Schmidt algorithm with τ = 1 , or days. The histograms correspond to the simulated distribution of the eigenvalues of theincrements of the correlation matrix. τ = 1 , or days correspond to the time horizonof the increment. The simulation was obtained for (cid:15) = 0 . The distance is, therefore, equal to if and only if A = N × N . Similarly the maximalpossible distance d ( A, N × N = 2) requires A = − N × N (notice that it is possible only foreven N ). C.1 (Ornstein-Uhlenbeck) random walk on SO ( N ) Below we list various options to generate a random walk on SO ( N ) that will not depart (cid:16)toofar" from the identity matrix (a random work around an arbitrary orthogonal matrix is atrivial generalisation).C.1.1 Gram-Schmidt based algorithmLet us denote by GS ( V ) the Gram-Schmidt orthogonalisation (and normalisation) procedurethat acts on the rows of a (square) matrix V . We require that the algorithm doesn’t mixdi(cid:27)erent rows and columns, ans so GS ( V + δV ) is close to V if V is orthonormal and δV issu(cid:30)ciently small.We can then generate the aforementioned walk with: O ( t + 1) = GS ( O ( t ) + µ · I + (cid:15) · W ) , (21)where µ is the drift parameter, W is a random N × N matrix and (cid:15) is the parametercontrolling the random walk around the identity matrix I . Figure 22 presents the simulationoutput for N = 50 and parameters listed in the caption.The result seems to be satisfactory, but it comes with a performance cost. A (cid:16)hand-written" Python code runs . seconds for a single (!) × matrix and is thereforeimpractical for T = 70000 . At the same time the already existing numpy routine for the GSalgorithm is ten times faster but unfortunately it mixes the rows of the matrix and so cannotbe used for the numerical simulation. 20igure 16: | λ | max ( τ ) based on S sim . It corresponds to the largest eigenvalue in absolutevalue of the increment of the proxy of the correlation matrix between single stocks dependingon τ , the time horizon of the increment. S sim is derived the Kac Walk for θ = 0 . and n = 15 random × rotations per δt = 1 . C.2 Kac walk based approaches
To reduce the running time we have to avoid using costly matrix operations (like, for in-stance, matrix products) since they have N complexity, and instead operating on selectedrows/columns of O ( t ) . For example, the di(cid:27)usion can be modelled by a sequence of rotations (cid:18) O ( t + 1) i O ( t + 1) j (cid:19) = (cid:18) c − ss c (cid:19) (cid:18) O ( t ) i O ( t ) j (cid:19) , (22)where c, s = cos( δ ) , sin( δ ) for a small angle δ , and the pair of rows i, j is selected randomly.In terms of the O ( t ) ’s columns this is just the Kac walk often used to model di(cid:27)usion on asphere.Repeating these so-called Givens rotations leads to a random walk on SO ( N ) thoughde(cid:28)nitely with no (cid:16)mean-reversion(cid:17). To introduce the drift we have somehow to rotate O ( t ) back each time in order to bring it closer to the identity matrix.We tried two di(cid:27)erent approaches:1. Identify a pair of indices i, j for which the value | O ( t ) i,j − O ( t ) j,i | is maximal. Thiscorresponds to a plane where O ( t ) deviates the most from the identity matrix. Rotatethen in this plane by a (cid:28)xed portion of the angle needed to bring this part of O ( t ) maximally close to the identity matrix: γγγ · arcsin O j,i − O i,j p ( O j,i − O i,j ) + ( O i,i + O j,j ) ! (23)21igure 17: ρ τ ( λ ) based on S using Kac walk with τ = 1 , or days. The histogramscorrespond to the simulated distribution of the eigenvalues of the increments of the corre-lation matrix. τ = 1 , or days correspond to the time horizon of the increment. Thesimulation was obtained for with θ = 0 . , n = 15 random × rotations per δt = 1 andwith 1000 iterations (paths) with Kac walkFigure 18: ρ τ ( λ ) based on S using Kac walk with τ = 1 , or days. The histogramscorrespond to the simulated distribution of the eigenvalues of the increments of the corre-lation matrix. τ = 1 , or days correspond to the time horizon of the increment. Thesimulation was obtained for with θ = 0 . , n = 15 random × rotations per δt = 1 andwith 1000 iterations (paths) with Kac walk2. Proceed the same as above but select i for which the value | O ( t ) i,i − | is maximal andtake j with the largest | O ( t ) i,j − O ( t ) j,i | .The two algorithms have overall four di(cid:27)erent parameters: • The number of random consecutive Givens rotations, n RW . • The constant angle used for these rotations, δ . • The number of the consecutive (cid:16)reversions" applied after Givens rotations, n Rev . • The parameter γγγ needed to control the reversion/drift.The (cid:28)rst approach performs slightly better, but the search for the pair ( i, j ) has an N cost, while the latter has only N -order complexity. Figure 23 demonstrates the N = 50 implementation of the second algorithm with parameters described in the caption.22igure 19: | λ | max ( τ ) based on S sim . It corresponds to the largest eigenvalue in absolute valueof the increment of the proxy of the correlation matrix between single stocks depending on τ ,the time horizon of the increment. S sim is derived the new Stochastic Di(cid:27)erential Equationfor µ = 1 . and σ = 0 . References
Ahdida,A., A. Alfonsi. (cid:16)A mean-reverting SDE on correlation matrices.(cid:17) Stochastic processand their application 2013R. Allez and J.-P. Bouchaud, (cid:16)Eigenvector dynamics: General theory and some applica-tions(cid:17),Phys. Rev. E 86, 046202 (2012).Bru, M.F.. (cid:16)Wishart processes(cid:17). J. Theoret. Probab., 4(4):725(cid:226)(cid:128)(cid:147)751, 1991.Cox,J. , J. Ingersoll, and S. Ross. (cid:16)A theory of the term structure of interest rates(cid:17). Econo-metrica, 53:385(cid:226)(cid:128)(cid:147)407, 1985Cuchiero,C., D. Filipovi , E. Mayerhofer, and J. Teichmann. (cid:16)A(cid:30)ne processes on positivesemide(cid:28)nite matrices(cid:17). Ann. Appl. Probab., 21(2):397(cid:226)(cid:128)(cid:147)463, 2011.Fonseca,J. Da, M. Grasselli, and C. Tebaldi. (cid:16)Option pricing when correlations are stochas-tic: an analytical framework(cid:17). Springer, 2008.Grebenkov, D., J. Serror (cid:16)Following a trend with an exponential moving average: Analyticalresults for a Gaussian model(cid:17) Physica A 2014Gourieroux, C., (cid:16)Continuous time Wishart process for stochastic risk(cid:17). Econometric Reviews,25:2:177 (cid:226)(cid:128)(cid:147) 217, 2007. 23igure 20: ρ τ ( λ ) based on S using the new Stochastic Di(cid:27)erential Equation with τ = 1 , or days. The histograms correspond to the simulated distribution of the eigenvaluesof the increments of the correlation matrix. τ = 1 , or days correspond to the timehorizon of the increment. The simulation was obtained for µ = 1 , σ = 0 . , and with 1000iterations (paths) with Stochastic Di(cid:27)erential Equation.Figure 21: ρ τ ( λ ) based on S using the new Stochastic Di(cid:27)erential Equation with τ = 1 , or days. The histograms correspond to the simulated distribution of the eigenvaluesof the increments of the correlation matrix. τ = 1 , or days correspond to the timehorizon of the increment. The simulation was obtained for µ = 1 , σ = 0 . , and with 1000iterations (paths) with Stochastic Di(cid:27)erential Equation.Heston, Steven L. (1993). (cid:16)A Closed-Form Solution for Options with Stochastic Volatilitywith Applications to Bond and Currency Options(cid:17). The Review of Financial Studies. 6(2): 327(cid:226)(cid:128)(cid:147)343. doi:10.1093/rfs/6.2.327. JSTOR 2962057.Kac, M. Probability and related topics in physical sciences, Book American MathematicalSoc. 1959Valeyre, S., S. Kuperstein, D. S. Grebenkov, S. Aboura (cid:16)The market neutral fundamentalmaximum variance portfolios(cid:17) Working Paper 2018.24igure 22: The OU walk de(cid:28)ned by (21) for N = 50 and ( µ, (cid:15) ) = (0 . , . . Thehorizontal axis is t and the vertical axis corresponds to the distance between O ( t ) and theidentity matrix as de(cid:28)ned by (20). The matrix O (0) is a random orthogonal matrix, so thedistance is close to . The process starts to oscillate around I d at t = 400 . Notice that d = 0 . is in fact a very small distance for N = 50 .Figure 23: The horizontal axis is t and the vertical axis corresponds to the distance between O ( t ) and the identity matrix as de(cid:28)ned by (20). The parameters are ( n RW , δ, n Rev , γγγ ) = (cid:0) , . · π , , . (cid:1) . In this simulation O (0) = I .25 . Should Employers Pay Better theirEmployees? An Asset Pricing Approach hould employers pay their employees better?An asset pricing approach ABSTRACT
We uncover a new anomaly in asset pricing that is linked to the remuneration: the morea company spends on salaries and benefits per employee, the better its stock performs, onaverage. Moreover, the companies adopting similar remuneration policies share a commonrisk, which is comparable to that of the value premium. For this purpose, we set up anoriginal methodology that uses firm financial characteristics to build factors that are lesscorrelated than in the standard asset pricing methodology. We quantify the importance ofthese factors from an asset pricing perspective by introducing the factor correlation level asa directly accessible proxy of eigenvalues of the correlation matrix. A rational explanationof the remuneration anomaly involves the positive correlation between pay and employeeperformance.JEL classification: G12, G32, J30, C4Keywords: Anomalies, Asset Pricing, Remuneration, Performance, Factor Correlation.
The wages of labour are the encouragement of industry, which like every other humanquality, improves in proportion to the encouragement it receives. Where wages are high,accordingly, we shall always find the workmen more active, diligent, and expeditious, thanwhere they are low. ” Adam Smith (1776) . I. Introduction
Should employers pay their employees better? Although this question might appearprovoking because lowering production costs remains a cornerstone of the contemporaryeconomy, we present the first attempt to report the real effects of employee remunerationon asset pricing. Remuneration – defined as the annual salaries and benefits expenses (e.g.,wages, bonuses, pension expenses, health insurance payment, etc.) per employee – is the basisof any employment contract. For instance, pay was shown to explain, on average, 65% ofthe variance in evaluations of overall job attractiveness (Rynes et al. , 1983). Classical theorystates that profit-maximizing firms choose the level of labor pay by setting the marginalcost of labor (i.e., the wage rate) equal to the marginal revenue product of labor (i.e., themarginal benefit). Beyond this paradigm, we provide strong evidence that firms that paytheir employees better tend to over-perform on the stock market.Our objective is to examine whether remuneration is an anomaly that can be priced inasset pricing models. Schwert (2003) defines anomalies as “empirical results that seem to beinconsistent with maintained theories of asset-pricing behavior (the CAPM). They indicateeither market inefficiency (profit opportunities) or inadequacies in the asset-pricing model.After they are documented and analyzed in the academic literature, anomalies often seemto disappear, reverse, or attenuate.” Anomalies are typically identified either by regressinga cross-section of average returns (e.g., the seminal Fama and MacBeth (1973) approachuses the capitalization and book-to-market values), or by using a panel regression of thecross-section of returns with different factor returns through the F-Statistic (Gibbons et al. ,1989), or by using a portfolio-based approach that segregates individual stocks with similarcapitalization and book-to-market values into different style portfolios (Fama and French,1993). In the latter case (which we refer to as the “FF approach”), the factors formedon small minus big market capitalization portfolios (SMB) and high minus low book-to-market portfolios (HML) explain an important part of the identified anomalies (Fama andFrench, 1996). Over recent decades, the growing number of discovered anomalies suggeststhat the standard asset pricing models fail to explain much of the cross-sectional variationin average stock returns. Meanwhile, the effect of remuneration on company performancehas surprisingly never been tested, despite the fact that employers pay particular attention2o labor costs in attempting to maximize profits.This research contributes empirically to the asset pricing literature by introducing anobservable firm characteristic, namely the remuneration, as a candidate anomaly. Moreprecisely, we focus on remuneration as a priced factor. Indeed, it remains unclear howfar remuneration can explain the cross-section of returns despite a sizeable literature onlabor economics that relates labor to asset pricing. This branch of literature has intensivelyinvestigated the impact of labor decisions on the firm’s value, notably through the operatingleverage, which affects the equity returns riskiness. However, to our best knowledge, thereare no asset pricing studies that incorporate employee’s wages as a pricing factor. Besides,based on the impressive list of anomalies analyzed by Harvey et al. (2016), we find onlyone paper that highlights income as a potential factor. Indeed, Gomez et al. (2015) analyzethe relation between U.S. census division-level labor income and the cross-section of returnsusing the standard Fama and French (1993) approach. More specifically, these authors useper capita personal income (from the Bureau of Economic Analysis) as a new candidatefactor and conclude that the cross-section of stock returns depends on the census district inwhich the headquarters of the firm are located. Unfortunately, as Harvey et al. (2016) hasnoted, “most of the division level labor income have a non-significant t-statistic.
We do notcount their factors ”. Moreover, we use remuneration at the company level to generate resultsthat are more realistic from an asset pricing perspective, which contrasts with Gomez et al. (2015), whose scope is limited to income per state and per division.This research contributes also theoretically to the asset pricing literature by introducinga new methodology to build factors that is conceptually close to principal component anal-ysis (PCA) but goes beyond its noise-induced limitations. This methodology presents manyadvantages compared with the conventional multi-factor approach developed by Fama andFrench (1992, 1993). We propose a new measure of “explanatory power” of factors wherethe relevance of the factor does not depend on the number of considered factors, in contrastto the R-squared argument of the FF setting. Hence, we introduce the Factor CorrelationLevel (FCL) as a metrics of common risks that measures the ability of stocks within thefactor to fluctuate in a common way. Importantly, it allows ordering the factors according totheir capacity of taking into account the variability of stocks, and therefore to their impor-tance from an asset pricing perspective. In this respect, our ranking by the FCL indicatorresembles principal component analysis. At the same time, this indicator is also linked to theR-squared value of the factor in the asset pricing model: higher FCLs correspond to higherR-squared values in the asset pricing model with one factor. The empirical validation of theFCL methodology is founded on an exhaustive testing protocol. First, we use ten factorsthat summarize most of the existing factors: dividend, capitalization, liquidity, momentum,3ow-volatility, debt-to-book, sales-to-market, book-to-market, cash and, of course, the remu-neration factor; those which are not present in this list remain correlated with some of thesefactors; we check that performance associated with the remuneration factor is not explainedby other major factors such as low-volatility, capitalization, book-to-market, or momentum.Second, we consider six “supersectors” that are used to split stocks into comparable groupssince remuneration varies strongly from one sector to another. Third, we employ a large dataset of 3612 daily single stock close prices from January 2001 to July 2015 for the 569 biggestcompanies in Europe. For comparison, we also treat the same number of randomly selectedcompanies in the U.S.A. whose capitalization exceeds 1 billion of dollars. Although we donot access the remuneration data for these companies, the analysis of other factors allowsus to validate the FCL methodology on the U.S. market (often considered as a benchmark)and to compare our predictions to whose of the FF approach. Fourth, we perform severalrobustness checks to examine if the results change with the tested variations; for instance, weperform a separate analysis with the 258 biggest companies from U.K. to check for potentialdomestic biases; we also run the methodology on monthly data to check the role of time scale;in the spirit of comparability, we evaluate the factor performances with seven incrementaltransitions from the standard FF approach to our methodology. Finally, we compare ourresults with the basic PCA and illustrate its limitations. Our main result indicates that amarket neutral investment strategy based on the remuneration anomaly would likely deliverpositive annual returns of 2.42% above the market.The remainder of the paper is organized as follows. Section II offers a literature reviewthat covers several fields of research. Section III describes the novel methodology. SectionIV presents the data, whereas Section V presents the empirical results. Section VI discussesthe advantages and limitations of our methodology and compares it with the FF approach.Section VII summarizes the main findings and concludes.
II. Literature review
A. The asset pricing
This article is mainly related to the asset pricing literature in which previous studies haveshown that the average returns of common stocks are related to firm characteristics such ascapitalization, price-earnings ratio, cash flow, book-to-market, past sales growth and pastreturns. For example, stocks with lower market capitalization tend to have higher averagereturns (Banz, 1981). Another important anomaly is the value premium: value stocks havehigher returns than growth stocks, which is likely because the market undervalues distressed4tocks (Fama and French, 1998). More precisely, small stocks and value stocks have higheraverage returns than their betas can explain (Campbell and Vuolteenaho, 2004). Profitabilityand investment also add to the description of average returns (Fama and French, 2015). Thelow volatility anomaly was revealed for medium and big stocks in addition to growth stocks(Jordan and Riley, 2013). Those stocks that are expected to have high idiosyncratic risk earnhigh returns in the cross-section (Fu, 2009). This result contradicts previous findings madeby Ang et al. (2006), who posit that stocks with high idiosyncratic volatility have low averagereturns. Macroeconomic risk has also been connected with the cross-section of returns. Forinstance, the growth rate of industrial production is seen as a priced risk factor in standardasset pricing tests (Chen et al. , 1986; Liu and Zhang, 2008). There is a size effect in bankstock returns that differs from the market capitalization effects documented in non-financialstock returns (Gandhi and Lustig, 2015). The most popular anomaly is momentum: stockswith low past returns tend to have low future returns while stocks with high past returns tendto have high future returns (Jegadeesh and Titman, 1993). Hence, the momentum strategythat buys past winners and sells past losers should earn abnormal returns in upcoming years.Return momentum has also been observed when spreads in average momentum returnsdecrease from smaller to bigger stocks (Fama and French, 2012). However, momentumstrategies seem to produce losses specifically in January (Jegadeesh and Titman, 1993),probably based on taxation effects (Grinblatt and Moskowitz, 2004). Similarly, changes inbook equity appear to be more informative about expected stock returns than price returns(Bali et al. , 2013). Notably, certain stock market anomalies may appear and then disappearafter publication in academic journals (McLean and Pontiff, 2015). In spite of the abundantliterature, the work by Gomez et al. (2015) seems to be the sole article that considers incomeas a candidate anomaly although it is still not an income per employee but rather per stateand per division. Several models have been developed to provide economic interpretations ofnumerous stylized anomalies and to improve the performance of the CAPM. Simultaneously,the anomaly-based evidence against the CAPM has been questioned because anomalies haveprimarily been confined to small stocks (Cederburg et al. , 2015). Campbell and Vuolteenaho (2004) introduced a two-beta model to explain the capitalization and book-to-market value anomalies in stock returns by splitting the CAPM into a cash-flow beta with a higherprice of risk than a discount-rate beta. Fama and French (1993) proposed a three-factor model to capturethe patterns in U.S. average returns associated with capitalization and value-versus-growth. Even aftera theoretical rationale for the three-factor model was provided by Ferguson and Shockley (2003), manyanomalies remain unexplained by the three-factor model (Fama and French, 2015). Although a four-factormodel has been derived (Carhart, 1997), it has also failed to absorb all the momentum in U.S. average stockreturns (Avramov and Chordia, 2006). Recently, a five-factor model was introduced to capture capitalization,value, profitability, and investment patterns in average stock returns and is reputed to perform better thanthe three-factor model (Fama and French, 2015). In line with this criticism, doubt was cast on the set of anomalies to consider in a multi-factorial setup,given that Harvey et al. (2016) have summarized 316 potential factors by reviewing 313 papers published . Corporate finance This article is also related to the extensive literature on corporate finance, which hasalso continued to investigate the relation between remuneration and performance, althoughit has usually focused on managerial pay as opposed to the broader category of employeesthat we consider in the present study. This branch of literature typically examines the wageas a managerial incentive likely to reduce agency costs by designing an optimal job contract.In that sense, we may consider that solving the incentive problem leads to shareholder valuecreation affecting stock returns. Indeed, managers face both discipline and opportunities pro-vided by the free market economy that leads to the notion that there is no need for explicitcontracts to resolve incentive problems (Fama, 1980). Nevertheless, market forces cannotact as a complete substitute for contracts (Holmstrom, 1999) because career concerns mustbe considered to design optimal contracts and to arrive at strong incentives (Gibbons andMurphy, 1992). The effects of incentives depend on how they are designed (Gneezy et al. ,2011), given that managers have considerable power to shape their own pay arrangements –and perhaps to even hurting shareholder interest (Bebchuk et al. , 2002). Indeed, public com-pany disclosures do not provide a comprehensive measure of managerial incentive to increaseshareholder value (O’Byrne and Young, 2010). Many explanations were brought forwardto justify top managers’ remuneration. Firms with abundant investment opportunities paytheir executives better (Gaver and Gaver, 1995). The increase in the level of stock-optioncompensation can be explained by the inability of boards to evaluate its real costs (Halland Murphy, 2003; Jensen et al. , 2004). The capitalization of large firms explains manypatterns in top manager pay across firms, over time, and between countries (Gabaix andLandier, 2008). Manager fixed effects, interpreted as unobserved managerial attributes andunderstood as a proxy for latent managerial ability, are important in explaining the levelof executive remuneration (Graham et al. , 2012). Overall, remuneration matters becauseit may affect a corporation’s level of risk as bonus-driven remuneration might encourageexcessive risk-taking. However, pay and risk are correlated not because mis-aligned paydrives risk-taking, but rather because principal agent theory predicts that riskier but moreprofitable firms must pay more remuneration than less risky firms to provide a risk-aversemanager the same incentives (Cheng et al. , 2015). since 1967. In the same vein, 38 out of 80 potential firm-level anomalies were shown to be insignificant inthe broad cross-section of average stock returns (Hou et al. , 2015). In addition, mistakes can easily be madein this field due to multiple testing or data mining methods. As noted by Harvey and Liu (2015), manydiscovered factors are likely to be false if their t-statistics do not exceed 3. Finally, these papers suggest thatmany claims in the anomalies literature are likely to be exaggerated regarding the associated t-statistics. . Labor economics The labor economics literature treats this question through the “efficiency wage theory”by relating it to unemployment. Yellen (1984) and Akerlof and Yellen (1990) did a remark-able work with an analysis that is built – unlike most economic models – mainly on sociologyand psychology with experimentation that delivers salient stylized facts on human behaviorin a working context. Efficiency wage theory maintains that rising wages is the best wayto increase output per employee because it links pecuniary incentives to employee perfor-mance. In particular, the use of performance pay packages by employers has been shownto increase employee productivity (Lazear, 2000) and job satisfaction (Green and Heywood,2008). There are several interesting studies that relate labor market to asset pricing. Allthese empirical results emphasize the significant impact of labor decisions, in which wageplays a prominent role, onto firm’s value. Santos and Veronesi (2006) show that labor incometo consumption ratio is a strong predictor of long horizon returns. Danthine and Donaldson(2002) explain that operating leverage is more significant for the riskiness of equity returnsthan financial leverage. In other words, attention should be paid to wages, particularly be-cause the priority nature of wages enhances the risk of dividends. In this spirit, Kuehn etal. (2013) note that a high value of unemployment makes wages inelastic, which gives rise tooperating leverage. The impact of inelastic wages is even stronger in bad times as it amplifiesthe equity risk premium. Gourio (2007) argues that because wages are smooth, revenues aremore cyclic than costs, making the profits more volatile. In particular, firms with high book-to-market or with low productivity, i.e. value firms, have more pro-cyclic earnings. Ochoa(2013) finds a positive and statistically significant relation between the reliance on skilledlabor and expected returns. In times of high volatility, firms with a high share of skilledworkers earn an annual return of 2.7% above those with a high share of unskilled workersnotably because their labor is more costly to adjust. Labor decisions made by workers canaffect firm risk (Donangelo, 2014) while hiring decisions can also be the determinants of firmrisk (Carlson et al. , 2004; Belo et al. , 2014). Indeed, Donangelo (2014) discusses the idea thatmobile workers carry some of the firm’s capital productivity when they leave an industry. Hefinds that portfolios that hold long positions in stocks of high-mobility industries (generalworkers) and short positions in stocks of low-mobility industries (industry-specific workers)earn an annual return spread of over 5%. Like Monika and Yashiv (2007) who explain thatlabor should matter since firms’ market value embodies the value of hiring, Belo et al. (2014)argue that the market value of a firm reflects the value of its labor force because the firm canextract rents as compensation for the costs associated with adjusting its labor force. Theyfind that long positions in stocks of low-hiring firms and short positions in high-hiring firmsearn an average annual excess stock return of 5.6%. Favilukis and Xiaoji (2016) introduce7nfrequent renegotiation in standard wages model showing that it leads to smooth averagewages. Due to this wage rigidity, they find that wage growth forecasts long-horizon excessequity returns.
D. Social sciences
This article is also broadly related to several streams of research in various social sciences,including sociology, psychology and human resources. In these fields, wage acts like a moti-vator since it typically reflects a social preference for rewards likely to affect the employee’sperformance. Sociological studies have developed a theory of social exchange in which thereare equivalent rewards on both sides (Blau, 1955), which is consistent with the preferencefor reciprocity that is viewed as a social preference, as it depends on the behavior of thereference person (Fehr and Falk, 2002). Reciprocity induces agents to cooperate voluntarilywith the principal when the principal treats them correctly; the evidence for reciprocity isbased on a so-called gift exchange experiment.Psychological studies highlight the exchange in working situations in which the perceivedvalue of labor equals the perceived value of remuneration, based on the theory of equity(Adams, 1963). When there is no mismatch between effort and wages, employees maychange their perceived effort and even their perceived level of remuneration by redefiningthe non-pecuniary component.Human resources studies generally offer evidence that money is an important motivatorfor most people (Rynes et al. , 2004), as pay can help climbing on the Maslow’s motivationalhierarchy of needs, including social esteem and self-actualization. Nevertheless, tangiblerewards might also produce secondary negative effects on motivation (Baker, 1992) by fore-stalling self-regulation (Deci et al. , 1999).
III. Methodology
In this section, we introduce a new methodology to build factors that combines advantagesof the PCA and the Fama and French (1993) approach. As would be the case with the PCA,our factors are built to be uncorrelated with the market index and with sectorial factors. Foreach factor, we introduce and estimate the Factor Correlation Level (FCL) thatallows us to order the factors based on their importance and to select the mostimportant ones in asset pricing models. . Conventional diagonalization of the covariance and correlation matrices Identifying common risks of multiple assets is necessary to diversify investments and canhelp to profit from style’s arbitrage opportunities. Conventional approaches, such as PCA,attempt to diagonalize the empirical covariance (or correlation) matrix of the traded uni-verse, i.e., to decorrelate assets by constructing independent linear combinations (portfolios)of assets. Each eigenvector of the covariance matrix represents the coefficients of one suchcombination while the corresponding eigenvalue gives its variance. If the covariance ma-trix does not contain negative elements (i.e., if there are no negatively correlated assets),the eigenvector corresponding to the largest eigenvalue has positive elements that can beinterpreted as relative weights of stocks in the market mode. The classical long portfolio,following the market, can be constructed by investing in proportion to these weights. Inturn, market neutral portfolios should be orthogonal to the market mode and therefore haveboth long and short positions (the latter corresponding to negative weights). The othereigenvectors capture different common risks of the traded universe, and the most commoninclude sectorial risks (e.g., banking sector, commodities, energy, etc.).In mathematical terms, if the covariance matrix Ω of stocks was known precisely, it mightbe diagonalized to identify uncorrelated linear combinations of stocks and their variancesto assess the related risks. For a traded universe with n stocks, let r , . . . , r n denote thedaily returns of these stocks at a given time. The covariance matrix has n eigenvalues λ , . . . , λ n and n eigenvectors V , . . . , V n satisfying Ω V α = λ α V α (for each α = 1 , . . . , n ).Each eigenvector V α determines one linear combination of stocks, ( V α ) r + . . . + ( V α ) n r n ,which is decorrelated from the others, while the eigenvalue λ α is its variance (under thecondition that V α is appropriately normalized).The above eigenbasis can be interpreted as follows. For any linear combination of stockswith weights w i , r π = w r + . . . + w n r n = ( w · r ) (written as a scalar product), the varianceof such a portfolio π can be expressed as h r π i = h n X i =1 w i r i ! i = n X i,j =1 w i w j Ω i,j = n X i,j =1 w i w j n X α =1 λ α ( V α ) i ( V α ) j = n X α =1 λ α ( w · V α ) , (1)where h . . . i denotes the expectation, and the returns r k were assumed to be centered. Inother words, the variance is decomposed into a sum of variances λ α of independent linearcombinations proportional to the projection of the weights w i onto the corresponding eigen-vector V α . If the weights w i are chosen in proportion to the elements of one eigenvector, i.e.,9 i = c ( V α ) i for some α and c , then the orthogonality of V α to other eigenvectors yields h r π i = λ α c ( V α · V α ) = λ α ( w · w ) , (2)where we used the L -normalization of the eigenvectors: ( V α · V α ) = 1. As expected, thevariance of such a linear combination is fully determined by the corresponding eigenvalue λ α . Notably, the above relation can also be written as λ α = h r π i P ni =1 w i (3)to estimate the variance of the linear combination whose weights are constructed close to aneigenvector.As different stocks exhibit quite distinct volatilities, it is convenient to rescale the stock’sreturn r i by its realized volatility σ i : ˜ r i = r i /σ i . This rescaling is also known to reduceheterogeneity of volatilities among stocks and heteroskedasticity (Andersen et al. , 2000;Bouchaud et al. , 2001; Valeyre et al. , 2013). In other words, one can write h r π i = h n X i =1 w i σ i ˜ r i ! i = n X i,j =1 ˜ w i ˜ w j C i,j , (4)where ˜ w i = w i σ i and C = h ˜ r i ˜ r j i is the covariance matrix of the renormalized returns ˜ r i or, equivalently, the correlation matrix of returns r i : Ω i,j = σ i σ j C i,j . To proceed, theeigenvalues and eigenvectors of Ω can be replaced by the eigenvalues ˜ λ α and eigenvectors ˜ V α of the correlation matrix C , C ˜ V α = ˜ λ α ˜ V α , i.e., h r π i = n X i,j =1 ˜ w i ˜ w j n X α =1 ˜ λ α ( ˜ V α ) i ( ˜ V α ) j = n X α =1 ˜ λ α ( ˜ w · ˜ V α ) . (5)If the volatility-normalized weights ˜ w i are chosen to be proportional to the elements of aneigenvector, ˜ w i = c ( ˜ V α ) i , one obtains h r π i = ˜ λ α c ( ˜ V α · ˜ V α ) = ˜ λ α c = ˜ λ α ( ˜ w, ˜ w ), from which˜ λ α = h r π i P ni =1 w i σ i , (6)where the L -normalization of ˜ V α was used: ( ˜ V α · ˜ V α ) = 1. As previously discussed, ˜ λ α isthe rescaled variance of the linear combination of the volatility-normalized returns ˜ r i (givenby the eigenvector ˜ V α ), each of which is decorrelated from other such combinations. Byconstruction, the variance ˜ λ α is normalized, which facilitates the comparison of different10actors and different markets. We emphasize that diagonalizations of the covariance andcorrelation matrices are generally not equivalent; in particular, the eigenvalues λ α , ˜ λ α and theeigenvectors V α , ˜ V α are different (though in our case, their interpretations should be close).We choose the second option (i.e., Eq. (6)), which inherently reduces stock heterogeneityand heteroskedasticity due to rescaling.Unfortunately, a straightforward diagonalization of the empirical covariance or correlationmatrix estimated from stock price series is known to be very sensitive to noise (Laloux et al. ,1999; Plerou et al. , 1999, 2002; Potters et al. , 2005; Wang et al. , 2011; Allez and Bouchaud,2012). In particular, only a few eigenvectors corresponding to the largest eigenvalues can beestimated, as illustrated and further discussed in Sec. V.D. As a consequence, conventionaldiagonalization does not appear suitable for building various representative factors. B. Our methodology: Indicator-based factors
We propose a different approach to building factors. We begin from the available eco-nomic and financial indicators regarding the traded companies, such as their capitalization,sales-to-market, dividend yields, etc. We expect that companies with comparable indica-tors – at least those with comparable indicators in the extreme quantiles of the indicatordistribution – will exhibit correlations in their stock performance. This hypothesis allowsus to construct and then test indicator-based factors beyond sectors. To minimize sectorialcorrelations, we split the stocks into six supersectors of similar sizes, as detailed in AppendixA. The following construction is performed separately for each supersector and then the dataare aggregated (see below).We consider ten indicator-based factors:1. The dividend factor, which is based on the dividend yield.2. The capitalization (or size) factor, which is based on capitalization.3. The liquidity factor, which is based on the ratio of the weekly exponential movingaverage to the total number of shares (i.e., capitalization/close price).4. The momentum factor, which is based on the 3-year exponential moving average ofpast daily returns.5. The low-volatility (or beta) factor, which is based on the sensitivity to the stock index.6. The leverage factor, which is based on the debt-to-book value ratio.7. The sales-to-market factor, which is based on the ratio of sales to market value at theend of the fiscal period.8. The book-to-market factor, which is based on the ratio of the book value to the marketvalue at the end of the fiscal period. 11. The remuneration factor, which is based on salaries and benefits expense per employee.10. The cash factor, which is based on the ratio between the free cash flow and the latestmarket value.We believe that considering these 10 factors is sufficient and including additional factorswill not significantly change our results. In particular we might have included the investmentand profitability factors following Fama and French (2015), but we expect that our 10 factorsalready capture the common risk from these two factors. Indeed, sales and cash shouldbe correlated with profitability, whereas the dividend yield and leverage ratio should becorrelated with investment.For each trading day, the stocks of the chosen supersector are sorted according to theindicator (e.g., remuneration) available the day before (we use the publication date and notthe valuation date). The related indicator-based factor is formed by buying the first qn s stocks in the sorted list and shorting the last qn s stocks, where n s is the number of stocks inthe considered supersector, and 0 < q < is a chosen quantile level. The other stocks (withintermediate indicator values) are not included (weighted by 0). In the simplest setting, onecan choose equal weights: w i = +1 , if i belongs to the first qn s stocks in the sorted list , − , if i belongs to the last qn s stocks in the sorted list , , otherwise. (7)In attempting to reduce the specific risk, the common practice suggests to invest inverselyproportional to the stock’s volatility σ i , i.e., to set w i = ± /σ i or 0. Moreover, the inversestock volatility should also be bounded to reduce the impact of extreme specific risk. Eachtrading day, we recompute the weight w i as follows w i = + µ + min { , σ mean /σ i } , if i belongs to the first qn s stocks in the sorted list , − µ − min { , σ mean /σ i } , if i belongs to the last qn s stocks in the sorted list , , otherwise, (8)where σ mean = n s ( σ + . . . + σ n s ) is the mean estimated volatility over the supersector. Inthis manner, the weights of low-volatility stocks are reduced to avoid strongly unbalancedportfolios concentrated in such stocks. The two common multipliers, µ ± , are used to ensurethe beta market neutral condition: n s X i =1 β i w i = 0 , (9)12here β i is the sensitivity of stock i to the market (obtained by a linear regression of thenormalized stock and index returns based on the reactive volatility model (Valeyre et al. ,2013); note that the use of standard daily returns leads to similar results, see AppendixB). If the aggregated sensitivity of the long part of the portfolio to the market is higherthan that of the short part of the portfolio, its weight is reduced by the common multiplier µ + < qn s , which is obtained from Eq. (9) by setting µ − = qn s (which implies that thesum of absolute weights | w i | does not exceed 1). In the opposite situation (when the shortpart of the portfolio has a higher aggregated beta), one sets µ + = qn s and determines thereducing multiplier µ − < qn s from Eq. (9). This method of ensuring the market neutralcondition is better than leaving the residual beta (as in the FF approach) or withdrawing itby subtracting an appropriate constant from all weights. Indeed, under our approach, thefactor is maintained to be invested only in stocks that are sensitive to this factor. In turn,subtracting a constant would affect all stocks, even those that were “excluded” and whoseweights were set to 0 in Eq. (8). We also emphasize the difference with the conventional FFapproach: our factors are built to be market-neutral under Eq. (9), whereas theFF portfolio is built to be delta-neutral (i.e., to have zero net investment): n s X i =1 w i = 0 . (10)The resulting factor is obtained by aggregating the weights constructed for each super-sector. This construction is repeated for each of the ten factors listed above. We emphasizethat the factors are constructed on a daily basis, i.e., the weights are re-evaluated dailybased on updated indicators. However, most indicators do not change frequently so that thetransaction costs related to changing the factors are not significant.The above procedure can be extended to construct factors from other quantiles, in ad-dition to the first and the last. In this manner, we will consider three portfolios for eachfactor: • Q1: long positions for stocks whose indicator belongs to the first 15% quantile andshort positions for stocks in the last 15% quantile, as discussed above (for q = 0 . • Q2: long positions for stocks in the second 15% quantile and short positions for stocksin the next-to-last 15% quantile (i.e., positive weights are assigned to stocks rangingbetween 0 . n s and 0 . n s in the list, and negative weights are assigned to stocksranging between 0 . n s and 0 . n s ). • Q3: long positions for stocks in the third 15% quantile (0 . n s − . n s ) and shortpositions for stocks in the third-to-last 15% quantile (0 . n s − . n s ).13o evaluate common risk with each factor, we introduce the factor correlation level (FCL)as the square root of the ratio between the empirical variance of the indicator-based factorand the total empirical variance of the constituent stocks:FCL( t ) = (cid:18) EMA { r π ( t ) } EMA { P ni =1 w i ( t ) σ i ( t ) } (cid:19) / , (11)where r π ( t ) is the daily return of the factor, r π ( t ) = n X i =1 w i ( t ) r i ( t ) , (12)where w i ( t ) is the weight of the stock i in the factor, and σ i ( t ) is the volatility of the stock i estimated using the reactive volatility model (Valeyre et al. , 2013). The exponential movingaverage (EMA) is used with a long averaging period of 200 days to reduce noise by smoothingmeasurements. We emphasize that the above sum aggregates stocks from all supersectors.We also considered the standard volatility estimator based on a 40-days exponential movingaverage and obtained similar results (see Appendix B). The square root in Eq. (11) is takento operate with volatilities instead of variances. The estimator (11) is built analogouslyto Eq. (6) for the eigenvalues ˜ λ α of the correlation matrix. This analogy relies on theassumption that the indicator-based weights w i are close to an eigenvector of the correlationmatrix. Since the true correlation matrix is unavailable, it is impossible to directly validatethis strong assumption. We will therefore resort to indirect validations based on empiricalcorrelations of the constructed factors and on the profitability of trading strategies derivedfrom such factors. Note also that the weights w i depend on the choice of the quantile q , suchthat we expect to have slightly different results for different quantiles (see Fig. 4 below).Simultaneously, the analogy to eigenvalues of the correlation matrix allows various factors tobe classified according to their “importance”: larger values of FCL mean stronger volatilityof the factor and therefore higher common risks. For example, when the correlation of smallcapitalization firms increases while the volatility of individuals stocks remains stable, theFCL of the capitalization factor will increase, and the volatility of the factor will increase.In general, the risk of a factor is proportional to the average individual volatility multiplied bythe FCL. For this reason, FCL can be interpreted as an average correlation measurebetween stocks within the factor that is also directly linked to the common risklevel underpinning the factor.
It must also be emphasized that the FCL estimator isdynamic, i.e., it can capture changes in the correlation structure of the market over time.14
V. Data
In this study, we use only liquid stocks (most with capitalization greater than 800 millioneuros), thus excluding microcap firms that are typically the main focus of the labor stud-ies we have cited. Thanks to the European accounting regulations, the remuneration mustbe provided by European companies on a regular basis and can thus be accessed throughcommercial databases such as FACTSET. Lacking such information for the U.S. market, wemainly focus on the European companies. To reveal possible nation-specific features, theanalysis is performed for two trading universes: (i) the 569 biggest companies in Europe(London Stock Exchange, Euronext, Eurex, Sixt), and (ii) the 258 biggest companies on theLondon Stock Exchange only. Although the twice-as-large European universe is expectedto increase the statistical significance of the results, the consideration of the U.K.-boundeduniverse allows us to eliminate country biases and additional fluctuations (e.g., due to cur-rency exchange rate variations). We will show that the major conclusions are similar forboth universes. In addition, we will validate our indicator-based methodology on the U.S.universe that includes the 569 randomly selected companies whose capitalization is above 1billions of dollar. Note that the universe of the 1229 biggest firms in the U.S. studied byFama and French (2008) is comparable to our European universe in terms of capitalizationand liquidity.All the companies that we include in the European and U.K. universes belong to thesmall (below 1 billion euros), mid (between 1 and 5 billion euros), large (between 5 and 20billion euros), or big (above 20 billion euros) capitalization categories. The data set consistsof 3612 daily single stock close prices from January 2001 to July 2015.
Note that mostFama and French data begin from 1963, which leads to greater t-statistics.
We relyon daily prices (instead of the monthly prices that are commonly used in the literature) tohave more precision in the temporal granularity of our FCL estimation. In addition, severaleconomic and financial indicators are extracted from the FACTSET database: book-to-market, capitalization, sales-to-market, dividend yield, debt-to-book, free cash flow, salariesand benefit expenses, and the number of employees on an annual basis (see Table I). For theEuropean universe, we partly offset geographical biases in each indicator by renormalizing itto its median in the country. For instance, remuneration is divided by its median by country,whereas the median by country is subtracted from the moving average of returns in the caseof momentum. 15apitalization Number of employees RemunerationEurope (13 ±
25) B e (41 ±
78) thousand (0 . ± .
99) M e U.K. (11 ±
21) B £ (38 ±
87) thousand (0 . ± .
08) M £ Table I
Basic statistics (mean and standard deviation) regarding capitalization (in billionsof euro/pounds), number of employees (in thousands), and remuneration (in millions ofeuro/pounds) from the FACTSET database. Since minimal capitalization is approximately800 million euros, the distribution is truncated at small capitalizations.
V. Empirical results
In this section, we present the main results of our methodology applied to the European,the U.K. and the U.S. universes. We mainly focus on the remuneration indicator, which haslargely been ignored so far. We will show that remuneration yields a non-negligible commonrisk and represents a small anomaly. The possibility of revealing the role of the remunerationfactor relies on the proposed FCL methodology.
A. Correlation between remuneration and capitalization
First, we inspect the empirical joint distribution of remuneration and capitalization. Thisinspection is important because a positive size-wage effect has already been well documentedin the economic literature for microcapitalization firms (Lallemand et al. , 2007). The wagegap due to firm size is approximately 35% (Oi and Idson, 1999) because large firms (butremaining in the microcapitalization category) demand a higher quality of labor and set ahigher performance standard that must be supported by a compensating wage difference.Note that the magnitude and determinants of the employer-size wage premium vary across in-dustrialized countries. Indeed, individual effects explain approximately 90% of inter-industryand firm-size wage differences in France (Abowd et al. , 1999), while almost 50% of the firm-size wage differentials in Switzerland derive from a firm-size effect (Winter-Ebmer et al. ,1999). In the U.K., larger firms pay better because of internal labor markets that rewardeffort and firm-specific capital (Belfield et al. , 2004). A visual inspection of Figure 1 (top)suggests that there is almost no correlation between remuneration and capitalization withinthe class of liquid stocks (that excludes microcapitalization firms) and, in any case, residualcorrelation is not significant. As a consequence, a larger firm from our sample does notnecessarily pay its employees more. This result is consistent with the literature.To confirm that the remuneration anomaly exists for different capitalizations, we splitour sample in two groups: the above-median group of stocks whose capitalization exceedsthe median size of our sample, and the below-median group with the remaining stocks16 capitalization r e m une r a t i on capitalization per employee r e m une r a t i on Figure 1.
Remuneration versus capitalization (top) and remuneration versus capitalizationper employee (bottom). Full circles and empty diamonds present large U.K. and Europeancompanies, respectively. Both quantities are shown in local currency and plotted on a loga-rithmic scale to account for significant dispersion in capitalization and remuneration. Solidand dashed lines indicate the linear regression between the logarithms of these quantities forthe U.K. and European universes, respectively (the respective slopes are 0.34 and 0.30, and R goodness of fit are 0.48 and 0.58, respectively). Since the records on remuneration andcapitalization of each company in the FACTSET database are updated at different momentsof the year, data were averaged over the period from 15/12/2014 to 30/07/2015. Similar re-sults were obtained by taking the latest record for each company (not shown). Two subplotsshow the empirical distributions of capitalization (top) and remuneration (right) among thebiggest European companies. 17
001 2003 2005 2007 2009 2011 2013 2015−5051015 year c u m u l a t i v e pe r f o r m an c e ( % ) above the medianbelow the median Figure 2.
Similar cumulative performance anomalies of two remuneration factors for quan-tile Q1: one is constructed from stocks whose capitalization exceeds the median size of oursample, and the other is constructed from the remaining stocks. The cumulative performanceof both factors after 15 years is approximately 9%, yielding an annualized performance of0 .
6% (compared with 0 .
68% in Table IV). These curves are obtained for the European uni-verse (the results for the U.K. universe are similar and thus not shown). The annualizedperformance for the remuneration factor is thus biased and cannot be fully explained by anunbiased random walk.(we recall that both groups exclude microcapitalization firms). For each group, we buildits own remuneration factor. Figure 2 shows that the cumulative performances of bothremuneration factors are statistically different from 0 and behave similarly. An apparentslight outperformance of the factor constructed for the below-median group is not significantand can be attributed to statistical fluctuations.Further investigations on the size-wage effect compel us to explore this relation per em-ployee. Figure 1 (bottom) reveals that remuneration is positively correlated to cap-italization per employee , i.e., remuneration increases with the amount of capitalizationper employee. One plausible explanation for this phenomenon might be that reducing thenumber of employees (in particular, underperforming employees) increases marginal remu-neration. In summary, there is no correlation between capitalization and remuneration forboth universes of firms with capitalization over 800 million euros. Simultaneously, remuner-ation increases with the amount of capitalization per employee – as if the cake had to beshared fewer times. 18 . Remuneration as a common risk
The motivation for building indicator-based factors relies on the hypothesis that thestocks with close indicator values behave similarly and thus share common risks. To verifythis hypothesis, we compare three realizations of the remuneration factor built on differentquantiles (Q1, Q2, and Q3), as described in Section III.B. Figure 3 shows weak but highlysignificant correlation between the daily returns of the remuneration factors from quantilesQ1 and Q2 (top) and Q1 and Q3 (bottom), notwithstanding that these factors have nostocks in common, which is the indirect proof that the companies adopting similarremuneration policies (e.g., paying their employees well) share a common risk.
The weak correlation can be explained by a rapid decrease of the stock sensitivity to theremuneration factor with the quantile: the correlation level of (Q1,Q3) is measured to be halfthat of (Q1,Q2). The common risk is of the same order of magnitude as the residual risk,even for Q1. In summary, only the stocks in the extreme quantiles are the most sensitiveto the remuneration factor. This observation is also confirmed by the anomalies that aremore important for extreme quantiles , as shown in Figure 4.
C. Factor correlation level as a proxy of the eigenvalues
Ordering the factors based on their importance is central for the asset pricing analysis.As discussed in Sec. III.B, the relevance of indicator-based factors can be characterizedusing the factor correlation level (FCL) defined by Eq. (11). If the factor weights wereapproximately proportional to the elements of an eigenvector of the correlation matrix, theFCL would be an estimator of the volatility of this factor. The factors with larger FCLwould most likely have greater impact on the portfolio returns for the same exposure. Ingeneral, the risk of a factor is proportional to the average individual volatility multiplied bythe FCL. Thus,
FCL can be interpreted as an average correlation measurementbetween stocks within the factor.
Using the daily returns of each factor and estimating the realized volatility of each stock,we compute the FCL for each factor based on Eq. (11). Figure 5 shows the time evolutionof the FCLs for ten indicator-based factors defined in Sec. III.B. For comparison, we plotthe FCLs for the European and the U.S. universes (the FCLs for the U.K. universe behavesimilarly and are thus not shown). First, the FCLs exhibit strong variations over time. Inparticular, the FCLs of two factors can cross each other, i.e., the ordering of the factorsbased on their “importance” can evolve over time. For both universes, the low-volatilityfactor appears as the most important, followed by capitalization and momentum factors.Other factors are smaller but statistically significant. Averaging the FCL over 15 years al-19 daily performance Q da il y pe r f o r m an c e Q −0.01 −0.005 0 0.005 0.01−0.01−0.00500.0050.01 daily performance Q da il y pe r f o r m an c e Q Figure 3. (Top)
Correlation between the daily returns of the two remuneration factorsconstructed on quantiles Q1 (0%–15% and 85%–100%) and Q2 (15%–30% and 70%–85%),which have no stocks in common. The daily returns of these factors are weakly correlated butcorrelation is significant: the slope and its 95%-confidence interval is 0 . ± . (Bottom) For comparison, the correlation between the daily returns of the remuneration factors Q1and Q3 (30%–45% and 55%–70%) is shown, with the slope and its 95%-confidence interval0 . ± .
03. Both graphs were obtained for the European universe. Similar graphs for theU.K. universe yield the slopes 0 . ± .
03 and 0 . ± .
03 for Q1-Q2 and Q1-Q3 correlations,respectively (graphs are not shown but are available upon request).20
001 2003 2005 2007 2009 2011 2013 2015−10%−5%0%5%10%15% year c u m u l a t i v e pe r f o r m an c e Q1Q2Q3
Figure 4.
The cumulative performance of the remuneration factor for the three quantiles(Q1, Q2 and Q3) for the European universe (the graph for the U.K. universe is similar andis available upon request). Biases are more pronounced for Q1 than for Q2 or Q3, whichmight be explained by the possibility that stocks belonging to the extreme quantile are themost sensitive to the remuneration anomaly.lows us to order the factors according to their importance. Table II suggests the followingorder for the European universe: low-volatility (1.73), capitalization (1.72), mo-mentum (1.41), sales-to-market (1.22), liquidity (1.19), book-to-market (1.13),dividend (1.09), leverage (1.07), remuneration (0.99), and cash (0.92).
All theseFCLs are higher than the noise level of 0 .
78 that we estimated by building a “noise factor”according to an arbitrary non-financial indicator, such as an alphabetic order. Even thoughthe remuneration factor is relatively small, its magnitude remains statistically relevant incomparison with other well-known factors. For example, the FCLs of the book-to-market,dividend, leverage and cash factors are close to that of the remuneration factor. Their lowvalues mean that these factors are not particularly volatile and that the related commonrisks are low. Conversely, the low-volatility factor (excluded from the FF approach) has thehighest FCL and is thus identified as the first potential source of risk in a portfolio, aftermarket index and sectorial risks. Notably, the low-volatility factor is comparable to thecapitalization factor and greatly exceeds the book-to-market factor, the two “major” factorsidentified in the Fama and French (1993) model.21
001 2003 2005 2007 2009 2011 2013 20150.511.522.5 year F C L Div.Cap.Liq.Mom.LowLev.SalesBookRem.Cash year F C L Div.Cap.Liq.Mom.LowLev.SalesBookRem.Cash
Figure 5.
Evolution of the factor correlation level (FCL) for ten factors (quantile Q1): theEuropean (top) and USA (bottom) universes (the behavior for the U.K. universe is similarand available upon request). In our interpretation, FCL is a measure of “importance” offactors in asset pricing models. Thick lines highlight the three major factors: low-volatility,capitalization, and momentum. The mean FCLs averaged over 14 years are summarizedin Table II. All FCLs are highly volatile, but this volatility is not linked to stock marketvolatility. In addition, we can see the jump- and cross-over of FCLs. During the 2007–2008financial crisis, several FCLs collapse for the U.S. universe. Note that we could not constructthe remuneration factor for the U.S. universe because of lack of systematic remuneration datafor U.S. companies. 22CL Div. Cap. Liq. Mom. Low Lev. Sales Book Rem. Cash MarketEurope 1.09 1.72 1.19 1.41 1.73 1.07 1.22 1.13 0.99 0.92 10.41U.K. 0.97 1.45 0.92 1.15 1.38 0.96 1.03 0.96 0.93 0.83 6.73U.S. 1.49 1.73 1.49 1.62 2.10 1.15 1.41 1.12 − Table II
The mean value of the FCL for ten factors (quantile Q1) averaged over the periodfrom 10/08/2001 to 31/07/2015, for the European, U.K., and U.S. universes. According tothese values, the main factors for asset pricing are the low-volatility factor (excluded fromthe FF approach), followed by the capitalization, and momentum factors. We see that thebook-to-market and remuneration factors are of the same order of magnitude such that theremuneration factor should have the same importance in asset pricing models as the book-to-market factor. We also estimated the FCL of the market (last column). The FCL of anoise factor was estimated to be around 0 . D. Comparison with the principal component analysis
The principal component analysis (PCA), which is applied to decorrelate time series,consists in forming the empirical correlation matrix from daily stock returns and then findingits eigenvalues and eigenvectors. In practice, the number of stocks in a traded universe(typically 500 - 1000) is often comparable to the number of available historic returns perstock (for instance, 3612 daily returns in our dataset), that makes this general methodstrongly sensible to noise, as discussed in (Laloux et al. , 1999; Plerou et al. , 1999, 2002;Potters et al. , 2005; Wang et al. , 2011; Allez and Bouchaud, 2012).In order to illustrate this limitation, we apply the PCA to the European universe andcompute numerically 569 eigenvalues. Figure 6 shows the histogram of square roots of theobtained eigenvalues, i.e., how many eigenvalues are contained in successive bins of size0 . λ / ≈ .
62, corresponding to the market mode, was excludedfrom the plot for a better visualization of other values. One can identify approximatelyten well-separated single eigenvalues that are typically attributed to market sectors. Inturn, the remaining part of (smaller) eigenvalues lying close to each other and thus almostindistinguishable, can be rationalized by using the random matrix theory (Laloux et al. ,1999). If the daily stock returns were distributed as independent Gaussian variables (withmean zero and variance one), the eigenvalues of the underlying empirical correlation matrixwould asymptotically be distributed according to the Marcenko-Pastur density ρ ( λ ) = p qλ − ( λ + q − πqλ , (13)23 λ o cc u r en c e Figure 6.
Histogram of square roots of eigenvalues, λ / , of the empirical correlationmatrix obtained from daily returns of 569 stocks in the Europe universe over the periodfrom 10/08/2001 to 31/07/2015. The largest value, λ / ≈ .
62, corresponding to themarket mode, was excluded from the plot for a better visualization of other values.where q = N/T is the ratio between the number of stocks, N , and the number of dailyreturns per stock, T . These eigenvalues lie between two critical values, λ min = (1 − √ q ) and λ max = (1 + √ q ) . As a consequence, the eigenvalues obtained by diagonalizing the empiricalcorrelation matrix and lying below λ max can be understood as statistical uncertainty of thePCA. In other words, the PCA cannot reliably identify the factors with λ < λ max . For ourEuropean universe, q = 569 / √ λ max ≈ . .
4, would thus be understood as statistical uncertainty in the PCA method.
The crucialadvantage of our method, in which factors are built from firm-based indicatorswhile market and sectorial correlations are eliminated by construction, is thepossibility to go beyond this PCA limit and to identify the factors with smallerFCLs.
Moreover, this identification can be performed over time.24 . Net investment as a proxy of the exposure to the low-volatility factor
Building market-neutral portfolios requires nonzero net investment when the portfoliois exposed to the low volatility anomaly. This anomaly is governed by the low-volatilityfactor, which is the most influential factor (after market and sectors) according to our FCLmeasurement (Table II), and unfortunately a residual exposure to the low-volatility factorcannot be easily reduced. As a result, most factors can still be correlated to the low-volatilityfactor. Thus, when the average beta of long stocks in a factor is significantly different fromthe average beta of short stocks, the factor is also exposed to the low-volatility factor witha nonzero net investment. The net investment is defined as the difference between long( ω i >
0) and short ( ω i <
0) investments normalized by total investment, i.e.,∆ = P ni =1 w i P ni =1 | w i | . (14)By construction, ∆ can vary between − − β i in the market neutral relation (9) by the averages h β L i and h β S i for long and short stocks, the net investment ∆ from Eq. (14) can also beexpressed as ∆ = h β S i − h β L ih β S i + h β L i . (15)When the average sensitivities for long and short stocks are similar, net investment is closeto 0. In turn, a net bias in ∆ occurs when the average beta is different for long and shortstocks. ∆ is a proxy of the exposure to the low-volatility factor that is more reactive andmore precise than the estimation obtained through the usual regression of returns.The bias in the long and short betas in Eq. (15) may also be related to the sensitivityto the market (i.e., to the stock index) of a factor built with the FF approach (i.e., neutralin nominal but not in beta): β F F = h β L i − h β S i = − h β i ∆ , (16)where h β i = ( h β S i + h β L i ) is the average beta of the universe that we estimated as h β i ≈ . −
80% and −
001 2003 2005 2007 2009 2011 2013 2015−1−0.8−0.6−0.4−0.200.20.40.60.81 year ∆ capitalizationmomentumlow−volatilitybook−to−marketremuneration Figure 7.
Evolution of the net investment ∆ for five indicator-based factors for the Euro-pean universe: capitalization, momentum, low-volatility, book-to-market, and remuneration(the results for the U.K. universe are not shown but are available upon request). We recallthat ∆ is a proxy of the exposure to the low-volatility factor. The remuneration ∆ is aroundzero, and the factor therefore has no correlation with the low-volatility factor. Other factorsseem to be more exposed to the low-volatility factor.and the momentum factors, in particular. In the FF approach, these factors would thereforealso have a significant sensitivity to the market. In particular, the low-volatility factor builtwith the FF approach would be strongly correlated to the market. Moreover, ∆ indicatesthat most factors have a residual correlation with the low-volatility factor that remainsuncorrected by our method. Since 2003, the ∆ of the book-to-market factor (one of themajor anomalies investigated by Fama and French) has shrunk, and the related book-to-market anomaly has almost disappeared (see Table IV). Finally, the remunerationfactor shows nearly zero net investment, i.e., it remains uncorrelated with the low-volatilityfactor.
F. Other inter-factor correlations
Correlations between factors matter as long as one needs uncorrelated portfolios for assetpricing purposes. The indicator-based factors were introduced to build as many uncorrelatedportfolios as possible. At the same time, such an explicit construction does not guaranteeto yield truly uncorrelated combinations, such as the eigenvectors of the covariance (orcorrelation) matrix. Moreover, some indicators may capture the same economic or financialfeatures of the company and may thus be correlated; in other words, different factors may26pproximate the same eigenvector and thus be highly correlated. In particular, adding newindicator-based factors does not necessarily help to capture new features and may thus beredundant. The choice of the ten indicator-based factors studied in this paper is judgedas sufficient with respect to the trade-off between capturing information and remaininguncorrelated. Table III presents the correlation coefficients between ten indicator-basedfactors estimated from their volatility-normalized daily returns. Clearly, many indicator-based factors remain correlated. If the same estimation was applied to ten independentGaussian vectors of the same length ( m = 3612 elements), the standard deviation of theestimated correlation coefficients would be 1 / √ m ≈ . − . − . .
20) factors. These correlations can be explained as follows. First, thecompanies with low sales-to-market ratios have a high margin and thus the abil-ity to pay their employees well (strong negative correlation − . et al. , 2012). In addition, the positive relation between firm size and the use ofcost-cutting strategies that is monotonically increasing and highly significant, is uncovered.Second, the companies that pay high dividends to shareholders tend to remu-nerate their employees less , yielding a negative correlation of − .
23, which is a directrepresentation of profit-sharing within firms. Indeed, dividend payments are charged on theprofits of the business after all salaries and benefits expenses are paid out. Although thisresult appears intuitive, it remains important as it reveals the level of correlation betweenboth quantities. The labor economics literature and the corporate finance literature are notvery well documented on this particular issue. Finally, companies that perform welland show strong momentum can offer higher remuneration to their employeesor, alternatively, the higher remuneration stimulates employees to work betterand to imbue the company with momentum (positive correlation 0 . -0.33 -0.33 -0.36 -0.36 -0.13 0.05Rem. -0.23 0.05 -0.06 0.20 -0.03 -0.17 -0.38 -0.13 -0.11Cash 0.14 -0.01 0.05 -0.04 0.05 -0.02 0.23 0.05 -0.11 Table III
Correlation coefficients between 10 indicator-based factors for the U.K. companies:Dividend (1), capitalization (2), liquidity (3), momentum (4), low-volatility (5), leverage(6), sales-to-market (7), book-to-market (8), remuneration (9), and cash (10). These coef-ficients were estimated from daily returns of these factors over the period from 23/02/2001to 27/07/2015. Daily returns of each factor were normalized by their volatility averagedover 20 days to reduce the effects of heteroskedasticity. Similar correlation coefficients wereobtained for the European companies (available upon request).8 shows the evolution of two correlation coefficients between volatility-normalized daily re-turns of remuneration, low-volatility, and sales-to-market factors. The correlation betweenthe remuneration and low-volatility factors remains close to zero, with eventual deviationsbeyond the Gaussian significance range (e.g., during the subprime and financial crises in2007-2009). These two factors can be considered uncorrelated. In turn, the negative cor-relation between the remuneration and sales-to-market factors always remains beyond theGaussian significance range.
G. The anomaly of the remuneration factor and its interpretation
Table IV compares the remuneration anomaly with other factors in terms of the annual-ized bias (the annualized cumulative return between the last and the first observation days),the Sharpe ratio (the annualized bias normalized by annualized volatility), and t-statistics(the Sharpe ratio multiplied by the square root of the total duration in years). In particular,the t-statistic allows one to reject the null hypothesis of no bias at the 90% confidence level.The bias reveals the level of overperformance due to a particular factor. We observe asignificant bias for the dominant capitalization and low-volatility factors, which have beenpreviously documented. The anomaly of the book-to-market factor seems to have disap-peared (see Table IV). In fact, the Sharpe ratio that we estimated to be 0.49 for the period28
001 2003 2005 2007 2009 2011 2013 2015−1−0.500.51 year c o rr e l a t i on Figure 8.
Correlation coefficients between daily returns of the remuneration factor and ofthe low-volatility factor (solid line) or the sales-to-market factor (dashed line) for the largestU.K. companies. The coefficients were computed over a sliding window of 90 days. Prior tocomputation, the daily returns were renormalized by their average volatility over the previous20 days. The mean values over 15 years are − .
03 and − .
38 (see Table III), respectively.Horizontal dashed lines show the standard deviation, 0 . , became much smaller in recent years (and even changedthe sign for the European universe, becoming − . A bias of . means that companies that pay better should overperform their less paying com-petitors by × . . The prefactor 2 appears if we assume that 50% is invested in highremuneration and 50% in low remuneration (i.e., there is no exposure to the low-volatilityfactor and volatility is nearly homogeneously distributed). This is one of the most impor-tant results in this paper, as it shows that a market neutral investment style arbitragestrategy based on the remuneration anomaly is likely to deliver positive returns.
Next, assuming that the bias in the remuneration factor consists of an intrinsic bias andcontributions from biases of other factors due to inter-factor correlations, the relative im-pacts of these biases can be estimated by multiplying them by the correlation coefficientsin the 9 th line of Table III. These relative impacts are summarized in the last line of TableIV. Since most contributions from other factors are negative, it might be surmised that theintrinsic remuneration bias is even higher than 1 .
21% (estimated to be around 2.85%) butthat its value is reduced due to correlations with other factors. If we were able to build a re-muneration factor fully decorrelated from other factors, we would have obtained most likelya t-statistic above 3 (around 3 .
29, see Table IV) that fulfills the requirements formulatedby Harvey et al. (2016). Note also that there is no selection bias in our study (we have notanalyzed all the different possibilities to finally retain the remuneration factor), such thatthe condition requiring a t-statistic greater than 3 when taking into account the numberof possible anomaly candidates is not applicable. In any event, the observed bias of 1 . .
37 indicatesthat a horizon of 1 / . ≈ . H. The rationale behind the remuneration anomaly
Our analysis clearly reveals correlations between remuneration policies of a company andperformances of its stock. Do higher wages imply better performances, or better perfor-mances lead to higher wages? More generally, is the relation between remuneration and Based on the publicly available data from Fama and French, http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html E u r o p e Bias, % 2.39 -5.72 -0.95 -1.60 -4.15 -1.95 0.08 -0.23 0.68 1.66Sharpe 0.80 -1.69 -0.41 -0.42 -1.46 -0.74 0.03 -0.08 0.25 0.65t-stat 3.04 -6.42 -1.57 -1.59 -5.57 -2.81 0.11 -0.30 0.97 2.46 U . K . Bias, % 2.12 -4.29 -0.11 -2.81 -3.81 -1.01 0.92 0.34 1.21 2.60Sharpe 0.65 -1.38 -0.05 -0.71 -1.25 -0.35 0.31 0.11 0.37 0.92t-stat 2.48 -5.24 -0.18 -2.71 -4.77 -1.34 1.16 0.40 1.40 3.51Impact, % -0.49 -0.21 0.01 -0.56 0.11 0.17 -0.35 -0.04 2.85 -0.29
Table IV
The annualized bias (the annualized cumulative return between the last andthe first observation days, as a percentage), the Sharpe ratio (annualized bias normalizedby annualized volatility), and the t-statistic (the Sharpe ratio multiplied by the squareroot of the total duration in years, i.e., by √ . ’ .
81) for the following 10 indicator-based factors (quantile Q1): dividend (1), capitalization (2), liquidity (3), momentum (4),low-volatility (5), leverage (6), sales-to-market (7), book-to-market (8), remuneration (9),and cash (10). These quantities are estimated for the period from January 2001 to July2015, for the largest European companies (top lines) and for the largest U.K. companies(bottom lines). The last line shows the relative impacts of the biases of various factorson the remuneration bias (1.21) for the U.K. companies. These impacts are obtained bymultiplying the biases in the fourth line by the correlation coefficients from the 9 th line ofTable III. The annualized bias for the remuneration factor in the U.K. universe is 1.21%with a t-statistic of 1.40. Moreover, if we subtract all the impacts from remuneration’sannualized bias, we obtain an intrinsic remuneration bias of 2.85%. Therefore, we wouldhave a t-statistic of approximately 2.85 × et al. (2016) .31erformance causal? In analogy to the chicken or the egg causality dilemma, wages andperformances are likely to be entangled, while the causality direction can change from firmto firm or even over time. Although these challenging questions are difficult to answer ina quantitative way, we provide two arguments in favor of a causal relation between wagesand performances. First, the remuneration factor is built on the published balance sheetsthat reflect wages of the past year. As a consequence, there is a significant one or even two-year delay between earlier remunerations and current performances. In this way, we capturethe impact of wages on performances. Second, the top and bottom remuneration quantilesare not static but change in time (half of the companies in each quantile are replaced inapproximately 5 years). One can speculate that the management of a company competeswith others by offering higher remuneration to attract the best employees who will makethe performance of the company stronger. As football team managers, companies could buysuccess by investing in human resources (Simmons and Forrest, 2004).In a survey paper, Yellen (1984) poses the question of why firms do not cut wages in aneconomy characterized by involuntary unemployment? Indeed, unemployed workers wouldprefer to work at the real wage rather than being unemployed, but firms will not hire themat a lower wage simply because any reduction in wage would lower employee productivity.This is Yellen’s most-cited paper, and it stipulates that the amount of effort that employeesput into their job depends on the difference between the wage they are getting paid and whatthey perceive as a “fair wage”. The bigger the difference, the less hard they tend to work,which highlights the idea that paying employees more than the market clearing wage mayboost productivity and ends up being worthwhile for the employer. Paradoxically, cuttingwages may end up raising labor costs since it will negatively affect productivity (Stiglitz,1981). Hence, productivity is the main argument, which is confirmed by other theoreticalpapers that consider employees to be more productive in larger firms and thus explain whythey demand higher wages (Idson and Oi, 1999). The other arguments are as follows. Givenjob contract incompleteness, not all duties of an employee can be specified in advance. Forthis reason, monitoring is a central instrument to control production costs (Alchian andDemsetz, 1972). Unfortunately, monitoring is too costly and sometimes inaccurate due tomeasurement error. Instead of having costly and imperfect monitoring, firms can offer higherwages to their employees to create an incentive for the employee not to lose their high wageby being fired (Shapiro and Stiglitz, 1984). In this context, paying a wage in excess of themarket clearing wage can be seen as an efficient way to prevent employees from shirking.The attractiveness of wages to skillful workers also contributes to reduce their turnover.Moreover, raising wages partly eliminates job demands from less performing candidates whowould fear competing with overperforming candidates. This adverse selection is a subtle32upport for the fair wage hypothesis because paying fair wages will attract only the moreskillful workers and deter lemons and will thus help avoid costly monitoring devices in therecruitment processes. In summary, the motivation for the fair wage-effort hypothesis isa simple observation of human nature arguing that employees who receive less than whatthey perceive to be a fair wage will not work as hard as a consequence. In the very samevein, Akerlof and Yellen (1990) set up a model of unemployment in which “people work lesshard if they are paid less than they deserve, but not harder if they receive more than theydeserve”. The model puts in equation the fair wage-effort hypothesis to represent the ideathat a poorly paid employee may be keen on taking its revenge on its employer. VI. Discussion
A. Fama and French approach
Fama and French (1993, 2015) use time series of 25 portfolios, each portfolio built withsimilar capitalization and book-to-market stocks. They regress the monthly performance R i ( t ) of each portfolio i on the returns f j ( t ) of different factors j : R i ( t ) = a i + X j b i,j f j ( t ) + ε i ( t ) , where a i and ε i ( t ) are portfolio-specific intercept and noise, and b i,j is the estimated sensi-tivity of the i -th portfolio to the j -th factor.If the remuneration factor had to be investigated using the FF approach, how could oneproceed? Five different portfolios might be built with stocks sorted according to remunera-tion and then at least three major factors might be used: the market index, capitalization,and book-to-market factors (the factor returns, f j ( t ), would be estimated through the per-formance of the long-short portfolio, e.g., buying the high capitalization and shorting thelow capitalization, or buying the high book-to-market and shorting the low book-to-market).The intercept, a i , for the 5 different portfolios might be measured with their t-statistics toassess whether the remuneration is an anomaly. One might also measure the a high − a low andits t-statistics, as in Table 2 by Fama and French (2008). Finally, the remuneration factormight be added to the regression panel and the R for every portfolio might be measured toquantify how well the data fit the statistical model and how well the common factors explainthe price returns.Instead, we simply measure the average returns of the HML portfolio (see Table IV)built to be beta-neutral without any regression, as we construct our remuneration factor33ectors Median book-to-market Median remuneration (in euros)Consumer discretionary 0.31443 22 859.96Consumer staple 0.24681 39 416.51Energy 0.81440 137 625.91Financial 0.87972 126 498.10Health 0.24442 51 452.06Industrial 0.32765 58 626.27IT 0.19867 77 854.94Material 0.55733 32 516.14Telecom 0.39122 66 283.21Utilities 0.32572 47 014.69 Table V
Sectorial variations of the median of the book-to-market and of the remuneration(in euros) for the U.K. universe in 2014. Both book-to-market value and remuneration varysubstantially across different sectors.as uncorrelated to the main factors. That should be close to the ( a high − a low ) of the FFapproach, or close to the average return of the HML portfolio built to be delta-neutral (seeTable I from Fama and French (2015)). This is due to the fact that the remuneration factoris not exposed to the market index, low-volatility and book-to-market factors. However, theFF approach would not account for the fact that remuneration depends on sectors (see TableV). Using the volatility of the portfolio, we can also measure the t-statistics to learn whetherthe anomaly is statistically significant, and we measure the FCL to quantify how well thecommon factors explain the price returns.In Appendix B we compare the FF approach to our methodology. In particular, weshow that sectorial constraint and beta-neutral property were the two key advantages ofour factors construction: without them, the FF approach applied to the same period,would give insignificant results for the remuneration factor (we recall that mostFama and French data begin from 1963, which leads to greater t-statistics). B. Advantages and limitations of the methodology
Our methodology has several advantages over the FF approach:1. The estimated FCL quantifying the relevance of the factor does not depend on thenumber of considered factors, in contrast to the R argument of the FF approach (e.g.,see Table 6 in Fama and French (1993)). Thus, one can select the most important fac-tors (e.g., stock index, low-volatility, capitalization, liquidity, and momentum factors)in asset pricing models.2. The sensitivities of the different common risk factors to the market (i.e., to the stock34ndex) are maintained at zero even for the low-volatility factor, which is an importantfeature because the market mode may have a hundred times greater impact on portfolioreturns than other factors.3. The factors are constructed to be sector neutral, which allows one to better identifytheir impacts on price variations, which is important because intra-sector correlationsare typically more important than within-factor correlations. Notably, the book-to-market factor of FF approach also captures sectorial risk, as the firms are not priced inthe same way from one sector to another (see Table V). In particular, the remunerationis very different from one sector to another.4. Weights ( w i ) of the stocks that are close in capitalization (or in book-to-market, or inremuneration, etc., depending on the factor) are of the same order of magnitude thatreduces the specific risk of the factor.5. Maintaining factors beta-neutral at any time reduces the noise of factors, even thosethat are not supposed to be correlated to the stock index. In fact, we will show inAppendix B that in the case of factors uncorrelated to the stock index, the beta-neutralconstraint reduces the volatility of the factor by 1 .
2% on an annualized basis.6. Our method enables the inclusion of the low-volatility factor into the cross-section ofaverage returns (in contrast to the FF approach) without any multiregression model.The low-volatility and capitalization factors were found to provide the largest anomaly(see Table IV). In addition, the low-volatility factor was also identified as the majorcontribution to risk, according to our measurement (see Fig. 5). Surprisingly, thecapitalization factor, which had previously been considered as the most important,now occupies the second position. Moreover, the book-to-market factor identified byFama and French (1993) as important, has eventually become a minor factor (and isjust slightly more important than the remuneration factor) after having eliminated thesectoral and market modes.The main limitations to our methodology are related to the methodology itself. Indeed,although introducing indicator-based factors and their relevance assessments through theFCL were inspired by eigenbasis, this construction does not pretend to yield true eigenvectorsand eigenvalues of the covariance (or correlation) matrix. In particular, correlations observedbetween several factors (e.g., the remuneration and sales-to-market factors) indicate that thedecorrelation performed is not perfect. Although the construction of factors can be furtherrefined to make them less correlated (e.g., by splitting the stocks into smaller groups thansupersectors), it is difficult to quantitatively assess the quality of such improvements.35
II. Conclusion
We identify a new anomaly in asset pricing that is statistically significant and econom-ically relevant. It is linked to remuneration: the more a company pays for salaries andbenefits expenses per employee, the better its stock performs. We show that remuneration isa common risk factor although its magnitude appears relatively small compared with domi-nant factors such as low-volatility or capitalization. It also appears that only the companiesthat belong to extreme quantiles are sensitive to the remuneration factor. To validate theabnormal performance associated with the remuneration factor, we check that performanceis not explained by other major factors such as low-volatility, capitalization, book-to-market,or momentum. This finding is an empirical contribution to the asset pricing because em-ployee’s remuneration has not been accounted for in so far, while it is a determinant elementin social sciences including labor economics, sociology or management. These various strandsof literature show that strong attention should be paid to wages and more generally to labordecisions that are likely to affect firms’ value. The economic interpretation of our key findingis mainly based on a rational explanation of the remuneration anomaly: wages and employeeperformance are positively correlated. This argument is overall supported by the efficiencywage theory, which claims that rising wages is the best way to increase output per employeebecause it links pecuniary incentives to employee performance. But it is also supported byseveral studies highlighting the prominent role of operating leverage as a main source ofriskiness of equity returns that is comparable in magnitude to financial leverage.For this purpose, we introduce an original methodology, coined “Factor Correlation Level”(FCL), to build indicator-based factors. The FCL describes the ability of stocks within thefactor to move in a common way and thus reflects the common risk level underpinning eachfactor. The FCL methodology is a theoretical contribution to the asset pricing literature.Indeed, it allows ordering the factors according to their capacity of taking into account thevariability of stocks. This ranking can help fund managers to select the most importantfactors to set up an asset pricing model and well balanced portfolios. The FCL approach isan alternative to the common practice in asset pricing studies where factor selection dependson several statistical criteria that do not necessarily convey the same information.Implications of this work are important, numerous and go far beyond asset pricing lit-erature. A first investment style implication of our finding is that the companies that paybetter should overperform their competitors by 2 .
42% per year. In other words, a marketneutral investment style arbitrage strategy based on the remuneration anomaly would likelydeliver positive returns. A second economics implication is that a company might operatebetter if it could attract the best human resources while maintaining the company as com-36etitive as possible by keeping only those employees who are productive. While we findthat a company that pays too much its shareholders, pays less to its employees accordingto the negative correlation between remuneration and dividend factors, attention should bebrought by top managers to this trade-off between equity capital and labor remuneration.A third research implication is that our new methodology suggests the following ranking forthe European stocks according to their respective FCLs: low-volatility (1.73), capitalization(1.72), momentum (1.41), sales-to-market (1.22), liquidity (1.19), book-to-market (1.13),dividend (1.09), leverage (1.07), remuneration (0.99), and cash (0.92). In particular, thelow-volatility factor, which is excluded from the FF approach, is the next most importantcomponent following the market factor (i.e., the stock index). The remuneration factor iscomparable to the book-to-market factor and thus not negligible. We conclude that a fivefactor model should encapsulate the first five anomalies ordered by their FCL.
Appendix A. Supersectors
Following the Global Industry Classification Standard (GICS), we constructed six super-sectors as summarized in Table VI. This redistribution has been performed manually andhas aimed at minimizing intrasector correlations and at obtaining an almost equal numberof stocks in each supersector. We emphasize that final portfolios include the stocks from allsupersectors, i.e., this redistribution is only an intermediate technical step to improve thefactors.
Appendix B. Comparison with FF approach
In order to highlight the advantages of our methodology as compared to the standard FFapproach, it is instructive to consider incremental transformations from one method to theother. In this way, one can analyze the respective roles of several proposed improvements.For this purpose, we implement the standard FF approach and its progressive modifications. • A0 (the standard FF approach): According to Table I from Fama and French (2015),stocks are subdivided two groups of small (below median) and large (above median)capitalization. Within each of two groups, assets are ordered according to the chosenindicator (e.g., remuneration) and then split into three subgroups (top, medium andbottom 33%). The related portfolio is constructed by buying the top 33% and sellingthe bottom 33% assets from the sorted list with equal weights. Such prepared twoportfolios (for small and large capitalization groups) are then merged into a single37 Food & Staples RetailingFood, Beverage & TobaccoHealth Care Equipment & ServicesHousehold & Personal ProductsPharmaceuticals, Biotechnology & Life Sciences2 BanksDiversified FinancialsInsurance3 Consumer Durables & ApparelConsumer ServicesMediaRetailing4 MaterialsReal Estate5 EnergyTransportationUtilities6 Automobiles & ComponentsCapital GoodsCommercial & Professional ServicesSoftware & ServicesTechnology Hardware & EquipmentTelecommunication Services
Table VI
Six supersectors that we used to split stocks and to construct the indicator-basedfactors (from the FACTSET database). Note that we mixed very different industries to have6 supersectors with approximately the same number of stocks. Even if different industrieswere grouped randomly into six supersectors, we show in Appendix B that our methodolodywould reduce significantly the sectorial risk of different factors.38F portfolio. To be comparable with our methodology, the portfolio is rebalanced ondaily basis (note that the original FF approach stipulated monthly rebalancing). Theconstucted portfolio is delta-neutral. • A1: The same rules as A0 except for buying top 15% and selling bottom 15% assets(as in our methodology); • A2: The same rules as A1 except that the splitting into small and large capitalizationgroups is withdrown; • A3: The same rules as A2 except that we add sectorial and geographical constraints asin our methodology. In other words, assets are split into 6 supersectors (see AppendixA), the portfolio construction is performed individually for each supersector and thenthe obtained portfolios are merged. In addition, we normalize the chosen indicator(e.g., remuneration) by the median per country to correct for geographical biases; • A4: The same rules as A3 except that equal weights are replaced by volatility-basedweights as in our methodology; • A5: The same rules as A4 except that the volatility-based weights are rescaled by fac-tors µ ± to get beta-neutral portfolios (beta’s are estimated throuh a standard method-ology); • A6 (our methodology): The same rules as A5 except that a standard volatility and betaestimations (by exponential moving averages) are replaced by the reactive volatilitymodel.Each of these seven approaches (A0, ..., A6) has been applied to both U.K. and Europeanuniverses. We computed the mean return and volatility of ten factor-based portfolios intro-duced in this paper. To be closer to the standard Fama and French framework, we presentresults on monthly basis, in contrast to the main text, in which daily basis was used. TableVII recapitulates the main findings for the European universe (similar results were obtainedfor the U.K. universe, available upon request).As expected, the change of quantiles (passage from the standard A0 approach to A1)almost does not affect the results. Similarly, a standard volatility/beta estimator and thereactive volatility/beta model lead to similar results (passage from A5 to A6). The mostsignificant changes are observed when passing from A2 to A3 and from A4 to A5. • In the former case, adding the sectorial constraints (see Appendix A) reduces secto-rial biases and allows one to better capture the indicator-based factors. To illustrate thispoint, let us suppose that remuneration is very high in the energy industry and is low (atapproximately the same level) in all other industries. If there was no sectorial constraint, theremuneration factor would be long on the energy industry and short in all other industries.In other words, it would be 100% invested in energy, with eventual high risks. In turn, the39 iv. Cap. Low Mom. Liq. Lev. Sales. Book. Rem. Cash. A Mean 0.35% -0.93% 0.30% -0.64% 0.68% 0.02% -0.46% -0.33% -0.03% 0.35%Std 3.20% 0.56% 4.90% 5.72% 3.92% 2.87% 3.16% 3.16% 2.04% 1.99%t-stat 1.46 -22.40 0.83 -1.49 2.32 0.10 -1.97 -1.40 -0.18 2.37 A Mean 0.37% -0.92% 0.34% -0.79% 0.75% 0.07% -0.53% -0.42% -0.02% 0.32%Std 3.15% 0.44% 4.81% 5.52% 3.77% 2.80% 2.98% 3.04% 1.98% 1.97%t-stat 1.58 -27.98 0.95 -1.91 2.66 0.32 -2.40 -1.87 -0.12 2.17 A Mean 0.37% -1.12% -0.21% -0.49% 0.31% -0.23% -0.49% -0.27% -0.07% 0.38%Std 3.41% 1.39% 4.80% 6.01% 3.85% 2.54% 3.18% 3.47% 2.00% 1.96%t-stat 1.45 -10.82 -0.59 -1.09 1.07 -1.20 -2.09 -1.05 -0.44 2.62 A Mean 0.41% -0.96% -0.19% -0.61% 0.31% -0.21% -0.40% -0.39% 0.00% 0.39%Std 2.65% 1.17% 3.85% 4.99% 3.35% 2.31% 3.05% 2.60% 1.91% 1.69%t-stat 2.06 -10.97 -0.68 -1.63 1.22 -1.23 -1.77 -2.03 0.02 3.11 A Mean 0.41% -0.96% -0.19% -0.60% 0.30% -0.21% -0.41% -0.40% 0.00% 0.40%Std 2.65% 1.17% 3.85% 4.98% 3.34% 2.31% 3.05% 2.59% 1.91% 1.68%t-stat 2.06 -10.97 -0.68 -1.62 1.22 -1.19 -1.79 -2.06 0.03 3.17 A Mean 0.41% -1.16% -0.86% -0.11% -0.34% -0.46% 0.02% -0.08% 0.22% 0.25%Std 2.09% 1.97% 1.90% 3.34% 2.37% 1.58% 1.95% 1.94% 1.53% 1.61%t-stat 2.61 -7.88 -6.04 -0.43 -1.94 -3.94 0.13 -0.58 1.92 2.06 A Mean 0.45% -1.17% -0.82% -0.16% -0.36% -0.40% -0.03% -0.10% 0.19% 0.24%Std 2.05% 1.91% 1.94% 3.33% 2.44% 1.59% 2.06% 2.00% 1.50% 1.60%t-stat 2.94 -8.19 -5.63 -0.66 -2.00 -3.36 -0.22 -0.66 1.73 1.98
Table VII
Progressive evaluation of factor performances with incremental transition fromthe FF approach (A0, top) to our methodology (A6, bottom). For each factor, we presentmean monthly return (Mean) and volatility (Std), as well as their ratio (t-stat).40ectorial constraint reduces this risk by approximately 1/6 because the strong concentrationon energy only remains in the 5th supersector while investments in other industries are nec-essarily imposed for other supersectors. For instance, if the annualized sectorial volatility is12%, such an enforced diversification would reduce it to 2% on an annualized basis. • In the latter case, we switch from the delta-neutral to beta-neutral portfolios, i.e., we(partly) remove correlations with the stock market index. We evoque two possible originsto rationalize the significant decrease of volatility when passing from A4 to A5. First, ifwe suppose that stock beta’s follow a distribution with standard deviation s β , the averageaggregated beta of a random delta-neutral factor built with 2 × ×
500 = 150 stocks wouldbe 0, while its standard deviation would be 2 s β / √ ≈ s β ≈ s β ≈ .
37 from our data. As a consequence, the volatility added by the random exposure tothe market index is around 6% × σ m ≈ .
2% on an annualized basis, where σ m ≈
21% is theannualized volatility of the market index. Second, our construction of beta-neutral portfolioreduces their leverage to ensure Eq. (9). Consequently, smaller investments lead to smallervolatility, as compared to the Fama and French construction with a constant investment.One also observes that volatilities of factors progressively diminish when passing fromA0 to A6. This observation indicates that our modifications better withdraw other commonrisks and manage to concentrate on the risk of interest.Looking more specifically to the remuneration factor, one can observe a significant in-crease of t-stat, from − .
18 (insignificant) to 1 .
73 (significant), when passing from the stan-dard FF approach (A0) to our methodology (A6). In other words, implementing theabove improvements allowed us to level up the remuneration factor from noiseto a small but significant anomaly.
We complete this Appendix by the following general remark. The variability of resultspresented in Table VII indicates their dependence on a chosen data analysis method and itsparameters. The methodology plays therefore the crucial role, especially when dealing withsmall anomalies such as remuneration. This highlights the advantage of our method thatenabled to detect and quantify such small features in the market behavior. At the sametime, our methodology remains robust against some changes in construction of factors, suchas replacing conventional volatility estimator by reactive volatility model, using volatilityrenormalized weights, or changing daily to monthly returns.
REFERENCES
Abowd, J.M., Kramarz, F., and Margolis, D.M. 1999. High wage workers and high wagefirms.
Econometrica
67, 251-333. 41dams, J.S. 1963. Toward an understanding of inequity.
Journal of Abnormal and SocialPsychology
67, 422-436.Akerlof, G.A. 1982. Labour contracts as a partial gift exchange.
Quarterly Journal of Eco-nomics
97, 543-69.Akerlof, G.A. and Yellen, J.L. 1990. The fair wage-effort hypothesis and unemployment.
Quarterly Journal of Economics
American Economic Review
62, 777-795.Allez R. and Bouchaud J.-P., 2012. Eigenvector dynamics: General theory and some appli-cations.
Physical Reviews E
86, 046202.Andersen T. G., Bollerslev T., Diebold F. X., and Labys P., 2000. Exchange rate returnsstandardized by realized volatility are (nearly) gaussian.
Multinational Finance Journal
Journal of Finance
61, 259-299.Avramov, D., Chordia, T. 2006. Asset pricing models and financial market anomalies.
Reviewof Financial Studies
19, 1001-1040.Babecky, J., Du Caju, P., Kosma, T., Lawless, M., Messina, J., and Room, T. 2012. How doEuropean firms adjust their labour costs when nominal wages are rigid?
Labour Economics
19, 732-801.Baker, G. 1992. Incentive contracts and performance measurement.
Journal of Political Econ-omy
Journal of Portfolio Management
Journal of Financial Economics
9, 3-18.Bebchuk, L., Fried, J.M., and Walker, D.I. 2002. Managerial power and rent extraction inthe design of executive compensation.
University of Chicago Law Review
69, 751-846.42elfield, C.R., and Wei, X. 2004. Employer size-wage effects : evidence from matchedemployer-employee survey data in the UK.
Applied Economics
36, 185-193.Belo, F., Lin, X., and Bazdresch, S. 2014. Labor hiring, investment and stock return pre-dictability in the cross section.
Journal of Political Economy
Studies in the Theory of Capital Markets , New York: Praeger.Blau, P.M. 1955. The Dynamics of bureaucracy: A study of interpersonal relations in twogovernment agencies. (Chicago: Chicago University Press).Bouchaud J.-P., Matacz A., and Potters M., 2001. Leverage effect in financial markets: Theretarded volatility model.
Physical Review Letters
87, 1-4.Campbell, J.Y., and Vuolteenaho, T. 2004. Bad beta, good beta.
American Economic Review
94, 1249-1275.Carhart, M. 1997. On persistence in mutual fund performance.
Journal of Finance
52, 57-82.Carlson, M., Fisher, A., and Giammarino, R. 2004. Corporate investment and asset pricedynamics: Implications for the cross-section of returns.
Journal of Finance
59, 2577-2603.Cederburg, S., Davies, P., O’Doherty, M. 2015. Asset-pricing anomalies at the firm level.
Journal of Econometrics
Journalof Business
59, 383-403.Cheng, I.-H., Hong, H. and Scheinkman, J.E.A. 2015. Yesterday’s heroes: Compensationand risk at financial firms.
Journal of Finance
70, 839-879.Cochrane, J. 2005. Asset pricing. Revised edition.
Princeton University Press .Danthine, J., and Donaldson, J. 2002. Labor relations and asset pricing.
Review of EconomicStudies
69, 41-64.Deci, E.L., Koestner, R., and Ryan, R. 1999. A meta-analytic review of experiments exam-ining the effects of extrinsic rewards on intrinsic motivation.
Psychological Bulletin
Journal of Finance
Journal ofPolitical Economy
81, 607-636.Fama, E. 1980. Agency problems and the theory of the firm.
Journal of Political Economy
88, 288-307.Fama, E., and French, K.R. 1992. The cross-section of expected stock returns.
Journal ofFinance
47, 427-465.Fama, E.F., and French, K.R. 1993. Common risk factors in the returns on stocks and bonds.
Journal of Financial Economics
33, 3-56.Fama, E.F., and French, K.R. 1996. Multifactor explanations of asset pricing anomalies.
Journal of Finance
51, 55-84.Fama, E.F., and French, K.R. 1998. Value vs. Growth: The international evidence.
Journalof Finance
53, 1975-1999.Fama, E.F., and French, K.R. 2008. Dissecting anomalies.
Journal of Finance
63, 1653-1678.Fama, E.F., and French, K.R. 2012. Size, value, and momentum in international stock re-turns.
Journal of Financial Economics
Journal of Financial Economics
Review of Financial Studies
29, 148-192.Fehr, E. and Falk, A. 2002. Psychological foundations of incentives.
European EconomicReview
46, 687-724.Ferguson, M.F., and Shockley, R.L. 2003. Equilibrium anomalies.
Journal of Finance
Journal ofFinancial Economics
91, 24-37.Gabaix, X., and Landier, A. 2008. Why has CEO pay increased so much?
Quarterly Journalof Economics
Journal of Finance
70, 733-768.Gaver, J., and Gaver, K.M. 1995. Compensation policy and the investment opportunity set.
Financial Management
24, 19-32.Gibbons, M.R., Ross, S.A., and Shanken, J. 1989. A test of the efficiency of a given portfolio.
Econometrica
57, 1121-1152.Gibbons, M.R., and Murphy, K.J. 1992. Optimal incentive contracts in the presence of careerconcerns: Theory and evidence.
Journal of Political Economy
Journal of Economic Perspectives
25, 191-209.Gomez, J-P., Priestley, R., and Zapatero, F. 2015. Labor income, relative wealth concerns,and the cross-section of stock returns. Forthcoming,
Journal of Financial and QuantitativeAnalysis .Gourio, F. 2007. Labor leverage, firms heterogeneous sensitivities to the business cycle, andthe cross-section of returns. Working paper, Boston UniversityGraham, J.R., Li, S. and Qiu, J. 2012. Managerial attributes and executive compensation.
Review of Financial Studies
25, 144-186.Green, C., and Heywood, J.S. 2008. Does performance pay increase job satisfaction?
Eco-nomica
75, 710-728.Grinblatt, M., and Moskowitz, T. 2004. Predicting stock price movements from past returns:The role of consistency and tax-loss selling.
Journal of Financial Economics
71, 541-579.Hall, B.J., and Murphy, K.J. 2003. The trouble with stock options.
Journal of EconomicPerspectives
17, 49-70.Harvey, C.R., Liu, Y., and Zhu, H. 2016. ... and the cross-section of expected returns.
Reviewof Financial Studies , 29, 5-68.Harvey, C.R., and Liu, Y. 2015. Lucky factors. Working paper, University of Duke.Holmstrom, B. 1999. Managerial incentive problems: A dynamic perspective.
Review ofEconomic Studies
66, 169-182. 45ou, K., Xue, C., and Zhang, L. 2015. Digesting anomalies: An investment approach.
Reviewof Financial Studies
28, 650-705.Idson, T.L., and Oi, W.Y. 1999. Workers are more productive in large firms.
AmericanEconomic Review
89, 104-108.Jegadeesh, N., and Titman, S. 1993. Returns to buying winners and selling losers: Implica-tions for stock market efficiency.
Journal of Finance
48, 65-91.Jensen, M., Murphy, K.J. and Wruck, E. 2004. Remuneration: Where we’ve been, how wegot to here, what are the problems, and how to fix them. Mimeo, Harvard University.Jordan, B.D. and Riley, T.B. 2013. Dissecting the low volatility anomaly. Working paper,University of Kentucky.Kuehn, L-A., Petrosky-Nadeau, N., and Zhang, L. 2013. An equilibrium asset pricing modelwith labor market search. Working paper, NBER.Thierry Lallemand, T., Plasman, R., and Rycx, F. 2007. The establishment-size wage pre-mium: Evidence from European countries.
Empirica
34, 427-451.Laloux L., Cizeau P., Bouchaud J.-P., and Potters M. 1999. Noise dressing of financialcorrelation matrices.
Physical Review Letters
83, 1467.Lazear, E.P. 2000. Performance pay and productivity.
American Economic Review
90, 1346-1361.Liu, X., and L. Zhang, 2008, Momentum profits, factor pricing, and macroeconomic risk.
Review of Financial Studies
21, 2417-2448.McLean, R.D. and Pontiff, J. 2015, Does academic research destroy stock return predictabil-ity?
Journal of Finance
71, 5-32.Monika, M., and Yashiv, E. 2007. Labor and the market value of the firm.
American Eco-nomic Review
97, 1419-1431.O’Byrne, S.F. and Young, S.D. 2010. What investors need to know about executive pay.
Journal of Investing
19, 36-44.Ochoa, M. 2013. Volatility, labor heterogeneity and asset prices. Working paper, FederalReserve Board of Washington D.C. 46i, W.Y. and Idson, T.L. 1999. Firm size and wages.
Handbook of labor economics
3, 2165-2214.Plerou V., Gopikrishnan P., Rosenow B., Amaral L. A. N., and Stanley H.E., 1999. Universaland nonuniversal properties of cross correlations in financial time series.
Physical ReviewLetters
83, 1471.Plerou V., Gopikrishnan P., Rosenow B., Amaral L. A. N., Guhr T., and Stanley H. E.,2002. Random matrix approach to cross correlations in financial data.
Physical Reviews E
65, 066126.Potters M., Bouchaud J.-P., and Laloux L., 2005. Financial applications of random matrixtheory: Old laces and new pieces.
Acta Physica Polonica B
36, 2767-2784.Rynes, S., Schwab, D., and Heneman, H. 1983. The role of pay and market pay variability injob application decisions.
Organizational Behavior and Human Performance
31, 353-364.Rynes, S., Gerhart, B., and Minette, K. 2004. The importance of pay in employee motivation:Discrepancies between what people say and what they do.
Human Resource Management
43, 381-394.Santos, T., and Veronesi, P. 2006. Labor income and predictable stock returns.
Review ofFinancial Studies
19, 1-44.Schwert, G.W. 2003. Anomalies and market efficiency. Chapter 15 in Handbook of the eco-nomics of finance. eds. G. Constantinides, M. Harris, and R. M. Stulz, North-Holland,937-972.Shapiro, C., and Stiglitz, J.E. 1984. Equilibrium unemployment as a worker discipline device.
American Economic Review
74, 433-444.Simmons, R., and Forrest, D. 2004. Buying success: Relationships between team performanceand wage bills in the US and European sports leagues. In
International sports economicscomparisons , ed. R. Fort and J. Fizel, 123-40. Westport: Praeger.Smith, A. 1776. An Inquiry into the Nature and Causes of the Wealth of Nations. London:Methuen and Co., Ltd., ed. Edwin Cannan, 1904. Fifth edition.Stiglitz, J.E. 1981. Alternative theories of wage determination and unemployment: Theefficiency wage model. Working paper, University of Princeton.47aleyre S., Grebenkov D. S., Aboura S., and Liu Q., 2013. The reactive volatility model.
Quantitative Finance
13, 1697-1706.Wang D., Podobnik B., Horvatic D., and Stanley H. E., 2011. Quantifying and modeling long-range cross correlations in multiple time series with applications to world stock indices.
Physical Reviews E
83, 046121.Winter-Ebmer, R., and Zweimuller, J. 1999. Firm-size wage differentials in Switzerland:Evidence from job-changers.
American Economic Review
89, 89-93.Yellen, J.L. 1984. Efficiency wage models of unemployment.
American Economic Review II Conclusion Générale . Bibliographie
Acharya, Viral and Lasse Pedersen (2005), “Asset pricing with liquidity risk.”
Journal of Financial Economics , 77, 375–410.Agarwal, Vikas (2004), “Risks and portfolio decisions involving hedge funds.”
Review of Financial Studies , 17, 63–98.Ahdida, Abdelkoddousse and Aurélien Alfonsi (2013), “A mean-reverting sdeon correlation matrices.”
Stochastic Processes and their Applications , 123,1472–1520.Allez, Romain and Jean-Philippe Bouchaud (2012), “Eigenvector dynamics:General theory and some applications.”
Phys. Rev. E , 86, 046202.Amihud, Yakov (2002), “Illiquidity and stock returns: cross-section and time-series effects.”
Journal of Financial Markets , 5, 31–56.Ang, Andrew and Joseph Chen (2007), “Capm over the long run: 1926-2001.”
Journal of Empirical Finance , 14, 1–40.Ang, Andrew, Robert Hodrick, Yuhang Xing, and Xiaoyan Zhang (2006),“The cross-section of volatility and expected returns.”
Journal of Finance ,61, 259–299. 265ng, Andrew, Robert Hodrick, Yuhang Xing, and Xiaoyan Zhang (2009),“High idiosyncratic volatility and low returns: International and furtheru.s. evidence.”
Journal of Financial Economics , 91, 1–23.Ang, Andrew, Assaf A. Shtauber, and Paul C. Tetlock (2013), “Asset pricingin the dark: The cross-section of otc stocks.”
Review of Financial Studies ,26, 2985–3028.Asness, Clifford S, Robert J Krail, and John M Liew (2001), “Do hedge fundshedge?”
The Journal of Portfolio Management , 28, 6–19.Asness, Clifford S., Tobias J. Moskowitz, and Lasse Pedersen (2013), “Valueand momentum everywhere.”
Journal of Finance , 68, 929–985.Baker, Malcolm, Brendan Bradley, and Ryan Taliaferro (2014), “The low-risk anomaly: A decomposition into micro and macro effects.”
FinancialAnalysts Journal , 70, 43–58.Bali, Turan G., Robert F. Engle, and Yi Tang (2017), “Dynamic condi-tional beta is alive and well in the cross section of daily stock returns.”
Management Science , 63, 3760–3779.Banz, Rolf W. (1981), “The relationship between return and market value ofcommon stocks.”
Journal of Financial Economics , 9, 3–18.Barra (1998),
Risk Model Handbook, United States Equity, Version 3 (E3) .Bass, Robert, Scott Gladstone, and Andrew Ang (2017), “Total portfoliofactor, not just asset, allocation.”
The Journal of Portfolio Management ,43, 38–53.Bekaert, Geert and Guojun Wu (2000), “Asymmetric volatility and risk inequity markets.”
Review of Financial Studies , 13, 1–42.Bender, Jennifer, Xiaole Sun, Ric Thomas, and Volodymyr Zdorovtsov2662018), “The promises and pitfalls of factor timing.”
The Journal ofPortfolio Management , 44, 79–92.Benichou, Raphael, Yves Lempérière, Emmanuel Sérié, Julien Kockelkoren,Philip Seager, Jean-Philippe Bouchaud, and Marc Potters (2017),“Agnostic risk parity: Taming known and unknown-unknowns.”
Journalof Investment Strategies , 6, 1–12.Benzaquen, Michael, Iacopo Mastromatteo, Zlotan Eisler, and Jean-PhilippeBouchaud (2017), “Dissecting cross-impact on stock markets: an empiricalanalysis.”
Journal of Statistical Mechanics: Theory and Experiment , 2017,023406.Black, Fischer (1972), “Capital market equilibrium with restricted borro-wing.”
The Journal of Business , 45, 444–55.Black, Fischer (1976), “Studies of stock price volatility changes.”
Proceedingsof the 1976 Meetings of the American Statistical Association, Business andEconomics Statistics Section , 177–181.Black, Fischer, Michael C. Jensen, and Myron Scholes (1972), “The capitalasset pricing model: Some empirical tests.”
Studies in the Theory of CapitalMarkets, Praeger Publishers Inc.
Blume, Marshall E (1971), “On the assessment of risk.”
Journal of Finance ,26, 1–10.Bollerslev, Tim, Robert Engle, and Jeffrey Wooldridge (1988), “A capitalasset pricing model with time-varying covariances.”
Journal of PoliticalEconomy , 96, 116–31.Bouchaud, Jean-Philippe, Andrew Matacz, and Marc Potters (2001),“Leverage effect in financial markets: The retarded volatility model.”
Phys. ev. Lett. , 87, 228701.Bouchaud, Jean-Phillipe and Marc Potters (2018),
The Oxford Handbook ofRandom Matrix Theory , chapter Financial applications of random matrixtheory: a short review. Oxford University Press.Brandt, Michael W., Pedro Santa-Clara, and Rossen Valkanov (2009),“Parametric portfolio policies: Exploiting characteristics in the cross-section of equity returns.”
Review of Financial Studies , 22, 3411–3447.Brown, Stephen J. (1989), “The number of factors in security returns.”
TheJournal of Finance , 44, 1247–1262.Bru, Marie-France (1991), “Wishart processes.”
Journal of TheoreticalProbability , 4, 725–751.Bun, Joel, Jean-Philippe Bouchaud, and Marc Potters (2016), “Cleaning cor-relation matrices.”
Risk magazine .Bussiere, Matthieu, Marie Hoerova, and Benjamin Klaus (2015),“Commonality in hedge fund returns: Driving factors and implications.”
Journal of Banking & Finance , 54, 266–280.Campbell, John and Ludger Hentschel (1992), “No news is good news *1:An asymmetric model of changing volatility in stock returns.”
Journal ofFinancial Economics , 31, 281–318.Carhart, Mark (1997), “On persistence in mutual fund performance.”
Journalof Finance , 52, 57–82.Cederburg, Scott and Michael S. O’Doherty (2015), “Asset-pricing anomaliesat the firm level.”
Journal of Econometrics , 186, 113–128.Chan, Louis K. C. and Josef Lakonishok (1992), “Robust measurement ofbeta risk.”
Journal of Financial and Quantitative Analysis , 27, 265–282.268hen, An-Sing and Tai-Wei Zhang (2005), “The beta regime change riskpremium.”
Available at SSRN .Choueifaty, Yves and Yves Coignard (2008), “Toward maximum diversifica-tion.”
The Journal of Portfolio Management , 35, 40–51.Christie, Andrew A. (1982), “The stochastic behavior of common stock va-riances: Value, leverage and interest rate effects.”
Journal of FinancialEconomics , 10, 407 – 432.Ciliberti, Stefano, Emmanuel Sérié, Guillaume Simon, Yves Lemperiere, andJean-Philippe Bouchaud (2017), “The ’size premium’ in equity markets:Where is the risk?”
Available at SSRN .Clarke, Roger, Harindra de Silva, and Steven Thorley (2013), “Risk parity,maximum diversification,and minimum variance: An analytic perspective.”
The Journal of Portfolio Management , 39, 39–53.Connor, Gregory (1995), “The three types of factor models: A comparison oftheir explanatory power.”
Financial Analysts Journal , 51, 42–46.Connor, Gregory and Robert Korajczyk (1993), “A test for the number offactors in an approximate factor model.”
Journal of Finance , 48, 1263–91.Cont, Rama and Jean-Philipe Bouchaud (2000), “Herd behavior and ag-gregate fluctuations in financial markets.”
Macroeconomic Dynamics , 4,170–196.Conway, Delores A and Marc R Reinganum (1988), “Stable factors in secu-rity returns: Identification using cross-validation.”
Journal of Business &Economic Statistics , 6, 1–15.Cox, John C, Jonathan Ingersoll, and Stephen Ross (1985), “A theory of theterm structure of interest rates.”
Econometrica , 53, 385–407.269uchiero, Christa, Damir Filipović, Eberhard Mayerhofer, and JosefTeichmann (2011), “Affine processes on positive semidefinite matrices.”
Ann. Appl. Probab. , 21, 397–463.Da Fonseca, José, Martino Grasselli, and Claudio Tebaldi (2007), “Optionpricing when correlations are stochastic: an analytical framework.”
Reviewof Derivatives Research , 10, 151–180.DeJong, Douglas V. and Daniel W. Collins (1985), “Explanations for theinstability of equity beta: Risk-free rate changes and leverage effects.”
Journal of Financial and Quantitative Analysis , 20, 73–94.DeMiguel, Victor, Martin-Utrera Alberto, Francisco J. Nogales, and RamanUppal (2017), “A portfolio perspective on the multitude of firm characte-ristics.”
CEPR Discussion Papers .Dhrymes, Phoebus J, Irwin Friend, and N Bulent Gultekin (1984), “A criticalreexamination of the empirical evidence on the arbitrage pricing theory.”
Journal of Finance , 39, 323–46.Dichtl, Hubert, Wolfgang Drobetz, Harald Lohre, Carsten Rother, andPatrick Vosskamp (2018), “Optimal timing and tilting of equity factors.”
Available at SSRN .Duan, Jin-Chuan (1995), “The garch option pricing model.”
MathematicalFinance , 5, 13–32.Engle, Robert (1982), “Autoregressive conditional heteroscedasticity with es-timates of the variance of united kingdom inflation.”
Econometrica , 50,987–1007.Engle, Robert (2002), “Dynamic conditional correlation: A simple class ofmultivariate generalized autoregressive conditional heteroskedasticity mo-270els.”
Journal of Business & Economic Statistics , 20, 339–50.Engle, Robert (2016), “Dynamic conditional beta.”
Journal of FinancialEconometrics , 14, 643–667.Epps, Thomas W. (1979), “Comovements in stock prices in the very shortrun.”
Journal of the American Statistical Association , 74, 291–298.Eynard, Bertrand, Taro Kimura, and Sylvain Ribault (2015), “Random ma-trices.”
ARXIV:1510.04430 .Fabozzi, Frank J. and Jack Clark Francis (1978), “Beta as a random coeffi-cient.”
The Journal of Financial and Quantitative Analysis , 13, 101–116.Fama, Eugene F. (1965), “Random walks in stock market prices.”
FinancialAnalysts Journal , 21, 55–59.Fama, Eugene F. (1968), “Risk, return and equilibrium: Some clarifying com-ments.”
Journal of Finance , 23, 29–40.Fama, Eugene F. and Kenneth R. French (1992), “The cross-section of ex-pected stock returns.”
The Journal of Finance , 47, 427–465.Fama, Eugene F. and Kenneth R. French (1993), “Common risk factors in thereturns on stocks and bonds.”
Journal of Financial Economics , 33, 3–56.Fama, Eugene F. and Kenneth R. French (1997), “Industry costs of equity.”
Journal of Financial Economics , 43, 153 – 193.Fama, Eugene F. and Kenneth R. French (1998), “Value versus growth: Theinternational evidence.”
The Journal of Finance , 53, 1975–1999.Fama, Eugene F. and Kenneth R. French (2008), “Dissecting anomalies.”
Journal of Finance , 63, 1653–1678.Fama, Eugene F. and Kenneth R. French (2012), “Size, value, and momentum271n international stock returns.”
Journal of Financial Economics , 105, 457–472.Fama, Eugene F. and Kenneth R. French (2015), “A five-factor asset pricingmodel.”
Journal of Financial Economics , 116, 1 – 22.Fama, Eugene F. and James D. MacBeth (1973), “Risk, return, and equili-brium: Empirical tests.”
Journal of Political Economy , 81, 607–36.Ferson, Wayne E. and Stephen R. Foerster (1994), “Finite sample propertiesof the generalized method of moments in tests of conditional asset pricingmodels.”
Journal of Financial Economics , 36, 29–55.Ferson, Wayne E. and Campbell R. Harvey (1999), “Conditioning variablesand the cross section of stock returns.”
The Journal of Finance , 54, 1325–1360.Francis, Jack Clark (1979), “Statistical analysis of risk surrogates for nysestocks.”
Journal of Financial and Quantitative Analysis , 14, 981–997.Frazzini, Andrea and Lasse Pedersen (2014), “Betting against beta.”
Journalof Financial Economics , 111, 1–25.Fu, Fangjian (2009), “Idiosyncratic risk and the cross-section of expectedstock returns.”
Journal of Financial Economics , 91, 24 – 37.Fung, William and David A Hsieh (1997), “Empirical characteristics of dy-namic trading strategies: The case of hedge funds.”
Review of FinancialStudies , 10, 275–302.Galai, Dan and Ronald W. Masulis (1976), “The option pricing model andthe risk factor of stock.”
Journal of Financial Economics , 3, 53 – 81.Golub, Gene H. (1973), “Some modified matrix eigenvalue problems.”
SIAMReview , 15, 318–334. 272ourieroux, Christian (2006), “Continuous time wishart process for stochasticrisk.”
Econometric Reviews , 25, 177–217.Goyal, Amit and Pedro Santa-Clara (2003), “Idiosyncratic risk matters!”
Journal of Finance , 58, 975–1008.Graham, B. and D.L.F. Dodd (1934),
Security Analysis . McGraw-HillEducation.Grinblatt, Mark and Tobias J. Moskowitz (2004), “Predicting stock pricemovements from past returns: the role of consistency and tax-loss selling.”
Journal of Financial Economics , 71, 541–579.Guedj, Olivier and Jean-Philippe Bouchaud (2005), “Experts’ earning fore-casts: Bias, herding and gossamer information.”
International Journal ofTheoretical and Applied Finance , 08, 933–946.Harvey, Campbell R. and Yan Liu (2018), “Lucky factors.”
Available atSSRN .Haugen, Robert A. and Nardin L. Baker (1991), “The efficient marketinefficiency of capitalization–weighted stock portfolios.”
The Journal ofPortfolio Management , 17, 35–40.Haugen, Robert A. and A. James Heins (1975), “Risk and the rate of returnon financial assets: Some old wine in new bottles.”
The Journal of Financialand Quantitative Analysis , 10, 775–784.Heston, Steven L (1993), “A closed-form solution for options with stochas-tic volatility with applications to bond and currency options.”
Review ofFinancial Studies , 6, 327–43.Hodges, Philip, Ked Hogan, Justin R. Peterson, and Andrew Ang (2017),273Factor timing with cross-sectional and time-series predictors.”
TheJournal of Portfolio Management , 44, 30–43.Hong, Harrison and David Sraer (2016), “Speculative betas.”
Journal ofFinance , 71, 2095–2144.Hong, Harrison and Jeremy Stein (1999), “A unified theory of underreac-tion, momentum trading, and overreaction in asset markets.”
Journal ofFinance , 54, 2143–2184.Jagannathan, Ravi and Zhenyu Wang (1996), “The conditional capm and thecross-section of expected returns.”
The Journal of Finance , 51, 3–53.Jegadeesh, Narasimhan and Sheridan Titman (1993), “Returns to buyingwinners and selling losers: Implications for stock market efficiency.”
Journalof Finance , 48, 65–91.Jobson, J. D. and Bob Korkie (1980), “Estimation for markowitz efficientportfolios.”
Journal of the American Statistical Association , 75, 544–554.Jordan, Bradford D. and Timothy Brandon Riley (2013), “Dissecting the lowvolatility anomaly.”
Available at SSRN .Kac, Marc (1959),
Probability and Related Topics in Physical Sciences .Lectures in applied mathematics (American Mathematical Society) ; 1.A,American Mathematical Society.Khandani, Amir E. and Andrew Lo (2011), “What happened to the quantsin august 2007? evidence from factors and transactions data.”
Journal ofFinancial Markets , 14, 1–46.Kyle, Albert (1985), “Continuous auctions and insider trading.”
Econometrica , 53, 1315–35.Lakonishok, Josef, Andrei Sheifer, and Robert W. Vishny (1994), “Contrarian274nvestment, extrapolation, and risk.”
The Journal of Finance , 49, 1541–1578.Laloux, Laurent, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters(1999), “Noise dressing of financial correlation matrices.”
Phys. Rev. Lett. ,83, 1467–1470.Ledoit, Olivier and Sandrine Péché (2011), “Eigenvectors of some largesample covariance matrix ensembles.”
Probability Theory and RelatedFields , 151, 233–264.Ledoit, Olivier and Michael Wolf (2003), “Improved estimation of the cova-riance matrix of stock returns with an application to portfolio selection.”
Journal of Empirical Finance , 10, 603–621.Ledoit, Olivier and Michael Wolf (2004), “Honey, i shrunk the sample cova-riance matrix.”
The Journal of Portfolio Management , 30, 110–119.Ledoit, Olivier and Michael Wolf (2012), “Nonlinear shrinkage estimation oflarge-dimensional covariance matrices.”
The Annals of Statistics , 40, 1024–1060.Lee, Wai (2017), “Factors timing factors.”
The Journal of PortfolioManagement , 43, 66–71.Lettau, Martin and Sydney Ludvigson (2001), “Resurrecting the (c)capm: Across-sectional test when risk premia are time-varying.”
Journal of PoliticalEconomy , 109, 1238–1287.Lewellen, Jonathan and Stefan Nagel (2006), “The conditional capm doesnot explain asset-pricing anomalies.”
Journal of Financial Economics , 82,289–314.Luedecke, B. P. (1984), “An empirical investigation of arbitrage and approxi-275ate k-factor structure on large asset markets.”
Dissertation, Departmentof Economics, University of Wisconsin .Lux, Thomas and Michele Marchesi (1984), “Scaling and criticality in a sto-chastic multi-agent model of a financial market.”
Nature , 397, 498.Maillard, Sébastien, Thierry Roncalli, and Jérôme Teïletche (2010), “Theproperties of equally weighted risk contribution portfolios.”
The Journalof Portfolio Management , 36, 60–70.Malkiel, Burton G. and Yexiao Xu (1997), “Risk and return revisited.”
TheJournal of Portfolio Management , 23, 9–14.Markowitz, Harry (1952), “Portfolio selection.”
Journal of Finance , 7, 77–91.McCulloch, Robert and Peter Rossi (1991), “A bayesian approach to testingthe arbitrage pricing theory.”
Journal of Econometrics , 49, 141–168.McLean, R. David and Jeffrey Pontiff (2016), “Does academic research des-troy stock return predictability?”
Journal of Finance , 71, 5–32.Meng, J. Ginger, Gang Hu, and Jushan Bai (2011), “Olive: A simple methodfor estimating betas when factors are measured with error.”
Journal ofFinancial Research , 34, 27–60.Michard, Quentin and Jean-Philippe Bouchaud (2005), “Theory of collectiveopinion shifts: from smooth trends to abrupt swings.”
Eur. Phys. J. B , 47,151–159.Michaud, Richard O. (1989), “The markowitz optimization enigma: Is ’opti-mized’ optimal?”
Financial Analysts Journal , 45, 31–42.Mitchell, Mark (2001), “Characteristics of risk and return in risk arbitrage.”
Journal of Finance , 56, 2135–2175.276atton, Andrew (2009), “Are ”market neutral” hedge funds really marketneutral?”
Review of Financial Studies , 22, 2295–2330.Plerou, Vasiliki, Parameswaran Gopikrishnan, Bernd Rosenow, LuísA. Nunes Amaral, Thomas Guhr, and H. Eugene Stanley (2002), “Randommatrix approach to cross correlations in financial data.”
Phys. Rev. E , 65,066126.Plerou, Vasiliki, Parameswaran Gopikrishnan, Bernd Rosenow, Luís A.Nunes Amaral, and H. Eugene Stanley (1999), “Universal and nonuniversalproperties of cross correlations in financial time series.”
Phys. Rev. Lett. ,83, 1471–1474.Possamai, Dylan and Pierre Gauthier (2011), “Prices expansion in the wishartmodel.”
The IUP Journal of Computational Mathematics , 4, 44 – 71.Potters, Marc, Jean-Philippe Bouchaud, and Laurent Laloux (2005),“Financial applications of random matrix theory: Old laces and newpieces.”
ACTA PHYSICA POLONICA B , 36, 2767–2782.Reinganum, Marc R. (1981), “Misspecification of capital asset pricing:Empirical anomalies based on earnings’ yields and market values.”
Journalof Financial Economics , 9, 19–46.Roll, Richard (1977), “A critique of the asset pricing theory’s tests parti: On past and potential testability of the theory.”
Journal of FinancialEconomics , 4, 129–176.Roll, Richard and Stephen Ross (1984), “A critical reexamination of the empi-rical evidence on the arbitrage pricing theory: A reply.”
Journal of Finance ,39, 347–50.Roll, Richard and Stephen A Ross (1980), “An empirical investigation of the277rbitrage pricing theory.”
The Journal of Finance , 35, 1073–1103.Ross, Stephen (1976), “The arbitrage theory of capital asset pricing.”
Journalof Economic Theory , 13, 341–360.Shanken, Jay (2015), “On the Estimation of Beta-Pricing Models.”
TheReview of Financial Studies , 5, 1–33.Sharpe, William (1963), “A simplified model for portfolio analysis.”
Management Science , 9, 277–293.Sharpe, William (1964), “Capital asset prices: A theory of market equilibriumunder conditions of risk.”
Journal of Finance , 19, 425–442.Stein, Jeremy (2009), “Presidential address: Sophisticated investors and mar-ket efficiency.”
Journal of Finance , 64, 1517–1548.Tofallis, Chris (2008), “Investment volatility: A critique of standard betaestimation and a simple way forward.”
European Journal of OperationalResearch , 187, 1358 – 1367.Trzcinka, Charles (1986), “On the number of factors in the arbitrage pricingmodel.”
The Journal of Finance , 41, 347–368.Valeyre, Sebastien (2012), “Herding behavior also applies to fundamentals.”
The Hedge Fund Journal , 91.Valeyre, Sebastien (2018), “Time scale effect on correlation at larger horizon.”Valeyre, Sebastien, Denis Grebenkov, and Sofiane Aboura (2019a),“Emergence of correlations between securities at short time scales.”
PhysicaA: Statistical Mechanics and its Applications , 526, 121026.Valeyre, Sebastien, Denis Grebenkov, and Sofiane Aboura (2019b), “The reac-tive beta model.”
Journal of Financial Research , 42, 71–113.278aleyre, Sebastien, Denis Grebenkov, Sofiane Aboura, and Qian Liu (2013),“The reactive volatility model.”
Quantitative Finance , 13, 1697–1706.Valeyre, Sebastien, Denis Grebenkov, Qian Liu, Sofiane Aboura, and FrancoisBonnin (2016), “Should employers pay better their employees? an assetpricing approach.”
Available at SSRN .Valeyre, Sebastien and Stanislav Kuperstein (2018), “Model of diffusion ofthe correlation between securities.”Valeyre, Sebastien, Stanislav Kuperstein, Denis Grebenkov, and SofianeAboura (2018), “The market neutral fundamental maximum variance port-folios.”Wang, Duan, Boris Podobnik, Davor Horvatić, and H. Eugene Stanley (2011),“Quantifying and modeling long-range cross correlations in multiple timeseries with applications to world stock indices.”
Phys. Rev. E , 83, 046121.Wyart, Matthieu and Jean-Philippe Bouchaud (2007), “Self-referential be-haviour, overreaction and conventions in financial markets.”
Journal ofEconomic Behavior & Organization , 63, 1–24.279 itre : Modélisation fine de la matrice de covariance/corrélation des actions
Résumé : Une nouvelle méthode a été mise en place pour débruiter la matrice de corrélation desrendements des actions en se basant sur une analyse par composante principale sous contrainte enexploitant les données financières. Des portefeuilles, nommés “Fundamental Maximum variance portfo-lios”, sont construits pour capturer de manière optimale un style de risque défini par un critère financier(“Book”, “Capitalization”,etc.). Les vecteurs propres sous contraintes de la matrice de corrélation, quisont des combinaisons linéaires de ces portefeuilles, sont alors étudiés. Grâce à cette méthode, plusieursfaits stylisés de la matrice ont été mis en évidence dont: i) l’augmentation des premières valeurs propresavec l’échelle de temps de 1 minute à plusieurs mois semble suivre la même loi pour toutes les valeurspropres significatives avec deux régimes; ii) une loi “universelle” semble gouverner la composition detous les portefeuilles “Maximum variance”. Ainsi selon cette loi, les poids optimaux seraient directementproportionnels au classement selon le critère financier étudié; iii) la volatilité de la volatilité des porte-feuilles “Maximum Variance”, qui ne sont pas orthogonaux, suffirait à expliquer une grande partie de ladiffusion de la matrice de corrélation; iv) l’effet de levier (augmentation de la première valeur propreavec la baisse du marché) n’existe que pour le premier mode et ne se généralise pas aux autres facteursde risque. L’effet de levier sur les beta, sensibilité des actions avec le “market mode”, rend les poids dupremier vecteur propre variables.
Mots clefs : corrélation, filtre, diagonalisation sous contrainte, modèle multifactoriel, portefeuillesoptimaux, gestion d’actifs, diffusion
Discipline : Sciences Economique/ Gestion de portefeuille
Title : Refined model of the covariance/correlation matrix between securities
Summary : A new methodology has been introduced to clean the correlation matrix of singlestocks returns based on a constrained principal component analysis using financial data. Portfolioswere introduced, namely “Fundamental Maximum Variance Portfolios”, to capture in an optimal waythe risks defined by financial criteria (“Book”, “Capitalization”, etc.). The constrained eigenvectors ofthe correlation matrix, which are the linear combination of these portfolios, are then analyzed. Thanksto this methodology, several stylized patterns of the matrix were identified: i) the increase of the firsteigenvalue with a time scale from 1 minute to several months seems to follow the same law for allthe significant eigenvalues with 2 regimes; ii) a universal law seems to govern the weights of all the“Maximum variance” portfolios, so according to that law, the optimal weights should be proportional tothe ranking based on the financial studied criteria; iii) the volatility of the volatility of the “MaximumVariance” portfolios, which are not orthogonal, could be enough to explain a large part of the diffusionof the correlation matrix; iv) the leverage effect (increase of the first eigenvalue with the decline of thestock market) occurs only for the first mode and cannot be generalized for other factors of risk. Theleverage effect on the beta, which is the sensitivity of stocks with the market mode, makes variable theweights of the first eigenvector.
Key words : correlation, filter, constrained diagonalization, multi factorial model, optimal portfo-lios, portfolio management, diffusion
Discipline : Economics/ Portfolio ManagementCentre d’Économie de l’Université Paris NordU.F.R Sciences Economiques et GestionÉcole Doctorale ERASMEUniversité Paris 13 – Campus Villetaneuse99 avenue Jean-baptiste Clément93430 Villetaneuse: Economics/ Portfolio ManagementCentre d’Économie de l’Université Paris NordU.F.R Sciences Economiques et GestionÉcole Doctorale ERASMEUniversité Paris 13 – Campus Villetaneuse99 avenue Jean-baptiste Clément93430 Villetaneuse