[PDF] Evidence of Crowding on Russell 3000 Reconstitution Events

Abstract

We develop a methodology which replicates in great accuracy the FTSE Russell indexes reconstitutions, including the quarterly rebalancings due to new initial public offerings (IPOs). While using only data available in the CRSP US Stock database for our index reconstruction, we demonstrate the accuracy of this methodology by comparing it to the original Russell US indexes for the time period between 1989 to 2019. A python package that generates the replicated indexes is also provided. As an application, we use our index reconstruction protocol to compute the permanent and temporary price impact on the Russell 3000 annual additions and deletions, and on the quarterly additions of new IPOs . We find that the index portfolios following the Russell 3000 index and rebalanced on an annual basis are overall more crowded than those following the index on a quarterly basis. This phenomenon implies that transaction costs of indexing strategies could be significantly reduced by buying new IPOs additions in proximity to quarterly rebalance dates.

Full PDF

EEvidence of Crowding on Russell 3000

Reconstitution Events

Alessandro Micheli and Eyal NeumanDepartment of Mathematics, Imperial College LondonJune 16, 2020

Abstract

We develop a methodology which replicates in great accuracy the FTSERussell indexes reconstitutions, including the quarterly rebalancings due to newinitial public oﬀerings (IPOs). While using only data available in the CRSP USStock database for our index reconstruction, we demonstrate the accuracy ofthis methodology by comparing it to the original Russell US indexes for the timeperiod between 1989 to 2019. A python package that generates the replicatedindexes is also provided [31].As an application, we use our index reconstruction protocol to compute thepermanent and temporary price impact on the Russell 3000 annual additionsand deletions, and on the quarterly additions of new IPOs . We ﬁnd that theindex portfolios following the Russell 3000 index and rebalanced on an annualbasis are overall more crowded than those following the index on a quarterlybasis. This phenomenon implies that transaction costs of indexing strategiescould be signiﬁcantly reduced by buying new IPOs additions in proximity toquarterly rebalance dates.

Keywords: crowding, indexing strategies, price impact, Russell Index.

FTSE Russell is, quoting the company web-page [24], a “ global provider of benchmarks,analytics, and data solutions with multi-asset capabilities ”. The company maintains awide range of indexes varying for geographic regions, weighting procedures and assetclasses. 1 a r X i v : . [ q -f i n . T R ] J un n US markets, FTSE Russell most prominent products are the Russell US in-dexes : the Russell 1000, 2000, 3000 and 3000E indexes track rosters of US companiesacross diﬀerent market capitalizations. Part of the strength of Russell US indexesresides in their modularity. As shown in Table 1, each index is composed accordingto diﬀerent investment styles, therefore oﬀering an extended and meticulous coveragefor the US Equity market. For example, the Russell 3000 measures the performanceof the 3,000 largest public companies in the US by total market capitalization andrepresents approximately 98 percent of the American public equity market. On theother hand, the Russell 1000 Defensive Index is much more specialized as it includesthose Russell 1000 Index companies that are more stable and are less sensitive toeconomic cycles, credit cycles and market volatility.Such indexes are often used by portfolio managers as benchmarks for US equitymarket performances across diﬀerent market segments. It does not come as a surprisethen, that Russell US indexes are the go-to equity universe for a wide body of academicliterature, including portfolio management research [19, 4, 5, 15, 16] as well as marketmicrostructure e.g. [7, 37, 14, 39, 8, 6, 9].The rosters of securities in the Russell U.S. indexes have also received attentionfor the presence of the so called “index eﬀects”.

It has been empirically observed thatthe securities added to equity indexes receive positive returns concurrently with theirindex additions and shortly thereafter. The main indexes on which such eﬀects areobserved are the S&P500 and the Russell U.S. indexes, with many studies, such as [30,13, 17, 15, 33], providing evidences in support of the existence of the aforementionedabnormal returns.As for the Russell U.S. indexes, Madhavan [30] ﬁrst analyzed the presence ofstatistically signiﬁcant abnormal returns attributable to the annual reconstitutionof Russell 2000 and Russell 3000 indexes. Moreover, Madhavan explained the ab-normal returns due to microstructure eﬀects such as price pressure and changes inliquidity. The mechanisms generating these abnormal returns phenomena were fur-ther investigated and tested by Chen in [17]. Cai and Todd Houge in [13] comparedthe performance of a buy-and-hold strategy of the Russell 2000 index to the returnsof a portfolio following the annually rebalanced Russell 2000 index. The latter wasshown to be signiﬁcantly more proﬁtable in a time scale of 5 years. More recently,Onayev and Zdorovtsov [32] have found evidence of strategic predatory trading be-haviour around the annual reconstitution, whereby closing prices of companies aremanipulated in order to inﬂuence their index membership.One of the main features distinguishing the Russell U.S. indexes across all othersUS stocks equity indexes is their rebalance procedure. In general, rebalance procedureof equity indexes are not necessarily publicly disclosed and sometimes presents somedegree of arbitrariness. For example, as discussed in [13, 33], Standard & Poor’smaintains a proprietary selection process used to discern which stocks will belongto the new issue of the index and make adjustments whenever it considers it to be2 ussell U.S. IndexesBroad market Large cap Small cap

Russell 3000E Index Russell 1000 Index Russell 2000 IndexRussell 3000E Value Index Russell 1000 Value Index Russell 2000 Value IndexRussell 3000E Growth Index Russell 1000 Growth Index Russell 2000 Growth IndexRussell 3000 Index Russell 1000 Defensive Index Russell 2000 Defensive IndexRussell 3000 Value Index Russell 1000 Dynamic Index Russell 2000 Dynamic IndexRussell 3000 Growth Index Russell 1000 Growth-Defensive Index Russell 2000 Growth-Defensive IndexRussell 3000 Defensive Index Russell 1000 Growth-Dynamic Index Russell 2000 Growth-Dynamic IndexRussell 3000 Dynamic Index Russell 1000 Value-DefensiveIndex Russell 2000 Value-DefensiveIndexRussell 3000 Growth-Defensive Index Russell 1000 Value-DynamicIndex Russell 2000 Value-DynamicIndexRussell 3000 Growth-Dynamic IndexRussell 3000 Value-DefensiveIndexRussell 3000 Value-DynamicIndex

Table 1: Russell US indexes by investment style and market sector. Table originallypublished in Section “Construction and Methodology” of [26].necessary. Nonetheless, even though such procedures remain undisclosed, the S&P500historical constituent securities are available to researchers via the WRDS databasemaintained by the Wharton School of the University of Pennsylvania.On the other hand, we have the FTSE company that implements a publicly avail-able fully deterministic rebalance algorithm for its Russell indexes, but which prefersnot to publicly disclose their historical index compositions. Bloomberg L.P. terminalsoﬀer the list of the companies in the indexes but neither the constituents securitiesnor the index weights are available.The FTSE index compositions are available to buy for ﬁnancial institutions andfunds. On the academic side, the WRDS database has recently started providing aRussell index historical dataset for 21 indexes, with a substantial annual fee. Howeverthis dataset provides only information on index weights and companies contributionsto returns. It falls short of more reﬁned information such as quarterly and annual3anking and rebalance days, and on historical lists of securities for companies whichare traded across diﬀerent classes of shares. See Section 2 for a detailed description ofthese features and their importance to the index reconstitution. Since these featuresplay a crucial role on indexes reconstitution, tracking them in a consistent frameworkis important for academic research (see e.g. the analysis in Section 5). There seems tobe a gap in Russell data for research purposes, in the sense that there is no recognisedsource from which academic researchers and ﬁnancial institutions’ internal researcherscan borrow detailed Russell US indexes data from. Therefore, it is very likely thatthe notion and composition which is used to approximate the Russell US indexes,e.g. the Russell 3000 or Russell 2000 indexes, could be diversiﬁed across diﬀerentacademic papers.Agreeing upon a common “deﬁnition” of what are Russell indexes could beneﬁt theﬁnancial research community as a whole and is one of the main goals of this work. Onsome very elementary scientiﬁc grounding, sharing the notion of initial data, commonto all ﬁnancial data analyses, allows for a higher degree of reproducibility of theresults. In the ﬁrst part of this paper (Sections 2–4) we develop a methodology whichis based only on data available in the CRSP database of the Wharton Research DataServices (see Section 3 for more details on the database). This methodology allowsus to replicate in great accuracy the Russell 1000, 2000 and 3000 indexes weightsand returns, and to track new additions and deletions. We demonstrate the accuracyof this methodology by comparing our suggested reconstitution procedure versus theoriginal Russell US indexes for the time period June 1989 to June 2019 (see Section4). A python package named pyndex that generates the indexes according to ourmethodology is also provided in [31].The impact of sharing such initial data might vary across diﬀerent studies, rangingfrom marginal to impactful, nonetheless we still remark that it could only be beneﬁcialand would bring the research community a step closer to the conclusion of disputeson results based on the quality of the data being analysed. Similarly to the naturalsciences, we remark that the validity of quantitative claims is settled only by referringto the observations of the phenomenon, which, by nature, strongly depend on the dataanalysed.As a ﬁrst application for our index reconstruction methodology, we study crowdingon indexing strategies around Russell 3000 reconstitution events, which starting from2004 occurs every quarter. Before we describe our analysis on this topic, we surveysome existing literature on crowding in ﬁnancial markets.Over the last 20 years, the phenomenon of crowding in ﬁnancial markets has in-creasingly gained attention both from academics as well as from ﬁnancial institutions.It is a subject of many research works studying both theoretical and empirical aspectsincluding [18, 37, 9, 1, 11, 12, 29].Crowding is often considered to be an explanation for sub-par performances of in-vestments as well the development of systemic risk in ﬁnancial markets. The presence4f largely overlapping portfolios comes at the expense of portfolio managers, also interms of transaction costs, as aﬃne positions usually lead to similar trades.Cont and Bouchaud [18] proposed a simple mathematical model in which the com-munication structure between agents gives rise to heavy tailed distribution for stockreturns. This established a theoretical connection between crowding and stock mar-kets shortfall. The aforementioned portfolios overlap was shown to be a considerablefactor in the August 2007 Quant Meltdown. Using simulated returns of overlappingequity portfolios, Khandani and Lo [29] showed that combined eﬀects of portfoliodeleveraging following by a temporary withdrawal of market-making risk capital wasone of the main drivers of the 2007 Quant Meltdown. Caccioli et al. [11, 12] de-veloped a mathematical model for a network of diﬀerent banks holding overlappingportfolios. They investigated the circumstances under which systemic instabilitiesmay occur as a result of various parameters, such as market crowding and marketimpact. Recently, Volpati et al. [37] measured signiﬁcant levels of crowding in U.S.equity markets for Momentum signals as well as for Fama-French factors signals, eventhough with smaller signiﬁcance.As already mentioned, we apply our index reconstruction methodology in order tomeasure crowding eﬀect on Russell indexes around reconstitutions events. It was re-ported by Madhavan [30] and others, that around annual index reconstitution events,there are signiﬁcant abnormal returns on stocks which are new additions or deletionsfrom the index. These returns are caused by portfolio trading strategies which an-ticipate the change in the stock price for new additions or deletions. The returns ofthese stocks are typically decomposed into two parts. The ﬁrst is called temporaryprice impact, and it describes the returns that revert back within one month fromindex rebalance day. The permanent price impact captures sustainable stock returns,which accumulates within two months from the index reconstitution announcement(see more details in Section 5). It follows that the reconstitution events of the Rus-sell 3000 index serve as prominent examples for studying crowding eﬀects of tradingstrategies.Starting from 2004, the Russell indexes received quarterly additions to take intoaccount the changes brought in the market by newly listed securities, that is IPOs,which took place between annual rebalancings (see additional details in Section 2).The practice of quarterly updates to the indexes was continued ever since and is cur-rently still in use. To our knowledge, none of the papers that studied the Russellindex reconstitution eﬀect has dealt with these quarterly additions. The reconstruc-tion methodology, which is developed in this paper, can assist us to preform a morereﬁned analysis of crowding phenomenon on Russell 3000 index around reconstitutiondates. This is the second objective of this paper.In Section 5 we compute the permanent and temporary price impact on the Russell3000 stock additions and deletions, using the annual index portfolios generated withour protocol. We also track the quarterly additions to the Russell 3000 index. We5nd that the price impacts of the aforementioned quarterly additions are overallcompatible with the hypothesis that the majority of market participants track theRussell 3000 index on an annual basis rather than on a quarterly basis. Such ﬁndingsare consistent with the belief that the portfolio strategies following the Russell 3000index rebalance on an annual basis are more crowded than those following the Russell3000 index rebalance on a quarterly basis.This paper is structured as follows. In Section 2 we describe the precise method-ology of the FTSE Russell indexes reconstitution which includes the quarterly re-balancings due to new initial public oﬀerings (IPOs). In Section 3 we describe thedata that we are using in order to reconstruct the indexes in this paper. Section 4is dedicated to our methodology which approximates the Russell US indexes to ourresults which replicate the indexes. In Section 5 we determine the temporary andpermanent price impact generated by the annual index additions as well as examinethe existence of crowded trades around the annual Russell 3000 reconstitution. InSection 6 we present the conclusions of this paper.

In this section we describe the main features in the original FTSE Russell indexesreconstitution methodology.The FTSE Russell US 1000, 2000 and 3000 are equity capitalisation-weighted in-dexes that currently follow an annual rebalance procedure, which was ﬁrst adoptedin June 1989. As further discussed in [13], the indexes followed a quarterly rebal-ance schedule from 1979 to 1986 and a semi-annual one from 1987 to 1989. In the2004 rebalance calendar, the Russell indexes received quarterly additions to take intoaccount the changes brought in the market by newly listed securities, that is IPOswhich took place between annual rebalancings. The practice of quarterly updates tothe indexes was continued ever since and is currently still in use.We remark that the newly issued securities added at each quarter do not replaceany other company already in the indexes, in fact quoting the 2004 press release [10]of Russell Investments : “As IPOs are added to Russell indexes each quarter, Russell will not deleteexisting index members to make room for them, but will continue to re-constitute the indexes fully each year at the end of the second quarter.” Inclusion in the Russell indexes is established systematically via a set of rules Russell Investments controlled the Russell US indexes until its index division was bought byLSE Group in 2015 and subsequently renamed FTSE Russell. rank day , which takes place in May, all U.S.-domiciled com-panies with stock prices greater than $1.00 are ranked according to their marketcapitalisation. The total market capitalisation of a company is computed by deter-mining the shares of common stock, non-restricted exchangeable shares and partner-ship units/membership interests while excluding any other form of shares, such asconvertible preferred stocks, foreign securities as well as American Depositary Re-ceipts (ADR). Explicitly, as discussed in [26], exchangeable shares are shares whichmay be exchanged, on a one-for-one basis, at the owner’s option at any time, whilemembership interests or partnership units embody an economic interest in a limitedliability company or limited partnership.For a company which is traded across diﬀerent classes of shares, e.g. BerkshireHathaway, FTSE Russell ﬁrst determines its so called pricing vehicle : the share classwith the highest two-year trading volume as of the corresponding rank day in May.Hence, the total market capitalisation is computed by multiplying the cumulativesum of shares across all classes by the close price of the pricing vehicle on the rankday. Only companies with a total market capitalisation higher than 30 million U.S.dollars are included in the ranking of the Russell US indexes.Once the ranking has been established, the 3000 companies with the highest mar-ket capitalisation fall in the Russell 3000 index. The top 1000 companies in theRussell 3000 index in turn, constitute the Russell 1000 index, while the bottom 2000determine the Russell 2000 index. The top 4000 companies in the ranking with totalmarket capitalisation higher than 30 million U.S. dollars, or all the available securitiesin case they are less than 4000, constitute the Russell 3000E.The weight corresponding to each security admitted to the index is computed asfollows. The outstanding shares of a security are adjusted to only include the numberof available shares which can be traded by the public, the so called “free ﬂoat” . Infact, it is possible that some of the shares, as those held by government or other thirdparty, might not be available for trading. The adjustment is done based on infor-mation contained in governmental ﬁlings, such as those submitted to the Securitiesand Exchange Commission (SEC). Further information about the ﬂoat-adjustmentprocedure for the rebalance year 2020 can be found in Section “Methodology En-hancements” of [27]. A market capitalization computed via the free ﬂoat shares iscalled ﬂoat adjusted. Stocks in the Russell US indexes are weighted by their ﬂoat-adjusted market capitalization times the closing price of the corresponding pricingvehicle.The rebalance day is scheduled to be one month later than the ranking day, co-inciding therefore with the end of June or beginning of July. On this day the newissues of the indexes oﬃcially replaces the previous ones in the stock market. Minor7djustments to the indexes are made in the period between the ranking day and therebalance day, for example, in the case of mergers and spin-oﬀ of companies. Sincethe FTSE Russell acquisition of the Russell U.S. indexes, which took place in 2015,the exact reconstitution calendar have been published on the FTSE Russell webpage.In order to reliably retrieve the reconstitution calendar prior to FTSE Russell acqui-sition one has to consider the research literature. Speciﬁcally, in Table 2 we gatherthe rank days and rebalance days as described in Section “Index Construction andSample Selection” of [13], Section 2 of [17] and Section 3.2 of [30]. It is documentedby the FTSE Russell webpage [25] that, starting from 2017, the Russell U.S. indexesrank day has seen a shift towards the ﬁrst half of May, in agreement to what is alsoobserved for the year 2020 in Table 2. The academic sources, that are dated before2017, unanimously agree on the rank day coinciding with the 31 st of March. Academic Sources and Annual Reconstitution CalendarsSource Rank Day Rebalance Day Year

FTSE Russell [27] May 8 June 26 2020Cai & Houge [13] May 31 June 30 2008Madhavan [30] May 31 July 1 2001Chen [17] May 31 June 30 2006Table 2: A comparison of the Russell indexes reconstitution calendar across diﬀerentsources.In a similar fashion to the annual rebalance schedule, the ranking for inclusionsof IPOs takes place at the end of Q3, Q4 and Q1. Approximately one month aftereach quarterly ranking date the index gets extended with the new eligible IPOs, as itcan be seen from the 2019 quarterly rebalance calendar in Table 3. As discussed inSection “Deﬁning Membership by size” [26], the quarterly rebalance days are taken tobe the third Fridays of September, December and March and the corresponding rankdays are set to be 5 weeks before each quarterly rebalance day.The eligibility of the IPOs is established in two ways:1. If the new issue released in the IPO belongs to a company which is alreadyan index constituent, the following criterion is considered. FTSE Russell de-termines the value associated to the IPO by multiplying the number of sharesreleased in the IPO by the price of the pricing vehicle of the company releasingthe issue. If the IPO’s value is larger than the market capitalisation of thecompany sitting at the bottom of the Russell 3000E index, the security releasedin the IPO is added to the index. The market capitalisation of the company8 ussell U.S. Quarterly Rebalance Calendar 2019

Quarterly additions 2019-Q3 Additions 2019-Q4 Additions 2020-Q1 AdditionsInitial oﬀering period IPOs which initiallyprice/trade betweenMay 13 and Aug 16. IPOs which initiallyprice/trade betweenAug 17 and Nov 15. IPOs which initiallyprice/trade betweenNov 16 and Feb 14.Rank date 16 Aug 2019 15 Nov 2019 14 Feb 2020Rebalance date 20 Sep 2019 20 Dec 2019 20 Mar 2020

Table 3: Quarterly IPO calendar for the 2019 Russell rebalance schedule.at the bottom of the Russell 3000E index, before being used in the compari-son with the IPO’s values, is suitably adjusted to take into account the pricevariations of the stocks which have taken place since the annual rebalance day.Note that the index membership assigned to the new issue will be the same ofthe pricing vehicle. However, the new issue is added to the index as a separateentity, therefore, it does not contribute to the total market capitalization of itscompany.2. If the new issue belongs to a company which is not in the index at the time ofthe IPO, then the market capitalisation of the IPO is established by multiplyingthe number of shares released, by their price on the quarter IPO ranking day.If such market capitalisation falls within any of the capitalisation breakpointsestablished at the annual ranking day then the company is added to the oneor more of the indexes accordingly.At every ranking day, either annual or quarterly, the index weights are recalculatedbased on the current capitalisation of the index constituents. The current annual cycleof the indexes is summarised by Fig. 1. Note that once a company is delisted fromthe market, the corresponding securities are not traded anymore. This implies that ifany such company is part of any Russell US index, then the number of actively tradedsecurities in the index might reduce during the course of the year. Nonetheless, FTSEdecides not to alter the current composition of the index and therefore any stock inthe index which is delisted is not replaced.Finally, as discussed in Section “Long-Run Impact of Additions and Deletions” of[13], the index returns can be found by a weighted average of the daily stock returnsbelonging to the index under the assumption of dividends reinvestment. The capitalisation breakpoints are determined by the market capitalisation on the annual rankingday of the lowest ranking company in the Russell 1000 and 3000 index. For the Russell 2000 indexthey are similarly determined by the highest and lowest market capitalisation. anuary February March April May June July August September October November December Annual RankingMinor AdjustmentsAnnual RebalanceQ1 IPO RankingQ1 IPO Additions Q3 IPO RankingQ3 IPO Additions Q4 IPO RankingQ4 IPO Additions

Figure 1: Russell US indexes annual reconstitution timeline starting from June 2004.The timeline for the years 1989-2004 is identical apart from the quarterly IPOs addi-tions, i.e. the blue circles.

For the index reconstitution we will adopt the data available in the Wharton ResearchData Services (WRDS) database, a research platform available to “ ”provided by the the Wharton School of the University of Pennsylvania. Speciﬁcally,we will limit ourselves to the ﬁnancial data collected in the Center for Research inSecurity Prices (CRSP) U.S. Stock database, which oﬀers highly accurate informationfor the U.S. stock market. As further discussed in Section “General Description:Coverage” of [35], such database contains end-of-day and month-end prices for, • NYSE , starting from December 31, 1925, • NYSE MKT , starting from July 2, 1962, • NASDAQ , starting from December 14, 1972, • Arca Exchanges , starting from March 8, 2006.Moreover, the securities listed in this database are only equity securities for U.S.companies or international companies which are traded in any of the stock marketaforementioned. The CRSP database contains the necessary information about theﬁnancial securities to be used in our analysis, such as prices, quote data, sharesoutstanding as well as the information about corporate actions, including IPOs. Werefer the reader to Appendix A for further technical details regarding CRSP databasesand their content.

We turn to discuss the main features of our analysis to reconstruct Russell 1000, 2000and 3000 using CRSP datasets to a very high degree of accuracy. Our analysis will not10e free from approximations to the original Russell US index reconstitution, whichcurrently counts more than 40 pages of methodology. Due to the restricted breadthof data we consider, which as mentioned in Section 3 is conﬁned to the CRSP U.S.stock ﬁnancial data, our reconstitution methodology departs in multiple ways fromthe one of Section 2. We will consider the time window starting from July 1989, theﬁrst year in which the annual rebalance schedule has been applied, and terminatingin June 2019.As already discussed in Section 2, the exact annual reconstitution calendar of theRussell U.S. indexes is not available to the public for the entire time period we considerhere. Therefore, we will take the annual rank day to take place on 31 st of May whilethe annual rebalance day to be the last Friday of June, as similarly supported by theacademic sources cited in Table 2. Following Table 3 and the methodology in [26]we take the Q3, Q4 and Q1 rebalance days to fall on the third Friday of September,December and March respectively and the corresponding rank days to be 5 weekstheretofore. If any rank day, be it annual or quarterly, falls on a U.S. non-tradingday then we move it to the preceding trading day. Instead, for a rebalance day, be itannual or quarterly, which falls on a U.S. non-trading day we shift it to the followingtrading day. We do so in order to avoid any look-ahead bias. Such choice of schedulemay deviate from the real one, but given that the rank and rebalance days often takeplace at the end of May and at the beginning of July respectively, we expect thedeviation to be marginal and not to present any measurable eﬀect on our ﬁnal result.Originally, as explained at length in Section 2, the pricing vehicle for each companyis identiﬁed and then used to determine the market value of such company on theranking day. Such procedure presents extra work required for companies with morethan one share class: we would need to identify the pricing vehicle using the two-year trading volume of each share class. Figure 2 of [13] shows that it is possibleto replicate, for the time period 1979-2004, to very high statistical accuracy thecumulative returns of the Russell 2000 index. This is done by computing the marketcapitalization of every company without determining its pricing vehicle, that is bymultiplying the total shares outstanding of each security times the correspondingshare price. This motivates us to deviate from the original methodology and toapproximate the market capitalization of each company as in [13], for all the RussellU.S. indexes and for the entire time period 1989-2019. Moreover, we remark that suchmarket capitalization would diﬀer from the original one only for companies which aretraded across two or more share classes and not for all the index constituents.CRSP does not contain any information regarding cross-ownership or privatelyheld shares. Such a piece of information is necessary in order to adjust for the freeﬂoat, i.e. the fraction of shares which can be traded by the public. Hence, insteadof computing the weights of the stock admitted to the index using the ﬂoat-adjustedmarket capitalization, as in the original methodology, we use the the same marketcapitalization which was used to established the index ranking.11tarting from the reconstitution calendar of May 2004, we introduce quarterlyranking days and rebalance days in order to update our index with the IPOs takingplace between rank days. Note that this is one of the main diﬀerences from previousindex reconstruction papers such as [13]. As already discussed in Section 2, the wayin which the original methodology considers adding newly issued securities to theindex is two-fold, depending whether they belong to a company listed in the indexor not. Our methodology deviates from the original as follows. We will add only thenewly issued securities belonging the companies not listed in the index. We requireeach security to satisfy the standard eligibility requirements for the admission to theindex and whose IPO took place in the 3 months preceding the quarterly rank day.Once a new issues satisﬁes the eligibility requirements, then it can be added to oneor more Russell U.S. indexes, only if its total market capitalization falls within themarket capitalization breakpoints established during the most recent annual rankingday.Finally, in accordance with the original FTSE Russell methodology any companyin the index which is deleted between rebalance days is never replaced.Now we are ready to present our main results regarding indexes replication. Weﬁrst concentrate on the results of indexes replication between 1989-2004, where Rus-sell indexes were rebalanced annually, without any quarterly IPOs additions. Thenwe focus on more recent results of indexes replications between 2004-2019, wherequarterly rebalancings including companies IPOs were introduced.Cai and Houge [13] retrieved the roster of companies in the Russell 1000 and2000 indexes for the time period 1979 to 2004 directly from Frank Russell Company.Figure 1 in [13] displays the total number of Russell 2000 membership changes foreach annual rank date alongside the number of new issues, i.e. IPOs and spin-oﬀs,picked up by the index each year. In Fig. 2 we also compute the annual number ofconstituent changes to Russell 2000 index for each annual reconstitution. Speciﬁcally,the “Total Index Additions” bar at year t counts the number of companies added tothe Russell index during the year t annual rebalance but which weren’t in the indexin the previous release of the index. The “New Issues” bar at year t quantiﬁes thecompanies added to the Russell 2000 index during the year t annual rebalance whoseIPO took place between May of year t − and May of year t . CRSP does not oﬀerenough information regarding corporate actions in order to include spin-oﬀs, as it wasdone in Fig. 1 of [13]. Over the years 1989 to 2004, where our methodology intersectswith [13], we see that there is a very good agreement in terms of the number of newissues added to the index and the total number of index additions. Therefore, thisguarantees that, for the time period 1989-2004, our methodology does not signiﬁcantlydiﬀer from the original methodology which generated the rosters of companies studiedby Cai and Houge and which were originally retrieved from Frank Russell Company.Figure 2 extends the results of Cai and Houge to the time period in which IPOs wereincluded in the original methodology, that is from the year 2004 until the most recent12ata. Time N u m b e r o f C o m p a n i e s R u ss e ll Total Index Additions New Issues

Figure 2: The annual index changes between 1989-2019, as well the total numberof new IPOs taking place in 12 months before year the rebalance of each year andsatisfying the requirements for index additions. Russell US indexes started receivingindex additions due to IPOs from September 2004.We visually compare the index returns generated with our methodology versusthe original index returns. We retrieve the original daily returns for the Russell 1000,2000 and 3000 from the Bloomberg L.P. terminal. As already discussed, the indexdaily returns can be computed by a weighted average of the stock returns using theindex weights. Therefore, a correct combination of the stocks selection and theircorresponding index weights should be capable to reproduce the original daily indexreturns. We remark that it would be very hard, if not impossible, to back-engineerthe constituents and the corresponding weights given the original daily index returnsfor the entire time period considered. The daily index returns are too noisy to beused for any meaningful visual comparison, therefore we plot the trailing three-months (T3M) index gross returns. Let ≤ t < t and let r t be the daily net return fromday t − to time t . The compounded gross returns on [ t , t ] are given by, (cid:89) t ∈ [ t ,t ] (1 + r t ) . Hence, the trailing three-months gross returns at time t are computed by taking [ t , t ] to be a 3 months time window.Figure 3 compares the T3M index gross returns based on the daily returns ofour replicated indexes and the original Russell 1000, 2000 and 3000 indexes for the13ime window between July 1989 to June 2019. The Russell U.S. indexes generatedfollowing our methodology are consistently capable of mimicking the original Russellindexes for the entire duration of the time window considered.The visual agreement of Fig. 3 is further assessed statistically via cross-correlationsof daily returns of our replicated indexes against the original Russell U.S. indexes.As discussed in Section “What is the problem with cross-correlating simultaneous au-tocorrelated time series?” of [21], the signiﬁcance thresholds of the cross-correlationbetween two time series has to be altered from the conventional cross-correlationlimit, if the time series considered individually present signiﬁcant autocorrelations. Ifsuch autocorrelations are not taken into account they may lead to the phenomenonof “spurious correlations” , in which, as also shown in Fig. 1 of [21], where even twoindependent time series can present a signiﬁcant correlation.For the time window 1989-2004 and 2004-2019, we checked that none of the dailyreturns of the indexes, generated or original, present signiﬁcant autocorrelation atany non-zero lag. The correlations of generated and original indexes at zero lagare reported in Table 4, along with the 5% signiﬁcance thresholds which are givenby ± . / √ n , where n is the sample size. All the cross-correlations are stronglysigniﬁcant. Cross-CorrelationsYears Russell 3000 Russell 2000 Russell 1000 5% Signiﬁcance Limits ± ± Table 4: Cross-correlation at lag 0 days between daily net returns for Russell 1000,2000 and 3000 generated with our reconstitution procedure versus the original Russellindexes.Similarly, as shown in Fig. 4, the normalised distribution of the daily returnsoverlap to a very good degree between June 1989 to June 2004. The agreement isalso conﬁrmed by the corresponding Q-Q plot.Next, we turn to the time window ranging from June 2004 to June 2019. Asalready discussed at the beginning of this section, we consider adding to our indexonly securities issued by companies which are not listed in the index at the time oftheir IPO. This approximation allows us to exclude the extra work of consideringdiﬀerent criteria for the securities belonging to companies already in the index. Thefull discussion of such criteria was given in Section 2. Figure 5 displays a justiﬁcationfor such approximation. We compare, for each year from 2004 to 2019, the numberof new issues from companies which are not in the index, namely “New Issues notin Index” , to the number of new issues from companies which belong to the Russell14000 or 2000 indexes, that is the “New Issues from Russell 1000” and “New Issuesfrom Russell 2000” bars. Speciﬁcally, the “New Issues from Russell 1000” bar at year t quantiﬁes the eligible securities issued by a company in the Russell 1000 index inthe time window from year t to year t + 1 . Similarly, mutatis mutandis, for the “NewIssues from Russell 2000” and the “New Issues not in Index” bars. We remark thatgiven the hierarchical structure of the Russell 1000, 2000 and 3000 indexes the sumof the new issues from companies which belong to the Russell 1000 and 2000 indexesis simply the total number of new issues from companies in the Russell 3000 index.For the entire duration of our analysis the “New Issues not in Index” IPOs are abouttwo orders of magnitude larger than the “New Issues from Russell 1000 and Russell2000”

IPOs combined. In many years there are no new issues belonging companiesbelonging to the indexes, for example as in 2007 or 2016.Similarly to the time period 1989-2004, we compare the T3M index gross returnsof the reproduced Russell index against those of the original ones. Again, Figure 3compares the T3M cumulative returns arising from the Russell 1000, 2000 and 3000index generated with our methodology versus the original indexes between June 2004to June 2019. Similarly to the pre-2004 returns, our generated index can fully imitatethe original Russell indexes returns.Moreover, for the time window 2004-2019, the daily returns do not show anyautocorrelation both for the original and generated time series. As contained inTable 4, the cross-correlation between the generated and original returns is extremelysigniﬁcant for both the Russell 1000, 2000 and 3000 indexes.When comparing the normalised distributions of the daily returns as in Fig. 6we observe a very good agreement between our and the original indexes, which issupported by the respective the daily returns histogram (left panel) and a Q-Q plot(right panel) for the Russell 3000.

In this section we measure the temporary and permanent price impact for the annualadditions and deletions in the Russell 3000 index. Moreover, we conduct a carefulanalysis on temporary and permanent price impact for new IPO’s which are addedto the the Russell 3000 index, estimating such quantities near the dates of quarterlyand annual rebalancings. Studying the aforementioned price impact allows us totest whether the majority of market participants follow the index rebalance annuallyor quarterly. Speciﬁcally, for each year from 2004 to 2018 we test the followinghypotheses: • whether the most recent Q3, Q4 and Q1 quarterly additions remaining in theRussell 3000 index at annual rebalance, present a signiﬁcantly diﬀerent priceimpact compared to all the other additions in the index, near the date of the15

992 1996 2000 2004 2008 2012 20160.60.81.01.21.4 T3M Index Gross Returns1992 1996 2000 2004 2008 2012 20160.51.01.5 1992 1996 2000 2004 2008 2012 20160.60.81.01.21.4 R u ss e ll R u ss e ll R u ss e ll Generated Original

Figure 3: We compare the T3M gross returns for the time period June 1989 to June 2019 belonging Russell 1000, 2000and 3000 generated with our reconstruction methodology (in blue) versus the original Russell US indexes (in orange). D e n s i t y Russell 3000 Normalised Daily Returns Distributions

GeneratedOriginal -5 0 5Generated Quantiles-505 O r i g i n a l Q u a n t il e s Q-Q Plot

Figure 4: On the left panel we compare between the normalised daily returns his-togram of the Russell 3000 replicated index (orange area) and the original index (bluearea), from June 1989 to June 2004. On the right panel the corresponding Q-Q plotbetween the two distributions is presented. C o un t s New Issues Additions by Year

New Issues not in IndexNew Issues from Russell 1000New Issues from Russell 2000

Figure 5: Comparison of the number of new IPOs in each year in the following groups: “New Issues not in Index” for securities which are not in the Russell 3000 index, “NewIssues from Russell 1000” and “New Issues from Russell 2000” ,between 2004-2018.annual rebalance. • whether near each of the Q3, Q4 and Q1 rebalancings, the quarterly additionshave a price impact signiﬁcantly diﬀerent from other Russell 3000 index mem-bers, which have not changed their index membership in the most recent annualrebalance. 17 D e n s i t y Russell 3000 Normalised Daily Returns Distributions

GeneratedOriginal -5 0 5Generated Quantiles-505 O r i g i n a l Q u a n t il e s Q-Q Plot

Figure 6: On the left panel we compare between the normalised daily returns his-togram of the Russell 3000 replicated index (orange area) and the original index (bluearea), from June 2004 to June 2019. On the right panel the corresponding Q-Q plotbetween the two distributions is presented.As a result we shed light on crowded and less crowded trades on new stock addi-tions to the Russell 3000 index, in proximity of the annual and quarterly rebalancedates.Madhavan, in his seminal work [30], measured the mean permanent and temporaryprice impacts generated by the annual addition and deletions of securities to theRussell U.S. indexes. Speciﬁcally, Madhavan focused on the 1996-2001 period, thatis, before the index reconstitution methodology was updated to include the quarterlyIPO additions as discussed in Section 2. He computed the permanent and temporaryprice impact in terms of the log-returns produced by the securities within the followingtime intervals: for permanent impact, from the end of May until two months thereafterand for temporary impact from June 30 until one month thereafter. We recall thatthe end of May coincides with the annual rank day and June 30 can be consideredthe date of the reconstitution, as it can be inferred by Table 2. It was found that forindex additions over the period 1996-2001, the mean temporary impact and the meanpermanent impact for the Russell 3000 index were 5.4% and 3.3%, respectively. Forindex deletions in the Russell 2000 index, the results were more modest with a meantemporary impact of 0.7% and a mean permanent impact of âĂŞ1.6% (see Table 1and 2 therein).Quantifying temporary and permanent market impact is especially of interest tothe market microstructure literature as well as to ﬁnancial institutions since they areoften found to be two of the main sources of transaction costs.Analogously to [30], we determine the permanent and temporary price impacts18ssociated to the annual reconstitution of the index both for index additions and dele-tions. We adopt the methodology from Section 5.2 of [30] and quantify the temporarymarket impact as, R temp = ln( p ) − ln( p ) (5.1)and the permanent market impact as, R perm = ln( p ) − ln( p ) , (5.2)where p , p and p are the stock prices at the annual rank day, one month thereafterand two months thereafter, respectively. Table 5 reports the measurement for thepermanent and temporary market impact for annual additions and deletions. Thecolumns ¯ R temp and ¯ R perm contain the mean temporary and permanent market impact,respectively, expressed in terms of percentages with the corresponding standard errorsin parenthesis. The column N o contains the size of the sample considered. Note thatwe included in this analysis also the new IPOs which were added to the index on theannual rebalance, but not the ones that were added in the quarterly rebalancing ofQ3, Q4 and Q1 of the same year.We remark that after 2008, the temporary market impact often presents a neg-ative sign, amenable to the 2010s bull market which signed a positive trend in theequity stock market, as also it has been documented by ﬁnancial news e.g. [36, 20, 34].Nonetheless, in many years deletions still present a combination of positive temporaryprice impact and negative market impact regardless of the positive trend aforemen-tioned. Moreover, we also measure a more moderate price impact for deleted securi-ties, analogously to what has been observed by [30] for the time period 1996-2001.As discussed in Section 2, from the 2004 annual reconstitution the Russell U.S.indexes has started receiving quarterly additions with newly issued securities in or-der to provide a version of the indexes which better resemble the equity market.We therefore investigate if such quarterly updates are really implemented by marketparticipants via quarterly reconstitutions of the index portfolios.We recall that the price impact measured on the index additions arises from thetransactions generated by traders portfolio rebalancings. In fact, close to the annualreconstitution period, market participants review their equity portfolios tracking theindexes: buy and sell orders are based on their beliefs on what constituents willbe added and deleted from their current portfolio composition. It follows that thesecurities which are already present in the equity portfolio aforementioned at the timeof the annual review and are believed to remain in the new roster of securities, will notsee an excess of transactions comparable to those of the new additions and deletions.Indeed, this is the reason why the index eﬀect literature focuses exclusively on annualindex additions and deletions.In the hypothesis of a portfolio manager tracking the index at each quarter rebal-ance, at the time of the annual reconstitution she would mainly have to buy sharesof the securities which she believes will be added to the index, and which were not19 rice Impact in Russell 3000 IndexAnnual Additions Annual Deletions Year ¯ R temp ¯ R perm N o ¯ R temp ¯ R perm N o “Quarterly Additions” group, which appearsin orange, are the securities which were added at the most recent Q3, Q4 and Q1rebalancings and which remained in the index in the upcoming annual reconstitution.The group “New Additions” , in blue, represents any other security added to the indexin the same year. We observe a very good agreement for the distributions at eachyear, supporting the hypothesis that securities in the “Quarterly Additions” group andthose in the “New Additions” group are traded by market participants in a very similarfashion within the time frame of up to two months after the annual reconstitutiondate.We further investigate the observed similarity between the “Quarterly Additions” group and the “New Additions” group near the annual reconstitution date, underminimal assumptions. We conduct a two-sample t -test assuming unequal variancesand unequal sample sizes under the null hypothesis that the two groups are sampledfrom the same distribution. The t -statistic, which we denote by t obs , is deﬁned in(B.1). Here, y refers to the log-returns of the “New Additions” group and z stands forthe returns of the “Quarterly Additions” group according to (5.1) and (5.2). We applya bootstrap algorithm with 10,000 repetitions for each year. We refer to Algorithm B.1in Appendix B for the procedure used to calculate the p -values. In order to accountfor multiple hypothesis tests at each year from 2005 to 2018, we need to modify the p -values which are given by Algorithm B.1 by using the Benjamini-Hochberg procedure(see Section 3 of [2]). In Appendix C we describe the transformation that needs tobe applied on the p -values (see equation (C.2) and Algorithm C.1 therein). Whenadjusting for multiple testing, the p -values for the hypothesis for the permanent priceimpact tests and those for the hypothesis temporary price impact tests are adjustedseparately.Table 6 reports the two-tailed adjusted p -values of our test statistic for eachyear from 2005 to 2018. Only in one year out of fourteen, namely 2006, the meanpermanent price impacts of the two groups were found to be signiﬁcantly diﬀerentat 0.05 signiﬁcance level. As for the temporary market impact the two groups werefound to be signiﬁcantly diﬀerent only on three years out of fourteen, namely 2006,2011 and 2016. Nonetheless, such discrepancy could have already been deduced fromFigure 7, where the blue and orange temporary price impact distributions presentvisibly diﬀerent features.The reconstitution methodology introduced in Section 4 allows us to keep trackof the quarterly index additions in the Russell 1000, 2000 and 3000 indexes at eachquarter. This allows us to test the complementary hypothesis of whether the newquarterly additions receive any abnormal price impact soon after the correspondingquarterly rank day. In fact, in the case in which most of market participants were to21ebalance their index portfolio annually, the new quarterly additions would not seeany signiﬁcant excess of price impact compared, for example, to the securities whichare already present in the index.As already discussed in this section, the new annual additions present an excessof price impact measurable up to the end of July of the corresponding year, i.e. twomonths after the annual rank day. Moreover, as shown in Table 3, the Q3 rank dayusually falls approximately in the middle of August. Hence, it might be the case thatsome new annual additions could continue to present a measurable excess of priceimpact in the proximity of Q3 rank day.The only securities in the index which can be safely considered devoid of theaforementioned excess of price impact are those who have not changed their indexmembership in the most recent annual rebalance. In fact, even securities who re-mained in the Russell 3000 index during the most recent annual rebalance, but havemoved from the Russell 2000 index to the Russell 1000 index, might still present anexcess of price impact. This eﬀect is generated by the buy orders of those marketparticipants following the Russell 1000 index. Therefore, we investigate if the priceimpact measured on the new quarterly additions and those securities that have notchanged their index membership in the most recent annual rebalance present mea-surable diﬀerences.For each quarter we conduct, similarly to what we did for the annual rebalance, atwo-samples t -test assuming unequal variances and unequal sample sizes. Our t -test isdone under the null hypothesis that the new quarterly additions and the securities thathave not changed their index membership in the two most recent annual rebalancingsare sampled from the same distribution. Again, we take the 10,000 repetitions forthe bootstrap resampling as in Algorithm B.1.We deﬁne the mean test statistics ¯ t to be the mean of the bootstrap t -distributioncreated in Algorithm B.1, for the two-samples t -test, measuring permanent priceimpact. Here y and z refer to the two months log-returns starting on the quarter rankday, for stocks which are already in the index and for quarterly additions, respectively.The 95% conﬁdence interval is also derived by using a bootstrap percentile method,as deﬁned in (B.2). The 95% conﬁdence interval needs to be further adjusted for mul-tiple testing, as introduced by [3] and further discussed in Section “False CoverageStatement Rate-Adjusted CIs” of [28]. This is done analogously to the p -value cor-rections of Table 6, see Algorithm C.2 in Appendix C for the exact procedure. Whenadjusting for multiple testing, the 95% conﬁdence interval for the hypothesis for theQ3, Q4 and Q1 permanent price impact are adjusted separately. Figure 8 presentsthe mean test statistics ¯ t for the two-samples t -test for permanent price impact (inthe blue line), along with the 95% conﬁdence interval (the light blue region). In theorange line we show the observed test statistics t obs from (B.1)Finding t obs outside the 95% conﬁdence interval would mean that we must rejectthe null hypothesis that the two samples come from the same distribution, and accept22

005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018-50050 P r i ce I m p a c t ( % ) Permanent Price Impact by Year2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018-25025 P r i ce I m p a c t ( % ) Temporary Price Impact by YearNew Additions Quarterly Additions

Figure 7: A comparison of the permanent and temporary impact distributions by year for the “Quarterly Additions” group and the “New Additions” group, for the Russell 3000 index. The “Quarterly Additions” group are the securitieswhich were added at the most recent Q3, Q4 and Q1 rebalancings and which remained in the index in the upcomingannual reconstitution. The “New Additions” represent any other security added to the index in the same year. ermanent Impact Temporary Impact Years t obs p t obs p Table 6: Observed test statistic and the corresponding two-tailed p -values at 0.05level of signiﬁcance, for each year from 2005 to 2018 under the null hypothesis thatthe “Quarterly Additions” group and the “New Additions” are sampled from the samedistribution.the alternative hypothesis that the distributions generating the two samples are dif-ferent. We remark that only three years out of ﬁfteen present two or more signiﬁcantobserved test statistics t obs , namely 2007 and 2015.Similarly, in Figure 9 we present in the blue line the mean test statistic ¯ t fromAlgorithm B.1, for temporary price impact. Here y and z refer to the one monthslog-returns starting on the quarter rank day, of stocks which are already in the indexand of quarterly additions, respectively. In the light-blue region we plot the conﬁdence interval, which is derived along the same lines as in Figure 8. In theorange line we show the observed test statistic t obs from (B.1), for temporary priceimpact. We observe that only four years out of ﬁfteen present two or more signiﬁcantobserved test statistics t obs , namely 2011, 2013, 2015 and 2017. Nonetheless, themajority of the signiﬁcant t obs are only marginally signiﬁcant. It is reasonable tobelieve that the results in the years 2007 and 2009 might have been biased by theunfolding of the 2007-2008 ﬁnancial crisis.Ultimately, no compelling evidences were found to conclude that the majority ofmarket participants follow the quarterly index rebalancings, as shown by Figures 8and 9. Moreover, the similarities between the price impact distributions observed inFigure 7 are in favour of the hypothesis that most market participants focus on theannual index rebalance, disregarding the quarterly index additions until the entire24

004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 − t Q3 t -statistic Permanent Impact − . . . t Q4 t -statistic Permanent Impact − . . . t Q1 t -statistic Permanent Impact t obs t

95% CI

Figure 8: Permanent price impact case. The mean test statistic ¯ t is plotted in theblue line for the new quarterly additions versus those securities that have not changedtheir index membership in the most recent annual rebalance. The light blue bandsare the 95% conﬁdence intervals for t -test statistic. Observed test statistic t obs ispresented in the orange line.index portfolio has to be reviewed to take into account the changes brought by theannual index reconstitution.The non-crowding phenomenon around quarterly rebalance dates, points out apossibility for proﬁtable trading strategies on IPOs additions. A trader who wishesto track the new index additions could purchase new IPOs additions around quarterlyrebalance dates, with relatively low transaction costs. These IPOs could be sold laterby the trader near the annual rebalance date, where the stock price will experience asigniﬁcant increase due to price impact. 25

004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 t Q3 t -statistic Temporary Impact − . . . t Q4 t -statistic Temporary Impact − . . . t Q1 t -statistic Temporary Impact t obs t

95% CI

Figure 9: Temporary price impact case. The mean test statistic ¯ t is plotted in theblue line for the new quarterly additions versus those securities that have not changedtheir index membership in the most recent annual rebalance. The light blue bandsare the 95% conﬁdence intervals for t -test statistic. Observed test statistic t obs ispresented in the orange line. This paper was built out of two parts. On the ﬁrst part we dealt with reconstructionof Russell US indexes. We reviewed the main features of the Russell US index re-constitution methodology, starting from the index eligibility criteria to the quarterlyIPOs addition procedure. Our analysis focused on the years 1989-2019 . We split ouranalysis into two time windows: the ﬁrst is 1989-2004, and the second is 2004-2019,with 2004 being the year in which the quarterly IPOs addition were introduced. By acareful choice of approximations to the aforementioned methodology we reproducedthe Russell 1000, 2000 and 3000 indexes to a very high degree of accuracy, using onlyCRSP US Stock database for our index reconstruction. We remark that the CRSPdatabase, which is part of the Wharton Research Data Services (WRDS) database,26s frequently used by researches in the ﬁeld and is available in many academic insti-tutions.The index constituents and their corresponding weights are released via a pythonpackage called pyndex [31], in the purpose to make this an accessible and standardplatform for researches in the ﬁeld, as the Russell indexes historical data is oftenunavailable for academic studies.In the second part of the paper, we studied crowding phenomenon on strategiesthat tracking the Russell 3000 index. We measured the temporary and permanentprice impact for the annual index additions and deletions from 2005 to 2018. Wecompared the permanent and temporary price impact aﬀecting the securities addedin the Q3, Q4 and Q1 quarterly rebalancings and remaining in the index versus thenew index additions that didn’t belong to the Russell 3000 index at any time in theprevious rebalance year. Such measurements suggested a larger presence (or crowd-ing) of trading strategies that are tracking the index additions annually compared tothose who rebalance quarterly. This phenomenon implies that indexing strategies canexperience reduced transaction costs by buying new IPOs additions closely quarterlyrebalance dates.It was shown in [37] that common strategies, which are based only on momentumsignals, are crowded and therefore would give a rather poor proﬁtability. Our ﬁnd-ing add additional information on crowding phenomena, as we show that indexingstrategies are indeed crowded on the one year scale but much less crowded on the months scale near quarterly rebalancings. Wharton Research Data Services (WRDS) was used in preparing this paper. Thisservice and the data available thereon constitute valuable intellectual property andtrade secrets of WRDS and/or its third-party suppliers.

References [1] P. Barroso, R. M. Edelen, and P. Karehnke. Institutional crowding and themoments of momentum. SSRN.3045019, 2017.[2] Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: A practi-cal and powerful approach to multiple testing.

Journal of the Royal StatisticalSociety: Series B (Methodological) , 57(1):289–300, 1995.[3] Y. Benjamini and D. Yekutieli. False discovery rate–adjusted multiple conﬁdence27ntervals for selected parameters.

Journal of the American Statistical Association ,100(469):71–81, 2005.[4] E. N. Biktimirov, A. R. Cowan, and B. D. Jordan. Do demand curves for smallstocks slope down?

Journal of Financial Research , 27(2):161–178, 2004.[5] A. L. Boone and J. T. White. The eﬀect of institutional ownership on ﬁrmtransparency and information production.

Journal of Financial Economics ,117(3):508–533, 2015.[6] G. Bormetti, L. M. Calcagnile, M. Treccani, F. Corsi, S. Marmi, and F. Lillo.Modelling systemic price cojumps with Hawkes factor models.

Quantitative Fi-nance , 15(7):1137–1156, 2015.[7] F. Bucci, M. Benzaquen, F. Lillo, and J. P. Bouchaud. Slow decay of impact inequity markets: insights from the ANcerno database. arXiv:1901.05332, 2019.[8] F. Bucci, F. Lillo, J. P. Bouchaud, and M. Benzaquen. Are trading invariantsreally invariant? Trading costs matter.

Quantitative Finance , pages 1–10, 2020.[9] F. Bucci, I. Mastromatteo, Z. Eisler, F. Lillo, J. P. Bouchaud, and C. A. Lehalle.Co-impact: crowding eﬀects in institutional trading activity.

Quantitative Fi-nance , 20(2):193–205, 2020.[10] BusinessWire. Russell indexes to add ipos on a quarterly basis changein methodology enhances market representation of index. Available at , 2004.[11] F. Caccioli, J. D. Farmer, N. Foti, and D. Rockmore. Overlapping portfolios,contagion, and ﬁnancial stability.

Journal of Economic Dynamics and Control ,51:50–63, 2015.[12] F. Caccioli, M. Shrestha, C. Moore, and J. D. Farmer. Stability analysis ofﬁnancial contagion due to overlapping portfolios.

Journal of Banking & Finance ,46:233–245, 2014.[13] J. Cai and T. Houge. Long-term impact of Russell 2000 index rebalancing.

Financial Analysts Journal , 64(4):76–91, 2008.[14] F. Capponi and R. Cont. Trade duration, volatility and market impact.SSRN.3351736, 2019.[15] Y. C. Chang, H. Hong, and I. Liskovich. Regression discontinuity and the priceeﬀects of stock market indexing.

The Review of Financial Studies , 28(1):212–246,2014. 2816] H. Chen, G. Noronha, and V. Singal. Index changes and unexpected losses toinvestors in S&P 500 and Russell 2000 index funds. SSRN.651950, 2005.[17] H. L. Chen. On Russell index reconstitution.

Review of Quantitative Financeand Accounting , 26(4):409–430, 2006.[18] R. Cont and J. P. Bouchaud. Herd behavior and aggregate ﬂuctuations in ﬁnan-cial markets.

Macroeconomic Dynamics , 4(2):170–196, 2000.[19] M. Cremers, A. Pareek, and Z. Sautner. Short-Term Investors, Long-Term In-vestments, and Firm Value: Evidence from Russell 2000 Index Inclusions.

Man-agement Science , 2020.[20] G. Davies. The great bull market reaches its 10th birthday. Available at ., 2019.[21] R. T. Dean and W. T. M. Dunsmuir. Dangers and uses of cross-correlation inanalyzing time series in perception, performance, movement, and neuroscience:The importance of constructing transfer function autoregressive models.

Behav-ior Research Methods , 48(2):783–802, 2016.[22] B. Efron and R.J. Tibshirani.

An Introduction to the Bootstrap . Chapman &Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis,1994.[23] H. G. Fong.

The World of Risk Management . World Scientiﬁc, 2005.[24] FTSE-Russell. Our story. Available at .[25] FTSE-Russell. Russell U.S. indexes – annual reconstitution. Avail-able at , 2017.[26] FTSE-Russell. Russell U.S. equity indexes v4.5 – Construction and Methodol-ogy. Available at https://research.ftserussell.com/products/downloads/Russell-US-indexes.pdf , 2020.[27] FTSE-Russell. Russell U.S. indexes review timetable - March 2020 andannual reconstitution timetable - June 2020. Available at https://research.ftserussell.com/products/index-notices/home/getnotice/?id=2595207&_ga=2.179740323.629831544.1587596713-1989757251.1586952995 , 2020.[28] D. M. Groppe. Combating the scientiﬁc decline eﬀect with conﬁdence (intervals).

Psychophysiology , 54(1):139–145, 2017.2929] A. Khandani and A. W. Lo. What happened to the quants in august 2007?:Evidence from factors and transactions data. SSRN.1288988, 2008.[30] A. Madhavan. The Russell reconstitution eﬀect.

Financial Analysts Journal ,59(4):51–64, 2003.[31] A Micheli. pyndex - Russell index reconstruction package. Available at https://github.com/alemicheli/pyndex .[32] Z. M. Onayev and V. M. Zdorovtsov. Predatory Trading Around Russell Recon-stitution. SSRN.1101341, 2008.[33] A. Petajisto. The index premium and its hidden cost for index funds.

Journalof Empirical Finance , 18(2):271–288, 2011.[34] N. Randewich. Wall Street’s oldest-ever bull market turns 10 years old. Avail-able at https://uk.reuters.com/article/usa-stocks-bull/rpt-wall-streets-oldest-ever-bull-market-turns-10-years-old-idUKL1N20V1RJ ,2019.[35] Wharton Research Data Services. Overview of CRSP U.S. stock data.Available at .[36] Wall Street Journal Staﬀ. Inside a Decadelong Bull Run. Avail-able at , 2019.[37] V. Volpati, M. Benzaquen, Z. Eisler, I. Mastromatteo, B. Toth, and J. P.Bouchaud. Zooming in on equity factor crowding. arXiv:2001.04185, 2020.[38] R. Heller Y. Benjamini and D. Yekutieli. Selective inference in complex research.

Philosophical Transactions of the Royal Society A: Mathematical, Physical andEngineering Sciences , 367(1906):4255–4271, 2009.[39] E. Zarinelli, M. Treccani, J. D. Farmer, and F. Lillo. Beyond thesquare root:Evidence for logarithmic dependence of market impact on sizeand participationrate.

Market Microstructure and Liquidity , 01(02):1550004, 2015.

A CRSP US Financial Data

The analysis in this paper is extensively based on the ﬁnancial data available in theCRSP dataset. We summarise some of the main features of the databases considered.30he ﬁnancial information regarding the securities used for this study was retrievedfrom the

CRSPQ:DSF dataset, which consists of the quarterly updated CRSP dailystock data. The

CRSPQ:DSF dataset contains all the major daily ﬁnancial indicatorsfor the securities traded in the U.S. stock market, including stock closing prices,daily returns and outstanding shares. On the other hand, the

CRSPQ:DSFHDR datasetcontains the metadata related to each security in the

CRSPQ:DSF . The informationstored in the

CRSPQ:DSFHDR ﬁle includes for example the initial day of trading foreach security, the name of company to which the security belongs to as well as itsStandard Industrial Classiﬁcation.Two labels are required in order to identify a company and its underlying securities.The identiﬁer permco uniquely identiﬁes a company in CRSP; it is neither reusedonce a company cease to exist, nor changed in case the company’s name is subjectto modiﬁcation. Each company in CRSP may be traded on one or more securitiestherefore it is necessary to uniquely identify them in order, for example, to computethe company’s market capitalisation. CRSP provides a unique ﬁve-digit permanentidentiﬁer for each security, under the name of permno , which neither changes duringan issue’s trading history, nor is reassigned after an issue ceases trading. For eachsecurity, the daily returns, assuming dividend reinvestment, are given by the column ret in the

CRSPQ:DSF ﬁle.As discussed in Section “Calculations” of [35], the market capitalisation of a givencompany in CRSP can be found by computing, (cid:88) i p it · v it , where p it is the unadjusted close price of day t of security i and v it is the correspondingunadjusted number of its shares outstanding. The sum is taken over all the securitiesbelonging to the given company. Here, p it and v it corresponds to the columns prc and shrout of the CRSPQ:DSF dataset, respectively.The selection of the securities with prices greater than 1$ discussed in Section 2is performed over the adjusted stock prices. Following Section “Adjusting for StockSplits and Other Corporate Actions” of [35] the adjusted stock price of securities i isgiven by p it β it , where p it is the unadjusted close price of security i at day t and β it is the correspondingprice adjustment factor. Here β it is stored in column cfacpr of CRSPQ:DSF dataset.The share types in CRSP are identiﬁed through a two-digit code, named as hshrcd ,describing the type of shares traded. The ﬁle

CRSPQ:DSFHDR stores the hshrcd for allthe ﬁnancial securities belonging to the

CRSPQ:DSF dataset. Following Appendix A.2in Chapter 3 of [23], common stocks are represented by a hshrcd equal to 10 or 11.31n order to establish which IPOs will be included in which quarterly addition,one has to consider the corresponding IPO date. In CRSP, the ﬁrst day of tradingcorresponding to an IPO is stored in the begdat variable from

CRSPQ:DSFHDR dataset.The most widely used database in IPO research is SDC Platinum from ThomsonFinancial, which is currently not available in WRDS or CRSP. As reported in [35],a comparison between SDC and

CRSPQ:DSFHDR indicates that the ﬁrst trading daysagree in 76% of cases. This conﬁrms that the IPO dates in the

CRSPQ:DSFHDR datasetcan be reliably used for the index reconstruction.

B Bootstrap Two-Samples t -test In this section we summarise some useful results on bootstrap two samples t -test,which are taken from Chapter 16.2 of [22].We consider two samples z and y of sizes n and m , respectively, from possiblydiﬀerent probability distributions F and G . We would like to test the null hypothesis H : F = G . Let x be the collection of all the observations in y and z . We test H with the following two-samples unequal variance and unequal size statistic t ( · ) , t obs ≡ t ( x ) = ¯ z − ¯ y (cid:112) ¯ σ /n + ¯ σ /m , (B.1)with ¯ σ = 1 n − n (cid:88) i =1 ( z i − ¯ z ) , ¯ σ = 1 m − m (cid:88) i =1 ( y i − ¯ y ) , where ¯ z and ¯ y are the means of samples z and y , respectively. Algorithm B.1 computesthe bootstrap test statistic and the corresponding two-tailed p -values. In our analysiswe take the number of bootstrap repetitions N to be .Moreover, as discussed in Chapter 13.3 of [22], given a level of signiﬁcance α ,the corresponding conﬁdence interval for the bootstrapped distribution of the teststatistic t can be found using the bootstrap percentile method . Let ˆΦ be the empir-ical cumulative distribution function of the bootstrap test statistic t . The (1 − α ) conﬁdence interval are given by, ( ˆΦ − ( α/ , ˆΦ − (1 − α/ , (B.2)where ˆΦ − ( α/ and ˆΦ − (1 − α/ by deﬁnition correspond to the α/ and − α/ percentiles, respectively. C Multiple Testing

In this section we summarise some of the results regarding the Benjamini-Hochberg(BH) correction for independent multiple testing.32 lgorithm B.1 Bootstrap test statistic for testing F = G

1. Draw N samples of size n + m with replacement from x . Call the ﬁrst n observations z ∗ and the remaining m observations y ∗ .2. Evaluate t ( · ) on each sample, t ( x ∗ ,k ) = ¯ z ∗ − ¯ y ∗ (cid:112) (¯ σ ∗ ) /n + (¯ σ ∗ ) /m , k = 1 , , . . . N where ¯ σ ∗ and ¯ σ ∗ are deﬁned on z ∗ and y ∗ accordingly.3. Approximate two-tailed p -values by ˆ p boot = 1 − (cid:80) Nj =1 {− t obs ≤ t ( x ∗ ,j ) ≤ t obs } N .

As discussed in Section 2.b of [38], the p -values can be adjusted for multiple testingaccording to the BH procedure via Algorithm C.1. Section 3 of [2] clariﬁes that inthe BH procedure the test statistics are assumed to be independent. Let H ,i with Algorithm C.1 Multiple testing at signiﬁcance level α Let H ,i with i = 1 , . . . , m be the null hypotheses, and p i be the corresponding p -values.1. Sort the p -values as p (1) ≤ p (1) ≤ . . . ≤ p ( m ) and let p ( k ) be the largest valuesuch that p ( k ) ≤ kαm (C.1)2. If no such k exists, select no discovery. Otherwise, reject the k hypothesescorresponding to p (1) , . . . , p ( k ) , declaring these ﬁndings to be discoveries. i = 1 , . . . , m , be the null hypotheses, and p i be the corresponding p -values. One canalternatively compute the BH-adjusted p -values as follows P BH ( i ) = min (cid:18)(cid:18) min j ≥ i mp j /j (cid:19) , (cid:19) . (C.2)Then, P BH ( i ) ≤ α if and only if H ( i ) is among the discoveries when using the BHprocedure at signiﬁcance level α .As further discussed in Section “False Coverage Statement Rate-Adjusted CIs” of[28], the Benjamini-Hochberg procedure can be applied to conﬁdence intervals for33ultiple comparisons as shown in the following algorithm. Algorithm C.2 Adjusted Conﬁdence Intervals for Multiple-Testing

1. Apply the BH procedure to the p values from the family of m tests, where m isthe total number of hypothesis tests.2. For any p value that is signiﬁcant after the BH procedure, construct a conﬁdenceinterval for the corresponding test with coverage − α (cid:48) , where α (cid:48) is: α (cid:48) = (cid:18) km (cid:19) α, with k is deﬁned as in (C.1).In this paper we take α to be 0.05. Therefore, in order to compute the bootstrapconﬁdence interval, adjusted to the Benjamini-Hochberg framework, we use α (cid:48) inplace of αα