[PDF] Dynamics of Investor Spanning Trees Around Dot-Com Bubble

Abstract

We identify temporal investor networks for Nokia stock by constructing networks from correlations between investor-specific net-volumes and analyze changes in the networks around dot-com bubble. We conduct the analysis separately for households, non-financial institutions, and financial institutions. Our results indicate that spanning tree measures for households reflected the boom and crisis: the maximum spanning tree measures had clear upward tendency in the bull markets when the bubble was building up, and, even more importantly, the minimum spanning tree measures pre-reacted the burst of bubble. At the same time, we find less clear reactions in minimal and maximal spanning trees of non-financial and financial institutions around the bubble, which suggest that household investors can have a greater herding tendency around bubbles.

Full PDF

aa r X i v : . [ q -f i n . E C ] A ug Dynamics of Investor Spanning Trees Around Dot-Com Bubble

Sindhuja Ranganathan , Mikko Kivel¨a , Juho Kanniainen Industrial and Information Management/Tampere University of Technology, Tampere, Finland Department of Computer Science, School of Science/Aalto University, Espoo, FinlandCurrent Address: Industrial and Information Management/Tampere University of Technology,Tampere, Finland* [email protected]

Abstract

We identify temporal investor networks for Nokia stock by constructing networks from correlationsbetween investor-specific net-volumes and analyze changes in the networks around dot-com bubble. Weconduct the analysis separately for households, non-financial institutions, and financial institutions.Our results indicate that spanning tree measures for households reflected the boom and crisis: themaximum spanning tree measures had clear upward tendency in the bull markets when the bubble wasbuilding up, and, even more importantly, the minimum spanning tree measures pre-reacted the burstof bubble. At the same time, we find less clear reactions in minimal and maximal spanning trees ofnon-financial and financial institutions around the bubble, which suggest that household investors canhave a greater herding tendency around bubbles.

Introduction

The strategic interaction and collection of individuals or agents in a financial setup can play a key rolein determining their financial outcomes. Understanding how investors behave and operate has been atopic of interest in behavioral finance in the recent past. Earlier in the literature investors tradingstrategies and investor behavior were studied at an aggregated level using conventional regressionmethodologies [1, 2, 3, 4, 5, 6, 7]. The evolution of networks of stocks and currency rates and theirstructural change have been successfully analyzed in the existing literature[8, 9, 10, 11, 12, 13]. Effectof economic and financial bubble on the stock market have been analyzed in the literature [14, 15, 16].However, investor networks have been examined much less, and even though complex network methodshave been applied to identify investor networks recently[17, 18], we still lack research to study thedynamics of investors network around a financial crisis.This paper aims to be the first step to revealunderstanding on investor networks by focusing on the dynamics of correlation networks of investorsover the Dot-com (IT Millennium) bubble using unique investors transaction registry data whichcontains all the trades of Finnish households and institutions in Helsinki Exchange. We especiallyfocus on the question of how gradual and non-gradual changes in investor network structure are relatedto the stock price process. This research opens avenues to reveal understanding on actual mechanismsof stock markets to identify domino effects that can propagate through investors and propels the stockmarkets into a crisis state.In this paper investors correlation matrix is obtained using time series of investor-specific daily netvolumes for Nokia, one of the most important technology company around the millennium. At thesame time, Nokia is the most liquid stock in our data sample from Helsinki stock exchange and therehas been other research based on Nokia’s stock market data, for example in refs. 18, 19, 20. Investors’correlation matrices are estimated for three main categories of investors: financial institutions,households and non-financial institutions. Correlation matrices can be interpreted as link-weighted1/15etworks and the links in the resulting networks where all nodes are connected can be filtered with amultitude of different approaches [12, 21, 22, 23]. An elegant, and popular method in stock marketnetwork analysis, is to employ minimal or maximal spanning tree methods to find a “backbone” of thefull correlation network [8, 9, 11, 24, 25, 26]. Several more complicated correlation matrix constructionand filtering methods have been developed more recently [21, 23, 27, 28, 29, 30, 31], but utilizing theseis left for the future research.The analysis of dynamics of investor networks developed in this paper introduces two theoreticalchallenges when compared to other financial correlation networks. First, the set of investors is muchlarger than, for example, the number of stocks, and the set of active investors is strongly time-varying.The vast majority of methods developed for analyzing dynamic, or temporal, networks are based onthe assumption that only the links change and the set of nodes are stable [32, 33]. Further, changes inthe set of investors even limits the applicability of methods based on analyzing each network snapshotseparately, because metrics that are sensitive to network size cannot be compared across different timewindows where the number of investors can be exceedingly different. The second challenge is related tothe widely varying sparsity of the time series where few investors are extremely active and many otherstrade very infrequently. The active investors could be investigated using a high temporal resolutionand short observation window lengths, but the infrequent investors need a lower resolution and alonger time window. The conventional correlation analysis done here requires a single time resolutionlevel and observation window length to be chosen, and this choice must be a compromise between thetwo extremes.We construct minimum and maximum spanning trees for networks within six month time windowswith displacement of one month. Our results with estimated correlations between households’transactions show that the average weight of maximum spanning tree increases and average weight ofminimum spanning tree decreases before the tipping point of the stock prices (at which stock pricesstart to decline), after which they remain quite stable. In other words, when the bubble propagates,then, on average, an investor has a more and more positive correlations with another investors in themaximum spanning tree, but, at the same time, the correlations with the most distant investor, interms of trading style, becomes even more negative in the minimum spanning tree. This suggests thathouseholds became polarized before the Nokia prices crash in 2000. However, as strong effect cannotbe observed for financial institutions – the average weights of minimum and maximum spanning treesof institutional investors are not as clearly related to the evolution of financial crisis.

Dot-com Bubble

In this paper, we analyze the behavior of Nokia’s investors around the dot-com bubble of year 2000.Bubbles are phenomenon when price of assets deviate from their fundamental values [34]. Generally,during bubbles, investors purchase shares anticipating future gains and when bubbles collapse it leadsto sudden fall in the prices, which was the case also in dot-com bubble. Particularly, during the late1990s, internet-based stocks dominated the equity markets and there were lots of investments in theinternet and technology based start-ups with extremely optimistic expectations. As people startedpouring money on technology based start up companies, price of their share in the stock market grewvery high. During early 2000, investments in these companies reduced drastically and many of thesecompanies that were expected to generate profits failed, leading to the bubble to burst. As aconsequence of this, there was panic selling and market got slumped.Bubbles have been studied quite extensively and from various perspectives in the literature.According to ref. 35, market prices during bubbles follow power-law acceleration and have log-periodicoscillations. Dot-com bubble had similar characteristics and ended up in crash (see, for example, ref.[36]). One perspective is that bubbles occur due to the uncertainty that prevail in the market. [37] Inthis regard, ref. 38 provides evidence that uncertainty is plausible for a sudden rise in the price of somestocks as high level of uncertainty matched high prices and high return volatility in the market duringdot-com bubble. The sudden rise and fall in the market prices during dot-com bubble was associatedwith variations in risks from various sources. Bakshi and Wu [39] show that with the rising valuation ofthe NASDAQ 100, return volatility as a risk measure increased, the estimates for the market price for2/15iffusion risk became negative (from September 21, 1999, to January 5, 2000), and the market price ofjump risk became unusually high. Another perspective of bubble’s occurrence is that it occurs whenthere are new innovations [40] that investors see as opportunity pulls, expecting high profits in thefuture. Other reasons for the occurrence of bubble are lack of experience in traders [41], investor’semotions [42], investor’s over-confidence [43] and public announcements [44]. There are several reasonsfor a bubble to burst. According to ref. 38, one of the reason for the dot-com bubble to burst was thatthe expected profitability of technology stocks became low. Not all bubbles leads to crashes, but whena bubble crash, it signals important information to the market. According to ref. 40, when a bubblebursts it signals that there is a need to implement new innovations that happened in bubble period.This requires social and economic support to continue the growth of innovations which could benefitthe economy.

Results

Next we describe how we construct a series of correlation networks of investors investing in Nokia stockaround the Dot-Com Bubble, 1998 - 2002, and report the basic statistics related to the changes inthese networks. We then continue to investigate the minimal and maximal spanning trees we extractfrom these fully connected networks. We report the results of our analysis separately for Finnishhouseholds, financial institutions and non-financial institutions.

Nodes in the networks: Active investors

The nodes in the networks we construct are investors, and in order to estimate the correlationsbetween pairs of them we need to have enough data on their trading behavior. Figure 1a depicts thedistribution of investors in the period 1998–2002 and shows that there are many investors who havetraded only for few days but relatively few investors who have traded for many days, making the datasparse. We take two steps to alleviate the problems related to sparse data in the network construction:First, we only consider active investors who have traded minimum of 20 days in a given time period.Second, daily net-volumes of each active investor is averaged over a week (that is, we applyinvestor-specific simple moving averages).We investigate our total sample period 1998–2002 by moving a 6 month sliding time window on it.By using the above definition for active investors for each 6 month time window, Figure 1b depicts theevolution of the number of active financial institutions, households and non-financial institutions onthese time windows. We see that the numbers of active households and non-financial institutions hadpositive trends over the sample period while the number of active financial institutions remained ratherstable. Importantly, the bubble “burst” did not have clear effects on the number of active traders.Even though the number of investors in each time window can be stable, the set of investors canvary significantly. This is indeed the case, as shown by the Figure 1c where we use the Jaccard indexto investigate the number of investors overlapping in the every subsequent time window. Note that theactiveness criterion (at least 20 observation in six months) is applied for each estimation period with adisplacement of one month, and that this filtering has an effect to the Jaccard index values. Weobserve that the networks of households have lower similarity between each other compared to financialinstitutions, meaning that the turnover of active household investors is relatively high over time. Thismeans that, especially for relatively inactive household investors, the networks in different timewindows are bound to be very different, and if we observe any stable in network statics they cannot beonly explained by stability of the networks, but they need to be explained by some other organizingprinciples in the system.

Links in the networks: Correlations in trading patterns

We use the Pearson correlation of trading patters of investor pairs inside each time window toconstruct a links between the investors (for details, see Methods). The Pearson correlation coefficient3/15as been used extensively in the network analysis of time series of stock prices [8] and it has some clearadvantages also in the analysis individual investor trading. Observations with exceptionally hightrading volumes can represent days with arrival of important information, which are of our interest toanalyze if investors react to the information in the same way, and therefore it is a desired property thatthe measure is sensitive to exceptionally large values. In contrast to Pearson correlation, Kendall andSpearman correlations are looking at rank-order as opposed to metric information, and thus they donot weight these outlier days appropriately.Not only the nodes change between the different time windows, but also the weights of the links(the correlations) are relatively unstable. To quantify this, we measure the average absolute change incorrelations between nodes that remain in two consecutive time windows (see Eq. 1 in Methods) andthe average correlation between all pairs of nodes in Figure 2. The figure 2a demonstrates the averagechange in the correlations among pairs of investors who both appear in subsequent periods. Thechange in correlations between two consecutive time windows is of the same order as the standarddeviation of the correlations inside the time windows. That is, the network is relatively unstable in itslinks, but, as we will see in the next, the global organization of the network and related statistics arestill rather stable.

Minimum/maximum spanning trees

The correlation matrices of investors’ net trading volumes which we produce can be interpreted asweighed networks where all node pairs (i.e. investors) are connected. Particularly, investors i and j areconnected by a weight of ρ ij ∈ [ − , L min , for the merged network of investors of the three categories. There is an obvious, downward jump in L min justbefore the tipping point, which is defined as the highest price of the stock Nokia during the sampleperiod. Importantly, L min is estimated using data from the past, and therefore, no information aboutforthcoming bubble burst was used. That is, the investors pre-reacted to the impending decline in thestock price, and next we focus on investigating which investor groups are behind this reaction. Wevisualize the maximum spanning trees in 3b-d. There does not seem to be any clear visible clusteringof the categories similar to business sectors in stock networks or geographical regions in currencynetworks [8, 45, 46]. However, we can see that there might be some local tendency for nodes from thesame category to be adjacent, but this observation is not investigated further here. 4/15igure 4 displays the average weights of minimum and maximum spanning trees, L min and L max ,around the crisis for networks containg nodes only from one of the three investor categories. Again,every data point is estimated with data over the previous 126 trading days (6 months), and theestimation windows are rolling by one month. Figure 4a shows that the average weight of minimumspanning tree, L min , of household network suddenly jumps down just some months prior the turningpoint of the stock price evolution around the crisis. Particularly, the value of L min was -0.32 on03-April 2000 whereas it was -0.45 on 06-June 2000, after which the stock prices started to burst.Importantly, the difference is considerably large in comparison to other changes in the data sample, yetthe estimates, -0.32 and -0.45, are based on partially overlapping estimation data (the length of theestimation period is six months and the analysis is ran with a rolling window of 1 month). Anotherimportant observation is that the level of L min does not recover back to the level it was prior to thetipping point during the following two years. For non-financial and financial institutions, we see noobvious patterns in L min around the crisis. Overall, weights in minimum spanning tree amonghouseholds are, on average, abnormally negative just around the turning point for households. Thismeans that households, on average, have neighbors in the minimum spanning tree who are trading inan abnormally opposite way.Dynamics of maximum spanning trees in Figure 4b provide a slightly different story compared tominimum spanning tree dynamics. Particularly, we see that the average weight of maximum spanningtree, L max , for households has a clearly positive trend prior the spike of February 2000, after which itremains quite stable. Particularly, its value was 0.27 in 1998, it increased almost to 0.6 in two years inbull markets, which is an increase of 122%! This means that there are investors that have been comingtogether when the bubble was building up. A positive pre-trend and rather stable post-trend can alsobe identified for non-financial institutions, but it is weaker compared to households. Financialinstitutions, however, behave differently regarding L max – there is a peak in L max for financialinstitutions just before the tipping point, which lasts a half year, but otherwise L max is relativelystable over the period. Note that the average weights of the networks displayed in Figure 2b do notdisplay peaks at same times or of same magnitude.In the light of private information channels that investors use in trading in stock markets (see ref.17), our results from maximum spanning tree analysis would suggest that especially householdinvestors’ connections to the most important neighbors in a connected graph became more and moreimportant when the techno bubble was building up, which can indicate herding in stock markets. Alsothe existing literature provides evidence that spanning trees for different financial networks reactaround financial crises, though with different data sets (and thus with different networks) compared tothe present research (see refs. 9, 47 with the data on stock returns, 13 with data on stock marketindexes, and 48 with the data on currency exchange rates). Discussion

This paper examines the behavior of Finnish investors using of shareholding registration records forNokia stock in Helsinki stock exchange from year 1998 to year 2002, which includes the period of thedot-com bubble. Analysis for households, non-financial institutions, and financial institutions areconducted using minimum and maximum spanning trees constructed from correlations betweeninvestor-specific net-volumes. We find that the spanning tree measures reflected the bubble with thedata for households, and, in fact, they pre-reacted on forthcoming bear markets, while non-financialand financial institutions show no equally clear reactions. Particularly, the average correlations ofhouseholds’ minimum spanning tree clearly jumped down a couple of months before the Nokia pricestarted to have a negative trend. On the other hand, the average correlation in households’ maximumspanning tree dynamics did not jump suddenly right before the burst of the bubble – rather, theaverage correlation had a considerably large upward trend in bull markets, increasing from 0.27 toalmost 0.60 during two years before the stock price crash, after which it stayed quite stable. This resulton maximum spanning trees can reflect information channels between individual household investors –investors’ connections to the most important neighbors in a connected graph became more and more5/15mportant when the techno bubble was building up, which can indicate herding in stock markets,especially among household investors.There are some restrictions in our research on correlated investors network, which are mainlyrelated to how the networks are constructed. We used data on investors’ transactions with only onestock, because the other stocks in the our data set are too illiquid to have enough data estimatinginvestor-specific networks. In the future studies multiple similar stocks could be pooled together ormethods that function better under sparse data could be used. Another limitation is the way we usedPearson correlation between the investment time series to calculate the similarities between nodes.There are more sophisticated ways of inferring the latent relationships between the nodes in theliterature [27, 28, 29, 30], but the particular difficulty in the investor networks is the high variations inthe transaction frequencies between investors. The high frequency nodes can be analysed with muchhigher temporal resolution than the low frequency ones, and choosing a single resolution level is acompromise between these two extremes. Finally, the spanning tree analysis discards valuable data invery aggressive way in order to make the system less complex, and there are multiple alternatives inthe literature where more data is kept [21, 22, 23, 31]. In the future research, we aim to build thenetwork in more sophisticated way, which allows us to analyze a large number of stocks withalternative methods.The network of investors is dynamically changing, and the approach taken here—which is in linewith the literature on stock correlation networks—was to calculate various static network metrics onsnapshots of the network, and then inspect how these metrics change in time. Methods that do notrely on static networks but measure the dynamics of networks have been developed in the field oftemporal networks [32, 33], but most of these approaches have been constructed for networks where thelinks change dynamically but the nodes are relatively stable. There are, of course, other systems withlong temporal data and large changes in the set of nodes, such as citation networks and collaborationnetworks [49, 50, 51]. In some systems, such contact networks of customers, the patterns of nodes’leaving and entering the system can even be of the main interest [52, 53, 54]. However, there is arelatively few methods for analysing networks where both nodes and links change, and the temporalinvestor networks introduced here could serve as a good example for network analysis in the futureresearch.Additionally, in the present paper, the set of investors were based on the status of household,financial institution, or non-financial institution and activeness, which is rather arbitrary way to classifyinvestors. Also, one could say that the observations of investor trading events are just realizations of anon-observable (psychological) process, making the identified temporal network unstable. In out futureresearch, we will develop sampling methods to overcome potential these problems. Also, alternativeinference techniques for the estimation of network edges are expected in the future research.

Materials and methods

Data

The data used in this study is the central register of shareholdings for Finnish stocks from Finnishcentral depository, provided by Euroclear Finland. It includes all the major publicly traded Finnishstocks from 1995. It consists of shareholdings of all the Finnish and non-Finnish investors traded in theHelsinki stock exchange on a daily level basis. The data contains investors’ trades and portfoliosincluding all Finnish household investors, Finnish institutions, and foreign institutions. The records areexact duplicates of the official certificates of ownership and trades, and hence are very reliable. TheBook Entry System entails compulsory registration of holdings for Finnish individuals (referred to ashouseholds) and institutions. Foreigners are partially exempt from registration as they can opt forregistration in a nominee name, and thus they cannot be separated from each other, for which reasondata about foreigners trades is excluded in the present paper. A more detailed descriptions of the dataset is provide in Refs. 1, 18.Our sample data consists of marketplace transactions of

Nokia stock consisting of investors 6/15ransactions from 1 January 1998 to December 2002. Each data record has following information:stock ticker, owner id, trading date, transaction registration date, number of shares traded, the price oftrade, buy/sell transaction type, and other investor specific fields like investors’ sector code, languagecode, gender, date of birth, and postal code. We have considered investors from different categorieswho have traded actively with Nokia for our analysis.

Links in the network

Net volume traded by an investor i on day t is given as V i,t = V bi,t − V si,t , here V bi,t is the number ofshares of Nokia bought by investor i on day t and V si,t is the number of shares of Nokia sold by investor i on day t . In comparison to the inference method introduced in ref. 18, we do not scale thenet-volumes by V bi,t + V si,t , because the scaled approach does not measure the magnitude of trades, i.e.the level of the scaled variable does not reflect exceptionally high or low traded net volumes. Forexample, suppose that on a given day for a given stock, investor A buys one share and sells zero andinvestor B buys exceptionally many shares, say 1,000,000 and sells zero. Then both investors’ scalednet-volumes would equal +1, although their trading behavior have been very different. Thedependency between two investors, i and j , is measured with Pearson correlation for M different timewindows of fixed width W . In our study, W is set to 126 trading days (6 months) and the analysis isran with a rolling window of 1 month (21 trading days). As the total number of days in our data is1252 these choices give us M = 54 time windows for the 6 month time window. Note the data studiedhere is very sparse in a sense that for many investors most of days are without any activity (see plot (a)in Fig. 1), but these silent days are here considered as decisions for not to trade. That is, the inactivedays are not considered as missing data in our calculation of the Pearson correlation coefficient. In ournotation, ρ ( ij ) t denotes the Pearson correlation coefficient between investors i and j estimated fromdaily net-volumes of W days counted backwards from the day t . One could also use daily net-volumesof W/2 days in past and W/2 days in the future, but we prefer to use the data in the past instead ofusing the data in the future in order to analyze pre-reactions in the networks so that no informationabout the forthcoming bubble burst is not used.The average absolute change in correlations between nodes that remain in two consecutive timewindows is defined as J edges ( t ) = 1 | e t ∩ e t +1 | X ( i,j ) ∈ e t ∩ e t +1 (cid:16) | ρ ( ij ) t +1 − ρ ( ij ) t | (cid:17) , (1)where e t denotes the set of edges in the network at time t (i.e., e t = { ( u, v ) | u, v ∈ n t , u = v } ). Minumum and maximum spanning trees

For a network with N t nodes and edge set E t , a maximum spanning tree is a connected sub-networkwith the same nodes and a subset of N t − E maxt ⊆ E t such that the sum of the edge weights(here correlations), P ( i,j ) ∈E maxt ρ ( ij ) t , is maximized. Similarly, for a minimal spanning tree we find a setof edges E mint such that the sum of the edge weights is minimized.Note that we do not transform the correlations into distance using formula d ij = p − ρ t ), whichwould make minimal spanning trees to maximal ones and vice-versa – spanning tree structure isotherwise invariant to this transformation because this transformation only reverses the rank-order ofthe edge weights. We also construct minimum spanning trees, which are complementary to themaximum ones.The average weights of maximum and minimum spanning trees are defined as: L max ( t ) = 1( N t − X ( i,j ) ∈E maxt ρ ( ij ) t . and L min ( t ) = 1( N t − X ( i,j ) ∈E mint ρ ( ij ) t , References

1. Grinblatt M, Keloharju M. The investment behavior and performance of various investor types:a study of Finland’s unique data set. Journal of financial economics. 2000;55(1):43–67.2. Odean T. Are investors reluctant to realize their losses? The Journal of finance.1998;53(5):1775–1798.3. Brennan MJ, Cao HH. International portfolio investment flows. The Journal of Finance.1997;52(5):1851–1880.4. Kaniel R, Saar G, Titman S. Individual investor trading and stock returns. The Journal ofFinance. 2008;63(1):273–310.5. Barrot JN, Kaniel R, Sraer D. Are retail traders compensated for providing liquidity? Journalof Financial Economics. 2016;120(1):146–168.6. Hoffmann AO, Post T, Pennings JM. Individual investor perceptions and behavior during thefinancial crisis. Journal of Banking & Finance. 2013;37(1):60–74.7. Chiang TC, Zheng D. An empirical analysis of herd behavior in global stock markets. Journal ofBanking & Finance. 2010;34(8):1911–1921.8. Mantegna RN. Hierarchical structure in financial markets. The European Physical JournalB-Condensed Matter and Complex Systems. 1999;11(1):193–197.9. Onnela JP, Chakraborti A, Kaski K, Kertesz J. Dynamic asset trees and Black Monday. PhysicaA: Statistical Mechanics and its Applications. 2003;324(1):247–252.10. Naylor MJ, Rose LC, Moyle BJ. Topology of foreign exchange markets using hierarchicalstructure methods. Physica A: Statistical Mechanics and its Applications. 2007;382(1):199–208.11. Heimo T, Kaski K, Saram¨aki J. Maximal spanning trees, asset graphs and random matrixdenoising in the analysis of dynamics of financial networks. Physica A: Statistical Mechanics andits Applications. 2009;388(2):145–156.12. Emmert-Streib F, Dehmer M. Influence of the time scale on the construction of financialnetworks. PLoS One. 2010;5(9):e12884.13. Song DM, Tumminello M, Zhou WX, Mantegna RN. Evolution of worldwide stock markets,correlation structure, and correlation-based graphs. Physical Review E. 2011;84(2):026108.14. Zhou WX, Sornette D. A case study of speculative financial bubbles in the South African stockmarket 2003–2006. Physica A: Statistical Mechanics and its Applications. 2009;388(6):869–880.15. Zhou WX, Sornette D. 2000–2003 real estate bubble in the UK but not in the USA. Physica A:Statistical Mechanics and its Applications. 2003;329(1):249–263.16. Jiang ZQ, Zhou WX, Sornette D, Woodard R, Bastiaensen K, Cauwels P. Bubble diagnosis andprediction of the 2005–2007 and 2008–2009 Chinese stock market bubbles. Journal of economicbehavior & organization. 2010;74(3):149–162.17. Ozsoylev HN, Walden J, Yavuz MD, Bildik R. Investor networks in the stock market. Review ofFinancial Studies. 2014;27(5):1323–1366. 8/158. Tumminello M, Lillo F, Piilo J, Mantegna RN. Identification of clusters of investors from theirreal trading activity in a financial market. New Journal of Physics. 2012;14(1):013041.19. Kalev PS, Nguyen AH, Oh NY. Foreign versus local investors: Who knows more? Who makesmore? Journal of Banking & Finance. 2008;32(11):2376–2389.20. Lillo F, Miccich`e S, Tumminello M, Piilo J, Mantegna RN. How news affects the tradingbehaviour of different categories of investors in a financial market. Quantitative Finance.2015;15(2):213–229.21. Tumminello M, Aste T, Di Matteo T, Mantegna RN. A tool for filtering information in complexsystems. Proceedings of the National Academy of Sciences of the United States of America.2005;102(30):10421–10426.22. Serrano M ´A, Bogun´a M, Vespignani A. Extracting the multiscale backbone of complex weightednetworks. Proceedings of the national academy of sciences. 2009;106(16):6483–6488.23. Chi KT, Liu J, Lau FC. A network perspective of the stock market. Journal of EmpiricalFinance. 2010;17(4):659–667.24. Vandewalle N, Brisbois F, Tordoir X, et al. Non-random topology of stock markets.Quantitative Finance. 2001;1(3):372–374.25. Wang GJ, Xie C, Stanley HE. Correlation Structure and Evolution of World Stock Markets:Evidence from Pearson and Partial Correlation-Based Networks. Computational Economics.2016; p. 1–29.26. Birch J, Pantelous AA, Soram¨aki K. Analysis of correlation based networks representing DAX30 stock price returns. Computational Economics. 2016;47(4):501–525.27. Kenett DY, Preis T, Gur-Gershgoren G, Ben-Jacob E. Dependency network and node influence:application to the study of financial markets. International Journal of Bifurcation and Chaos.2012;22(07):1250181.28. Qian XY, Liu YM, Jiang ZQ, Podobnik B, Zhou WX, Stanley HE. Detrended partialcross-correlation analysis of two nonstationary time series influenced by common external forces.Phys Rev E. 2015;91:062816. doi:10.1103/PhysRevE.91.062816.29. Nakajima J, West M. Dynamic network signal processing using latent threshold models. DigitalSignal Processing. 2015;47:5–16.30. Musmeci N, Nicosia V, Aste T, Di Matteo T, Latora V. The multiplex dependency structure offinancial markets. arXiv:160604872 [physicssoc-ph]. 2016;.31. Kwapie´n J, O´swiecimka P, Forczek M, Dro˙zd˙z S. Minimum spanning tree filtering of correlationsfor varying time scales and size of fluctuations. Physical Review E. 2017;95(5):052313.32. Holme P, Saram¨aki J. Temporal networks. Physics reports. 2012;519(3):97–125.33. Holme P. Modern temporal network theory: a colloquium. The European Physical Journal B.2015;88(9):1–30.34. Kindleberger CP. Bubbles. In: The World of Economics. Springer; 1991. p. 20–22.35. Johansen A, Sornette D. Log-periodic power law bubbles in Latin-American and Asian marketsand correlated anti-bubbles in Western stock markets: An empirical study. arXiv preprintcond-mat/9907270. 1999;. 9/156. Johansen A, Sornette D. The Nasdaq crash of April 2000: Yet another example oflog-periodicity in a speculative bubble ending in a crash. The European Physical JournalB-Condensed Matter and Complex Systems. 2000;17(2):319–328.37. Oechssler J, Schmidt C, Schnedler W. On the ingredients for bubble formation: informed tradersand communication. Journal of Economic Dynamics and Control. 2011;35(11):1831–1851.38. P´astor L, Veronesi P. Was there a Nasdaq bubble in the late 1990s? Journal of FinancialEconomics. 2006;81(1):61–100.39. Bakshi G, Wu L. The behavior of risk and market prices of risk over the Nasdaq bubble period.Management Science. 2010;56(12):2251–2264.40. Perez C. The double bubble at the turn of the century: technological roots and structuralimplications. Cambridge Journal of Economics. 2009;33(4):779–805.41. Dufwenberg M, Lindqvist T, Moore E. Bubbles and experience: An experiment. The AmericanEconomic Review. 2005;95(5):1731–1737.42. Andrade EB, Odean T, Lin S. Bubbling with excitement: an experiment. Review of Finance.2015; p. rfv016.43. Abreu D, Brunnermeier MK. Bubbles and crashes. Econometrica. 2003;71(1):173–204.44. Corgnet B, Kujal P, Porter D. The effect of reliability, content and timing of publicannouncements on asset trading behavior. Journal of Economic Behavior & Organization.2010;76(2):254–266.45. Heimo T, Kumpula JM, Kaski K, Saram¨aki J. Detecting modules in dense weighted networkswith the Potts method. Journal of Statistical Mechanics: Theory and Experiment.2008;2008(08):P08007.46. Wang GJ, Xie C, Chen YJ, Chen S. Statistical properties of the foreign exchange network atdifferent time scales: evidence from detrended cross-correlation coefficient and minimumspanning tree. Entropy. 2013;15(5):1643–1662.47. Coelho R, Gilmore CG, Lucey B, Richmond P, Hutzler S. The evolution of interdependence inworld equity markets—Evidence from minimum spanning trees. Physica A: Statistical Mechanicsand its Applications. 2007;376:455–466.48. Jang W, Lee J, Chang W. Currency crises and the evolution of foreign exchange market:Evidence from minimum spanning tree. Physica A: Statistical Mechanics and its Applications.2011;390(4):707–718.49. Martin T, Ball B, Karrer B, Newman MEJ. Coauthorship and citation patterns in the PhysicalReview. Phys Rev E. 2013;88:012814. doi:10.1103/PhysRevE.88.012814.50. Wu S, Das Sarma A, Fabrikant A, Lattanzi S, Tomkins A. Arrival and departure dynamics insocial networks. In: Proceedings of the sixth ACM international conference on Web search anddata mining. ACM; 2013. p. 233–242.51. Hric D, Kaski K, Kivel¨a M. Stochastic Block Model Reveals the Map of Citation Patterns andTheir Evolution in Time. arXiv:170500018 [physicssoc-ph]. 2017;.52. Dasgupta K, Singh R, Viswanathan B, Chakraborty D, Mukherjea S, Nanavati AA, et al. Socialties and their relevance to churn in mobile telecom networks. In: Proceedings of the 11thinternational conference on Extending database technology: Advances in database technology.ACM; 2008. p. 668–677. 10/153. Kawale J, Pal A, Srivastava J. Churn prediction in MMORPGs: A social influence basedapproach. In: Computational Science and Engineering, 2009. CSE’09. International Conferenceon. vol. 4. IEEE; 2009. p. 423–428.54. Saram¨aki J, Moro E, et al. From seconds to months: an overview of multi-scale dynamics ofmobile telephone calls. The European Physical Journal B-Condensed Matter and ComplexSystems. 2015;88(6):1–10. 11/15 ig 1.

The number of investors in Nokia stocks during full time period 1998 – 2002 and the change ofinvestors across the 6 month time windows. (a) Cumulative distributions of investors and theirrespective trading days during the full time period. (b) The evolution of number of investors tradingNokia in the 6 month time windows for households, non-financial institutions, and financialinstitutions. The numbers of investors in each category | n t | are very different across categories, andthey are normalized by the average numbers of investors in the full time period h| n t |i . (c) The changeof investors measured using Jaccard coefficient J ( t ) = | n t +1 ∩ n t || n t +1 ∪ n t | , where n t and n t +1 represent the setsof nodes in the network of months t and t + 1, respectively, for different investor categories and 6month time windows. The value of J ( t ) is higher (lower) the more (less) similar the consecutivenetworks are. Results for each time window in panels (b) and (c) are plotted at the end of the window.That is, each point is estimated with data over the previous 126 trading days (6 months). Theestimation windows are rolling by one month, and the resulting points are joined by solid lines. Inpanels (b) and (c) the green dotted vertical line in the figures represents the highest stock price ofNokia in the sample period, and the blue curves (with axis on the right) represent the Nokia stockprice. In all panels, lime-green curve corresponds to financial institutions, cyan curve to householdsand orange curve to non-financial institutions. 12/15 ig 2. The change in investor correlations of Nokia stock trading across the 6 month time windowsduring 1998–2002. (a) The average change in correlations between two consecutive time windows J edges ( t ) (see Eq. 1 in the Methods section). (b) The average edge weight, or correlation, in each timewindow. Every point is estimated with data over the previous 126 trading days (6 months), and theestimation windows are rolling by one month. The green dotted vertical line represents the higheststock price of Nokia in the sample period, and the blue curves (with axis on the right) represent theNokia stock price. The lime-green curves correspond to financial institutions, cyan curves tohouseholds and orange curves to non-financial institutions. 13/15 ig 3. The minimum and maximum spanning trees of all investors. (a) Backward looking averageweight of minimum spanning tree, L min ( t ), for the merged set of investors with 6 month time windowsduring 1998 - 2002 (brown line). Every data point is estimated with data over the previous 126 tradingdays (6 months), and the estimation windows are rolling by one month. The green dotted vertical linein the figures represents the highest stock price of Nokia in the sample period, and the blue curves(with axis on the right) represent the Nokia stock price. Maximum spanning trees between (b)8-July-1999 and 04-January-2000 (before the crisis), (c) 5-January-2000 and 06-July-2000 (during thecrisis), and (d) 7-July-2000 and 04-January-2001 (after the crisis). Cyan nodes represents households,orange nodes non-financial institutions, and lime-green nodes financial institutions. Size of the nodesare based on the volume traded by the investor during the period. However, one should not comparethe sizes of nodes between different network as the sizes are not comparable across panels. 14/15 ig 4. Backward looking average weight of the (a) minimum spanning tree, L min ( t ), (b) maximumspanning tree, L max ( tt