Simplicial persistence of financial markets: filtering, generative processes and portfolio risk
SSimplicial persistence of financial markets:filtering, generative processes and portfolio risk
Jeremy D. Turiel a,1 , Paolo Barucca a , and Tomaso Aste a,b a Department of Computer Science, UCL, Gower Street, WC1E6BT London, UK; b Systemic Risk Centre, London School of Economics and Political Sciences, London, UnitedKingdom; Corresponding author. E-mail: [email protected] 21, 2020
We introduce simplicial persistence, a measure of time evolution ofnetwork motifs in subsequent temporal layers. We observe longmemory in the evolution of structures from correlation filtering, witha two regime power law decay in the number of persistent simpli-cial complexes. Null models of the underlying time series are testedto investigate properties of the generative process and its evolu-tional constraints. Networks are generated with both TMFG filteringtechnique and thresholding showing that embedding-based filteringmethods (TMFG) are able to identify higher order structures through-out the market sample, where thresholding methods fail. The decayexponents of these long memory processes are used to characterisefinancial markets based on their stage of development and liquidity.We find that more liquid markets tend to have a slower persistencedecay. This is in contrast with the common understanding that de-veloped markets are more random. We find that they are indeed lesspredictable for what concerns the dynamics of each single variablebut they are more predictable for what concerns the collective evo-lution of the variables. This could imply higher fragility to systemicshocks. network theory | topological filtering | TMFG | ling memory | complexsystems | time series analysis | financial networks
1. Introduction
Networks representing the structure of the interactions withincomplex systems have been increasingly studied in the lastfew decades (1). Applications range from biological networksto social networks, infrastructures and finance (2). In finance- mainly due to the abundance of time-series data regardingeconomic entities and the lack of data on direct relationshipsbetween them - there has been an extensive focus on the es-timation of pairwise interactions from pair correlations (3)of the stochastic time series characterising financial markets.The need to extract significant links from noisy correlationmatrices has triggered the development of filtering techniqueswhich yield sparse network structures (4–7) based on a lim-ited set of statistical or topological hypotheses. There arethree main approaches to network filtering: thresholding, sta-tistical validation, and topological filtering. These methodsshow that a meaningful and consistent taxonomy of financialassets emerges from sparse network structures, in particularwhen applying topological methods. Thresholding methodsremove edges which are less significant based on their strength(or its absolute value), quantile thresholding is one of thesemethods that we use in this paper. This method considersthe distribution of edge strengths and removes edges withstrength below a certain quantile level. It is often applied tofinancial correlation matrices due to its lack of assumptions onthe underlying distribution. Statistical validation - which con-stitutes a generalisation of simpler thresholding methods - hasbeen used to establish the significance of edges in correlation matrices, with applications to economics and finance as well asother fields (8–11). Statistical validation can be implementedby comparing empirical networks with random networks fromconstrained randomisations which generate weighted ensem-bles of null models and allow to quantify the significance ofobserved realisations with respect to the ensemble statistics ofthe null constrained model. Topological filtering through theMinimum Spanning Tree (MST) technique was initially sug-gested by Mantegna (3), and was further extended to planargraphs with the Planar Maximally Filtered Graph (PMFG)(12) and more recently to chordal graphs with predefined mo-tif structure, as the Triangulated Maximally Filtered Graphs(TMFG) in (13) and the Maximally Filtered Clique Forest(MCFC) in (14).Market efficiency imposes the absence of temporal memoryin log returns, but the presence of long memory in higher-ordermoments of returns and long-term dependence (autocorrela-tion) of absolute and squared returns have been observed andthe are now considered among the important stylised factsin markets (15), e.g. volatility clustering, a form of regimeswitching in the fluctuations observed in financial markets.In (16, 17), later extended in (18), it was shown that ordersigns obey a long memory process, balanced by anti-correlatedvolumes which guarantee market efficiency. In financial timeseries analysis, through the generalised Hurst exponent analy-sis, it was demonstrated that memory effects are related to thestage of maturity of the market, with more mature marketsbeing more random (19).With the present paper we provide the missing piece, con-necting market structure and market memory by analysingthe autocorrelation of market structures (20), through per-sistence of its filtered correlation matrix (21). We analysea range of null models (22) - corresponding to a range ofparsimonious assumptions on the underlying generative pro-cesses - for groups of time series. We compare topologicaland thresholding network filtering approaches on both nullmodel-based time-series ensembles and real data to test thelong memory properties of multivariate financial time series.Each null model preserves different aspects of the time series,allowing to validate hypotheses about the long memory ofmarket structures by ranking persistence decays of real timeseries against null models. We show how the edge and mo-tif persistences - such as triangles and tetrahedra - of thesemodels decay in TMFG-filtered graphs and graphs obtainedby filtering correlation matrices through quantile thresholding.We show how topological filtering is a better suited tool toidentify persistently correlated groups of securities throughoutthe market. We compare TMFG with quantile thresholdingof the correlation matrix, at fixed network density level ob-serving that quantile thresholding yields analogous results toplanar filtering for edge persistence, but it fails to identify1 a r X i v : . [ q -f i n . S T ] S e p otifs distributed throughout the market sample generatinginstead highly localised and clustered structures. Further, wedemonstrate that our findings have a practical application byintroducing an unsupervised technique to identify groups ofstocks which share strong fundamental price drivers. This tech-nique can be of particular use in less traded markets, whereidentifying structures with shared fundamental price driversmight otherwise require in-depth knowledge of the companies.The rest of the paper is structured as follows. Section 2describes the data, methods and definitions used for this work,Section 3 outlines the main findings, Section 4 discusses thesefindings and Section 5 concludes the work with suggestionsfor future works.
2. Materials and methods
A. Data.
We select the 100 most capitalised stocks from fourstock markets: NYSE, Italy, Germany and Israel’s (400 stocksin total). Markets range from highly liquid and more developedones such as the New York Stock Exchange and the FrankfurtStock Exchange to less liquid markets such as the Italian StockExchange and the Tel Aviv Stock Exchange.We investigate daily closing price data from Bloomberg for:• New York Stock Exchange (3/01/2014 - 31/12/2018);• Frankfurt Stock Exchange (3/01/2014 - 28/12/2018);• Borsa Italiana (Italian Stock Exchange) (3/01/2014 -28/12/2018);• Tel Aviv Stock Exchange (5/01/2014 - 1/1/2019).The data respectively includes 1258 daily prices observa-tions for the NYSE, 1272 for FSE and BI and 1225 for TASE.
B. Time series null models.
We generate ensembles of nullmodels which preserve an increasing number of properties ofthe real time series.
Random return shuffling
Individual stock log-return ( r t =log P rice t − log P rice t − ) time series are randomly shuffled,i.e. a random permutation along the time dimension of eachvariable is applied, to obtain a null model for noise and spuri-ous correlations. This model maintains the overall statisticsof the values of each time series but eliminates any correlationstructure. Rolling univariate Gaussian generator
We calculate the rollingmean µ t − δ t ,t and standard deviation σ t − δ t ,t of the log-returnseries for each security separately. We then generate ensem-bles by sampling the return r t at each point in time fromthe (rolling) univariate Gaussian distributions with samplemean and standard deviation r t ∼ N ( µ t − δ t ,t , σ t − δ t ,t ), with N ( µ, σ ) being a normal distribution with mean µ and stan-dard deviation σ . This intends to simulate the process as asimple moving average with uncorrelated time-varying Gaus-sian random noise. Stable multivariate Gaussian generator
We calculate the mean µ (for each security) and covariance matrix Σ throughout thewhole length of the log-return time series. We then generateensembles by sampling the vector of returns r t at each pointin time for all securities from the fixed multivariate Gaussianwith empirical means and covariance matrix, r t ∼ N ( µ , Σ ).This intends to represent an underlying fixed market structurewith sampling noise. Rolling multivariate Gaussian generator
After obtaining the log-return time series, we calculate the rolling mean µ [ t − δ t ,t ] (foreach security) and covariance matrix Σ [ t − δ t ,t ] between theseries. We generate ensembles by sampling the return at eachpoint in time r t for all securities from the (rolling) multivariateGaussian distributions with sample means and covariancematrices r t = N ( µ [ t − δ t ,t ] , Σ t − δ t ,t ] ). This intends to detectthe changing market structure and simulate the process asbeing generated by a multivariate Gaussian distribution withtime-varying constraints on structural relations. C. Correlation matrix estimation.
We then compute for thetime series correlation matrices with exponential smoothingfrom rolling windows of δ = 126 trading days with smoothingfactor of θ = 46 days. This is done for all realisations of eachnull model ensemble and for the real data.Correlations are noisy measures of co-movements of finan-cial asset prices, which are often non-stationary within theobservation window. Longer time windows benefit the mea-sure’s stability, as we have more observations to estimate the N ( N − / N assets. However, alonger observation window can come with the disadvantage ofweighting more and less recent co-movements equally with therisk of averaging over a period in which the values are non-stationary. In order to compensate for this effect, we applythe exponential smoothing method for Kendall correlations(23). This allows for more stable correlations, as the methodapplies an exponential weighting to the correlation window,prioritising more recently observed co-movements. D. Filtering: quantile thresholding and TMFG.
We apply twofiltering techniques with fundamental differences. The firstfiltering method is quantile thresholding, which correspondsto hard thresholding to generate an adjacency matrix throughthe binarisation of individual correlations. For a correlationvalue v q corresponding to the quantile level q of the matrixvalues, the adjacency matrix is defined as A i,j = (cid:26) ρ i,j ≥ v q , ρ i,j < v q , . This filtering technique is entirely value-based with nostructural or other constraints. We apply it by providing aquantile level q which yields edge sparsity analogous to thatof the corresponding TMFG filter.The second filtering technique is the TMFG method (13).This topological filtering technique embeds the matrix withtopological constraints on planarity in a graph composed bysimplicial triangular and tetrahedral cliques. Edges are addedin a constrained fashion with priority according to their (ab-solute) value. The graph essentially corresponds to tilinga surface of genus 0. This technique represents a filteringmethod that accounts for values, but also imposes an underly-ing chordal structural form which might help regularising thefiltered graph also for probabilistic modeling (24). Further-more, this technique imposes higher order structures, namelytriangles and tetrahedra, which are known to be a feature offinancial markets and social networks. E. Simplicial persistence.
We focus on temporal persistenceof tetrahedral and triangular simplicial complexes (motifs)in the TMFGs and graphs filtered via quantile thesholdingonstructed from correlations over rolling windows. TMFGnetworks can be viewed as trees of tetrahedral (maximal)cliques connected by triangular faces, these are triangularcliques with different meaning in the taxonomy, called separa-tors. If removed, separators split the graph into two parts. Notall triangular faces of the tetrahedral cliques are separatorsand we will refer to those which are not as triangles.This distinction is discarded for the results in Section Bin order to account for all triangles in the filtered graph, asquantile thresholding does not distinguish between triangularfaces and separators.A motif corresponding to clique X c is considered soft-persistent at time t + τ if and only if the motif is presentat both the initial time t and at t + τ . A visual intuition formotif (triangle) persistence through time is provided in Figure1. Fig. 1.
Motif persistence visualisation.
Visual representation of a TMFGstructure’s motif (triangle) persistence in time. The green triangle in figure a) ispersistent through figure c), while other two triangles (present in figure a) within thered triangle) do not persist due to the rewiring of an edge. Figure b) shows one ofnon-persistent triangles with dashed contour. The rewired edge is also dashed. Thisvisualisation aims at showing the impact of edge rewiring on motif persistence andthe difference between edge and motif persistence.
We investigate the decay in the number of persistent motifsbetween filtered correlation networks with observation windowsprogressively shifted by one trading day and we quantify howthe average persistence decays with the time shift τ .Here we use a form of soft persistence which is different fromhard persistence (survival) of motifs which is more commonin the literature (25, 26). Specifically, the average motifpersistence in the plateau regime is defined as h P m ( X c ) i T, T = 1 T · T − τ plat · T X t =0 T X τ = τ plat P m ( X t,t + τc ) , [1]where τ plat denotes the transition point to the plateau region.The average persistence for the entire clique set over T startingpoints at time shift τ is defined as h P m ( X τ ) i T,C = 1 T · | C | · T X t =0 X c ∈ C P m ( X t,t + τc ) . [2]Where, considering the motif sets X tC = {X ti } i =1 ,...,C and X t + τC = {X t + τi } i =1 ,...,C , the binary persistence value of motif c ∈ C at time t and t + τ is P m ( X t,t + τc ) = ( X c ∈ X tC ) ∧ ( X c ∈ X t + τC ) [3]We obtain the power law fit for the decay law and identifytwo regimes: one with a faster decay followed by one witha slower decay. The transition point τ plat is computed by minimising the unweighted average mean squared error (MSE)between the two fits over all possible transition points in time.We also compare the decay exponents for multiple randomstock selections over different markets to identify whether thesteepness of motif decay (edge, closed triad or tetrahedronclique) is indicative of market stability/development stage.We further investigate more liquid markets such as the NYSEfrom both a quantitative and qualitative point of view.Weclassify motifs in the plateau by their soft persistence andstudy the sector structure of the most persistent motifs.In order to further justify the analysis of motifs over individ-ual edges, we test the null hypothesis that motifs are formedby edges in the network whose existence is not mutually depen-dent. The assumption would imply that coexistence of edges inmotifs is not statistically significant and that motif structureshave no extra persistence beyond the individual edges thatform them. The hypothesis being tested implies that motifpersistence is simply the result of persistence characterisingtheir component edges: P m ( χ t,t + τc ) = P m ( χ t,t + τc ) · P m ( χ t,t + τc ) · P m ( χ t,t + τc ) , [4]where the motif and its edges are defined as χ t,t + τc = { χ t,t + τc , χ t,t + τc , χ t,t + τc } .In order to provide an application to systemic risk, weconstruct a portfolio containing all stocks in the ten mostpersistent motifs in the plateau region, as defined in Equation1 (for each market). We then compare its volatility with thatof random portfolios with the same number of assets.
3. Results
The main findings of this work are described in this section,starting with an overview of results on the long memory ofedges and simplicial complexes in TMFG-filtered correlationnetworks. The section continues with an analysis of nullmodels of financial market structures, described in SectionB, and a comparison with real data to gain insights aboutthe generative process of the stochastic structure. We thensuggests how soft persistence captures the underlying changein market structure by relating its decay exponent to the stageof development (a proxy for stability) or average traded volumein the market (a proxy for liquidity which yields well-definedstable structures). We conclude the section with results insystemic risk applications to financial portfolios where we showthat the most persistent motifs correspond to stocks in thesame sector and demonstrate how the portfolio of 10 mostpersistent motifs is highly volatile and systemic.
A. Long-term memory of motif structures.
The plot in Figure(2) shows the power law decay (evident from the linear trendin log-log scale) in h P m ( X τ ) i T =200 ,C vs. τ , followed by aplateau region that also decays as a power law, but with asmaller exponent. We also observe that all motif decays have τ plat ∈ [ δt window / , δt window ], where δt window represents thelength of the estimation window of the correlation matrix.The window used has δt window = 126 trading days and avalue of θ = 46 for exponential smoothing, as per (23). Thechoice of δt window corresponds to roughly 6 months of tradingand satisfies N < δt window , with N the number of assets inthe correlation matrix. The correlation matrix is hence well-conditioned and invertible. On the other hand the exponentialmoothing with θ = 46 mainly considers recent observationsfrom the latest few months.There are N − N − Fig. 2.
Persistence Decay.
Decay of triangular clique faces, separators andclique motifs persistence for 100 NYSE stocks, as a function of time interval δ t =[0 , (average over 200 starting points). The two power-law regimes are identifiedby the minimum MSE sum of the fits. In Figure (2) we notice that the minimum MSE for thetwo linear fits is achieved at the transition point between thedecay phase and the plateau. The transition point τ plat cantherefore be identified by minimising a standard fit measurewith two phases, which strengthens the unsupervised natureof our method. The method for minimum MSE search isdescribed Section E. B. Null models of persistence in filtered structures.
We re-port results for the edge and motif (triangle) persistence forreal data as well as for the null models described in Section B.We compare real data with null models and TMFG filteringwith quantile thresholding.Figure 3 shows the decay in edge persistence for both filter-ing methods. We notice that the random shuffling null modellies at the bottom, as it should produce completely randomstructures with little residual persistance due to probabilis-tic combinatorics and structural filtering constraints in theTMFG. This shows that persistence is not an artifact of anyof the filtering techniques used and not a mere result of returnvolatility of individual assets (which is preserved by returnshuffling). From Fig. (2) we also notice that the rolling uni-variate Gaussian model lies just above as it does not accountfor structure at all and only preserves rolling means and stan-dard deviations, this shows how persistence cannot merely beattributed to common long term trends or volatility variations.This null model carries some broad sense of structure andmarket direction and it shows how persistence does not merelyoriginate from overall market trends. We then find a secondcluster, of structured models, with the rolling multivariateGaussian at the bottom. This shows how market persistencegoes beyond asset means and covariance, even after spuriousstructures have been removed. We then find the real data,just below the stable multivariate Gaussian. This shows howmarkets have slowly evolving structures.Figure 4 shows the decay in triangular motif persistencefor both filtering methods. We notice results analogous tothose in Figure 3 for TMFG filtered graphs. Graphs filteredthrough quantile thresholding instead show a high level of noise in their top cluster (where structure is present). Ahigher number of motifs than those of the TMFG is found,but the ranking of null models is at times inconsistent, as wellas the position of the decay curve for real data. We wouldhave expected some triangles to break when looking at edgepersistence only, as well as to find that the clustering coefficientdecreases in persistent graphs (as it does in TMFG graphs).The clustering coefficient for quantile thresholding-persistentgraphs is also found to be much higher, suggesting that thefiltered structure is highly localised and clustered, while thatof the TMFG is more distributed, identifying systemic groupsof stocks throughout the market structure.
Fig. 3.
Edge persistence decay of null models.
Edge persistence decaywith δ τ for the time series null models of market returns and real data for the NYSE.We notice how for both TMFG filtering and quantile thresholding the real data liesbetween the rolling multivariate Gaussian ensemble and the stable multivariate Gaus-sian ensemble. This indicates that the real market structure does evolve slowly in time,but with persistence beyond what can be inferred from estimates of its covariancestructure. E dg e P e r s i s t e n c e Q u a n t il e E dg e P e r s i s t e n c e T M F G shuffled ensemble (1)multivariate rolling ensemble (4)univariate rolling ensemble (2)fixed gaussian ensemble (5)hamiltonian ensemble (3)real Fig. 4.
Motif (triangle) persistence decay of null models.
Motifpersistence decay with δ τ for the time series null models of market returns and realdata for the NYSE. We notice how for TMFG filtering the real data still lies between therolling multivariate Gaussian ensemble and the stable multivariate Gaussian ensemble(as in Figure 3). We instead notice that the decay ordering is noisier for quantilethresholding, showing how the method’s focus on individual connections affects itgeneralisation to motifs. This is despite the higher number of motifs in the quantilethresholding graph. T r i a n g l e P e r s i s t e n c e Q u a n t il e T r P e r s i s t e n c e T M F G shuffled ensemble (1)multivariate rolling ensemble (4)univariate rolling ensemble (2)fixed gaussian ensemble (5)hamiltonian ensemble (3)real C. Market classification via decay exponent.
We now considerhow the decay exponent of TMFG graphs behaves acrossmarkets. Table (1) compares the decay exponents for cliques,triangular motifs and clique separators in the NYSE, Germanstock market, Italian stock market and Israeli stock market.The decay exponent α is obtained from the fit based on thefollowing expression, h P m ( X τ ) i T,C = β · τ α [5] able 1. Exponents for the decay power law regime com-puted with MSE. The analysis refers to 100 randomly se-lected stocks amongst the 500 most capitalised, over timeintervals τ = [0 , and t = [0 , ..., different initial tem-poral network layers. For all motif analyses in this work,triangles and separators constitute non-overlapping sets,as these represent theoretically and taxonomically differ-ent structures and decay characteristics. Market Clique Triangular Motif Clique SeparatorNYSE -0.392 -0.493 -0.245Germany -0.792 -0.598 -0.381Italy -0.785 -0.811 -0.174*Israel -1.024 -0.866 -0.728 * Result compromised by regimes not well identified for motifdecay in large systems ( ≈
100 stocks).We notice from the results in Table (1) that the NYSE,which is clearly the most developed and liquid stock market,has the lowest decay exponent (in modulus, which correspondsto the slowest decay) for both cliques and triangles. Thisindicates that its correlations are more stable on a shortertime window.Germany and Italy have similar values for cliqueexponents, with Germany seemingly more stable in termsof triangular motifs. Israel, a younger and less liquid stockmarket, follows with a faster decay in both tetrahedral cliquesand triangular motifs. The ordering of these markets is notclearly identifiable in clique separators as noise in the data doesnot allow for the two decay regimes to be correctly identified inall markets (in this case for Italy). Separators have a distinctrole and meaning in the graph’s taxonomy and further workshould allow for a more thorough analysis of those.We observe promising results for a monotically increasingrelation between the decay exponent and the average dailyvolume of the market. The solidity of this result shall beinvestigated in future works.In Table (1) the decay exponent is not adjusted by theprobability that all edges in the clique must be present in thetemporal layer for the clique to exist. We show in Table (2)that, when adjusted by the probability of all its edges existingsimultaneously, triangular motifs have a slower decay thanindividual edges. The results in Table (2) are obtained froma set of randomly selected stocks different to those used forTable (1). This adds further confidence in the results andtheir generality.We stress that Table (2) falsifies the hypothesis that motifsare formed by edges in the network whose existence is notmutually dependent (Equation 4). This is falsified by theconsistently lower decay exponent (in modulus) for adjustedpersistence of triangular motifs. We can then conclude thatmotifs are more stable structures across temporal layers ofthe network, with significant interdependencies in their edges’existence.
D. Sector analysis in persistent motifs.
Figure (5) provides avisualisation of the network components formed by the tenmost persistent triangles in the NYSE. We observe that allstrongly persistent triangles have elements which belong tothe same industry sector. Table 3 shows this for the same tentriangles displayed in Figure (5). We notice that stock pricesin the sectors in Table (3) are mostly driven by sector-widefundamentals, which justify the persistent structure in the
Table 2.
Exponent for the power law decay regime identifiedby MSE in different sample markets. The analysis refersto 100 randomly selected stocks amongst the 500 mostcapitalised, over time intervals τ = [0 , and t = [0 , ..., different initial temporal network layers. Market Edge Triangular Motif Triangular Motif**NYSE -0.164 -0.398 -0.133Germany -0.265 -0.471 -0.157Italy -0.144* -0.458 -0.153Israel -0.397 -0.830 -0.277 * Result compromised by regimes not well identified for edgedecay in large systems ( ≈
100 stocks)** Motif exponent adjusted by the probability ofsimultaneous edge persistence in the motif).
Fig. 5.
Persistent NYSE motifs visualised.
Network representation of theten most persistent triangular motifs in the TMFG layers for the 100 most capitalisedstocks of the NYSE.. long term. Other motifs are constituted by ETFs and theirmain holdings ∗ .We also investigate whether motif persistence and motifstructures can be easily retrieved from the original correlationmatrix. The purpose of this is to check that our TMFGfiltering method is not redundant and trivially replaceable. Totest this, we consider the ten most present persistent trianglesacross the plateau region and check their overlap with the tenmost correlated triplets in each unfiltered correlation matrix.We find that no more than one triangle lies in the intersectionbetween the two sets, in each temporal layer. We also checkthe correlation between motif persistence and the averagesum or product (results are equivalent for our purpose) of itsindividual edges’ correlation for all unfiltered correlation layers.We observed through the Pearson and Kendall correlationvalues that the two measures are only loosely related, ascorrelation explained no more than 20% of the variance in theset of variables with large persistence. E. Portfolio volatility and systemic risk of persistent motifsvs. random portfolios.
Portfolio volatility distribution for the ∗ The reason for the existence of these motifs is intuitive and does not affect our analysis, as ETF-related motifs are unlikely to be present in the network formed by a random selection of stocks orby stocks in a portfolio. These motifs are present here as we focus on the 100 most capitalisedsecurities in the NYSE, which include ETFs. able 3.
Motif components and Financial Times sector affiliation for the ten most persistent motifs in the NYSE’s 100most capitalised stocks.
Security 1 Security 2 Security 3 FT SectorBiogen Inc Gilead Sciences Inc Celgene Corp BiopharmaceuticalUnitedHealth Group Inc Cigna Corp Anthem Inc Health CareBiogen Inc Gilead Sciences Inc Amgen Inc Biopharma/techBank of America Corp JPMorgan Chase & Co Morgan Stanley Financials-BanksVanguard FTSE ETF** MSCI EAFE ETF Vanguard FTSE ETF*** Index ETFsInvesco QQQ Trust* Amazon.com Inc Alphabet Inc TechConocoPhillips Schlumberger NV Exxon Mobil Corp Oil & GasNVIDIA Corp Texas Instruments Inc Broadcom Inc Tech HardwareChevron Corp Schlumberger NV Exxon Mobil Corp Oil & GasChevron Corp ConocoPhillips Schlumberger NV Oil & Gas * ETF on NASDAQ - Top Holdings include Amazon, Facebook, Apple, Alphabet** Vanguard FTSE Developed Markets Index Fund ETF Shares*** Vanguard FTSE Emerging Markets Index Fund ETF Shares100 most capitalised stocks inWe check that a portfolio formed by the 10 most persistentmotifs in each market has a highly enhanced out of samplevolatility due to its stable correlations.To do this, we consider the volatility of the motif portfolioand a distribution of volatilities for 10 randomly selectedportfolios with the same number of stocks.As expected, we observe the motif portfolio to yield avolatility vol motif close to the higher end of the distribution,i.e. ( vol motif − h vol random i ) > · σ ( vol random ), throughoutthe considered markets. We should highlight that the volatilityof portfolios is evaluated out of sample with respect to theperiod the persistence was calculated on, showing that thismethod is not only observational, but also predictive.Due to the more theoretical nature of this work, we referthe interested reader the work by some of the authors of thispaper for a more thorough analysis of portfolio applicationsand forecasting (27).
4. Discussion
The power law decay of edge and simplicial soft persistencesreported in Figure 2 suggests that market structures are char-acterised by a slow evolution which allows for long memoryin temporal layers. This decay type is in contrast with anexponential decay of the persistence which would imply in-stead short or no memory in the system. This observationis in line with the works by Bouchaud et al. and Lillo at al.in (16, 18, 19, 28), where power law decays in autocorrelationare identified as manifestations of long-memory processes inefficient markets. However, it extends the concept to higherorder structures.The comparison between soft persistence in correlationstructures from real data and artificial data generated fromdifferent null models (Figs. 3 and 4) demonstrates that thepersistence of real structures goes beyond all univariate nullmodels, hence confirming long memory as a characteristicrequiring structural constraints. Also we demonstrate thatreal structures overcome the persistence of the rolling multi-variate Gaussian, hence suggesting that pairwise covariancesand moving averages do not suffice to induce the long memorypresent in real markets. As per the analysis on motif persis-tence beyond those of individual edges, we suggest that higher order relations in terms of structural evolution are present.The ordering of null models in Figure 3 further supports thevalidity of the persistence measure.The comparison of simplicial persistence of triangles be-tween quantile thresholding and TMFG filteres graphs, re-ported in Figure 4, reveals that quantile thresholding strugglesto separate the decay of real structures from that of rollingGaussian generated ones. This could be attributed to the“local” nature of the method, which matches the pairwise in-terpretation of relations in generating from a rolling Gaussian.TMFG filtered graphs instead, perhaps due to their non-localembedding, provide a consistent ordering of null models withrelatively low noise.The ability to correctly identify persistent motifs through-out the market sample is essential as the most persistent motifswere found to be highly systemic (Section E). Persistent struc-tures in quantile thresholded graphs present higher and morestable clustering coefficients. This suggests a very localisedand compact structure. TMFG filtered graphs instead presenta lower clustering coefficient and a decay with τ , as expectedsince some structures break. This is further evidence of theability of the TMFG filtering method to identify meaningfulpersistent structures throughout the market. The issue withquantile thresholding is likely due to the method being merelyvalue-based with no sensible structural constraint, differentlyfrom the TMFG.The ranking of national markets based on their decay expo-nents in Table 1 can be interpreted in terms of the reductionof estimation noise in more liquid markets, as large deviationsbecome less likely and correlations as well as prices more re-flective of the underlying generative processes and structures.Structures are perhaps clearer too and deviations are exploitedmore quickly if they emerge. This suggests that more efficientand capitalised markets are characterised by structures whichare more stable in time and better reflected by the data. Thedecay exponent ranking also leads to the conclusion that moredeveloped markets are characterised by more meaningful un-derlying structures and cliques, suggesting that systemic riskmay represent a greater threat in developed markets.The results in Table 2 support the hypothesis that motifsconstitute meaningful structures in markets, beyond their indi-vidual edges. These results test the independence null model ofindividual edges in motif formation and show solid evidence toeject it. We can then conclude that highly persistent motifsare not a mere consequence of highly persistent individualedges, but also of the correlation in those edges existing con-currently. This results ties in with the above discussion onthe issues with locality of filtering methods and generativeprocesses.Table 3 strengthens the importance of persistent motifs.Indeed, the ten most persistent motifs visualised in Figure 5are representative of industry sectors in the NYSE. Thesesectors are not identified by the motifs with higher edge cor-relation, which instead are dominated by motifs often dueto correlation noise in high volatility stocks. Persistence andthe identification of persistent motifs are hence found to benon-trivial with respect to correlation strength of individualedges or motifs. The impact on portfolio diversification of themotifs in Figure 5 indicates that these structures are highlyrelevant for systemic risk and portfolio volatility, with highpredictive power provided by the long memory property ofpersistence, which is an intrinsic temporal feature. As thesemotifs are not characterised by noticeably strong correlations,a common variance optimisation of the portfolio is unlikelyto optimise the weights to sufficiently minimise the risk fromthese highly systemic structures.The systemic relevance of persistent motifs as well as theirout of sample forecasting power are shown by the results inSection E and in (27), where significantly higher out of sampleportfolio volatility is observed for the portfolio of persistentmotifs. The motif portfolio volatility is significantly aboveboth the mean and median of the random portfolios’ volatilitydistribution.This is a first example of how just selecting stocks fromthe ten most persistent motifs forms a portfolio with higherlong term volatility. Clearly when aiming for a reduction insystemic risk, low volatility (the opposite) is the objective.The observations from Section E and (27) lay the groundfor the construction of portfolios where asset weights aim toreduce the volatility originating from persistent correlationsin motif structures.
5. Conclusion
The present work introduces the concept of simplicial persis-tence, focusing on the soft persistence in simplicial cliques.This measure is applied to a complex system with a slowlyevolving stochastic structure, namely financial markets. Thegraph structures are obtained from Kendall correlations withexponential smoothing and filtered with the TMFG or throughquantile thresholding. The slow evolution of these systemswith time manifests long memory in their structure with a tworegime power law decay in persistence with time. The tran-sition point between regimes is identified in an unsupervisedway with mean-squared error minimisation.Null models of market structure are then used to test hy-potheses about the generative process underlying the system.Two persistence decay clusters are observed, where the leastpersistent corresponds to null models with no structural con-straints and the upper one (most persistent) comprises therolling multivariate Gaussian (lowest), real data, and the stablemultivariate Gaussian (highest).Simplicial persistence of higher order structures in real dataand null models is hardly recognised by value-based thresh-olding methods which are unable to identify persistent cliques throughout the market sample. Decay exponents for differentmarkets are then observed to provide a ranking correspondingto their liquidity or stage of development, which suggests that,despite these systems being less predictable in their individ-ual series, they are more stable and predictable in terms ofstructure. Most persistent motifs are found to correspond tosectors where the price of stocks is mostly driven by sector-widefundamentals.Based on the ability of simplicial persistence to forecastand identify strongly correlated clusters of stocks, the impactof persistence-based systemic risk on portfolio volatility isverified with a comparison between the ten most persistentmotifs portfolio and random portfolios of the same size (27).The present work provides further evidence of how networkanalysis and complex systems can enhance our understandingof real world systems beyond traditional methods. Our resultsand methods lay the ground for future studies and modellingof the evolution of stochastic structures with long memory.
6. Acknowledgments
TA and JT acknowledge the EC Horizon 2020 FIN-Tech projectfor partial support and useful opportunities for discussion. JTacknowledges support from EPSRC (EP/L015129/1). TAacknowledges support from ESRC (ES/K002309/1), EPSRC(EP/P031730/1) and EC (H2020-ICT-2018-2 825215).
1. Newman M (2018)
Networks . (Oxford university press).2. Strogatz SH (2001) Exploring complex networks. nature
The European PhysicalJournal B-Condensed Matter and Complex Systems
Available at SSRN 3294548 .5. Cimini G, et al. (2019) The statistical physics of real-world networks.
Nature Reviews Physics arXiv preprint arXiv:1903.10805 .7. Masuda N, Kojaku S, Sano Y (2018) Configuration model for correlation matrices preservingthe node strength.
Physical Review E
Physica A: Statistical Mechanics and its Applications
PloS one arXiv preprintarXiv:1902.07074 .11. Marcaccioli R, Livan G (2019) A pólya urn approach to information filtering in complex net-works.
Nature communications
Proceedings of the National Academy of Sciences
Journal of complex Networks arXiv preprint arXiv:1905.02266 .15. Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues.
Quantitative Finance
Studies in nonlineardynamics & econometrics
Phys. Rev.E
Handbook of financial markets: dynamics and evolution . (Elsevier), pp. 57–160.19. Di Matteo T, Aste T, Dacorogna MM (2005) Long-term memories of developed and emergingmarkets: Using the scaling analysis to characterize their stage of development.
Journal ofBanking & Finance
Phys. Rev. E
Phys.Rev. E
Phys. Rev. E
The Euro-pean Physical Journal B
Phys. Rev. E
BMC bioinformatics
Network Theory in Finance
International Conference on Complex Networks and Their Applications .(Springer), pp. 573–585.28. Bouchaud JP, Gefen Y, Potters M, Wyart M (2004) Fluctuations and response in financialmarkets: the subtle nature of ‘random’ price changes.