[PDF] Localization in covariance matrices of coupled heterogenous Ornstein-Uhlenbeck processes

Abstract

We define a random-matrix ensemble given by the infinite-time covariance matrices of Ornstein-Uhlenbeck processes at different temperatures coupled by a Gaussian symmetric matrix. The spectral properties of this ensemble are shown to be in qualitative agreement with some stylized facts of financial markets. Through the presented model formulas are given for the analysis of heterogeneous time-series. Furthermore evidence for a localization transition in eigenvectors related to small and large eigenvalues in cross-correlations analysis of this model is found and a simple explanation of localization phenomena in financial time-series is provided. Finally we identify both in our model and in real financial data an inverted-bell effect in correlation between localized components and their local temperature: high and low temperature/volatility components are the most localized ones.

Full PDF

LLocalization in covariance matrices of coupled heterogenous Ornstein-Uhlenbeckprocesses

Paolo Barucca ∗ Dipartimento di Fisica, Universit`a La Sapienza,P.le A. Moro 2, I-00185 Roma, Italy (Dated: November 12, 2018)We deﬁne a random-matrix ensemble given by the inﬁnite-time covariance matrices of Ornstein-Uhlenbeck processes at diﬀerent temperatures coupled by a Gaussian symmetric matrix. The spec-tral properties of this ensemble are shown to be in qualitative agreement with some stylized facts ofﬁnancial markets. Through the presented model formulas are given for the analysis of heterogeneoustime-series. Furthermore evidence for a localization transition in eigenvectors related to small andlarge eigenvalues in cross-correlations analysis of this model is found and a simple explanation oflocalization phenomena in ﬁnancial time-series is provided. Finally we identify both in our modeland in real ﬁnancial data an inverted-bell eﬀect in correlation between localized components andtheir local temperature: high and low temperature/volatility components are the most localizedones.

I. INTRODUCTION

Complex systems are hard to analyse since by deﬁ-nition the interactions among their components are noteasily connected with their behaviours [1]. In these sys-tems the absence of a well-deﬁned general model makescorrelation analysis an irreplaceable, if not unique, com-pass [2, 3]. Furthermore in these systems the presence ofnoise makes benchmarking important and random matrixtheory (RMT) is fundamental to check the statistical va-lidity of pair-correlations.RMT has mainly focused on the eﬀects of the ﬁnitelengths of time series. In particular a careful analysishas been carried out on the spectral properties of ran-dom matrices in the case where the number of variables N is large and the length of the signal M is comparable,i.e. with a ﬁnite ratio Q = M/N [4–8]. In this case thetotal time is not enough large for making the noise neg-ligible: one needs to disentangle the properties inducedby couplings from the ones brought by randomness.Nevertheless time-series in complex systems are not onlynoisy and ﬁnite but also heterogeneous, which meanstheir variances can be really diﬀerent (i.e. the varianceof one time series can be very diﬀerent from the varianceof another time series). More generally the marginal dis-tribution of one variable may be qualitatively and quan-titatively diﬀerent from the one of another variable.In ﬁnance, on which we will focus our considerations, thevolatilities of diﬀerent assets, i.e. the index of the per-centage change in stock prices, have a very broad distri-bution [9], i.e. there is a strong heterogeneity between thereturns of diﬀerent assets. In recent studies it has beenshown that this distribution is similar to a log-normalthat is compatible with a fractal model of the market[10, 11]. This feature has been included in models based ∗ [email protected] on the random matrix Wishart ensemble to improve thecomparison with real matrices [12–14].Summarising complex systems are heterogeneous, disor-dered and noisy and they have a non-trivial relationshipbetween interactions and correlations: carefully studiedbenchmarks are needed to gain a more detailed insight.In the following we will see how these diﬀerent featuresare interconnected and we will point out how importantis to consider them together in order to predict their ef-fects on cross-correlation analysis.The aim of this article is to observe the consequencesof heterogeneity in a simple ad hoc model that allowsto explicitly compute the relation between couplings andcorrelations.In Section II we start from the basic dynamical modelgiven by a set of independent Ornstein-Uhlenbeck (OU)processes at diﬀerent temperatures. Then we turn tothe interacting case where the OU processes are coupledthrough a given matrix. The ensemble we consider isthe one given by the inﬁnite-time covariance matricesof OU processes at diﬀerent temperatures coupled by aGaussian symmetric matrix. We also consider the sta-tionary distribution of the time-series induced and showthe relation with the known Wishart-Laguerre ensembleof random matrices.In Section III we show the results of numerical simula-tions in the asymptotic limit. Varying heterogeneity wecompute the spectral density of eigenvalues, the inverseparticipation ratio (IPR), a standard index of eigenvec-tors localization [15], and the component participationratio (CPR), that deﬁnes the contribution of a given com-ponent on all the eigenvectors. We check this ensembleproperties both in averaged and single-sample eigenvec-tors. Moreover we identify a steep change in eigenvectorlocalization driven by heterogeneity, that might be an in-dicator for a transition from an extended phase towards alocalized phase in the eigenvectors of the cross-correlationmatrix of the model. Finally we discuss the results bothwith respect to the known spectral properties of random a r X i v : . [ q -f i n . S T ] O c t matrix models and with the real localization propertieswidely observed in ﬁnancial data [16, 17] and give theo-retical perspectives. II. COUPLED HETEROGENEOUS OUPROCESSA. Indipendent OU processes

In the following we will consider signals extracted fromthe equilibrium distribution of a continuous-time stochas-tic dynamics. The interest of this model for applicationsrelies on the hypothesis that in complex systems observa-tions are samplings from a complicated noisy dynamicsas, for instance, in ﬁnance daily prices are the result ofall the small price adjustments given by all the transac-tions.We would like to stress though that we do not want tomodel a particular asset dynamics in detail: each class ofassets may require a diﬀerent dynamics and more compli-cated non-linear interaction terms that would not allowto give explicit formulas for the direct, from couplings tocorrelations, and inverse problem, from correlations tocouplings.The aim is to construct a null-model including a spe-ciﬁc parametrization that separates couplings and tem-peratures in order to explicitly distinguish their role onthe covariance matrix. We start our analysis from alimit-case, the noisy dynamics of N independent vari-ables x = { x , x ...x N } following a standard OU processwith a set of N temperatures T = { T , T ...T N } :˙ x i = − x i + (cid:112) T i η i ( t ) , (1)where η i ( t ) is a delta-correlated Gaussian noise with (cid:104) η i ( t ) η j ( t (cid:48) ) (cid:105) = 2 δ ij δ ( t − t (cid:48) ). In this case the marginalequilibrium distribution for x i is P i ( x i ): P i ( x i ) = e − x i Ti √ πT i (2)If we know sample the values of all the x i ’s at M timeswe can compute the empirical covariance coeﬃcients, C ij = x i x j − x i x j , where · indicates the average overthe M sampled times.For an inﬁnite value of the ratio Q = M/N the covariancematrix converges towards a diagonal one, C ij = T i δ ij .Meanwhile for a ﬁnite value of Q the oﬀ-diagonal ele-ments of C are N ( N − / T i T j M .In this case the Pearson correlation-matrix c ij = C ij / (cid:112) C ii C jj has exactly the same statistics of a matrixextracted from the widely-used Wishart-Laguerre ensem-ble of random matrices since its elements are the pair-correlations of N normally distributed signals of length M . The heterogeneity we have put in the dynamics playsno role in the correlation matrix in this case. B. Coupled OU processes

The generalisation to the coupled case is interesting.The dynamics now veriﬁes:˙ x i = − (cid:88) j J ij x j + (cid:112) T i η i ( t ) , (3)where J ij is symmetric and positive-deﬁnite in order toensure a ﬁnite limit to the process. In the following wewill focus our analysis on the asymptotic limit since in thepresent work we are not interested in the consequencesof the interplay of ﬁnite Q and heterogeneity but solelyon the consequences of the latter. In this system thereare two diﬀerent methods [18] to obtain a closed formulafor the asymptotic covariance matrix, C ij = (cid:104) x i x j (cid:105) , asa function of couplings and temperatures ( (cid:104)·(cid:105) indicatesthe average over an inﬁnite time). Starting from thedynamics with a few standard steps it is possible to ﬁndthe implicit formula : { C, J } = 2 ˜ T , (4)where ˜ T ij = T i δ ij , and {· , ·} denotes the matrix anti-commutator. From the spectral decomposition of J it ispossible to ﬁnd a set of explicit formulas for the elementsof C ij : C ij = 2 (cid:88) a,b u ai u bj λ a + λ b (cid:88) k u ak u bk T k , (5)where u ai is the i th component a th eigenvector of J and λ a is the a -th eigenvalue. In (4) C and J appear in asymmetric form and the same symmetry must hold alsoin (5). This fact implies that (5) can be used to solvethe inverse problem for this system, that is ﬁnding thecouplings J given the covariances C . This symmetry isnot surprising since it holds also in the familiar homo-geneous case where C = J − , an ostensibly symmetricformula. In Appendix A we examine the consequenceson the Pearson correlation matrix c in the case of smallcouplings.We have thus deﬁned two diﬀerent random-matrix en-sembles: one, that we will examine in the next Section,is the set of inﬁnite-time covariance matrices that are de-ﬁned by formula (5) for coupling matrices J sampled froma given random-matrix ensemble (for instance the Gaus-sian ensemble) and for sets of temperatures T sampledfrom a distribution chosen at will, the other (AppendixB) is the set of ﬁnite-time empirical covariance matricesbetween signals sampled from the stationary distributionof the OU dynamics for a given inﬁnite-time C . III. SAMPLING MATRICES

Since we are interested in ﬁnding the consequences ofheterogeneity we use straightly the inﬁnite-time asymp-totic formula (5) so that we avoid simulating the whole λ ρ ( λ ) as D increases FIG. 1. For ﬁxed N = 100 and (cid:15) = 0 . / √ N we plot thespectral density of the correlation matrix C for D = 0 . , . . samples. Increasing D we see that the lower edge of the spectrum becomes smallerand smaller and conversely that the higher edge increases. stochastic dynamics. Thus we generate a random cou-pling matrix J = I + (cid:15)K where I is the identity ma-trix, where (cid:15) is the strength of the coupling among sig-nals and K and random Gaussian matrix whose elementshave variance N . J must be positive-deﬁnite for any N so we eliminated samples with non-positive eigenvaluesthat have vanishing probability as N goes to inﬁnity. Inprinciple it is possible to consider any kind of probabilitymeasure for couplings and temperatures, the main ideaaddressed here is to regard couplings as homogeneousso that temperatures are the only source of heterogene-ity. Since in the ﬁnancial context temperatures repre-sent volatilities that are typically log-normal distributed[10, 11] we choose to draw them from this kind of distri-bution: p ( T ) = 1 T e − (log T − µ )22 D √ πD (6)Namely we generate N normally distributed randomnumbers, ξ i , and deﬁne T i = e µ + Dξ i . Then we ﬁx (cid:15) and draw the coupling matrix J , diagonalise it and use(5) to obtain C . Varying D , (cid:15) and N we observe somebasic features of the C matrix. First we compute theeigenvalue distribution changing N at ﬁxed D = 1 and (cid:15) = 0 . / √ N and we notice that, as N increases, the dis-tribution rapidly converges towards an inﬁnite-size spec-trum. Once this is veriﬁed we study the eigenvalue distri-bution varying D alone. The spectrum spreads on bothedges as is often observed in real data analysis Fig.1.Thus introducing heterogeneity we have new eigenvalues,both small and large, so we enquiry the related eigen-vectors and check whether they are statistically diﬀerentfrom the ones in the homogeneous bulk of the spectrum.We characterise the eigenvectors of C , v ai , through theIPR, a standard quantity in matrix analysis, deﬁned bythe formula: IP R a = (cid:88) i ( v ai ) (7) I P R null−modelReal dataJ −1 market eigenvector FIG. 2. For each ordered eigenvalue we plot the mean valueof the n-th IPR versus the eigenvalue index averaged over1000 samples for a system size N = 100 and a value of (cid:15) =0 . / √ N for D = .

74 and µ = 7 .

74 (as obtained from realdata volatilities). Crosses show the IPR averaged over 10matrices of daily asset returns from NYSE from the ﬁrst ofJune 1987 to the 31 of December 1998. The J − line is theequal temperatures case ( D = 0). We see that the largesteigenvector, representing the market, is extended and fallsexactly on the D = 0 line. Obviously

IP R values depend by the sample. Sincewe want to characterise its typical behaviour we takefor each sample the set of ordered eigenvalues and con-sider their IPR, then the IPRs over samples Fig.2. Realdata used are a set of 1017 daily asset returns fromNYSE from the ﬁrst of June 1987 to the 31 of Decem-ber 1998. In order to compare qualitatively with data weﬁxed the values of the log-normal distribution by evalu-ating the mean and standard deviation of the logarithmof returns variances, namely µ = (cid:80) k =1 log ( σ k ) and D = (cid:80) k =1 ( log ( σ k ) − µ ) , where σ i ’s are the empiricalvariances. The ﬁgure we obtain shows localization at theedges, a common feature observed in real data analysis.In particular the IPR shows agreement not only in thetypical ﬂat region related to the bulk where its value isﬂuctuating slightly over 3 /N but also on the edges (seeIPR in [17]), where we observe the increasing of the IP R .We then evaluate level spacings, λ n +1 − λ n , where λ n is the n -th eigenvalue of the covariance matrix, and ob-serve a clear left-shift in the spacings distribution [19, 20],mean that the skewness of spacings increases with het-erogeneity Fig. 3 approaching real data.To observe the heterogeneity eﬀect we also need to con-sider a matrix observable not depending on the eigenvec-tor, such as IPR, but depending on the component so westudy the component participation ratio that we deﬁneby the formula: CP R i = (cid:88) a ( v ai ) (8) −5 −4 −3 −2 −1 0 1 2 3−4.5−4−3.5−3−2.5−2−1.5−1−0.5 s p ( s ) Real datanull−modelJ −1 FIG. 3. Distribution of level spacings normalised by theirmean value, s n = λ n +1 − λ n (cid:104) λ n +1 − λ n (cid:105) , where λ n is the n -th eigenvalueof the covariance matrix. Data are presented in a log-log scale.Crosses show the IPR averaged over 10 matrices of daily assetreturns from NYSE from the ﬁrst of June 1987 to the 31 ofDecember 1998. The J − line is the equal temperatures case, D = 0. Null-model data are averaged over 10 samples for asystem of size N = 100 with µ and D parameters obtainedfrom real data. that is just the equivalent of the IPR for the change ofbasis matrix transposed. We investigate the relation be-tween CPR and heterogeneity evaluating the correlationsbetween CPR and both T and 1 /T by constructing thescatter plot (log( T i ) , CP R i ). For real data we decidedto approximate diﬀerent temperatures with the diﬀusionterms [21] so we plot (log( D ( e ) i ) , CP R i ) Fig. 4, where D ( e ) i = T T − (cid:80) t =1 ( r i ( t + 1) − r i ( t )) being r i ( t ) the return ofasset i at time t . The eﬀect holds also considering vari-ances versus CPR.The inverted-bell shape indicates that high and low tem-perature/volatility components are the most localizedones. This result depends both on the presence of cou-plings and heterogenous temperatures/volatilities: withno couplings the covariance matrix would be diagonaland so all the eigenvectors would be localized and withtoo low heterogeneity the diﬀerences between diﬀusionterms would be negligible and would not aﬀect localiza-tion so clearly.An explanation for this eﬀect can be achieved if weconsider the uncoupled case where every eigenvector issharply localized since the matrix is diagonal. If we nowput a coupling between the components what happensis that the ones in the bulk with closer eigenvalues arelikely to interact and spread while the ones on the edgesare related to more isolated eigenvalues so are less likelyto mix with others and will stay more localized. Thispicture should hold until the couplings are large enoughto contrast the diﬀerences in temperature. C P R ( T ) null−modelReal data FIG. 4. Scatter plot of the components in the plane( log ( T i ) , CP R i ). We can see an inverted-bell shape that isabsent in GOE matrices, i.e. with no heterogeneity. Realand null-model data are over 10 matrices of size N = 100.For null-model data we used the values for µ and D obtainedfrom real data. IV. CONCLUSION

We have analysed a simple model of complex systemsthat provides a method for sampling random matrices.We have shown how our method gives results which arein agreement with eigenvector localization ubiquitous inreal data. This model suggests that heterogeneity amongsignals is likely to cause localization, as indicated alsoby known random band models [22, 23]. The analysisshowed the peculiar characteristic that localizationinvolves both the noisiest signals and the most determin-istic ones, the inverted-bell eﬀect. Another interestingaspect is the heterogeneity eﬀect in localization in themodel proposed showing a non-trivial transition froma coupling dominated phase, where spectral propertiesare the same as those of Wishart matrices, towardsan heterogeneity dominated phase, where localizationon the edges of the spectrum occurs. A theoreticalperspective is to establish whether the eﬀect arises froma simple crossover or from a real phase transition, validalso in the thermodynamical limit, i.e. for inﬁnite N ,and possibly to characterise more in detail the twophases by examining also other matrix properties. Toimprove the comparison with real data, especially inﬁnance, another perspective is characterising the caseof ﬁnite time-samplings, i.e. ﬁnite ratio Q = M/N andcheck how the interplay of heterogeneities, couplingsand ﬁnite time-samplings change the properties of thecovariance matrix in a benchmark case.The research leading to these results has receivedfunding from the European Research Council underthe European Unions Seventh Framework Programme(FP7/2007-2013) / ERC grant agreement No. 247328.

V. APPENDIX A

We showed in the general case how couplings, covari-ances and heterogeneities are related. Here we show in aperturbative limit of small couplings what happens pass-ing from the covariance to the correlation matrix.We write J = I + (cid:15)K (1) where I is the identity matrix, K (1) is a random symmetric gaussian matrix and (cid:15) is anarbitrarily small real number. At ﬁrst order in (cid:15) the co-variance matrix must satisfy the perturbative expression C = ˜ T + (cid:15) Σ (1) . Consequently K (1) and Σ (1) verify: K (1) ij ( T i + T j ) = − (1) ij (9)Furthermore C ii = T i + (cid:15) Σ (1) ii so for the covariance matrixwe have: C ij = T i − (cid:15) K (1) ij ( T i + T j ) (10)while the correlation matrix c ij = C ij √ C ii C jj satisﬁes: c ij = I − (cid:15) K (1) ij ( T i + T j ) √ T i (cid:112) T j (11)First-order expansion reveals a symmetry between T and1 /T in the correlation matrix, that can be easily veriﬁed.This expansion allows us to consider a simpliﬁed random-matrix ensemble for the covariance matrices of weakly-coupled heterogeneous time-series for which analytical re-sults can be obtained [24]. Moreover in case of strongheterogeneity, i.e. T i >> T j c ij = c ji = (cid:15) (cid:113) T i T j K (1) ij , so ifthere is a low probability for a large value of | K (1) ij | , theelements of the correlation matrix on the rows/columnsrelated to variables with high or low temperature can besigniﬁcantly bigger than the others. From the theory ofLevy matrices [25] we know that large values of speciﬁcpair-correlation coeﬃcients, i.e. a large c mn , implies thepresence of eigenvectors concentrated on the two compo-nents involved, e.g. m and n . Moreover if the elementsof a whole row are large compared to the rest of matrixthere will be an eigenvector localized on the related com-ponent. An higher-order expansion shows the breaking ofthis high/low temperature symmetry in favour of the low-temperature components. At second order in (cid:15) we canwrite C = T + (cid:15) Σ (1) + (cid:15) Σ (2) and J = I + (cid:15)K (1) + (cid:15) K (2) .This higher order expansion leads to the supplementary equation for Σ (2) :2Σ (2) ij = ( T i + T j )( − K (2) ij + (cid:88) k K (1) ik K (1) kj )+2 (cid:88) k K (1) ik T k K (1) kj , (12)where we substituted Σ (1) with the expression found atﬁrst-order (9). If we know divide by √ T i (cid:112) T j we obtainthe second order correction to the correlation matrix c that reads:( T i + T j ) √ T i (cid:112) T j ( − K (2) ij + (cid:88) k K (1) ik K (1) kj ) + 2 (cid:80) k K (1) ik T k K (1) kj √ T i (cid:112) T j . (13)The ﬁrst two terms remain unchanged if we substitute T i with 1 /T i but the third one does not, it breaks the sym-metry in favour of elements related to components withlow temperatures. We stress the fact that (cid:15) is small re-gardless the value of the system size N . If one performedthe expansion for large N , then terms at all orders wouldhave to be considered since at higher orders matrix mul-tiplication would involve sums on an increasing numberof elements. VI. APPENDIX B

For a given coupling matrix J and set of temperatures T the equilibrium distribution of the signals x i is a mul-tivariate Gaussian, namely: P ( { x i }| J, T ) = exp ( − x T C − x ) (cid:112) (2 π ) N det C (14)where C is the covariance matrix, solution of eq. (5). Theempirical covariance matrix between signals extractedfrom this distribution deﬁnes a correlated Wishart en-semble [13, 26–28] whose peculiarity is the separation ofthe quenched disorders given by couplings and tempera-tures. VII. ACKNOWLEDGEMENTS

I want to thank C. Cammarota, B. Cerruti, A. De-celle, C. Lucibello, G.Parisi, J. Rocchi and B. Seoane forinteresting discussions. [1] R. K. Pan and S. Sinha, Phys. Rev. E , 046116 (2007).[2] B. Podobnik, D. Wang, D. Horvatic, I. Grosse, and H. E.Stanley, EPL (Europhysics Letters) , 68001 (2010).[3] A. E. Biondo, A. Pluchino, A. Rapisarda, and D. Hel-bing, PloS one , e68344 (2013). [4] M. Potters, J.-P. Bouchaud, and L. Laloux, arXivphysics/0507111, Financial applications of random ma-trix theory: Old laces and new pieces (2005).[5] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Ama-ral, T. Guhr, and H. E. Stanley, Phys. Rev. E , 066126(2002). [6] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters,Phys. Rev. Lett. , 1467 (1999).[7] Z. Burda and J. Jurkiewicz, Physica A: Stat. Mech. ,67 (2004).[8] A. Utsugi, K. Ino, and M. Oshikawa, Phys. Rev. E ,026110 (2004).[9] P. Cizeau, Y. Liu, M. Meyer, C.-K. Peng, and H. Eu-gene Stanley, Physica A: Stat. Mech. , 441 (1997).[10] Y. Liu, P. Gopikrishnan, P. Cizeau, Meyer, and H. E.Stanley, Phys. Rev. E , 1390 (1999).[11] J.-P. Bouchaud, M. Potters, and M. Meyer, The Euro-pean Physical J. B (Cond. Matt.) , 595 (2000).[12] Z. Burda, J. Jurkiewicz, M. A. Nowak, G. Papp, andI. Zahed, Physica A: Stat. Mech. , 694 (2004).[13] Z. Burda, A. T. G¨orlich, and B. Wac(cid:32)law, Phys. Rev. E , 041129 (2006).[14] G. Akemann, J. Fischmann, and P. Vivo, Physica A:Stat. Mech. , 2566 (2010).[15] J. Edwards and D. Thouless, Journal of Physics C: SolidState Physics , 807 (1972).[16] A. Chakraborti, I. M. Toke, M. Patriarca, andF. Abergel, Quant. Fin. , 991 (2011). [17] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Ama-ral, and H. E. Stanley, Phys. Rev. Lett. , 1471 (1999).[18] H. Risken, Fokker-Planck Equation (Springer, 1984).[19] B. Shklovskii, B. Shapiro, B. Sears, P. Lambrianides, andH. Shore, Physical Review B , 11487 (1993).[20] O. Agam, B. L. Altshuler, and A. V. Andreev, Physicalreview letters , 4389 (1995).[21] S. Siegert, R. Friedrich, and J. Peinke, Physics LettersA , 275 (1998).[22] G. Casati, L. Molinari, and F. Izrailev, Phys. Rev. Lett. , 1851 (1990).[23] Y. V. Fyodorov and A. D. Mirlin, Phys. Rev. Lett. ,2405 (1991).[24] P. Barucca, Quenched heterogeneities in disordered sys-tems , Ph.D. thesis, Sapienza University (2014).[25] P. Cizeau and J.-P. Bouchaud, Phys. Rev. E , 1810(1994).[26] V. A. Marˇcenko and L. A. Pastur, Sbornik: Mathematics , 457 (1967).[27] Z. Burda, J. Jurkiewicz, and B. Wac(cid:32)law, Phys. Rev. E , 026111 (2005).[28] S. H. Simon and A. L. Moustakas, Physical Review E69