On the Eigenvalue Density of Real and Complex Wishart Correlation Matrices
aa r X i v : . [ m a t h - ph ] O c t On the Eigenvalue Density of Real and Complex Wishart Correlation Matrices
Christian Recher, Mario Kieburg and Thomas Guhr
Fakult¨at f¨ur Physik, Universit¨at Duisburg–Essen, Lotharstraße 1, 47048 Duisburg, Germany (Dated: November 9, 2018)Wishart correlation matrices are the standard model for the statistical analysis of time series.The ensemble averaged eigenvalue density is of considerable practical and theoretical interest. Forcomplex time series and correlation matrices, the eigenvalue density is known exactly. In the realcase, however, a fundamental mathematical obstacle made it forbidingly complicated to obtain exactresults. We use the supersymmetry method to fully circumvent this problem. We present an exactformula for the eigenvalue density in the real case in terms of twofold integrals and finite sums.
PACS numbers: 05.45.Tp,02.50.-r,02.20.-aKeywords: Wishart correlation matrices, eigenvalue density, supersymmetry
Time series analysis is an indispensable tool in thestudy of complex systems with numerous applications inphysics, climate research, medicine, signal transmission,finance and many other fields [1–3]. A time series is anobservable such as the water level of a river, the tem-perature, the intensity of transmitted radiation, the neu-ron activity in electroencephalography (EEG), the priceof a stock, etc., measured at usually equidistant times k = 1 , . . . , n . Suppose we measure p such time series M j , j = 1 , . . . , p , for example, in the case of EEG, at p electrodes placed on the scalp or, in the case of tempera-tures, at p different locations. Our data set then consistsof the p × n rectangular matrix M with entries M jk . Thetime series M j are usually real (labeled β = 1), but insome applications they can be complex ( β = 2). Often,one is interested in the correlations between the time se-ries. To estimate them, one normalizes the time series M j to zero mean and unit variance. The correlation co-efficient between the time series M j and M l is then givenas the sample average C jl = 1 n n X k =1 M jk M ∗ lk and C = 1 n M M † (1)is the correlation matrix. For real time series ( β = 1),the complex conjugation is not needed and the adjointis simply the transpose. We notice that C is a p × p real–symmetric ( β = 1) or Hermitean ( β = 2) matrix.The eigenvalues of C provide important information,see recent examples in Refs. [4, 5]. As the empirical infor-mation is limited, it is desirable to compare the measuredeigenvalue density with a “null hypothesis” that resultsfrom a statistical ensemble. The ensemble is defined [6]by synthetic real or complex time series W j , j = 1 , . . . , p which yield the empirical correlation matrix C upon av-eraging over the probability density function P β ( W, C ) ∼ exp (cid:18) − β W † C − W (cid:19) , (2)that is, we have by construction Z d [ W ] P β ( W, C ) 1 n W W † = C , (3) where the measure d [ W ] is the product of the differentialsof all independent elements in W . To ensure that C is in-vertible, we always assume n ≥ p . When going to higherorder statistics, the Gaussian assumption (2) is not nec-essarily justified, but it often is a good approximation.This multivariate statistical approach is closely relatedto Random Matrix Theory [7], and the matrices W W † are referred to as Wishart correlation matrices. One isinterested in the ensemble averaged eigenvalue density ofthese matrices. In terms of the resolvent, it reads S β ( x ) = − pπ Im Z d [ W ] P β ( W, C )tr p x + p − W W † , (4)where p is the p × p unit matrix. The argument x carriesa small positive imaginary increment ε >
0, indicated bythe notation x + = x + iε . The limit ε → S β ( x ) only depends on the eigenvalues Λ j , j = 1 , . . . , p of C . Hence we may replace C in Eq. (4) by the p × p diagonal matrix Λ = diag(Λ , . . . , Λ p ). We notice thatthe eigenvalues are positive definite, Λ j > n and p hasbeen studied in great detail, see Refs. [8, 9]. However, anexact closed–form result for finite n and p is only avail-able in the complex case [10, 11]. Unfortunately, a deep,structural mathematical reason made it up to now impos-sible to derive such a closed–form result in the real casewhich is the more relevant one for applications. We havethree goals: We, first, introduce the powerful supersym-metry method [12–14] to Wishart correlation matrices forarbitrary C . This has, to the best of our knowledge, notbeen done before. We, second, use the thereby achievedunique structural clearness to derive a new and exactclosed–form result for the eigenvalue density in the realcase for finite n and p . We, third, show that our resultsare easily numerically tractable and compare them withMonte Carlo simulations.Why does the real case pose such a substantial prob-lem? — This is best seen by going to the polar decom-position W = U wV , where U ∈ O( p ) , V ∈ O( n ) for β = 1 and U ∈ U( p ) , V ∈ U( n ) for β = 2 and where w is the p × n matrix containing the singular values orradial coordinates w j , j = 1 , . . . , p . In particular, onehas W W † = U w U † , with w = ww † . When insertinginto (4), one sees that the non–trival group integralΦ β (Λ , w ) = Z exp (cid:18) − β U † Λ − U w (cid:19) dµ ( U ) (5)has to be done to obtain the joint probability densityfunction of the radial coordinates w j . Here, dµ ( U ) isthe invariant Haar measure. For β = 2, this integralis the celebrated Harish-Chandra–Itzykson–Zuber inte-gral and known explicitly [15, 16]. For β = 1, however,Φ (Λ , w ), is not a Harish-Chandra spherical function, itrather belongs to the Gelfand class [17] and a closed–formexpression is lacking. The only explicit form known is acumbersome, multiple infinite series expansion in termsof zonal or Jack polynomials [6, 18]. This inconvenientfeature then carries over to the eigenvalue density (4),but we will arrive at a finite series over twofold integrals.The supersymmetry method is based on writing S β ( x ) = − πp Im ∂Z β ( J ) ∂J (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) J =0 (6)as the derivative of the generating function Z β ( J ) = Z d [ W ] P β ( W, Λ) det( x + p + J p − W W † )det( x + p − W W † ) (7)with respect to the source variable J at J = 0. One hasthe normaliziation Z β (0) = 1 at J = 0. We considerthe real and the complex case and use the latter as test.We map Z β ( J ) onto superspace using steps which are bynow standard, see Refs. [13, 14]. A particularly handyapproach for applications such as the present one is givenin Ref. [19], we use the same conventions and find Z β ( J ) = Z d [ ρ ] I β ( ρ ) × p Y j =1 sdet − β/ (cid:18) x + /β − Jγ − β j ρ (cid:19) . (8)The merit of this transformation is the drastic reduc-tion in the number of degrees of freedom, because thevariables to be intergrated over form the 4 /β × /β Her-mitean supermatrix ρ = (cid:20) ρ ρ † ρ iρ (cid:21) . (9)For β = 2, ρ and ρ are scalar, real commuting vari-ables and ρ is a complex anticommuting scalar variable.For β = 1, ρ is a 2 × ρ hasto be multiplied with and we have ρ = (cid:20) χ χ ∗ ξ ξ ∗ (cid:21) , (10) where χ, ξ and χ ∗ , ξ ∗ denote anticommuting variablesand their complex conjugates, respectively. We also in-troduced the matrix γ = diag(0 /β , − /β ) and the su-persymmetric Ingham–Siegel integral I β ( ρ ) = Z d [ σ ]sdet − nβ/ ( /β + iσ ) exp( i str σρ ) , (11)where σ has the same form as ρ . The supertrace andsuperdeterminant [20] are denoted by str and sdet.Starting from the generating function (8) we first con-sider the complex case β = 2. By introducing eigenvalue–angle coordinates for the supermatrix ρ , we rederive in astraightforward calculation the eigenvalue density S ( x )as found in Ref. [10]. In the real case β = 1, theanalogous approach leads to inconvenient Efetov–Wegnerterms [14], and we thus proceed differently. Since the gen-erating function remains invariant under rotations of thematrix ρ , we introduce its eigenvalues R = diag( r , r )and the diagonalizing angle as new coordiantes. Thisyields the Jacobian | ∆ ( R ) | = | r − r | . The next stepis to evaluate the Ingham–Siegel integral I ( ρ ). The su-permatrix σ in Eq. (11) has the same form as ρ in Eq. (9).Doing the integral over σ followed by an expansion in theanticommuting variables of ρ according to Eq. (10) gives I ( ρ ) ∼ det ( n − / R exp ( − str ρ ) Θ( R ) (cid:18) ∂i∂ρ (cid:19) n − − (cid:18) χχ ∗ r + ξξ ∗ r (cid:19) (cid:18) ∂i∂ρ (cid:19) n − + 1 r r χχ ∗ ξξ ∗ (cid:18) ∂i∂ρ (cid:19) n ! δ ( ρ ) . (12)In a simple, direct calculation, we also expand the prod-uct of the superdeterminats in Eq. (8) in the anticom-muting variables of ρ . We collect everything and do theintegration over the anticommuting variables. With thenotation Q j = x + − j iρ , we obtain Z ( J ) ∼ Z d [ R ] Z dρ | ∆ ( R ) | det ( n − / R Θ( R )exp( − ( r + r − iρ )) p Y j =1 ( J + Q j )det / ( x + − j r ) det − R (cid:18) ∂i∂ρ (cid:19) n + p X j =1 (2Λ j ) ( J + Q j ) (cid:18) x + − j r ) r + 1( x + − j r ) r (cid:19) (cid:18) ∂i∂ρ (cid:19) n − + p X j = k (cid:18) (2Λ j ) ( J + Q j )( x + − j r ) × (2Λ k ) ( J + Q k )( x + − k r ) (cid:18) ∂i∂ρ (cid:19) n − !! δ ( ρ ) . (13)According to Eq. (6) we have to take the derivative with S H x L FIG. 1: Eigenvalue density for p = 5 and n = 200: analytical formula (solid lines) and Monte Carlo simulations (histogramwith bin width 3). respect to J . This leads to the three relations ∂∂J p Y l =1 ( J + Q l ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) J =0 = E p − ( Q ) (14) ∂∂J J + Q j p Y l =1 ( J + Q l ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) J =0 = E p − j ( Q ) ∂∂J J + Q j J + Q k p Y l =1 ( J + Q l ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) J =0 = E p − j,k ( Q ) , where E m ; i,k ( Q ) denotes the elementary symmetric poly-nomial of order m in the variables Q j , j = 1 , . . . , p = i, k with Q i and Q k omitted, E m ; i,k ( Q ) = X ≤ i < ··· Λ > . . . > Λ p > Λ p +1 with Λ − = 0 andΛ − p +1 = ∞ . The real function f ( r , r ) is independentof ε and has no singularities. The singularties at theboundaries of the domain are integrable. Using the com-mercial software Mathematica R (cid:13) [21], we evaluate ourformula (17) numerically. For independent comparison,we also carry out Monte Carlo simulations with ensem-bles of 10 matrices. In Figs. 1 and 2, we show the S H x L FIG. 2: Eigenvalue density for p = 10 and n = 200: analytical formula (solid lines) and Monte Carlo simulations (histogramwith bin width 3). results for p = 5 and p = 10 and n = 200 withthe chosen empirical eigenvalues Λ j , j = 1 , . . . , { . , . , . , . , . } and Λ j , j = 1 , . . . ,
10 of { , . , . , . , . , . , . , . , . , . } ,respectively. The agreement is perfect.In conclusion, we introduced the supersymmetrymethod for the first time to Wishart correlation matrices.We thereby derived exact expressions for the eigenvaluedensity in terms of low–dimensional integrals. This is adrastic reduction, as the original order of integrals is np .Our approach solves a serious mathematical obstacle inthe real case. A presentation for a mathematics audiencewill be given elsewhere [22]. Here, we derived and dis-cussed the formulae needed for applications. In the realcase ( β = 1), we obtained the previously unknown exactsolution in terms of a finite sum of twofold integrals. Weevaluated our formula numerically and confirmed it bycomparing to Monte Carlo simulations.We thank R. Sprik for fruitful discussions as well asA. Hucht, H. Kohler, and R. Sch¨afer for helpful com-ments. One of us (TG) greatly benefitted from the Pro-gram on High Dimensional Inference and Random Ma-trices in 2006 at SAMSI, Research Triangle Park, NorthCarolina (USA). We acknowledge support from DeutscheForschungsgemeinschaft (Sonderforschungsbereich Tran-sregio 12). [1] C. Chatfield, The Analysis of Time Series (Chapman andHall/CRC, Boca Raton, 2004), 6th ed.[2] M. M¨uller, G. Baier, A. Galka, U. Stephani, and H. Muhle, Phys. Rev. E , 046116 (2005).[3] P. ˇSeba, Phys. Rev. Lett. , 198104 (2003).[4] R. Sprik, A. Tourin, J. de Rosny, and M. Fink, Phys.Rev. B , 012202 (2008).[5] L. Laurent, P. Cizeau, J. Bouchaud, and M. Potters,Phys. Rev. Lett. , 1467 (1999).[6] R. Muirhead, Aspects of Multivariate Statistical Theory (Wiley, New York, 1982), 1st ed.[7] T. Guhr, A. M¨uller-Groeling, and H. Weidenm¨uller,Phys. Rep. , 189 (1998).[8] J. Silverstein, Multivariate Anal. , 331 (1995).[9] Vinayak and A. Pandey, Phys. Rev. E , 036202 (2010).[10] G. Alfano, A. Tulino, A. Lozano, and S. Verdu, Proc.IEEE Int. Symp. on Spread Spectrum Tech. and Appli-cations (ISSSTA ’04) (2004).[11] S. Simon and A. Moustakas, Phys. Rev. E , 065101(2004).[12] K. Efetov, Adv. Phys. , 53 (1983).[13] J. Verbaarschot, H. Weidenm¨uller, and M. Zirnbauer,Phys. Rep. , 367 (1985).[14] K. Efetov, Supersymmetry in Disorder and Chaos (Cam-bridge University Press, Cambridge, 1997), 1st ed.[15] Harish-Chandra, Am. J. Math. , 241 (1958).[16] C. Itzykson and J. Zuber, J. Math. Phys. , 411 (1980).[17] I. Gelfand, Dokl. Akad. Nauk. SSSR , 5 (1950).[18] I. Macdonald, Symmetric Functions and Hall Polynomi-als (Clarendon Press, Oxford, 1998), 2nd ed.[19] M. Kieburg, J. Gr¨onqvist, and T. Guhr, J.Phys. A ,275205 (2009).[20] F. Berezin, Introduction to Superanalysis (D. Reidel Pub-lishing Company, Dordrecht, 1987), 1st ed.[21] Wolfram Research Inc.,
Mathematica