Statistical Properties of Cross-Correlation in the Korean Stock Market
Gabjin Oh, Cheoljun Eom, Fengzhong Wang, Woo-Sung Jung, H. Eugene Stanley, Seunghwan Kim
aa r X i v : . [ q -f i n . S T ] O c t Statistical Properties of Cross-Correlation in the Korean StockMarket
Gabjin Oh, Cheoljun Eom, Fengzhong Wang, Woo-SungJung,
3, 4, 5, ∗ H. Eugene Stanley, and Seunghwan Kim
5, 6 Division of Business Administration,Chosun University, Gwangju 501-759, Republic of Korea Division of Business Administration,Pusan National University, Busan 609-735, Republic of Korea Center for Polymer Studies and Department of Physics,Boston university, Boston, MA 02215, USA Graduate Program for Technology and Innovation Managemant,Pohang University of Science and Technology,Pohang 790-784, Republic of Korea Department of Physics, Pohang University of Science and Technology,Pohang 790-784, Republic of Korea Asia Pacific Center for Theoretical Physics,Pohang 790-784, Republic of Korea bstract We investigate the statistical properties of the correlation matrix between individual stockstraded in the Korean stock market using the random matrix theory (RMT) and observe howthese affect the portfolio weights in the Markowitz portfolio theory. We find that the distributionof the correlation matrix is positively skewed and changes over time. We find that the eigenvaluedistribution of original correlation matrix deviates from the eigenvalues predicted by the RMT, andthe largest eigenvalue is 52 times larger than the maximum value among the eigenvalues predictedby the RMT. The β coefficient, which reflect the largest eigenvalue property, is 0.8, while oneof the eigenvalues in the RMT is approximately zero. Notably, we show that the entropy function E ( σ ) with the portfolio risk σ for the original and filtered correlation matrices are consistent witha power-law function, E ( σ ) ∼ σ − γ , with the exponent γ ∼ .
92 and those for Asian currency crisisdecreases significantly.
PACS numbers: 89.90.+n, 05.45.Tp, 05.40.FbKeywords: correlation matrix, random matrix theory, markowitz portfolio theory. ∗ Electronic address: [email protected] . INTRODUCTION Financial markets have been known as representative complex systems, which are or-ganized by various unexpected phenomenon according to non-trivial interactions amongheterogeneous agents [1]. The study of complex economic systems is not easy because we donot know the control parameters that govern economic systems and can not easily apply theparameters we do know to economic systems. However, much research has been conductedto understand the statistical properties of financial time series [2, 3]. In particular, theanalysis of financial data by various methods developed in statistical physics has become avery interesting research area for physicists and economists [4]. There is practical [5–7] aswell as scientifically important value in analyzing the correlation coefficient between stockreturn time series because this contains a significant amout of information on the nonlinearinteractions in the financial market and is a parameter in terms of the Markowitz portfoliotheory. The correlation matrix between stocks, which has unexpected properties due to com-plex behaviors, such as temporal non-equilibrium, mispricing, bubbles, market crashes andso on, is an important parameter to understand the interactions in the financial market [8].To analyze the correlation matrix, previous studies presented various statistical methods,such as principal component analysis (PCA) [9], singular value decomposition (SVD) [10]and factor analysis (FA) [11]. Here, to analyze the actual correlation matrix, we employ therandom matrix theory (RMT), which was introduced by Wigner, Dyson and Metha [12–15].It can explain the statistical properties of energy levels in complex nuclei well. The RMTmethod is a useful method for eliminating the randomness in the actual correlation matrix[16–21]. Recently, Laloux et al (1999) [22] and Plerou et al (1999) [27] analyzed the correla-tion matrix of financial time series by the RMT method. The authors found that 94% of theeigenvalues of correlation matrix can be predicted by the RMT, while the other 6% of theeigenvalues deviated from the RMT. In addition, Plerou et al (2002) [24] applied the RMTmethod to a United States stock market and observed that the correlation matrix of stockmarkets consist of random and non-random parts, which have a useful information in thefinancial market. The eigenvector deviations from the RMT show a very stable state over awhole period. We investigate the various statistical properties of the correlation matrix of473 daily stock return time series traded in the Korean stock market from 1 January 1993to 31 May 2003. We find that the distribution of the correlation matrix is positively skewed3nd changes over the whole time. Using the RMT method, we show that the correlationmatrix contains meaningful information as well as random property. Notably, we show thatfor both the original, C original , and filtered correlation, C filter , matrices the entropy func-tion, E ( σ ), with the portfolio risk, σ , is consistent with a power-law function, E ( σ ) ∼ σ − γ ,with an exponent γ ∼ .
92. In the following section, we describe the data and methodsused in this paper. In Section 3, we present the verification results. Finally, we end with aconclusion.
II. DATA AND METHOD
In this paper, we investigate the statistical properties of the correlation matrix of the 473daily stock returns traded on the Korean stock market from 3 January 1993 to 31 May 2003.The data obtained from the Korea Stock Exchanges cover 2845 days. To understand the non-trivial interactions, we calculated the correlation matrix between stocks for the whole periodas well as sub-periods by shifting 21 days with 250 data points. We propose a verificationprocess to analyze the statistical properties of the correlation matrix between stock returns.First, we estimated the statistical properties of the correlation matrices using the RMTmethod. Second, we calculate the entropy of the portfolio weights using the Markowitzportfolio theory. Before demonstrating the verification process, we introduces the RMT,which was proposed by Wigner, Dyson, and Metha, et al. and Markowitz portfolio theory(MPT) [30] introduced by Markowitz in 1952. We created N (number of company) datasets with L data points following iid (0 , G . Here, the G is a matrix ( N × L ) with the random elements and the correlation matrixis defined by C random = 1 L GG T , (1)where G T is the transpose of G , and the correlation between elements is approximately zero.If N → ∞ and L → ∞ , the eigenvalue spectrum of RMT is calculated by using P random ( λ ) = Q π p ( λ + − λ )( λ − λ − ) λ , (2)where the eigenvalues λ lie within λ − ≤ λ ≤ λ + , Q ≡ LN , and the maximum and minimumeigenvalue of RMT, C random , are given by 4 ± ≡ Q ± r Q . (3)If L and N have a limitable length, then the eigenvalue spectrum shows gradual decreasefrom the theoretical values of the largest eigenvalue predicted by the RMT.We next explain the MPT to select the optimal portfolio sets among all stocks. The MPTmethod introduced by Markowitz in 1952 is generically known as the mean-variance theory.The purpose of MPT is to minimize the portfolio risk in a given portfolio return, which canbe quantified by the variance and defined as follows.Ω = N X i =1 N X j =1 ω i ω j C ij σ i σ j , (4)where ω i is the portfolio weight of stock i , which can be calculated using two Lagrangemultipliers, σ i is the standard deviation of stock i , and C ij is the correlation coefficientbetween stock i and stock j . In this work, we use the no short-selling constraint for portfolioweights [4], i.e. we assume that all the weights are non negative numbers ( ω i > ∀ i=1, . . . ,N). We also normalize portfolio weights in such a way that P Ni =1 ω i = 1. Theportfolio return, µ , also is calculated by µ = N X i =1 ω i µ i , (5)where µ i is the mean value of stock i . We next considered the portfolio weights because thesecould determine the portfolio efficiency frontier lines. We used Shannon’s entropy methodto quantify the statistical properties of the portfolio weights since P Ni =1 ω i = 1, defined by E = N X i =1 − P i ln( P i ) , (6)where P i is the portfolio weight w i .Using the eigenvalue distribution predicted by the equation 2, we estimated a randompart from the original correlation matrix and as the previous paper [24], divided it two partsas follows. C original = C random + C filter . (7)5ased on how many random elements existed in the correlation matrix, we analyzed thenon-trivial interactions between stocks. In addition, to estimate the eigenvalue properties,we created the data sets by using each eigenvector element. R ( t ) ≡ N X i =1 V i r i ( t ) , (8)where r i ( t ) is the ith stock return at time t , and V i is the ith eigenvector. To observethe eigenvalue properties divided by the RMT method, we created the data sets, R Random ( t )and R Largest ( t ), reflecting the eigenvalue properties of both C random and C filter , respectively,and, by the one-factor model, widely acknowledged in the financial literature as a pricingmodel, we calculated the relationship between the created time series and the market factor,which influences all stocks in the market and is defined by r i ( t ) = α i + β i R Market ( t ) + ǫ i ( t ) , (9)where R Market is the KOSPI market index, α i and β i are the regression coefficients of stock i and use the β coefficient as the measurement to quantify the relationship between createddata sets and market index. III. RESULTS
In this section, we analyze the various statistical features of the correlation matrix of 473daily stock returns listed on the Korean stock markets from 3 January 1993 to 31 May. 2003using the random matrix theory and Markowitz portfolio theory. We present the resultson the statistical properties of the correlation matrix, such as its distribution, eigenvaluespectrum and entropy of portfolio weights calculated by MPT. Fig. 1(a) and (b) show thedistribution of the correlation matrices of the original and random data sets. Fig. 1(c) showsthe distribution of correlation matrices calculated by shifting 21 days with 250 data points.Fig. 1(d) displays the average value of correlation matrices of Fig. 1(c). In Fig. 1(a), wefind that the distribution of the correlation matrix between stocks for a whole period ispositively skewed and shows a significant difference from that for the random interactionin Fig. 1(b). In Fig. 1(c), we show that the distribution of the correlation matrix changesconsiderably over the whole time. Especially, in Fig. 1(d), during the Asian currency crisis,6he mean values of the correlation coefficients significantly increased. In other words, thedynamically changes were caused by the complex behavior of the market crash, unlike thecase of random interactions. Our findings confirm that all the possible interactions in theKorean stock market deviated from those for the random interaction.We next decompose the original correlation matrix into the random C random and filter C filter parts using the RMT method to extract the meaningful information from the originalcorrelation matrix. Fig. 2 shows the eigenvalue distribution of the correlation matrix in theKorean stock market. In Fig. 2, the solid-line (orange) is the eigenvalue spectrum predictedby the RMT, and the red circles and blue circles indicate the eigenvalue distributions of theoriginal time series and random data sets, respectively. In Fig. 2, we find that the eigenvaluedistribution of the RMT method is very similar to one from the random data, while that forthe real time series significantly shows different behavior. Moreover, the largest eigenvalueis 52 times larger than the largest eigenvalue of the RMT. The large values are greater than25 times those in the United States stock market [24].To characterize the statistical properties of each eigenvalues, we created the return timeseries using equation 8 and calculated the slopes β between those and the KOSPI marketindex using equation 9. Fig. 3(a) and (b) shows the distribution of the eigenvector elementscorresponding to both the largest eigenvalue, λ and λ , one of eigenvalues of the RMT,respectively. Fig. 3(c) and (d) show the β coefficient between the KOSPI market index andthe time series created. We find that the β between the market index and time series is 0.8,while one from the time series created using the eigenvector elements predicted by the RMT isapproximately zero. We argue that the largest eigenvalue can explain the market propertieswell, but one from the ranges predicted by the RMT is uncorrelated to the market index.We also decomposed the original correlation matrix according to each eigenvalue divided bythe RMT method. Fig. 4 shows the distribution of various correlation matrices. The redcircles, blue diamonds, black squares and pink triangles indicate the correlation matricesof the original, random, filter and largest eigenvalues, respectively. Through the abovefindings, we can expect that the distribution of the random correlation matrix C random followsa Gaussian distribution, while the correlation matrix C filter estimated after removing therandom components from the original correlation matrix by the RMT method has a similardistribution as the original time series. We found that the correlation matrix reflecting thelargest eigenvalue property has an obvious difference from that of the original time series.7o apply the RMT method to a portfolio optimization problem, we analyzed the portfolioweights estimated by the MPT through various correlation matrices. The important param-eters are the return, µ i , standard deviation, σ and correlation matrix, C ij , of the originalstock returns, which are needed to calculate the portfolio weights of each stock. To calculatethe effects of the correlation matrix among the input parameters, we apply the correlationmatrices, C filter , and C random divided by the RMT method. Fig. 5(a) shows the efficientportfolio lines created using the various correlation matrices, such as C original , C random , and C filter . In Fig. 5 (a), we found that the efficient frontier lines calculated with both theoriginal C original and filtered correlation matrices C filter show very similar behavior, whilethat of the random correlation matrix C random shows significant difference from the originalcorrelation. In addition, the efficient portfolio frontier line of the random correlation matrix C random at a given portfolio risk σ overestimates the portfolio return, µ , by a greater amountthan one of the original correlation matrix. We next calculated the entropy of the portfolioweights with each correlation matrix, such as C original , C filter and C random . Fig. 5(b) showsthe relationship between the portfolio risk, σ , and the entropy of the portfolio weights forthree types of correlation matrices according to a log-log plot. We found that the entropy( σ )for both the original and filtered correlation matrices was approximately consistent with apower-law function, E ( σ ) ∼ σ − γ with the exponent γ ∼ .
92, while there is no the power-lawfunction in the relationship between the entropy and the portfolio return, µ and presentedin Fig 5(c). We also calculated the exponents for each sub-periods by shifting 20 days with500 data points to verify the stability over time the result observed in Fig. 5. We find thatwhile the relationship between entropy of each portfolio weight and portfolio risk follow apower-law function, the exponent values, γ , calculated from each sub-periods changes overtime and lie within 1 . ≤ γ ≤ .
23. Especially, the γ value calculated during the Asiancurrency crisis decreases significantly. IV. CONCLUSIONS
We investigated the statistical properties of the correlation matrix between the returntime series of individual stocks traded in the Korean stock market using the RMT methodand observed the effect of the correlation matrix applied to the Markowitz portfolio theory.We found that the distribution of the correlation matrix between stocks showed a positive8kew and dynamically changed over time. We found that the eigenvalue distribution of thecorrelation matrix deviated from those of the RMT, and the largest eigenvalue was 52 timeslarger than the eigenvalues predicted by the RMT. The slopes β between market index andthe time series corresponding to the largest eigenvalue were 0.8, while those for the RMTwere approximately zero. Notably, we found that the entropy function E ( σ ) of portfolioweights with the portfolio risk σ was consistent with a power-law function, E ( σ ) ∼ σ − γ ,with the exponent γ ∼ .
92, while the relationship between the entropy and portfolio return µ is not a power-law function. We find that while for all sub-periods the exponents calculatedfrom the relationship between entropy of each portfolio weight and portfolio risk follow apower-law function, those for sub-periods changed over time and lie within 1 . ≤ γ ≤ . γ decreases significantly during Asian currency crisis. In the nextstep, we must rigorously study the portfolio weights of other stock markets because theseplay an important role in terms of the portfolio risk and return. [1] T. Lux and M. Marchesi, Nature 397, 498 (1999); T. Lux J. Econ. Behav. Organizat. 33,143165 (1998).[2] R. N. Mantegna et al. , Nature, 376, 46 (1995); R. N. Mantegna et al. , Nature, 383, 587 (1996);V. Plerou et al. , Nature, 421, 130 (2003); X. Gabaix et al. , Nature, 423, 267 (2003);[3] G. Oh et al. , J. Korean Phys. Soc., 48, 197 (2006); Y. Liu et al. , Phys. Rev. E, 60, 1390 (1999);K. Yamasaki et al. , Proc. Natl. Acad. Sci., 102, 9424 (2005); W. C. Jun et al. , Phys. rev. E,73 066128 (2006);[4] R. N. Mantegna and H. E. Stanley, An Introduction to Econophysics: Correlation and Com-plexity in Finance (Cambridge University Press, Cambridge, U.K., 1999); J-P. Bouchaud, M.Potters
Theory of Financial Risk and Derivative Pricing: From Statistical Physics to RiskManagement (Cambridge University Press, Cambridge, USA, 2004).[5] E. Elton, M. Gruber,
Modern Portfolio Theory and Investment Analysis (Wiley, New York,1981).[6] Y. Okhrin, W. Schmid, J. Econometrics 134. 235 (2006).[7] D. Sornette, P. Simonetti and J. V. Andersen, Phys. Rep. 335. 19 (2002)[8] J. D. Noh, Phys. Rev. E 61. 5981 (2000).
9] J. E. Jackson ,
A User’s Guide to Principal Components ( Wiley-Interscience; New Ed edition,2003).[10] J. E. Gentle,
Singular Value Factorization (Berlin: Springer-Verlag, 1998).[11] D. F. Morrison,
Multivariate Statistical Methods (New York: McGraw-Hill 1990).[12] M. L. Mehta, Random Matrices (Academic Press, Boston, 1991).[13] E. P. Wigner, Ann. Math. 53, 36 (1951); E. P. Wigner, Proc. Cambridge Philos. Soc. 47, 790(1951).[14] F. J. Dyson, J. Math. Phys. 3, 140 (1962); F. J. Dyson and M. L. Mehta, J. Math. Phys. 4,701 (1963).[15] M. L. Metha, Nucl. Phys., 18 395 (1960); M. L. Mehta and F. J. Dyson, J. Math. Phys. 4,713 (1963); M. L. Metha, Commun. Math. Phys., 20 245 (1971);[16] T. Guhr, A. M¨ u ller-Groeling, and H. A. Weidenm¨ u ller, Phys. Rep. 299, 190 (1998)[17] A. M. Sengupta and P. P. Mitra, Phys. Rev. E 60, 3 (1999).[18] P. Sˇ e ba, Phys. Rev. Lett. 91. 19 (2003).[19] A. Utsugi, K. Ino, and M. Oshikawa, Phys. Rev. E 70, 026110 (2004).[20] T. Guhr and B. K¨alberzk, arXiv:cond-mat/0206577 (2002).[21] S. Shari, M. Crane, A. Shamaie, H. Ruskin, Physica A 335, 629 (2004).[22] L. Laloux, P. Cizeau, J-P. Bouchaud, and M. Potters, Phys. Rev. Lett. 83, 1467 (1999)[23] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, and H. E. Stanley, Phys. Rev.Lett. 83 1471 (1999).[24] V. Plerou, P. Gopikrishnan, B. Rosennow, L. A. N. Amaral, T. Guhr, and H. E. Stanley, Phys.Rev. E 64, 066126 (2002).[25] B. Rosenow, V. Plerou, P. Gopikrishnan, L. A. N. Amaral, and H. E. Stanley, Int. J. ofTheoret. Appl. Finance 3, 399 (2000).[26] P. Gopikrishnan, B. Rosenow, V. Plerou, and H. E. Stanley, Phys. Rev. E 64, 035106 (2001).[27] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, and H. E. Stanley, Physica A 287,374 (2000).[28] B. Rosenow, P. Gopikrishnan, V. Plerou, and H. E. Stanley, Physica A 314, 762-767 (2002).[29] B. Rosenow, P. Gopikrishnan, V. Plerou, and H. E. Stanley, ”Random Matrix Theory andCross-Correlations of Stock Prices,” in Empirical Science of Financial Fluctuations: The Ad-vent of Econophysics, edited by H. Takayasu (Springer-Verlag, Tokyo, 2002). C ij P ( C ij ) −0.1 −0.05 0 0.05 0.100.010.020.030.04 C ij P ( C ij ) −1 0 105010015000.51 C ij t P ( C ij )
20 40 60 80 100 120−0.100.10.20.30.4 t < C ij ( t ) > Market Crash (a) (b) (c) (d)
FIG. 1: (a) and (b) show the distribution of the correlation coefficients between stocks of 473companies of taken from the Korean stock market and random data, respectively. (c) displays thedistribution of the correlation matrices of the sub-periods by shifting 21 days with 251 data pointsand (d) shows the average values of each correlation matrix in (c)[30] H. M. Markowitz, J. Finance 7, 77 (1952) IG. 2: The distribution of the eigenvalues for correlation matrix estimated using the 473 companieslisted on the Korean stock market, random data following the iid(0,1) process, and that predictedby the RMT method. The red circles, blue circles, and pink solid-line indicate the original timeseries, random data, and theoretical lines, respectively. Eigenvector (u) ρ ( u ) Eigenvector (u) ρ ( u ) −5 0 5−4−20246 R Market R R ando m −10 −5 0 5−10−50510 R Market R (a) λ − < λ < λ + λ =473 λ =473 λ − < λ < λ + FIG. 3: (a) and (b) show the distribution of both eigenvectors corresponding to λ and λ ,respectively. (c) and (d) display the β coefficients between the normalized market index and thetime series created by equation (9) for the eigenvalues λ and λ . The value of both β and β are zero and 0 .
8, respectively. C ij P ( C ij ) C Original C Random C Filter C Largest
FIG. 4: The distribution of the original correlation matrix, C original , and those created by therandom matrix theory, C random , C filter , and C largest , respectively. The red circles, blue diamonds,black squares and pink triangles indicate the correlation matrices corresponding to the original,random, filter and largest eigenvalue, respectively. σ ) R e t u r n ( µ ) OriginalRandomFilter −1 −2 −1 Risk ( σ ) E n t r op y ( σ ) FIG. 5: (a) shows the efficient portfolio frontier lines for the original, C original , random, C random ,and filter correlation matrix, C filter , respectively. (b) displays the relationship between the entropyof the portfolio weights and the portfolio risk. (c) displays the relationship between the entropyand the portfolio return.
995 1997 1999 2001−3.5−3 −2.5−2−1.5−1 time γ Asian currencycrisis
FIG. 6: The exponent of power-law function, E ( σ ) ∼ σ γ estimated by the relationship betweenthe portfolio risks, σ , and the entropy of portfolio weights., and the entropy of portfolio weights.