A Study of Correlations in the Stock Market
AA Study of Correlations in the Stock Market
Chandradew Sharma ∗ and Kinjal Banerjee † Department of Physics, BITS Pilani K.K Birla Goa Campus, N.H. 17B Zuarinagar, Goa 403726, India. (Dated: April 23, 2015)We study the various sectors of the Bombay Stock Exchange(BSE) for a period of 8 years fromApril 2006 - March 2014. Using the data of daily returns of a period of eight years we make adirect model free analysis of the pattern of the sectorial indices movement and the correlationsamong them. Our analysis shows significant auto correlation among the individual sectors and alsostrong cross-correlation among sectors. We also find that auto correlations in some of the sectorspersist in time. This is a very significant result and has not been reported so far in Indian contextThese findings will be very useful in model building for prediction of price movement of equities,derivatives and portfolio management. We show that the Random Walk Hypothesis is not applicablein modeling the Indian market and Mean-Variance-Skewness-Kurtosis based portfolio optimizationmight be required. We also find that almost all sectors are highly correlated during large fluctuationperiods and have only moderate correlation during normal periods.
I. INTRODUCTION
The stock market is an extremely complex system with various interacting components [1]. The movement of stockprices are somewhat interdependent as well as dependent on a wide multitude of external stimuli like announcementof government policies, change in interest rates, changes in political scenario, announcement of quarterly results bythe listed companies and many others. The overall result is a chaotic complex system which has so far proved verydifficult to analyze and predict. In fact it is still not completely clear, what are the generic features that will appearin any stock market and what are the features which depend on the social, political and economic climate of thecountry and/or of the world. So it is important to study each market individually so that finally we can be sure thatcertain behaviors or patterns are universal. Although some amount of work has been done in understanding the stockmarkets in Europe [2] and the United States[3], the proper mathematical and statistical study of emerging marketslike India are in their infancy [4].So far, there is no exact understanding on which external stimulus has how much effect on the stock prices oreven how the self interactions of the various stocks or the various sectors drive the market. Broadly speaking, theprice movement of a particular stock can be classified as (i) market (common to all stocks), (ii) sector (related to aparticular business sector) and (iii) idiosyncratic (limited to an individual stock). While it is virtually impossible todevelop any theory for this idiosyncratic movement, it is possible to analyze, study and build models for the other twotypes of stock movement. From an investors point of view, the most important reason to understand the stock marketis to get the maximum possible return on an investment with the minimum possible risk. So a better understandingof the stock market will lead to better theories of portfolio management.One important step in improving our understanding of the stock market is to study how the stock price movementof one stock affects the price of other stocks. One way to do this would be to see how one stock movement affects theothers within the same sector. Another is to study how the overall prices of the various sectors are correlated. Thegoal of this study is to try to determine and quantify, from the available data, some of the possible correlations whichmight exist between the stock prices. This will not only enhance the understanding of the stock market as a whole butwill play a crucial role in investment decisions like portfolio management. A systematic model independent analysis ofthe data that we do will also help in building more efficient and enhanced models which will give adequate weightageto the various relations which exist between movement of stock prices across sectors in a market and may help inforecasting future trends. Studies of such correlations have been carried out to a limited extent in the context of NewYork Stock Exchange [5], but to the best of our knowledge, no such study exists for the Bombay Stock Exchange(BSE) [6].To understand the financial market, it is very important to know the distribution of the return on a stock. Ourdata consists of the daily returns of 12 sectors of stocks of the BSE for Financial Year(FY) 2006 to FY 2013 i.e. 1990days from 3rd April 2006 to 31st March 2014. We will be treating each sector as one entity in the rest of the paper.This approach is novel and has not been carried out before, at least in the context of Indian markets. ∗ [email protected] † Corresponding Author; [email protected] a r X i v : . [ q -f i n . S T ] A p r If P i ( t ) is the index of the sector i = 1 , . . . , N at time t , then the (logarithmic) return of the i th sector over a timeinterval t = 1 to t = T days in the interval is defined as R i ( t ) ≡ ln P i ( t + 1) − ln P i ( t ) (1)In our case T = 1900, the number of days we have considered, and N = 13 because we look at the following 12 sectorsS&P BSE Auto (Auto), S&P BSE Bankex (Bankex), S&P BSE Consumer Durables (CD), S&P BSE Capital Goods(CG) , S&P BSE FMCG (FMCG), S&P BSE Health care (HC), S&P BSE IT (IT), S&P BSE Metal (Metal), S&PBSE Oil and Gas (Oil and Gas), S&P BSE Power (Power), S&P BSE Realty (Realty) and S&P BSE Teck (Teck) andthe S&P BSE SENSEX (Sensex) which serves as the benchmark. The plot of the Sensex index and the log returnover the time interval under consideration is given in Figure 1. From Figure 1, it is clear that we can divide the entireperiod in two sub interval (i) from FY 2006 - FY 2009 as large fluctuation period and (ii) from FY 2010 - FY 2013 fornormal period. We shall discuss how the cross correlations of the sectors are markedly different in these two periods,later in the paper. Obviously the mean return of the i th sector is given by¯ R i = 1 T T (cid:88) t =1 R i ( t ) (2)Defining R (cid:48) i = ( R i ( t ) − ¯ R i ), we can write the k th moment of the i th sector as m k ( i ) = 1 T T (cid:88) t =1 ( R (cid:48) i ( t )) k (3)For example, the second moment gives the variance as σ ( i ) = 1 T T (cid:88) t =1 ( R (cid:48) i ( t )) (4)These definitions are used in the analysis subsequently.Our paper is organized as follows. In section II we explore the individual sectors mentioned above and use thedata to determine some features of the distribution of the returns and find significant deviations from normality. Wethen calculate the autocorrelation of log returns for all sectors indices, to test the market efficiency, and find thatthere is significant autocorrelation in most of the sectors of BSE at lag 1. The more surprising result is that theanalysis of our data shows that the autocorrelations in some sectors persist at higher lags. In section III we analyzethe cross-correlations among sectors in BSE. Our study spans over FY 2006 - 2013, a time span which consisted aperiod large fluctuation in indices movement and normal fluctuation period. We find that, almost all sectors are highlycorrelated during period 2006 - 2009 and they are moderately correlated during 2009 - 2013. We finally conclude insection IV with a summary of our results and its interpretations. II. UNDERSTANDING BSE SECTORS
It is commonly believed that the distribution for log return of a stock or for log change in a index movement isa normal distribution. However, many empirical studies shows deviation from this perception. Consequently, anyprediction based on the normal distribution will generally fail. In particular, if there is any deviation from normality,the Random Walk Hypothesis will not be valid. Therefore, it is essential to first understand the distribution of anystock or index movement before using any model. Let us first consider the distribution of returns for the varioussectors.The study of Skewness and Kurtosis is very useful to characterize the distribution. We know that if a distributionis normal, then sample Skewness and sample excess Kurtosis will be close to zero [7]. Any significant deviation fromzero indicates a deviation from normality. The sample skewness and excess kurtosis of any distribution of the i thsector can be written in terms of the moments (3) asSample Skewness: G ( i ) = T ( T − T − m ( i ) m ( i ) / (5)Sample excess Kurtosis: G ( i ) = ( T − T − T − (cid:34) ( T + 1) (cid:32) m ( i ) m ( i ) − (cid:33) + 6 (cid:35) (6) (a)Variation of BSE index over time (b)Variation of (log) return of BSE over time(c)Variation of (log) return of BSE over time FIG. 1. Behavior of S& P BSE SENSEX for April 2006-Mar 2014
The Standard Error in Skewness(SES) and Standard Error in Kurtosis(SEK) are given by[7]
SES = (cid:115) T ( T − T − T + 1)( T + 3) ; SEK = 2
SES (cid:115) T − T − T + 5) (7)The sample skewness and sample excess kurtosis of the above sectors are displayed in Figure 2. The StandardError in Skewness( SES = 0 .
06) and Standard Error in Kurtosis(
SEK = 0 .
11) are calculated based on the formulaegiven by (7). It is clear from Figure 2 that there is significant deviation from zero for sample Skewness and sampleexcess Kurtosis in all sectors. Hence, based on the study of sample Skewness and sample excess Kurtosis, we can sayconfidently that each individual sector’s return shows positive Kurtosis (fat tails) accompanied by Skewness. Thisclearly shows that the returns of none of the sectors are normally distributed.To further strengthen this claim we perform the D’Agostino-Pearson omnibus test [7]. It is based on two quantities
FIG. 2. Sample Skewness and Sample excess Kurtosis of all the Sectors for April 2006-Mar 2014 depending on both Skewness and Kurtosis. The quantities are defined as follows: Z G ( i ) = G SES ; Z G ( i ) = G SEK (8) DP ( i ) = ( Z G ) + ( Z G ) (9)If the distribution of i th sector is normal, the DP ( i ) should be that of χ distribution. So if DP ( i ) > χ critical ,then the distribution of the i th sector is not normal. For a normal distribution of i th sector, χ critical (2 df ) should be13 .
82 with significance level of 0 . DP values for all the sectors are much larger than13 .
82. Therefore, the statistical results clearly indicate that the data does not satisfy the normality assumption, i.e.the change in index movement of individual sectors shows large deviation from normal distribution . This findingis also consistent with recent works [8] and shows that returns are driven by assymetric and fat-tailed distribution.This also clearly indicates that the market cannot be modeled using the Random Walk Hypothesis [9]. For stockmarket modeling or from the perspective of portfolio management the mean-variance model [10] should be expandedby mean-variance-skewness-kurtosis based portfolio optimization [11].To further explore the nature of the auto correlation, we look at the time series of the auto correlation data for thevarious sectors. This study is important because if there is autocorrelation in the time series we can predict immediatefuture based on present information. If there is no autocorrelation in the time series data, the data are uncorrelatedand it is not possible to make future predictions confidently.To emphasize, if there is autocorrelation in the time series at lag 1, it is possible to make predictions about immediatefuture with high degree of certainty. Here, we have estimated the sample auto covariance at lag k for a finite i th timeseries R i ( t ) of T observations by [12] γ ik = 1 T T − k (cid:88) t =1 ( R i ( t ) − ¯ R i )( R i ( t + k ) − ¯ R i ) (10)where R i ( t ) is given by our definition (1). The autocorrelation at lag k can then be estimated as: ρ ik = γ ik γ i (11)The function ρ ik is known as the Auto Correlation Function(ACF).We have used the Bartlett’s approximation [13] to estimate the variance of the ACF, at lags k greater than somevalue q beyond which the autocorrelation function may be deemed to have died out . This is defined as [12]: var [ ρ ik ] ≈ T q (cid:88) ν =1 (1 + 2 ρ iν ) k > q (12) (a)ACF for FMCG (b)ACF for Sensex(c)ACF for Realty FIG. 3. ACF for FMCG, BSE SENSEX and REALTY sectors for April 2006-Mar 2014
The standard error for estimated autocorrelation ρ ik is: SE [ ρ ik ] = (cid:112) var [ ρ ik ] (13)We calculate the auto correlation of log returns for all sector indices of BSE. It clearly shows that there is significantautocorrelation in most of the sectors of BSE at lag 1. Therefore, residual effect is confirmed in almost all sectors inBSE. A statistically significant ACF value at lag 1 indicates an autoregressive component exists in the time series. Infact, we find some auto correlation in most of the sector persists over time.Our results show that there is very little auto correlation in FMCG, weak autocorrelation in IT, Teck and Oil andGas, and significant lag 1 auto correlation in Auto, Bankex, CD, CG, HC, Metal, Power, Realty, and also in Sensex.In Figure 3, we have plotted the ACF for three BSE sectors for illustration. The figure also shows how the ACFpersists in time. This feature, obtained by the analysis of our data, is extremely striking and has not been reportedin literature before. Further analysis is required, in future works, to fully understand this feature. FIG. 4. β of all the Sectors for April 2006-Mar 2014 The study of the ACF is an empirical test of the efficiency of the BSE market for the period under consideration.The persistence of auto correlation we see above clearly indicates that the BSE is not an efficient market. Forexample Figure 3 shows the significant consistent autocorrelation in REALTY (lags 1 , , , , , , , , ,
23) and Sensex (lags 1 , , , , , weak efficient. A departure from weak efficiency (i.e. deviation fromrandom walk) may point towards possible market manipulation.The autocorrelation exhibited by the BSE sectors agrees with the findings in [15]. Those authors also show thatautocorrelation in returns might generate a momentum. Therefore, a BSE sector that outperformed other sectors inthe past might continue to do so for some time interval. These features in the auto correlation may be crucial forportfolio management in Indian equity marketsFinancial market volatility is central to the theory and practice of asset pricing, asset allocation, and risk manage-ment. Popular assumption is that volatilities and correlations are constant, but we have seen that they have significantvariation over time. Therefore, the study of β can be useful for investor [16]. The β factor is defined as β i = covariance( i, Sensex )variance( Sensex ) (14)A β of 1 indicates that the security’s price will move with the market. A β of less than 1 means that the security willbe less volatile than the market. A β of greater than 1 indicates that the security’s price will be more volatile thanthe market. As an example, from Figure 4 we can see that the β of the Bankex sector is 1.45 ( 45% more volatile thanthe market) while that of the HC sector is 0.49 ( less volatile than the market). A systematic study of this parameterwill be undertaken in a future work. III. CROSS-CORRELATION AMONG BSE SECTORS
In the last section we have shown that there is significant autocorrelation in most sectors. Let us now try tosee whether the movement of the indices in various sectors are also correlated i.e. whether there exists any crosscorrelation between the sectors. Some study of cross correlations of other markets have been carried out in differentcontexts [17] but to the best of our knowledge, there exists no studies of the correlations between sectors at least inthe context of the Indian financial markets.
FIG. 5. Cross-Correlation matrix for April 2006-Mar 2014
To understand the interactions among the sectors, it is useful to study the spectral properties of the correlationmatrix of sectorial indices movements. The deviation of eigenvalues of the correlation correlation matrix from those ofa random matrix provide signals about the underlying interactions between various sectors. The largest eigenvalue isidentified as representing the influence of the entire market, common for all sectors. The remaining large eigenvaluesare associated with the different sectors, as indicated by the composition of their corresponding eigenvectors [18].This is what we do in this subsequently in this section.If the time series of returns of N sectors of length T are mutually uncorrelated, then the resulting correlation matrixis random and is known as Wishart matrix [19]. It is known that the empirical distribution of the eigenvalues of theWishart matrix almost always converges to a probability distribution as T → ∞ and NT → a where a is a constantsuch that 0 ≤ a <
1. In that limit the distribution is continuous and supported on λ ∈ [(1 − √ a ) , (1 + √ a ) ] where0 < a < λ ofthe Wishart matrix should lie between 0 .
84 and 1 .
17. We estimate the sample cross correlation matrix for our dataset i.e for N = 13 sectors for T = 1990 days and NT → a = 0 . u of the cross correlation matrix as C = N (cid:88) i =1 λ i u i u Ti , (15)We find the eigenvalues λ i of the cross correlation matrix. The eigenvectors corresponding to these eigenvalues arethe PCs of the cross correlation matrix. These eigenvectors can be expanded in a basis given by the 13 sectors we areconsidering. All the eigenvalues and the expansion of the PCs in our chosen basis is given in Figure 6.From Figure 6 we can see significant deviation of the largest eigenvalue of the PC1 from the largest eigenvalue FIG. 6. Eigenvalues and Eigenvectors of Correlation matrix for April 2006-Mar 2014 of RMT. The largest eigenvalue of the cross correlation matrix is 9.11. Also, from the first column (correspondingto PC1) of Figure 6. the eigenvector of largest eigenvalue shows a relatively uniform composition, i.e. all sectorscontribute to it and all elements having the same sign.A very useful visualization of what we discussed above is the scree plot [20] as can be seen in Figure 7. The factthat the PC1 is so large and that it affects all the sectors with the same ratio, we can say that the largest eigenvalue isassociated with the collective response of the entire market to external informations [1, 21], i.e. the largest eigenvalueis due to the existence of a market-induced correlation across all sectors. Since PC1 dominates to such a large extentit is difficult to observe the correlations between sectors.From the investment point of view, it is interesting to note that the Tech and the IT sectors are highly correlatedall the time. Hence, it would be better to club both these the sectors together for modelling and for portfoliodiversification purposes.The scree plot also gives some very useful information about periods of large fluctuations. During the time of largefluctuations we find that there is a is large correlation among most of the sectors. As a comparison consider Figure8 where we compare the cross correlation matrices of a period of large fluctuation (April 2008 - March 2009) with aperiod of relatively small fluctuation (April 2012 - Mar 2013). As can be seen from the figure, although there existssignificant cross correlations at both the times, the magnitude is lesser in the later period. This indicates that periodsof large fluctuations can be studied using models where the correlation strength becomes large. Since periods of largefluctuations may correspond to crashes in the stock market, a systematic study of the cross correlation matrices ofthese periods will provide valuable insights into understanding and modeling crashes.A more efficient way of analyzing this would be by doing the Principle Component analysis we had performedpreviously in this section. Again, scree plots provide a more efficient and rigorous demonstration of the increase incorrelations during periods of crisis. As can be see in Figure 9, the PC1 when the entire market is experiencing largefluctuations is 9.91, while it comes down to 6.72 during period of relative calm. We can actually zoom in to the actual
FIG. 7. Eigenvalue and Eigenvector matrix of Correlation matrix for April 2006-Mar 2014 and Scree Plot for April 2006-Mar2014 (a)April 2008 - March 2009 (b)April 2012 - Mar 2013
FIG. 8. Comparison of Correlation Matrices time of the crash (Jan 2008) in Figure 10 using the quarterly and monthly data and see that the PC1 is actuallyhigher (11.32) during that time. This can provide a efficient and novel way of analyzing crashes of the stock market.
IV. CONCLUSIONS
In this paper, we have carried out a model independent analysis of the BSE for a period of 1990 days. This timeframe contains periods of both small and large fluctuations and thus provides a good sample to understand and studythe generic behavior of the stock market. Also the number of days chosen was large to avoid small sample size errors.Instead of studying the movement of individual stock returns as is usually done, we study the movement and behaviorof groups of stocks, the grouping being done in terms of sectors. We look at 12 sectors of stocks and use the whole0 (a)April 2008 - March 2009 (b)April 2012 - Mar 2013
FIG. 9. Comparison of Scree PlotsFIG. 10. Scree Plot for April Jan-Mar 2008 and for Jan 2008
Sensex as the benchmark. The auto correlations in the return data captures how the stocks within the individualsectors interact among themselves while the cross correlations look at how the sectors affect each other.We found the presence of significant auto correlations in all the sectors clearly demonstrating that the movementof the stock prices cannot be modeled via random walk. While this is usually a accepted feature of stock marketmodel, our analysis of the departure from normality is rigorous. It is not just based on the non zero skewness andkurtosis but also on D’Agostino-Pearson omnibus test. This comprehensively shows the existence of auto correlations.From an investors point of view, this means that the only mean-variance-skewness-kurtosis based methods of portfoliooptimization will be useful.A more interesting feature which we find in the study of auto correlations is that they persist over time. The ACFis significant for all sectors at lag one and there are certain sectors where this auto correlation persists at higher lags.This indicates that the BSE has significant departure from efficient market and EMH cannot be used to model the1stock price movement in BSE. This is a very interesting property of the stock market which has to be accountedfor in the future models. For financial markets to be meaningful and useful to the economy, they must be at leastweakly efficient. Some of the reasons why BSE is not efficient may be (i) weak disclosure procedure (ii) poor qualityand quantity of company’ disclosure (iii) almost no public awareness about securities (iv) no transparent regulation,supervision and administrative rule. This feature of the BSE should be of great interest not just to the investors butalso policy makers and market regulators. Further analysis of this will be done in a future work.We also study the relative volatility of the sectors compared to the whole market, measured in terms of β . Thisparameter, as we point out, should have a significant role in making investment decisions. How to use this parameterin building physics models of financial markets is a direction of future work.The cross correlation was studied by doing the Principle Component analysis of the correlation matrix. Our findingsshow that there exist a very large cross correlation but that correlation is due to some external force which drivesthe market as a whole. The effect of sectors on each other is smaller but not insignificant and will be the focus of afuture work. A very important feature following from our analysis is that the value of PC1 increases during periods oflarge fluctuation of the market. This can have far reaching application in studying and predicting crashes of financialmarkets. Acknowledgement
We would like to thank Dr. Sitabhra Sinha for careful reading of the manuscript and helpful suggestions.2