A General Class of Weighted Rank Correlation Measures
aa r X i v : . [ m a t h . S T ] J a n A General Class of Weighted Rank Correlation MeasuresM. Sanatgar a and A. Dolati b and M. Amini c a , b Department of Statistics, College of Mathematics, Yazd University, Yazd, [email protected] c Department of Statistics, Ferdowsi University of Mashhad, [email protected]
Abstract
In this paper we propose a class of weighted rank correlation coefficients extend-ing the Spearman’s rho. The proposed class constructed by giving suitable weights tothe distance between two sets of ranks to place more emphasis on items having lowrankings than those have high rankings or vice versa. The asymptotic distributionof the proposed measures and properties of the parameters estimated by them arestudied through the associated copula. A simulation study is performed to comparethe performance of the proposed statistics for testing independence using asymptoticrelative efficiency calculations.
Many situations exist in which n objects are ranked by two independent sources, wherethe interest is focused on agreement on the top rankings and disagreements on items at thebottom of the rankings, or vice versa. For example, every year a large number of studentsapply for higher education. The graduate committee of the university may like to choosethe best candidates based on some criteria such as GPA and the average of their gradesin the major courses that they passed during their bachelors level. In such cases, to mini-mize the cost of interviewing all of the candidates, a measure which gives more weightedfor those who have higher grades is required. Measures of rank correlation such as theSpearman’s rho and Kendall’s tau generally give a value for the overall agreement withoutgiving explicit information about those parts of a data set which are similar. This prob-lem motivates the definition of the Weighted Rank Correlation (WRC) measures whichemphasize items having low rankings and de-emphasize those having high rankings, orvice versa. Salama and Quade ([17],[15]) first studied the WRC of two sets of rankings,sensitive to agreements in the top rankings and ignore disagreements on the rest items in acertain degree. For application in the sensitivity analysis, Iman and Conover [8] proposedthe top-down concordance coefficient which centres on the agreement in the top rankings.1hieh [19] studied a weighted version of the Kendall’s tau which could place more em-phasis on items having low rankings than those have high rankings or vice versa. Blest[2] introduced a rank correlation measure which gives more weights to the top rankings.Pinto da Costa and Soares [14] proposed a WRC measure that weights the distance be-tween two ranks using a linear function of those ranks, giving more importance to higherranks than lower ones. Maturi and Abdelfattah [11] proposed a WRC measure with thedifferent weights to emphasize the agreement of the top rankings. Coolen-Maturi [3] ex-tended this index to the more than two sets of rankings but again the focus was only onthe agreement on the top or bottom rankings. The behaviour of several WRC measuresderived from Spearman’s rank correlation was investigated by Dancelli et al. [4]. Start-ing from the formula of the Spearman’s rank correlation measure, this paper proposeda general class of WRC measures that weight the distance between two sets of ranks.Two classes of weights, which are polynomial functions of the ranks, are considered toplace more emphasis the items having low rankings than those have high rankings or viceversa. The first one which extends the Blest’s rank correlation places more emphasis tothe agreement on the top rankings. The second one constructs a new class of WRC mea-sures and places more emphasis on the bottom ranks. The rest of the paper is organized asfollows. The proposed WRC measures are introduced in Section 2. The weighted corre-lation coefficients which are the population versions of the proposed WRC measures areintroduced in Section 3. The quantiles of the proposed WRC measures for small samplesand their asymptotic distributions for large samples are presented in Section 4. A simula-tion study is performed to compare the performance of the proposed statistics for testingindependence by using the asymptotic relative efficiency and the empirical powers of thetests, in Section 5. Finally, some discussions and possible extensions are given in Section6.
Let ( X , Y ) , ..., ( X n , Y n ) be a random sample of size n from a continuous bivariate distri-bution and let ( R , S ) , ..., ( R n , S n ) denote the corresponding vectors of ranks. The well-known Spearman’s rank correlation measure is given by ρ n , s = n − n n ∑ i = R i S i − ( n + ) n − . (1)A drawback of the Spearman’s rank correlation is that it generally gives a value for theoverall agreement of two sets of ranks without giving explicit information about those2able 1: Rankings of a product by 3 consumersA 1 2 3 4 5 6 7 8 9B parts of the sets which are similar. For example consider 3 consumers A, B and C thatranked 9 aspects of a product attributing ’1’ to the most important aspect and ’9’ to theleast important one. Their rankings are given in Table 1. As we see, the top ranks of (A,B)are more similar than those of (A,C) and the bottom ranks of (A,C) are more similar thanthose of (A,B), but the Spearman’s rank correlation gives the same value 0.416 for twosets of rankings (A,B) and (A,C). For the cases where the differences in the top rankswould seem to be more critical, Blest [2] suggests that these discrepancies should beemphasized. He proposed an alternative measure of rank correlation that attaches moresignificance to the early ranking of an initially given order. The Blest’s index is definedby γ n = n + n − − n − n ∑ i = (cid:18) − R i n + (cid:19) S i . (2)The values of the Blest’s index for two sets of rankings (A,B) and (A,C) are given by0.241 and 0.591, respectively. As we see, in situations such as the rankings (A,C) wherethe bottom ranks should be emphasized, the Blest’s index is a suitable rank correlationmeasure. In the following we develop a general theory for weighted rank correlationmeasures by giving suitable weights to the distance between two sets of ranks to placemore emphasis on items having low rankings than those have high rankings, or vice versa.Let D i = S i − R i , i = , ..., n . The most common form of the Spearman’s rank correlationcoefficient between two sets of rankings R , ..., R n and S , ...., S n is given by (Kendall, [9]) ρ n , s = − n ∑ i = D i max ( n ∑ i = D i ) , (3)where max ( n ∑ i = D i ) = ( n − n ) / S i = n + − R i , i = , ..., n . Through-out the rest of the paper we assume, without loss of generality, that the sample pairsare given in accordance with the increasing magnitude of X components, so that R i = i ,for i = , , ..., n and D i = S i − i . According to Blest’s idea [2] if the set of points3 , ) , [( ∑ ki = ( n + − i ) , ∑ ki = ( S i − i ) , k = , ..., n ] determined in the coordinate plane, theSpearman’s rho is normalized version of the sum of bars made the width of the givenpoints, i.e. ∑ nk = ∑ ki = ( S i − i ) as a measure of the disarray of originally ordered data, i.e., ρ n , s = − ∑ nk = ∑ ki = ( S i − i ) max ( ∑ nk = ∑ ki = ( S i − i )) . (4)By changing the order of summation, it is easy to see that ∑ nk = ∑ ki = ( S i − i ) = ∑ nk = D i .While (4) and (3) are two different representations of ordinary Spearman’s rho, the Blest’sindex is normalized version of the area which appears from connecting the mentionedpoints to each other. By looking again to the Blest’s index, one can imagine that the bars η k = ∑ ki = ( S i − i ) , k = , ..., n is assigned a certain weight, in comparison to the Spear-man’s rho that does not give any weight to the mentioned bar. Now we consider a generalclass of WRC measures of the form ν n = − n ∑ i = w i η i max ( n ∑ i = w i η i ) , (5)where the positive constants w i s are suitable weights. To construct WRC measures whichare sensitive to agreement on top rankings (lower ranks), for p = , , ... and n >
1, wechoose the weights w i = ( n + − i ) p − ( n − i ) p . The class of WRC measures constructedby (5) is then ν ( l ) n , p = + n ∑ i = ( i − S i )( n + − i ) pn ∑ i = ( n + − i )( n + − i ) p . (6)Alternatively, by choosing the weights w i = i p − ( i − ) p , one can obtain measures whichare sensitive to agreement on bottom rankings (upper ranks). In this case the class ofWRC measures (5) is simplified to ν ( u ) n , p = + n ∑ i = ( i − S i )( n p − ( i − ) p ) n ∑ i = ( n + − i )( n p − ( i − ) p ) . (7)Let κ n , p = ∑ ni = i p . It is easily seen that n ∑ i = ( n + − i )( n + − i ) p = κ n , p + − ( n + ) κ n , p , and n ∑ i = ( n + − i )( n p − ( i − ) p ) = κ n − , p + − ( n − ) κ n − , p . κ n , p as ν ( l ) n , p = ( n + ) κ n , p − ∑ ni = S i ( n + − i ) p κ n , p + − ( n + ) κ n , p , (8)and ν ( u ) n , p = − ( n + ) κ n − , p + ∑ ni = S i ( i − ) p κ n − , p + − ( n − ) κ n − , p . (9)Note that for p =
1, both of ν ( l ) n , p and ν ( u ) n , p reduce to the Spearman’s rank correlation coef-ficient (1). For p =
2, the measure ν ( l ) n , p reduces to the Blest’s rank correlation coefficient(2). The coefficients ν ( l ) n , p and ν ( u ) n , p are asymmetric WRC measures; that is, the correlationof ( X , Y ) is not the same as those of ( Y , X ) . One can obtain the symmetrized version of(6) as follows ν ( s . l ) n , p = ν ( l ) n , p ( X , Y ) + ν ( l ) n , p ( Y , X ) = ( n + ) κ n , p − ∑ ni = [ S i ( n + − i ) p + i ( n + − S i ) p ] κ n , p + − ( n + ) κ n , p . Similarly the symmetrized version of (7) is given by ν ( s . u ) n , p = − ( n + ) κ n − , p + ∑ ni = [ S i ( i − ) p + i ( S i − ) p ] κ n − , p + − ( n − ) κ n − , p . For p = ν ( s . l ) n , p and ν ( s . u ) n , p are equal to the Spearman’s rank correlation(1). For p = ν ( s . l ) n , p is the symmetrized version of the Blest’s index (2). Table2 shows the values of ν ( l ) n , p , ν ( u ) n , p , p = , , , , ν ( l ) n , p , ν ( u ) n , p and theirsymmetrized versions take values in [ − , ] . In particular, the value of these measures isequal to 1 when S i = i (a perfect positive dependency between two sets of ranks) and theytake − S i = n + − i (a perfect negative dependency between two sets of ranks). In this section we introduce the weighted correlation coefficient’s ν ( l ) p and ν ( u ) p and theirsymmetrized versions ν ( s . l ) p and ν ( s . u ) p as the population counterparts of the WRC mea-sures ν ( l ) n , p , ν ( u ) n , p , ν ( s . l ) n , p and ν ( s . u ) n , p . Each of these coefficients can be expressed as a linearfunctional of the quantity A ( u , v ) = C ( u , v ) − Π ( u , v ) , where C is the copula [21] associ-ated with the pair ( X , Y ) and Π ( u , v ) = uv is the copula of independent random variables.5able 2: Values of the Spearman’s rho and WRC measures for three sets of rankings in Table 1. ( A , C ) ( A , B ) p ν ( l ) n , p ν ( u ) n , p ν ( s . l ) n , p ν ( s . u ) n , p ν ( l ) n , p ν ( u ) n , p ν ( s . l ) n , p ν ( s . u ) n , p For p = , , ... , we have ν ( l ) p = ( p + )( p + ) Z Z ( − u ) p − ( C ( u , v ) − uv ) dudv , ν ( u ) p = ( p + )( p + ) Z Z u p − ( C ( u , v ) − uv ) dudv , ν ( s . u ) p = ( p + )( p + ) Z Z ( u p − + v p − )( C ( u , v ) − uv ) dudv , ν ( s . l ) p = ( p + )( p + ) Z Z (( − u ) p − + ( − v ) p − )( C ( u , v ) − uv ) dudv . (10)Note that for p =
1, all of these coefficients reduce to the Spearman’s rho given by ρ s = Z Z ( C ( u , v ) − uv ) dudv . For p =
2, the coefficient ν ( l ) p reduces to the Blest’s correlation coefficient [5] given by γ = Z Z ( − u ) C ( u , v ) dudv − . Remark 1.
A probabilistic interpretation can be made for the weighted correlation co-efficients ν ( l ) p and ν ( u ) p and their symmetrized versions ν ( s . l ) p and ν ( s . u ) p . We provide anillustration for ν ( l ) p . For p = , , ... consider the cumulative distribution functionF p ( u , v ) = ( − ( − u ) p ) v , ≤ u , v ≤ . et ( U , V ) be a random vector with the joint distribution function F p . For a copula C, thecoefficient ν ( l ) p has the following representation ν ( l ) p = ( p + )( p + ) p Z Z ( C ( u , v ) − uv ) dF p ( u , v )= E F p [ C ( U , V ) − Π ( U , V )] E F p [ M ( U , V ) − Π ( U , V )] , where M ( u , v ) = min ( u , v ) and E F P denotes the expectation with respect to F p . Thus, thecoefficient ν ( l ) p can be considered as an average distance between the copula C and theindependent copula Π , where the average is taken with respect to the bivariate distribu-tion function F p . The proposed weighted correlation coefficients could be seen as averagequadrant dependent (AQD) measures of association studied in [1]. For ν ( l ) p , it is more convenient to use the following alternative representation ν ( l ) p = ( p + )( p + ) p Z Z ( − u ) p ( − v ) dC ( u , v ) − p + p , (11)which follows from the fact that Z Z C ( u , v ) dF p ( u , v ) = P ( W ≤ U , Z ≤ V )= P ( U ≥ W , V ≥ Z )= Z Z ¯ F p ( u , v ) dC ( u , v ) , where ( W , Z ) and ( U , V ) are two independent pairs distributed as the copula C and thejoint distribution F p , respectively and ¯ F p ( u , v ) = P ( U > u , V > v ) = ( − u ) p ( − v ) , is thesurvival function associated with F p . An alternative representation for ν ( u ) p is given by ν ( u ) p = ( p + )( p + ) p Z Z ( − u p )( − v ) dC ( u , v ) − ( p + ) . (12)In the following examples we provide the values of the weighted correlation coefficient’s ν ( l ) p and ν ( u ) p for some copulas. Example 1.
Let C θ be a member of the Cuadras-Aug´e family of copulas [12] given byC θ ( u , v ) = [ min ( u , v )] θ [ uv ] − θ , θ ∈ [ , ] . (13) This family of copula is positively ordered in θ ∈ [ , ] , that is, for θ ≤ θ , we havethat C θ ( u , v ) ≤ C θ ( u , v ) for all u , v ∈ [ , ] . This family of copulas has no lower taildependence, whereas the upper tail dependence parameter is given by λ U = θ [12]. Forthis family of copulas we have ν ( u ) p = ( p + )( p + p − + θ ) p ( p + − θ ) , (14)7igure 1: The values of ν ( l ) p and ν ( u ) p , p = , , , , ,
10, for Cuadras-Aug´e family ofcopulas. and ν ( l ) p = θ ( p + ) ( − p ( p + ) B ( p , − θ )) p ( − θ ) , (15) where B ( a , b ) = R R x a − ( − x ) b − dx, is the beta function. For every θ ∈ [ , ] , ν ( u ) p isincreasing in p and ν ( l ) p is decreasing in p. For p = , , , , , , the values of ν ( l ) p and ν ( u ) p , as a function of θ is plotted in Figure 1. In particular for this family of copulas itfollows that for every θ ∈ [ , ] and p = , , ... , ν ( l ) p ≤ ν ( l ) = ρ s = ν ( u ) ≤ ν ( u ) p . Example 2.
Let C θ be a member of Raftery family of copulas [12] given byC θ ( u , v ) = min ( u , v ) + − θ + θ ( uv ) − θ n − [ max ( u , v )] − + θ − θ o , θ ∈ [ , ] . This family of copulas is also positively ordered in θ ∈ [ , ] and has no upper tail de-pendence, whereas the lower tail dependence parameter is given by λ L = θθ + [12]. Thevalues of ν ( ) p and ν ( u ) p , p = , , , , , , as a function of θ are plotted in Figure 2. Forthis family of copulas as we see for every θ ∈ [ , ] and p = , , ... , ν ( u ) p ≤ ν ( u ) = ρ s = ν ( l ) ≤ ν ( l ) p . The asymptotic behavior of the proposed WRC measures in general can be studied by thestandard results from the theory of empirical processes [22]. Before that, we mention the8 .0 0.2 0.4 0.6 0.8 1.0 . . . . . . q . . . . . . q . . . . . . q . . . . . . q . . . . . . q . . . . . . q rn ( l ) n ( l ) n l ( l ) n ( l ) n ( l ) . . . . . . q . . . . . . q . . . . . . q . . . . . . q . . . . . . q . . . . . . q rn ( u ) n ( u ) n ( u ) n ( u ) n ( u ) Figure 2: The values of ν ( l ) p and ν ( u ) p , p = , , , , ,
10 for Raftery family of copulas.asymptotic formula for κ n , p = ∑ ni = i p , that we need in the sequel. By definition of theRiemann integral, it holds that κ n , p n p + = n n ∑ i = (cid:18) in (cid:19) p = Z x p dx + O ( n − ) . (16)Let ( X , Y ) , ..., ( X n , Y n ) be a random sample of size n from a pair ( X , Y ) of continuousrandom variables with the joint distribution function H , marginal distribution functions F and G and the associated copula C . Let ( , S ) , ( , S ) , ..., ( n , S n ) be the ranks of therearranged sample. It is known that (R¨uschendorf, [16]) the copula C can be estimated bythe empirical copula defined for all u , v ∈ [ , ] by C n ( u , v ) = n n ∑ i = I ( in + ≤ u , S i n + ≤ v ) , where I ( A ) denotes the indicator function of the set A . The empirical versions of theweighted correlation coefficient’s ν ( l ) p and ν ( u ) p and their symmetrized versions ν ( s . l ) p and ν ( s . u ) p defined by (10), could be written in terms of the empirical copula C n . By plugging C n in (11), the empirical version of ν ( l ) p is of the form˜ ν ( l ) n , p = ( p + )( p + ) p Z Z ( − u ) p ( − v ) dC n ( u , v ) − p + p .
9y using the representation (8) and the identity (16), straightforward calculations gives˜ ν ( l ) n , p = ( p + )( p + ) np n ∑ i = (cid:18) − in + (cid:19) p (cid:18) − S i n + (cid:19) − p + p = ( p + )( p + ) κ n , p np ( n + ) p − ( p + )( p + ) np ( n + ) p + n ∑ i = S i ( n + − i ) p − p + p = ( p + )( p + ) κ n , p np ( n + ) p + ( p + )( p + )( κ n , p + − ( n + ) κ n , p ) np ( n + ) p + ν ( l ) n , p − p + p = ( + O ( n − )) ν ( l ) n , p + O ( n − ) . (17)By using (12), a similar argument shows that the empirical version of the coefficient ν ( u ) p is given by˜ ν ( u ) n , p = ( p + )( p + ) np n ∑ i = (cid:18) − (cid:18) in + (cid:19) p (cid:19) (cid:18) − S i n + (cid:19) − ( p + )= ( + O ( n − )) ν ( u ) n , p + O ( n − ) . (18)In the following we provide the asymptotic distribution of the WRC measures ν ( l ) n , p , ν ( u ) n , p and their symmetrized versions ν ( s . l ) n , p , ν ( s . u ) n , p . As shown by Segers [18], C n convergesweakly to C as n → ∞ , whenever C is regular, that is, the partial derivatives C ( u , v ) = ∂ C ( u , v ) / ∂ u and C ( u , v ) = ∂ C ( u , v ) / ∂ v exist everywhere on [ , ] and C and C arecontinuous on ( , ) × [ , ] and [ , ] × ( , ) , respectively. Moreover, the empirical cop-ula process C n = √ n ( C n − C ) converges weakly, as n → ∞ , to a centered Gaussian processˆ C on [ , ] , defined for all u , v ∈ [ , ] byˆ C ( u , v ) = C ( u , v ) − ∂∂ uC ( u , v ) C ( u , ) − ∂∂ vC ( u , v ) C ( , v ) , (19)where C ( u , v ) is Brownian bridge on [ , ] with the covariance function E ( C ( u , v ) C ( s , t )) = C ( min ( u , s ) , min ( v , t )) − C ( u , v ) C ( s , t ) . Theorem 1.
Suppose that C is a regular copula. Then √ n ( ν ( l ) n , p − ν ( l ) p ) , √ n ( ν ( u ) n , p − ν ( u ) p ) , √ n ( ν ( s . u ) n , p − ν ( s . u ) p ) and √ n ( ν ( s . l ) n , p − ν ( s . l ) p ) are asymptotically centered normal with theasymptotic variances, given by ( σ ( l ) p ) = ( p + ) ( p + ) Z [ , ] ( − u ) p − ( − s ) p − E (cid:0) ˆ C ( u , v ) ˆ C ( s , t ) (cid:1) dudvdsdt , (20) ( σ ( u ) p ) = ( p + ) ( p + ) Z [ , ] u p − s p − E (cid:0) ˆ C ( u , v ) ˆ C ( s , t ) (cid:1) dudvdsdt , ( σ ( s . u ) p ) =( p + ) ( p + ) Z [ , ] ( u p − + v p − )( s p − + t p − ) E (cid:0) ˆ C ( u , v ) ˆ C ( s , t ) (cid:1) dudvdsdt , ( σ ( s . l ) p ) =( p + ) ( p + ) Z [ , ] (( u − ) p − + ( v − ) p − )(( s − ) p − + ( t − ) p − ) × E (cid:0) ˆ C ( u , v ) ˆ C ( s , t ) (cid:1) dudvdsdt , here ˆ C is the Gaussian process defined by (19) .Proof. We prove the result for √ n ( ν ( l ) n , p − ν ( l ) p ) , similar arguments hold for √ n ( ν ( u ) n , p − ν ( u ) p ) , √ n ( ν ( s . u ) n , p − ν ( s . u ) p ) and √ n ( ν ( s . l ) n , p − ν ( s . l ) p ) . From (17) we have √ n ( ν ( l ) n , p − ν p ( l ) ) = ( + O ( n − )) √ n ( ˜ ν ( l ) n , p − ν p ( l ) ) + O ( n − / )= ( + O ( n − ))( p + )( p + ) Z Z ( − u ) p − (cid:2) √ n ( C n ( u , v ) − C ( u , v )) (cid:3) dudv + O ( n − / ) . Since the integral on the right side is a linear and continuous functional of the empiricalcopula process, then the left hand side is asymptotically centered normal with asymptoticvariance ( σ ( l ) p ) , as stated in the theorem. Corollary 2.
Assume the null hypothesis of independence i.e. C ( u , v ) = uv. Then √ n ν ( l ) n , p , √ n ν ( u ) n , p , √ n ν ( s . l ) n , p and √ n ν ( s . u ) n , p are asymptotically centered normal with the asymptoticstandard deviations σ ( l ) p = σ ( u ) p = ( p + ) √ ( p + ) and σ ( s . l ) p = σ ( s . u ) p = q p + p + ( p + ) .Proof. For C ( u , v ) = uv the covariance function of the limiting Gaussian process ˆ C takesthe form E ( ˆ C ( u , v ) ˆ C ( s , t )) = ( min ( u , s ) − us )( min ( v , t ) − vt ) . An application of Theorem4.1 and a routine integration gives the result.In order to use the proposed WRC measures for testing independence, one needs tofind their distributions or the quantiles of the distribution under the hypothesis of indepen-dence. The following result provides the expectation and variance of ν ( l ) n , p . Similar resultcould be found for ν ( u ) n , p , ν ( s . l ) n , p and ν ( s . u ) n , p . Theorem 3.
Under the hypothesis of independence between two sets of ranksE ( ν ( l ) n , p ) = , var ( ν ( l ) n , p ) = n ( n + ) κ n , p − n ( κ n , p ) ( κ n , p + − ( n + ) κ n , p ) . Proof.
We note that the WRC measure ν ( l ) n , p by (6) can be written as a linear combinationof the linear rank statistic of the form a n + b n n ∑ i = a ( i , S i ) , where a n = + n ∑ i = i ( n + − i ) p ! n ∑ i = ( n + − i )( n + − i ) p ! − b n = − n ∑ i = ( n + − i )( n + − i ) p ! − , and a ( i , S i ) = S i ( n + − i ) p . The mean and the variance of the quantity S = n ∑ i = a ( i , S i ) canbe obtained, for example, by using Theorem 1 in p. 57 in [20]. See, also [6].11able 3: Variance of the normalized WRC measures √ n ν ( l ) n , p , √ n ν ( u ) n , p , √ n ν ( s . l ) n , p and √ n ν ( s . u ) n , p , underthe assumption of independence, for p = , , , , Index The exact variance Asymptotic variance √ n ν ( l ) n , nn − √ n ν ( l ) n , n ( n + )( n + )( n + ) ( n − ) √ n ν ( u ) n , n
15 16 n − n + ( n − ) √ n ν ( s . l ) n , n
30 31 n + n + ( n + ) ( n − ) √ n ν ( s . u ) n , n
30 31 n − n + ( n − ) √ n ν ( l ) n , n n + n + n − ( n − )( n + n + ) √ n ν ( u ) n , n n − n + n − ( n − )( n − n + ) √ n ν ( s . l ) n , n + n + n + n − n ( n + ) ( n + ) ( n − ) √ n ν ( s . u ) n , n − n + n − n − n ( n − ) ( n − ) ( n − ) √ n ν ( l ) n , n ( n + n + n − n − n + ) ( n + )( n + n − ) ( n − )( n + ) √ n ν ( u ) n , n n − n + n + n − n + ( n − n − ) ( n − ) √ n ν ( s . l ) n , n n + n + n + n − n − n + ( n + n − ) ( n − )( n + ) √ n ν ( s . u ) n , n n − n + n − n − n + n + ( n − n − ) ( n − ) √ n ν ( l ) n , n + n + n − n − n + n − n ( n − )( n + n + n − n − ) √ n ν ( u ) n , n − n + n + n − n + n − n ( n − )( n − n + n + n − ) √ n ν ( s . l ) n , n
33 4100 n + n + n + n − n − n + n + n − ( n − )( n + n + n − n − ) √ n ν ( s . u ) n , n
33 4100 n − n + n − n − n + n + n − n − ( n − )( n − n + n + n − ) The exact and asymptotic variances of the normalized WRC measures √ n ν ( l ) n , p , √ n ν ( u ) n , p , √ n ν ( s . l ) n , p and √ n ν ( s . u ) n , p , under the assumption of independence are provided in Table 4, for p = , , , ,
5. According to the results of Table 4 it seems that the variance of thesymmetric versions of WRC measures are less than that of their asymmetric versions.They are more appropriate for testing independence of two random variables. Under theassumption of independence, all n ! orderings of a set of rank ( S , S , ..., S n ) are equallylikely to occur. After calculating the value of the WRC measures between the two rank-ings ( , , ..., n ) and ( S , S , ..., S n ) , their quantiles are obtained by considering that theatom of the discrete distribution is n ! . Here the r − quantiles in the data with x i = the valueof the specific WRC measure in the i -th paired samples, i = , , ..., n ! are calculated byusing ( − γ ) x ( j ) + γ x ( j + ) , where jn ≤ r < j + n , x ( j ) is the j th order statistic and γ = . nr = j , and 1 otherwise. This algorithm discussed in Hyndman and Fan [7] as one of thecommon methods for calculating quantiles for discontinuous sample in statistical pack-12ges. The quantiles of normalized WRC measures √ n ν ( l ) n , p , √ n ν ( u ) n , p and their symmetrizedversions √ n ν ( s . l ) n , p , √ n ν ( s . u ) n , p for p = , ..., n = , , ...,
10, are given in Tables 4-5.13able 4:
The quantiles of normalized WRC measures √ n ν ( l ) n , p and √ n ν ( s . l ) n , p for p = , ..., n = , , ..., √ n ν ( l ) n , p √ n ν ( s . l ) n , p n p
90% 95% 97 .
5% 99% 90% 95% 97 .
5% 99%n=5 1 1.565 1.789 2.012 2.012 1.565 1.789 2.012 2.0122 1.565 1.845 2.012 2.124 1.565 1.826 2.012 2.1243 1.600 1.909 2.045 2.185 1.618 1.931 2.030 2.1854 1.658 1.984 2.116 2.214 1.636 1.984 2.097 2.2145 1.730 2.041 2.161 2.226 1.676 2.041 2.147 2.226n= 6 1 1.470 1.890 2.030 2.170 1.470 1.890 2.030 2.1702 1.550 1.830 2.040 2.230 1.550 1.850 2.030 2.2103 1.599 1.897 2.103 2.275 1.599 1.934 2.101 2.2754 1.674 1.974 2.177 2.340 1.628 1.974 2.161 2.3405 1.791 1.985 2.227 2.368 1.727 1.972 2.233 2.368n=7 1 1.465 1.795 1.984 2.268 1.465 1.795 1.984 2.2682 1.500 1.831 2.079 2.268 1.488 1.819 2.079 2.2683 1.569 1.891 2.129 2.323 1.551 1.877 2.112 2.3234 1.656 1.980 2.196 2.378 1.620 1.954 2.204 2.3665 1.725 2.078 2.253 2.436 1.707 2.019 2.248 2.424n=8 1 1.414 1.751 2.020 2.290 1.414 1.751 2.020 2.2902 1.467 1.818 2.073 2.312 1.452 1.818 2.065 2.3123 1.541 1.895 2.144 2.372 1.514 1.885 2.129 2.3674 1.617 1.991 2.214 2.443 1.572 1.968 2.198 2.4315 1.704 2.067 2.291 2.500 1.644 2.035 2.275 2.481n=9 1 1.400 1.750 2.050 2.300 1.400 1.750 2.050 2.3002 1.445 1.800 2.070 2.330 1.430 1.795 2.070 2.3303 1.523 1.884 2.152 2.404 1.488 1.861 2.137 2.3954 1.611 1.973 2.238 2.479 1.548 1.938 2.212 2.4665 1.694 2.061 2.314 2.550 1.612 2.019 2.284 2.532n=10 1 1.399 1.744 2.012 2.319 1.399 1.744 2.012 2.3192 1.434 1.789 2.072 2.350 1.416 1.779 2.065 2.3433 1.511 1.873 2.156 2.429 1.469 1.844 2.137 2.4164 1.599 1.965 2.245 2.511 1.528 1.920 2.216 2.4915 1.683 2.054 2.333 2.587 1.589 1.998 2.298 2.561 n = ∞ Efficiency of the Tests of Independence
In this section we compare the Pitman asymptotic relative efficiency (or, Pitman ARE)and the empirical power of tests of independence constructed based on the proposed WRCmeasures.
Consider a parametric family { C θ } of copulas with θ = θ corresponding to the indepen-dence case. Let T n and T n be two test statistics for testing H : θ = θ versus H : θ > θ that reject null hypothesis for large values of T n and T n . Suppose that T n and T n satisfythe regularity conditions:(1) There exist continuous functions µ i ( θ ) and σ i ( θ ) , θ > θ , i = , θ n = θ + h √ n , h >
0, it holdslim n → ∞ P θ n (cid:18) √ n ( T in − µ i ( θ n )) σ i ( θ n ) < z (cid:19) = Φ ( z ) , z ∈ R , i = , , where Φ ( z ) is the standard normal distribution function;(2) The function µ i ( θ ) is continuously differentiable at θ = θ . and µ ′ i ( θ ) > , σ i ( θ ) > , i = , . Under these conditions, the Pitman ARE of T n relative to T n is equal to ARE ( T n , T n ) = " ∂∂θ µ | θ = θ ∂∂θ µ | θ = θ . σ ( θ ) σ ( θ ) . For more detail see, [13]. In the following we compare the ARE of proposed WRCmeasures, relative to the Spearman’s rho for the Cuadras-Aug´e family of copulas givenby (13).
Example 3.
Suppose that the copula of ( X , Y ) be a member of the Cuadras-Aug´e familyof copulas given by (13) in Example 3.1. Let T n = ν ( l ) n , p and T n = ρ ns . By Theorem 4.1,for the test statistics based on WRC measure ν ( l ) n , p , regularity conditions (1) and (2) aresatisfied with θ = , µ p ( θ ) = ν ( l ) p and σ lp given in Corollary 4.1 (for θ = ). By using(15) and differentiation with respect to θ one getsd ν ( l ) p d θ ( ) = p + ( p + ) . Since ν ( l ) = ρ s and ν ( l ) n , = ρ n , s , then from Corollary 4.1 we haveARE ( ν ( l ) n , p , ρ ns ) = ( p + ) ( p + ) ( p + ) ( p + ) . The quantiles of normalized WRC measures √ n ν ( u ) n , p and √ n ν ( s . u ) n , p for p = , ..., n = , , ..., √ n ν ( u ) n , p √ n ν ( s . u ) n , p n p
90% 95% 97 .
5% 99% 90% 95% 97 .
5% 99%n=5 1 1.565 1.789 2.012 2.012 1.565 1.789 2.012 2.0122 1.593 1.873 2.012 2.180 1.565 1.901 2.012 2.1803 1.663 1.982 2.120 2.222 1.633 1.982 2.098 2.2224 1.749 2.053 2.176 2.232 1.690 2.053 2.162 2.2325 1.865 2.102 2.205 2.235 1.814 2.102 2.197 2.235n=6 1 1.470 1.890 2.030 2.170 1.470 1.890 2.030 2.1702 1.582 1.834 2.058 2.254 1.582 1.862 2.072 2.2543 1.664 1.976 2.158 2.332 1.623 1.973 2.141 2.3324 1.794 1.992 2.223 2.368 1.722 1.983 2.232 2.3685 1.818 2.090 2.288 2.395 1.786 2.022 2.300 2.395n=7 1 1.465 1.795 1.984 2.268 1.465 1.795 1.984 2.2682 1.528 1.858 2.095 2.284 1.528 1.858 2.087 2.2843 1.611 1.940 2.182 2.365 1.579 1.936 2.187 2.3574 1.715 2.070 2.249 2.427 1.676 2.012 2.240 2.4105 1.818 2.137 2.316 2.487 1.763 2.078 2.332 2.461n=8 1 1.414 1.751 2.020 2.290 1.414 1.751 2.020 2.2902 1.491 1.847 2.097 2.338 1.472 1.838 2.088 2.3383 1.578 1.946 2.183 2.417 1.546 1.925 2.167 2.4134 1.675 2.044 2.270 2.491 1.620 2.023 2.238 2.4725 1.764 2.132 2.354 2.543 1.713 2.078 2.341 2.528n=9 1 1.400 1.750 2.050 2.300 1.400 1.750 2.050 2.3002 1.469 1.825 2.094 2.356 1.444 1.812 2.094 2.3503 1.563 1.925 2.195 2.442 1.516 1.897 2.176 2.4344 1.662 2.027 2.289 2.527 1.587 1.990 2.260 2.5075 1.745 2.123 2.374 2.603 1.660 2.073 2.338 2.584n=10 1 1.399 1.744 2.012 2.319 1.399 1.744 2.012 2.3192 1.450 1.812 2.093 2.374 1.429 1.795 2.081 2.3663 1.548 1.912 2.195 2.467 1.493 1.874 2.171 2.4504 1.646 2.014 2.294 2.558 1.560 1.964 2.262 2.5315 1.737 2.111 2.390 2.637 1.630 2.046 2.351 2.609 n = ∞ imilarly, for ν ( s . l ) n , p , ν ( u ) n , p and ν ( s . u ) n , p we haveARE ( ν ( s . l ) n , p , ρ n , s ) = ( p + ) ( p + ) ( p + ) ( p + p + ) , ARE ( ν ( u ) n , p , ρ n , s ) = ( p + ) ( p + ) , and ARE ( ν ( s . u ) n , p , ρ n , s ) = ( p + ) ( p + ) ( p + ) ( p + p + ) . Table 8 shows the ARE of the test of independence based on WRC measures com-pared to the test based on Spearman’s rho for the Clayton copula, as a family of copulaswith lower tail dependence and the Cuadras-Aug´e family of copulas, as a family of copu-las with upper tail dependence. As we see, for the Clayton family of copulas the measure ν ( s . l ) n , and for the Cuadras-Aug´e family of copulas, the measure ν ( s . u ) n , has the largest Pit-man ARE. The same results is still true by using Kendall’s τ instead of Spearman’s ρ since ARE ( T τ , T ρ ) = In the following we compare the power of tests based on the WRC measures ν ( s . l ) n , p and ν ( s . u ) n , p for p = , , , H : C ( u , v ) = uv against H : C ( u , v ) > uv . Monte Carlo simulations carry out for Gumbel, Clayton, Frank and Normal copulas withvarious degrees of dependence, with the sample of size n =
50 at significance level 0 . θ and the value of the corresponding spearman’s ρ (as the levelof dependence). Tables 7-10, show the power of tests that obtained under alternatives,defined by Gumbel, Clayton, Frank and normal copulas. The Clayton copula has lowertail dependence, the Gumbel copula has upper tail dependence and the normal copulahas neither. We see that for the Clayton’s family of copulas for all degree of dependencein terms of Spearman’s rho ( ρ s ) , the test based on ν ( s . l ) n , has the maximum power. Forthe Gumbel family of copulas, the test based on ν ( s . u ) n , has the maximum power. For the17able 6: ARE of test of independence based on WRC measures ν ( l ) n , p , ν ( u ) n , p , ν ( s . l ) n , p and ν ( s . u ) n , p , for p = , , ...,
13, relative to the Spearman’s rho for the Cuadras-Aug´e and Clayton family of copulas.
Cuadras-Aug´e Clayton p ν ( l ) n , p ν ( u ) n , p ν ( s . l ) n , p ν ( s . u ) n , p ν ( l ) n , p ν ( u ) n , p ν ( s . l ) n , p ν ( s . u ) n , p p The power of tests of independence based on the Kendall’s tau ( τ n ), Spearman’s rho( ρ n , s ) the WRC measures ν ( s . l ) n , p , ν ( s . u ) n , p , p = , , ,
5, computed from 50,000 samples of size 50for Clayton copula with different values of the parameter ( θ ) and different level of dependence interms of Spearman’s rho ( ρ s ) . θ ρ s ν ( s . l ) n , ν ( s . l ) n , ν ( s . l ) n , ν ( s . l ) n , ν ( s . u ) n , ν ( s . u ) n , ν ( s . u ) n , ν ( s . u ) n , ρ n , s τ n Table 8:
The power of tests of independence based on the Kendall’s tau ( τ n ), Spearman’s rho( ρ n , s ) the WRC measures ν ( s . l ) n , p , ν ( s . u ) n , p , p = , , ,
5, computed from 50,000 samples of size 50for Gumbel copula with different values of the parameter ( θ ) and different level of dependence interms of Spearman’s rho ( ρ s ) . θ ρ s ν ( s . l ) n , ν ( s . l ) n , ν ( s . l ) n , ν ( s . l ) n , ν ( s . u ) n , ν ( s . u ) n , ν ( s . u ) n , ν ( s . u ) n , ρ n , s τ n The power of tests of independence based on the Kendall’s tau ( τ n ), Spearman’s rho( ρ n , s ) the WRC measures ν ( s . l ) n , p , ν ( s . u ) n , p , p = , , ,
5, computed from 50,000 samples of size 50for Normal copula with different values of the parameter ( θ ) and different level of dependence interms of Spearman’s rho ( ρ s ) . θ ρ s ν ( s . l ) n , ν ( s . l ) n , ν ( s . l ) n , ν ( s . l ) n , ν ( s . u ) n , ν ( s . u ) n , ν ( s . u ) n , ν ( s . u ) n , ρ n , s τ n normal family of copulas the behavior of all tests of independence are the same. It seemsthat the members of the proposed class ν ( u ) n , p defined by (7) performs very well, comparedwith the Kendall’s tau and Spearman’s rho, if there exists a higher dependence in the uppertail. If there exists a higher dependence in the lower tail, the members of the proposedclass ν ( l ) n , p defined by (5) has a better performance. In this paper we have presented a class of weighted rank correlation measures extendingthe Spearman’s rank correlation coefficient. The proposed class was constructed by giv-ing suitable weights to the distance between two sets of ranks to place more emphasis onitems having low rankings than those have high rankings, or vice versa. The asymptoticdistributions of the proposed measures in general and under the null hypothesis of inde-pendence are derived. We also carried out a simulation study to compare the performanceof the proposed measures with the Spearman’s and Kendall’s rank correlation measures.Another line of research is the extension of the result to the situations where n objectsare ranked by m > eferences [1] J. Behboodian, A. Dolati and M. Ubeda-Flores, Measures of association based onaverage quadrant dependence, J. Probab. Statist. Sci 3 (2005), pp. 161–173.[2] D. C. Blest, Rank correlation an alternative measure, Austr. NZ J. Statist. 42 (2000),pp. 101–111.[3] T. Coolen-Maturi, A new weighted rank coefficient of concordance, J. Appl. Statist.41 (2014), pp. 1721–1745.[4] L. Dancelli, M. Manisera and M. Vezzoli, On two classes of Weighted Rank Cor-relation measures deriving from the Spearman’s ρρ