[PDF] A Robust Spearman Correlation Coefficient Permutation Test

Abstract

In this work, we show that Spearman's correlation coefficient test about H 0 : ρ s =0 found in most statistical software packages is theoretically incorrect and performs poorly when bivariate normality assumptions are not met or the sample size is small. The historical works about these tests make an unverifiable assumption that the approximate bivariate normality of original data justifies using classic approximations. In general, there is common misconception that the tests about ρ s =0 are robust to deviations from bivariate normality. In fact, we found under certain scenarios violation of the bivariate normality assumption has severe effects on type I error control for the most commonly utilized tests. To address this issue, we developed a robust permutation test for testing the general hypothesis H 0 : ρ s =0 . The proposed test is based on an appropriately studentized statistic. We will show that the test is theoretically asymptotically valid in the general setting when two paired variables are uncorrelated but dependent. This desired property was demonstrated across a range of distributional assumptions and sample sizes in simulation studies, where the proposed test exhibits robust type I error control across a variety of settings, even when the sample size is small. We demonstrated the application of this test in real world examples of transcriptomic data of the TCGA breast cancer patients and a data set of PSA levels and age.

Full PDF

AA R

OBUST S PEARMAN C ORRELATION C OEFFICIENT P ERMUTATION T EST

Han Yu ∗ Department of Biostatistics and BioinformaticsRoswell Park Comprehensive Cancer CenterBuffalo, NY 14263

[email protected]

Alan D. Hutson

Department of Biostatistics and BioinformaticsRoswell Park Comprehensive Cancer CenterBuffalo, NY 14263

[email protected] A BSTRACT

In this work, we show that Spearman’s correlation coefﬁcient test about H : ρ s = 0 found in moststatistical software packages is theoretically incorrect and performs poorly when bivariate normalityassumptions are not met or the sample size is small. The historical works about these tests makean unveriﬁable assumption that the approximate bivariate normality of original data justiﬁes usingclassic approximations. In general, there is common misconception that the tests about ρ s = 0 arerobust to deviations from bivariate normality. In fact, we found under certain scenarios violation ofthe bivariate normality assumption has severe effects on type I error control for the most commonlyutilized tests. To address this issue, we developed a robust permutation test for testing the generalhypothesis H : ρ s = 0 . The proposed test is based on an appropriately studentized statistic. Wewill show that the test is theoretically asymptotically valid in the general setting when two pairedvariables are uncorrelated but dependent. This desired property was demonstrated across a range ofdistributional assumptions and sample sizes in simulation studies, where the proposed test exhibitsrobust type I error control across a variety of settings, even when the sample size is small. Wedemonstrated the application of this test in real world examples of transcriptomic data of the TCGAbreast cancer patients and a data set of PSA levels and age. Keywords rank correlation · studentized · small sample · non-normality The concept of correlation and regression was originally conceived by Galton when studying how strongly thecharacteristics of one generation of living things manifested in the following generation [1]. The ideas prompting thedevelopment of more mathematically rigorous treatment of correlation were developed by Karl Pearson in 1896, whichyielded the well-known Pearson Product Moment Correlation Coefﬁcient [2] given as ρ ( X, Y ) =

Cov ( X, Y ) σ X σ Y = E ( X − µ X )( Y − µ Y ) (cid:112) E ( X − µ X ) E ( Y − µ Y ) , (1)where X and Y are two random variables from a non-degenerative joint distribution F XY , Cov ( X, Y ) denotes thecovariance, and µ X and µ Y , σ X and σ Y are the population means and standard deviations, respectively. If we let ( X , Y ) , ( X , Y ) , . . . , ( X n , Y n ) denote n paired i.i.d. observations, then the sample Pearson correlation coefﬁcient isgiven as r ( X, Y ) = Σ ni =1 ( X i − ¯ X )( Y i − ¯ Y ) (cid:113) Σ ni =1 ( X i − ¯ X ) (cid:113) Σ ni =1 ( Y i − ¯ Y ) . (2)Shortly thereafter Pearson’s work was published Spearman introduced the rank correlation coefﬁcient in 1904, withthe advantages of being robust to extreme values and disparities of distributions between two variables [3]. It should ∗ Corresponding author a r X i v : . [ s t a t . M E ] A ug Robust Spearman Correlation Coefﬁcient Permutation Testbe noted however that K. Pearson, in his biography of Galton, says that the latter "dealt with the correlation of ranksbefore he even reached the correlation of variates, i.e. about 1875", but Galton apparently published nothing explicitly[4]. Mathematically, Spearman’s correlation coefﬁcient is deﬁned as the Pearson correlation coefﬁcient on the ranks of ( X , Y ) , ( X , Y ) , . . . , ( X n , Y n ) and denoted as ρ s ( X, Y ) = ρ ( a, b ) , (3)where a i = Rank ( X i ) and b i = Rank ( Y i ) are the ranks of X i and Y i , respectively, i = 1 , , · · · , n . In general, whendiscussing the Spearman correlation coefﬁcient little attention is given to its population measure. However, if weconsider that a i /n converges to F ( X i ) , then one may consider the population measure linked to Spearman’s samplecorrelation coefﬁcient as, ρ s ( X, Y ) = E { F X ( X ) − E [ F X ( X )] }{ F Y ( Y ) − E [ F Y ( Y )] } E { F X ( X ) − E [ F X ( X )] } E { F Y ( Y ) − E [ F Y ( Y )] } = E [ F X ( X ) − ][ F Y ( Y ) − ] E [ F X ( X ) − ] E [ F Y ( Y ) − ] , (4)where F X and F Y are the marginal cumulative distribution functions (CDFs) for X and Y , respectively. The sampleestimator of ρ s can be obtained by replacing the original observations with their ranks in Equation 2, r s ( X, Y ) = r ( a, b ) = Σ ni =1 ( a i − ¯ a )( b i − ¯ b ) (cid:112) Σ ni =1 ( a i − ¯ a ) (cid:113) Σ ni =1 ( b i − ¯ b ) = 1 − ni =1 ( a i − b i ) n ( n − . (5)For samples from a bivariate normal population, there is also a known relation between the Spearman and Pearsoncorrelation coefﬁcients [5], which is E ( r s ) = 6 π ( n + 1) { sin − ρ + ( n −

2) sin − ρ } . (6)While Pearson’s ρ measures the linear relationship between two random variables it is often described that Spearman’s ρ s measures a monotonic association between X and Y , thus it may be considered more general measure of association,albeit it does measure the linear association between F ( X ) and F ( Y ) . Spearman’s correlation coefﬁcient is alsoless sensitive to extreme values because it is rank based. Due to these advantages, it is widely used as a measure ofassociation between two measurements. It is often of interest to test whether two random variables are correlated, i.e. H : ρ s = 0 , for which the common methods include a t -distribution based test, incorrectly based on the approximatebivariate normality of the ranks, a test based on Fisher’s Z transformation, again assuming approximate bivariatenormality of the ranks, and what we term the naive permutation test. The t -distribution based test is commonly usedwhen the sample size is large, with the t -statistic deﬁned as t = r s (cid:115) n − − r s . Under bivariate normality assumptions this statistic approximately follows student’s t distribution with n − degrees offreedom under H . For the test based on Fisher’s Z transformation, the statistic is deﬁned as Z = 12 arctan( r s ) = 12 ln 1 + r s − r s . Under bivariate normality assumptions the transformed Z statistic approximately follows normal distribution N (0 , . n − ) under H . For small sample size scenarios, naive permutation tests are also often used, where X and Y are randomlyshufﬂed separately to simulate the sample distribution of r s under H , which is an exact test for testing the independencebetween X and Y under the null exchangeability assumptions, but may be an invalid test for testing H : ρ s = 0 given G ( F ( X ) , F ( Y )) = G ( F ( X )) G ( F ( Y )) does not imply ρ s = 0 , where G ( · , · ) denotes the joint CDF of F ( X ) , F ( Y ) .These tests are so widely used that they are often the default options in common statistical software packages suchas R [6] and SAS [7]. However, there is little discussion that these tests relies on the untenable assumption that theunderlying sample distribution of the ranks has a bivariate normal distribution, which is in fact an impossibility. Evenamong those who noted this assumption, there is a misconception that the above tests are robust to such deviationsbecause Spearman’s ρ s is rank based. This is exempliﬁed in a discussion by Feller et al. in their article [8],Conversely, starting from any bivariate distribution φ ( X, Y ) we can always ﬁnd monotonic trans-formations X = f ( x ) , Y = g ( y ) to standardized normal variates x and y . The resulting bivariatedistribution ψ ( X, Y ) will not necessarily be bivariate normal, but we think it likely that in practicalstations it would not differ greatly from this form. This is a ﬁeld in which further investigation wouldbe of considerable interest. 2 Robust Spearman Correlation Coefﬁcient Permutation TestHowever, as we will show in Section 3, all the commonly used tests about ρ s = 0 as discussed above, including thenaive permutation test, are not even asymptotically valid when the non-exchangeabilty assumptions are violated under H . In some cases, the type I error can severely drift away from the desired level as the sample size increases! Aundesirable feature that is more notable in the era of “big-data”. Another variation of this approach is the Fisher-Yatescoefﬁcient, which transforms the original X and Y to their corresponding normal quantiles before the testing [9].Although the marginal distributions are of the transformed variates take a pseudo normal form, the joint normality ofthese transformed values is not guaranteed.In terms of our modiﬁed permutation test it is important to note the classic large sample result in Serﬂing where the“distribution free” large sample normal approximation for the sampling distribution for Pearson’s sample correlationcoefﬁcient r is derived using the multivariate delta method [4]. This method guarantees type I error converges to α when n → ∞ given ﬁnite fourth moments. A straightforward way to obtain a similar result for Spearman’s correlationis given by replacing the i.i.d sample with corresponding ranks. The test is asymptotically valid because the ranks areasymptotically independent, as we will discuss in Section 2. Even though large sample approximations about theseestimators are asymptotically valid they tend to suffer inﬂated type I errors in the small sample setting, e.g. n < .To address this issue, we propose a studentized permutation test for Spearman’s correlation ρ s , which extendsthe work or Diccicio and Romano for Pearson’s correlation coefﬁcient ρ [10]. We will show that the proposedtest is asymptotically valid under general assumptions and is exact under exchangeability assumptions when G ( F ( X ) , F ( Y )) = G ( F ( X )) G ( F ( Y )) , i.e. more simply when X and Y are independent. We show that ournewly proposed test has robust Type I error controls type I error control even when the sample distribution is dependent(non-exchangeability) and non-normal. Importantly, the type I error is well controlled when the sample size is small.This will be illustrated by a set of simulation studies. Finally, we will demonstrate the application of this test in realworld examples of transcriptomic data of TCGA breast cancer patients, as well as a data set of PSA levels and age. In this section, we start by reviewing the robust permutation test for Pearson’s correlation coefﬁcient as proposed byDiciccio and Romano[10]. Towards this end, we deﬁne G n to be the set of all permutations π of { , . . . , n } . Fortesting independence between two random variables X and Y , the permutation distribution of any given test statistic T n ( X n , Y n ) is deﬁned as ˆ R T n n ( t ) = 1 n ! (cid:88) π ∈ G n I { T n ( X n , Y nπ ) ≤ t } (7)where Y nπ represents { Y π (1) , . . . , Y π ( n ) } . In this setting, the permutation G n is all possible pairwise combinationsbetween X n and Y n . A level α one-sided permutation test rejects if T n ( X n , Y nπ ) is larger than the − α quantile of thepermutation distribution. The permutation test is exact when exchangeability assumptions hold, that is, the distributionof ( X n , Y n ) is invariant under the group of transformations G n . The test using the Pearson correlation coefﬁcient ˆ ρ isexact when used a metric of dependence for testing the null hypothesis of independence given as H : P = P X × P Y , where P X and P Y are marginal distributions of X and Y , respectively. The null hypothesis of independence is notequivalent to the test about zero correlation given as H : ρ = 0 with the exception of limiting assumptions suchas the data are distributed as bivariate normal random variables. In other words, in the general setting two randomvariables can be dependent but uncorrelated. In such cases, DiCiccio and Romano [10] have shown that, with ﬁnitefourth moments, the permutation distribution of ˆ ρ converges to N (0 , , but its sampling distribution converges to N (0 , τ ) , where τ = µ µ µ , and µ pq = E [( X − µ ) p ( Y − µ ) q ] . Thus the test will not be level α unless τ = 1 . In light of this result, DiCiccio and Romano proposed a studentizedcorrelation test statistic, which has been shown to control Type I error asymptotically at α when two random variablesare dependent but uncorrelated [10]. Speciﬁcally, the studentized statistic is deﬁned as S n = √ n ˆ ρ n / ˆ τ n , where ˆ τ n = ˆ µ ˆ µ ˆ µ , ˆ µ pq = 1 n n (cid:88) i =1 ( X i − ¯ X ) p ( Y i − ¯ Y ) q . The permutation distribution and sampling distribution of S n both converge to the standard normal distributionasymptotically. It should be noted that even though the results presented in DiCiccio and Romano[10] are based onlarge sample approximations, the behavior of this test for small to moderate sample sizes is quite good as born out intheir simulation results. Spearman’s coefﬁcient ρ s ( X, Y ) is the Pearson correlation coefﬁcient of the ranks of X and Y , that is ρ ( a, b ) . Whenthere are ties in the data, their ranks are typically taken as an average. Unlike Pearson’s correlation coefﬁcient ρ ,which measures the linear relationship between two random variables, Spearman’s correlation coefﬁcient ρ s measures amonotonic association, thus is far less restrictive. Note that Spearman’s correlation coefﬁcient is also the linear measurebetween F ( X ) and F ( Y ) . It is also less sensitive to non-normality or extreme values.Despite the above advantages, it is a misconception that the tests of ρ s = 0 based on the bivariate normality assumptionsunderlying the original data will be robust to the deviation from this assumption. In fact, when the original data areindependent, their ranks will will be dependent. Thus, tests of ρ s typically suffer similar issue as for ρ . We alsoemphasize that "normality" refers to the joint normality as opposed to marginal normality, because two random variablesthat are marginally normal can have a joint non-normal distribution. Therefore, the Fisher-Yates coefﬁcient, which backtransforms a variables rank through the normal quantile function does not provide what heuristically one may consideras a simple correction. In Section 3, we will empirically show that violation of the joint normality assumption willhave severe effect on type I error control. In addition, it is in fact impossible for the joint distribution of the ranks to bebivariate normal.Our approach is to replace ( X i , Y i ) with their ranks ( a i , b i ) in order to develop a Spearman’s correlation permutationtest analog to the Pearson’s correlation permutation test, with some subtle differences. The studentized permutationtest of the Pearson’s ρ only requires ﬁnite fourth moments and that observations are i.i.d. Although the pairs of ranks ( a i , b i ) are no longer i.i.d observations, we can show that they asymptotically satisfy this condition.When ( X i , Y i ) are from paired i.i.d. observations, we have ( F X ( X i ) , F Y ( Y i )) being i.i.d. as well. Since a i = Rank ( X i ) = n ˆ F X ( X i ) and ˆ F X ( X i ) → F X ( X i ) as n → ∞ , we have a i n → F X ( X i ) . Therefore, thepaired observations ( a i , b i ) are asymptotically i.i.d. Consequently, the exchangeability condition will hold at leastasymptotically and the test will be asymptotically exact. Intuitively, when n is sufﬁciently large, knowing ( a i , b i ) willlend little knowledge on the ranks of another pair ( a j , b j ) . It can also be shown that the correlation between a i and a j ( i (cid:54) = j ) is approximately − n − . Therefore, for a sample sequence X n , the correlation matrix of the ranks convergesto I n when n → ∞ .Speciﬁcally, the one sided studentized permutation test for testing H : ρ s = 0 versus H : ρ s > is performed bythe following steps. The test is implemented in the R perk ( per mutation tests of correlation c (k) oefﬁcients) package,which will be available on CRAN (The Comprehensive R Archive Network, https://cran.r-project.org/) and GitHub(https://github.com/hyu-ub/perk). • For n paired i.i.d. observations ( X , Y ) , ( X , Y ) , . . . , ( X n , Y n ) , calculate their ranks within each randomvariable, ( a , b ) , ( a , b ) , . . . , ( a n , b n ) . • Estimate the Spearman’s ρ s using Equation 5 as r s . • Estimate the variance of sample estimates r s by ˆ τ n = ˆ µ ˆ µ ˆ µ , ˆ µ pq = 1 n n (cid:88) i =1 ( a i − ¯ a ) p ( b i − ¯ b ) q . • Calculate the studentized statistic R s = r s /τ n . • Randomly shufﬂe ( b , b , ...b n ) for B times. For each permutation, calculate the permuted studentized statistic R ks , k ∈ (1 , ..., B ) . • Calculate the p-value by p = 1 B Σ Bk =1 I ( R ks > R s ) . • Reject H if p < α . 4 Robust Spearman Correlation Coefﬁcient Permutation Test We examined the Type I error control across all of the tests introduced above using distributions commonly found inthe literature for these examinations across a wide range of settings [10, 11]. For our simulation study, we focusedon testing H : ρ c = 0 versus H : ρ c > , with sample sizes n = 10 , , , , . Each simulation utilized , Monte Carlo replications and the number of permutations used is , . We compared the t test, Fisher’s Z -transformation (Fisher’s Z ), Fisher-Yates method, Serﬂing’s large sample normal approximation (Asymp Norm),naive permutation test (Permute), and studentized permutation test (Stu Permute). The Type I error control for α = 0 . was examined. The simulation scenarios 1 through 5 from DiCiccio and Romano. Two additional distributions werestudied as well:1. Multivariate normal (MVN) with mean zero and identity covariance.2. Exponential given as ( X, Y ) = rS T u where S = diag ( √ , , r ∼ exp(1) , and u is uniformly distributed onthe two dimensional unit circle.3. Circular given as the uniform distribution on a two dimensional unit circle.4. t . where X = W + Z and Y = W − Z , where W and Z are iid t . random variables.5. Multivariate t -distribution (MVT) with 5 degrees of freedom.6. Mixture of two bivariate normal distributions given as ( X, Y ) =

W Z + (1 − W ) Z where W ∼ Bernoulli (0 . , Z ∼ N ( (cid:18) (cid:19) , (cid:18) ρρ (cid:19) ) , Z ∼ N ( (cid:18) (cid:19) , (cid:18) − ρ − ρ (cid:19) ) . We select a range of ρ ’s: 0.1,0.3, 0.6 and 0.9 to simulate different degrees of dependencies between X and Y (MVN 1, MVN 3, MVN6,MVN 9).7. Mixture of four bivariate normal distributions (MVN 45), given as ( X, Y ) = Σ k =1 I ( W i = k ) Z i where P ( W = k ) = 0 . , k = 1 , , , . In addition, Z k ∼ N ( µ k , I ) , where µ = (5 , T , µ = (5 , − T , µ = ( − , T , µ = ( − , − T .Figure 1: Density plots of the a synthetic ( n = 1 , ) for each simulation distributions.The results in Table 1 and Figure 2 show that the large sample asymptotic normal approximation has inﬂated type I errorrates for all distributions when n ≤ . The t test, Fisher’s Z test, Fisher-Yates, and naive permutation tests tend to beover-conservative for the exponential and circular distributions. While for t . , the type I error is consistently inﬂated.Note that, for these tests, such deviation cannot be corrected as sample size increases. Instead, they may converge to anarbitrary level, either lower or higher than α .For MVN 1-9, we simulated a range of dependency among uncorrelated X and Y , where MVN 1 has the weakest andMVN 9 has the strongest dependency (Figure 1). The above four tests showed the type I error rate inﬂation becomes5 Robust Spearman Correlation Coefﬁcient Permutation Testincreasingly severe as the dependency increases. This demonstrates the failure in controlling type I error results fromthe data being dependent, which can occur when the underlying distribution is non-normal.The MVN 45 is a case where the dependency of original data is remedied by using the ranks. In this case, the ranks willdistribute as if it comes from a bivariate normal distribution, regardless of the distance between the centers of individualGaussian sub-populations. Therefore, all four tests show well control of the type I error rate.On the other hand, the studentized permutation test robustly control type I error for all distributions examined, evenwhen the n is as small as 10. This demonstrates a clear advantage of the proposed test over all other commonly usedtests for Spearman’s correlation coefﬁcient.Figure 2: Type I error rate of testing H : ρ s = 0 versus H : ρ s > . As an illustration of our approach, we tested H : ρ s = 0 versus H : ρ s > using The Cancer Genome Atlas (TCGA)breast cancer RNA sequencing (RNA-seq) data. The gene abundance was RSEM normalized [12]. Fibroblast growthfactor (FGF)2, FGF4, FGF7 and FGF20 are representative paracrine FGFs binding to heparan-sulfate proteoglycan andﬁbroblast growth factor receptors (FGFRs), whereas FGF19, FGF21 and FGF23 are endocrine FGFs binding to Klothoand FGFRs. FGFR1 is relatively frequently ampliﬁed and overexpressed in breast and lung cancer, and FGFR2 ingastric cancer. Moreover, FGF2 activates human dermal ﬁbroblasts through transcriptional downregulation of the TP536 Robust Spearman Correlation Coefﬁcient Permutation TestTable 1: Type I error rate of testing H : ρ s = 0 versus H : ρ s > .Distribution N t test Fisher’s Z Fisher-Yates Asymp Norm Permute Stu PermuteMVN

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exponential

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . t .

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Circular

10 0 . . . . . . . . . . . . . . . . . . . . < . . . . . . < . . . . MVT

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MVN 1

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MVN 3

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MVN 6

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MVN 9

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MVN 45

10 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . left ). The marginalnormality of data was examined by Shapiro-Wilk test and the bivariate normality was examined by Henze-Ziklertest. The p values of Shapiro-Wilk tests for log-transformed TP53 and FGFR1 abundances are 0.6077 and 0.0644,respectively. The p value of Henze-Zikler test is 0.3478. Although there is no statistical signiﬁcant, the marginal ofFGFR1 abundance likely deviates from normal distribution.The estimated Spearman’s correlation is ˆ ρ s = 0 . . Table 2 shows the results of hypothesis testing. Only the resultof studentized permutation test is non-signiﬁcant at α = 0 . and suggests there is no evidence of positive correlationbetween TP53 and FGFR1. In fact the biology does not support a positive correlation either, since FGFR1 mediatesnegative regulation of TP53 by FGF2 at transcriptional level [13]. Indeed, if we include all samples ( n ≈ ) fromTCGA breast cancer cohort, then all tests will fail to reject H with p -values over 0.5 except for Fisher-Yates test.Together with the results from the simulations, the result by studentized permutation test is clearly more reliable.Figure 3: Scatter plot of log-transformed TP53 versus FGFR1 abundance for TCGA data ( left ), and age versus − log( PSA ) for the PSA data ( right ).Table 2: Results of testing H : ρ s = 0 versus H : ρ s > for TCGA breast cancer data and PSA data.Tests p value (TCGA) p value (PSA) t test . < . Fisher’s

Z < . < . Fisher-Yates < . < . Asymp Norm . < . Permute . < . Stu Permute . < . The testing methods were also applied to a data set of age and baseline prostate-speciﬁc antigen (PSA) levels [14]. Thedata consists of age and PSA levels of 480 subjects, of which 473 have complete paired observations. The sampleSpearman’s correlation coefﬁcient between age and PSA is ˆ ρ s = − . . Since the alternative hypothesis of proposedtest is H : ρ s > , we applied a negative log transformation on PSA levels. Similar as the TCGA example, the marginalnormality of data was examined by Shapiro-Wilk test, and the bivariate normality was examined by Henze-Zikler test.The p values of Shapiro-Wilk tests for log-transformed age and PSA levels are 0.0208 and 0.0301, respectively. The p value of Henze-Zikler test is < . . The results indicates the distribution is not bivariate normal. Figure 3 ( right )shows the scatter plot of age versus − log( PSA ) . Table 2 shows that all tests rejects the H and conclude there is anon-zero correlation between age and PSA. The is an example where all tests have consistent results. Although thenormality tests are signiﬁcant, such deviation may have been remedied by using the ranks in this speciﬁc example.8 Robust Spearman Correlation Coefﬁcient Permutation Test Conventional tests of the Spearman’s correlation rely on normality assumption, including t -test, Fisher’s Z transfor-mation, and naive permutation test, which fails to control Type I error rates when the assumption is violated. Thiswas illustrated in our simulations studies (Section 3). Such defect cannot be remedied by transforming the marginaldistributions such as by Fisher-Yates coefﬁcient. Notably, the deviation from bivariate normality can result in aconvergence of type I error rate to an arbitrary level when n → ∞ . This indicates that, under scenarios when tworandom variables are uncorrelated but dependent, the type I error will not be controlled at desired level no matter howlarge the sample size is. On the other hand, the Serﬂing’s test based on delta method guarantees that the type I error rateconverges to α as long as the fourth order moment is ﬁnite. However, it typically suffers an inﬂated type I error whensample size is under 50.In this work, we present a robust Spearman’s correlation permutation test based on studentized statistic for testing H : ρ s = 0 versus H : ρ s > . The proposed approach is inspired by the work by DiCiccio and Romano [10], whichwas developed for Pearson’s correlation. Through extensive simulation studies and real world application, we showthe proposed test controls type I error even when sample size is as small as 10 and normality assumption is violated.Therefore, the test is valid in general cases. In addition, the studentized statistic can also be used for bootstrappingtests, so as to test for more general point null hypotheses [11]. In conclusion, the proposed studentized permutation testshould be used as a routine for testing non-zero Spearman’s correlation coefﬁcient. Acknowledgments

Project Data Sphere, LLC . Neither

Project Data Sphere, LLC northe owner(s) of any information from the website have contributed to, approved or are in any way responsible for thecontents of this publication.

References [1] Jeffrey M Stanton. Galton, pearson, and the peas: A brief history of linear regression for statistics instructors.

Journal of Statistics Education , 9(3), 2001.[2] Karl Pearson. Vii. mathematical contributions to the theory of evolution.—iii. regression, heredity, and panmixia.

Philosophical Transactions of the Royal Society of London. Series A, containing papers of a mathematical orphysical character , (187):253–318, 1896.[3] Charles Spearman. The proof and measurement of association between two things. 1961.[4] Maurice Kendall and Alan Stuart. The advanced theory of statistics, 2: 240–80.

London: Charles Grifﬁn , 1979.[5] PAP Moran. Rank correlation and product-moment correlation.

Biometrika , 35(1/2):203–206, 1948.[6] R Core Team et al. R: A language and environment for statistical computing, 2013.[7] SAS Institute.

Base SAS 9.4 procedures guide . SAS Institute, 2015.[8] Edgar C Fieller, Herman O Hartley, and Egon S Pearson. Tests for rank correlation coefﬁcients. i.

Biometrika ,44(3/4):470–481, 1957.[9] Ronald A Fisher and Frank Yates.

Statistical tables: For biological, agricultural and medical research . Oliverand Boyd, 1938.[10] Cyrus J DiCiccio and Joseph P Romano. Robust permutation tests for correlation and regression coefﬁcients.

Journal of the American Statistical Association , 112(519):1211–1220, 2017.[11] Alan D Hutson. A robust pearson correlation test for a general point null using a surrogate bootstrap distribution.

Plos one , 14(5):e0216287, 2019.[12] Bo Li and Colin N Dewey. Rsem: accurate transcript quantiﬁcation from rna-seq data with or without a referencegenome.

BMC bioinformatics , 12(1):323, 2011.[13] Masaru Katoh. Fgfr inhibitors: Effects on cancer cells, tumor microenvironment and whole-body homeostasis.

International journal of molecular medicine , 38(1):3–15, 2016.9 Robust Spearman Correlation Coefﬁcient Permutation Test[14] Christopher J Sweeney, Yu-Hui Chen, Michael Carducci, Glenn Liu, David F Jarrard, Mario Eisenberger, Yu-Ning Wong, Noah Hahn, Manish Kohli, Matthew M Cooney, et al. Chemohormonal therapy in metastatichormone-sensitive prostate cancer.