On Hodges and Lehmann's " 6/π result"
aa r X i v : . [ m a t h . S T ] M a y O N H ODGES AND L EHMANN ’ S “6/ π R ESULT ” Marc Hallin a , Yvik Swan b , and Thomas Verdebout c a ECARES, Universit´e libre de Bruxelles and ORFE, Princeton University b Universit´e du Luxembourg c Universit´e Lille Nord de Fance, laboratoire EQUIPPE
Abstract
While the asymptotic relative efficiency (ARE) of Wilcoxon rank-based tests forlocation and regression with respect to their parametric Student competitors canbe arbitrarily large, Hodges and Lehmann (1961) have shown that the ARE ofthe same Wilcoxon tests with respect to their van der Waerden or normal-scorecounterparts is bounded from above by /π ≈ . . In this paper, we revisit thatresult, and investigate similar bounds for statistics based on Student scores. Wealso consider the serial version of this ARE. More precisely, we study the ARE,under various densities, of the Spearman-Wald-Wolfowitz and Kendall rank-basedautocorrelations with respect to the van der Waerden or normal-score ones usedto test (ARMA) serial dependence alternatives. Keywords:
Asymptotic relative efficiency, rank-based tests, Wilcoxon test, vander Waerden test, Spearman autocorrelations, Kendall autocorrelations, linearserial rank statistics
1. Introduction
The Pitman asymptotic relative efficiency ARE f ( φ /φ ) under density f of atest φ with respect to a test φ is defined as the limit (when it exists) as n tendsto infinity of the ratio n f ( n ) /n of the number n f ( n ) of observations it takesfor the test φ , under density f , to match the local performance of the test φ basedon n observations. That concept was first proposed by Pitman in the unpublishedlecture notes [28] he prepared for a 1948-49 course at Columbia University. Thefirst published rigorous treatment of the subject was by Noether [25] in 1955. Asimilar definition applies to point estimation; see, for instance, [6] for a moreprecise definition. An in-depth treatment of the concept can be found in Chap-ter 10 of Serfling [31], Chapter 14 of van der Vaart [32], or in the monograph byNikitin [24]. Preprint submitted to Elsevier October 30, 2018 he study of the AREs of rank tests and R-estimators with respect to eachother or with respect to their classical Gaussian counterparts has produced a num-ber of interesting and sometimes surprising results. Considering the van der Waer-den or normal-score two-sample location rank test φ vdW and its classical normal-theory competitor, the two-sample Student test φ N , Chernoff and Savage in 1958established the rather striking fact that, under any density f satisfying very mildregularity assumptions, ARE f ( φ vdW /φ N ) ≥ , (1.1)with equality holding at the Gaussian density f = φ only. That result implies thatrank tests based on Gaussian scores (that is, the two-sample rank-based tests forlocation, but also the one-sample signed-rank ones, traditionally associated withthe names of van der Waerden, Fraser, Fisher, Yates, Terry and/or Hoeffding—for simplicity, in the sequel, we uniformly call them van der Waerden tests —asymptotically outperform the corresponding everyday practice Student t -test;see [1]. That result readily extends to one-sample symmetric and m -sample loca-tion, regression and analysis of variance models with independent noise.Another celebrated bound is the one obtained in 1956 by Hodges and Lehmann,who proved that, denoting by φ W the Wilcoxon test (same location and regressionproblems as above), ARE f ( φ W /φ N ) ≥ . , (1.2)which implies that the price to be paid for using rank- or signed-rank tests ofthe Wilcoxon type (that is, logistic-score-based rank tests) instead of the tradi-tional Student ones never exceeds 13.6% of the total number of observations. Thatbound moreover is sharp, being reached under the Epanechnikov density f . Onthe other hand, the benefits of considering Wilcoxon rather than Student can bearbitrarily large, as it is easily shown that the supremum over f of ARE f ( φ W /φ N ) is infinite; see [20].Both (1.1) and (1.2) created quite a surprise in the statistical community ofthe late fifties, and helped dispelling the wrong idea, by then quite widespread,that rank-based methods, although convenient and robust, could not be expectedto compete with the efficiency of traditional parametric procedures.Chernoff-Savage and Hodges-Lehmann inequalities since then have been ex-tended to a variety of more general settings. In the elliptical context, optimalrank-based procedures for location (one and m -sample case), regression, and scat-ter (one and m -sample cases) have been constructed in a series of papers by Hallinand Paindaveine ([7], [11], and [13]), based on a multivariate concept of signedranks. The Gaussian competitors here are of the Hotelling, Fisher, or Lagrange2ultiplier forms. For all those tests, Chernoff-Savage result similar to (1.1) havebeen established (see also [26, 27]). Hodges-Lehmann results also have beenobtained, with bounds that, quite interestingly, depend on the dimension of theobservation space: see [7].Another type of extension is into the direction of time series and linear rankstatistics of the serial type. Hallin [5] extended Chernoff and Savage’s result (1.1)to the serial context by showing that the serial van der Waerden rank tests alsouniformly dominate their Gaussian competitors (of the correlogram-based port-manteau, Durbin-Watson or Lagrange multiplier forms). Similarly, Hallin andTribel [19] proved that the 0.864 upper bound in (1.2) no longer holds for theAREs of the Wilcoxon serial rank test with respect to their Gaussian competitors,and is to be replaced by a slightly lower 0.854 one. Elliptical versions of thoseresults are derived in Hallin and Paindaveine ([8], [9], [10]).Now, AREs with respect to Gaussian procedures such as t -tests are not alwaysthe best evaluations of the asymptotic performances of rank-based tests. Theirexistence indeed requires the Gaussian procedures to be valid under the density f under consideration, a condition which places restrictions on f that may not besatisfied. When the Gaussian tests are no longer valid, one rather may like toconsider AREs of the formARE f ( φ J /φ K ) = 1 / ARE f ( φ K /φ J ) (1.3)comparing the asymptotic performances (under f ) of two rank-based tests φ J and φ K , based on score-generating functions J and K , respectively. Being distri-bution-free, rank-based procedures indeed do not impose any validity conditionson f , so that ARE f ( φ J /φ K ) in general exists under much milder requirementson f ; see, for instance, [17] and [18], where AREs of the form (1.3) are providedfor rank-based methods in linear models with stable errors under which Studenttests are not valid.Obtaining bounds for ARE f ( φ J /φ K ) , in general, is not as easy as for AREs ofthe form ARE f ( φ J /φ N ) . The first result of that type was established in 1961 byHodges and Lehmann, who in [21] show that ≤ ARE f ( φ W /φ vdW ) ≤ /π ≈ . (1.4)or, equivalently, . ≈ π/ ≤ ARE f ( φ vdW /φ W ) ≤ ∞ (1.5)3or all f in some class F of density functions satisfying weak differentiabilityconditions. Hodges and Lehmann moreover exhibit a parametric family of den-sities F HL = { f α | α ∈ [0 , ∞ ) } for which the function α ARE f α ( φ W /φ vdW ) achieves any value in the open interval (0 , /π ) ( α ARE f α ( φ vdW /φ W ) achievesany value in the open interval ( π/ , ∞ ) ). The lower and upper bounds in (1.4)and (1.5) thus are sharp in the sense that they are the best possible ones. Thesame result was extended and generalized by Gastwirth [2].Note that, in case f has finite second-order moments (so that ARE f ( φ W /φ N ) is well defined), since ARE f ( φ vdW /φ N ) = ARE f ( φ vdW /φ W ) × ARE f ( φ W /φ N ) ,Hodges and Lehmann’s “ /π result” implies that the ARE of the van der Waerdentests with respect to the Student ones, which by the Chernoff-Savage inequality islarger than or equal to one, actually can be arbitrarily large, and that this happensfor the same types of densities as for the Wilcoxon tests. This is an indication that,when Wilcoxon is quite significantly outperforming Student, that performance isshared by a broad class of rank-based tests and R -estimators, which includes thevan der Waerden ones.In Section 2, we successively consider the traditional case of nonserial rankstatistics used in the context of location and regression models with independentobservations, and the case of serial rank statistics; the latter involve ranks at time t and t − k , say, and aim at detecting serial dependence among the observations.Serial rank statistics typically involve two score functions and, instead of (1.3),yield AREs of the form ARE ∗ f ( φ J ,J /φ J ,J ) . (1.6)To start with, in Section 2.1, we revisit Gastwirth’s classical nonserial results.More precisely, we provide (Proposition 2.2) a slightly different proof of the mainproposition in [2], with some further illustrations in the case of Student scores.In Section 2.2, we turn to the serial case, with special attention for the so-calledWilcoxon-Wald-Wolfowitz, Kendall and van der Waerden rank autocorrelationcoefficients. Serial AREs of the form (1.6) typically are the product of two factorsto which the nonserial techniques of Section 2.1 separately apply; this providesbounds which, however, are not sharp. Therefore, in Section 3, we restrict to afew parametric families—the Student family (indexed by the degrees of freedom),the power-exponential family, or the Hodges-Lehmann family F HL —for whichnumerical values are displayed. 4 . Asymptotic relative efficiencies of rank-based procedures The asymptotic behavior of rank-based test statistics under local alternatives,since H´ajek and ˇSid´ak [4], is obtained via an application of Le Cam’s ThirdLemma (see, for instance, Chapter 13 of [32]). Whether the statistic is of theserial or the nonserial type, the result, under a density f with distribution func-tion F involves integrals of the form K ( J ) := Z J ( u )d u , K ( J, f ) := Z J ( u ) ϕ f ( F − ( u ))d u, and, in the serial case, J ( J, f ) := Z J ( u ) F − ( u )d u where, assuming that f admits a weak derivative f ′ , ϕ f := − f ′ /f is such thatthe Fisher information for location I ( f ) := R ϕ f ( F − ( u ))d u is finite. Denoteby F the class of such densities. If local alternatives, in the serial case, are of theARMA type, f is further restricted to the subset F of densities f ∈ F havingfinite second-order moments. Differentiability in quadratic mean of f / is thestandard assumption here, see Chapter 7 of [32]; but absolute continuity of f inthe traditional sense, with a.e. derivative f ′ , is sufficient for most purposes. Werefer to [4] and [16] for details in the nonserial and the serial case, respectively. In location or regression problems, or, more generally, when testing linear con-straints on the parameters of a linear model (this includes ANOVA etc.), the ARE,under density f ∈ F , of a rank-based test φ J based on the square-summablescore-generating function J with respect to another rank-based test φ J based onthe square-summable score-generating function J takes the formARE f ( φ J /φ J ) = K ( J ) K ( J ) C f ( J , J ) , with C f ( J , J ) := K ( J , f ) K ( J , f ) , (2.1)provided that J and J are monotone, or the difference between two monotonefunctions. Those ARE values readily extend to the m -sample setting, and to R-estimation problems. In a time-series context with innovation density f ∈ F ,and under slightly more restrictive assumptions on the scores, they also extend to5he partly rank-based tests and R-estimators considered by Koul and Saleh in [22]and [23].Gastwirth (1970) is basing his analysis of (2.1) on an integration by parts ofthe integral in the definition of K ( J, f ) . If both J and J are differentiable, withderivatives J ′ and J ′ , respectively, and provided that f is such that lim x →∞ J ( F ( x )) f ( x ) = 0 = lim x →∞ J ( F ( x )) f ( x ) , integration by parts in those integrals yields, for (2.1),ARE f ( φ J /φ J ) = K ( J ) K ( J ) R ∞−∞ J ′ ( F ( x )) f ( x )d x R ∞−∞ J ′ ( F ( x )) f ( x )d x ! . (2.2)In view of the Chernoff-Savage result (1.1), the van der Waerden score-genera-ting function J ( u ) = J vdW ( u ) = Φ − ( u ) (2.3)(with u Φ − ( u ) the standard normal quantile function) may appear as a naturalbenchmark for ARE computations. From a technical point of view, under thisintegration by parts approach, the Wilcoxon score-generating function J ( u ) = J W ( u ) = u − / (2.4)(the Spearman-Wald-Wolfowitz score-generating function in the serial case) ismore appropriate, though. Convexity arguments indeed will play an importantrole, and, being linear, J W is both convex and concave. Since J ′ W ( u ) = 1 and K ( J W ) = 1 / , equation (2.2) yields ARE f ( φ J /φ W ) = 1 K ( J ) R ∞−∞ J ′ ( F ( x )) f ( x )d x R ∞−∞ f ( x )d x ! . (2.5)Bounds on J ′ ( F ( x )) then readily yield bounds on AREs, irrespective of f .That property of Wilcoxon scores is exploited in Propositions 2.2 and 2.3 fornon-serial AREs, in Propositions 2.4 for the serial ones; those bounds are mainlyabout AREs of, or with respect to, Wilcoxon (Spearman-Wald-Wolfowitz) proce-dures, but not exclusively so.Assume that f ∈ F := { f ∈ F | lim x →±∞ f ( x ) = 0 } . Then, integration byparts is possible in the definition of K ( J W , f ) , yielding K ( J W , f ) = Z ∞−∞ f ( x )d x. J (thedifference of two monotone increasing functions) is differentiable, with deriva-tive J ′ , and that f ∈ F J := { f ∈ F | lim x →±∞ J ( F ( x )) f ( x ) = 0 } , so that (2.2) holds. Finally, assume that J is skew-symmetric about / . Definingthe (possibly infinite) constants κ + J := sup u ≥ / | J ′ ( u ) | and κ − J := inf u ≥ / | J ′ ( u ) | , we can always write ARE f ( φ J /φ W ) ≤ ( κ + J ) / K ( J ) (2.6)while, if J is non-decreasing (hence J ′ is non-negative), we further have ( κ − J ) / K ( J ) ≤ ARE f ( φ J /φ W ) ≤ ( κ + J ) / K ( J ) . (2.7)The quantities appearing in (2.6) and (2.7) often can be computed explicitly, yield-ing ARE bounds which are, moreover, sharp under certain conditions (see below).For example, if J is convex on [1 / , , its derivative J ′ is non-decreasingover [1 / , , so that κ − J = J ′ (1 / ≥ and κ + J = lim u → J ′ ( u ) ≤ + ∞ . (2.8)It follows that, under the assumptions made, ( J ′ (1 / / K ( J ) ≤ ARE f ( φ J /φ W ) ≤ (lim u → J ′ ( u )) / K ( J ) . (2.9)The lower bound in (2.9) is established in Theorem 2.1 of [2].The double inequality (2.9) holds, for instance (still, under f ∈ F J ), when thescores J = ϕ g ◦ G − are the optimal scores associated with some symmetric and strongly unimodal density g with distribution function G ; such densities indeed arelog-concave and have monotone increasing, convex over [1 / , score functions.Symmetric log-concave densities take the form g ( x ) = Ke − µ ( x ) , K − = Z ∞−∞ e − µ ( x ) d x (2.10)7ith x µ ( x ) a convex, even (that is, µ ( x ) = µ ( − x ) ) function; assume it to betwice differentiable, with derivatives µ ′ and µ ′′ . Then, ϕ g ( x ) = µ ′ ( x ) , so that J ( u ) := ϕ g ( G − ( u )) = µ ′ ( G − ( u )) , K ( J ) = Z ∞−∞ (cid:0) µ ′ ( x ) (cid:1) g ( x )d x = I ( g ) where I ( g ) the Fisher information of g (which we assume to be finite), and J ′ ( u ) = µ ′′ ( G − ( u )) /g ( G − ( u )) , hence J ′ (1 /
2) = µ ′′ (0) g (0) = µ ′′ (0) K .
Specializing (2.9) to this situation, we obtain the following proposition.
Proposition 2.1.
If the square-integrable score-generating function J is of theform ϕ g ◦ G − with g given by (2.10) , µ even, convex, and twice differentiable,then, under any f ∈ F J , (cid:18) µ ′′ (0) K (cid:19) ≤ I ( g ) ARE f ( φ J /φ W ) ≤ (lim u → J ′ ( u )) = ( lim x →∞ ( µ ′′ ( x ) /g ( x )) . (2.11)With µ ( x ) = x / (so that K − = √ π ) in (2.10), g is the standard Gaus-sian density; µ ′′ (0) = 1 , I ( g ) = 1 , and the lower bound in (2.11 ) becomes ( µ ′′ (0) /K ) = 2 π , whereas the upper bound is trivially infinite. This yields theHodges-Lehmann result (1.4).Turning back to (2.6) and (2.7), but with J concave (and still non-decreasing)on [1 / , , J ′ is nonincreasing, so that κ + J = J ′ (1 / and ARE f ( φ J /φ W ) ≤ ( J ′ (1 / / K ( J ) . (2.12)Not much can be said on the lower bound, though, without further assumptionson the behavior of J around u = 1 .Replacing, for various score-generating functions J and densities f , the quan-tities appearing in (2.6), (2.9) or (2.12) with their explicit values provides a varietyof bounds of the Hodges-Lehmann type. Below, we consider the van der Waer-den tests φ vdW , based on the score-generating function (2.3) and the Cauchy-scorerank tests φ Cauchy , based on the score-generating function J Cauchy ( u ) = sin(2 π ( u − / . (2.13)8 igure 1: ARE f ( φ W /φ vdW ) and ARE f ( φ Cauchy /φ vdW ) under various families of den-sities: symmetric stable (indexed by their tail parameter α ), Student- t (indexed by theirdegrees of freedom ν ) or Pareto (indexed by their shape parameter α ). roposition 2.2. For all symmetric densities f in F vdW , F Cauchy and F vdW T F Cauchy ,respectively,(i) ARE f ( φ W /φ vdW ) ≤ /π ;(ii) ARE f ( φ Cauchy /φ W ) ≤ π / ;(iii) ARE f ( φ Cauchy /φ vdW ) ≤ π .Proof. The van der Waerden score (2.3) is strictly increasing, and convex over [1 / , . One readily obtains K ( J vdW ) = 1 and J ′ vdW ( u ) = √ π exp { (Φ − ( u )) / } , hence κ − vdW = J ′ vdW (1 /
2) = √ π . Plugging this into the left-hand side inequalityof (2.9) yields (i). Alternatively one can directly apply (2.11).The Cauchy score is concave over [1 / , , but not monotone (being of boundedvariation, however, it is the difference of two monotone function). Direct inspec-tion of (2.13) nevertheless reveals that K ( J Cauchy ) = 1 / and J ′ Cauchy ( u ) = 2 π cos(2 π ( u − / , hence κ +Cauchy = J ′ Cauchy (1 /
2) = 2 π . Substituting this in (2.6) yields (ii). Theproduct of the upper bounds in (i) and (ii) yields (iii).Remarkably, those three bounds are sharp. Indeed, numerical evaluation showsthat they can be approached arbitrarily well by taking extremely heavy-tails suchas those of stable densities f α with tail index α → , Student densities with de-grees of freedom ν → , or Pareto densities with α → ; see also the family F HL of densities f a,ǫ ( x ) defined in equation (3.1).Figure 1 provides plots of ARE f ( φ W /φ vdW ) and ARE f ( φ Cauchy /φ vdW ) forvarious densities. Inspection of those graphs shows that both AREs are decreasingas the tails become lighter; the sharpness of bounds (i) and (iii), hence also that ofbound (ii), is graphically confirmed.The bounds proposed in Proposition 2.2 are not new, and have been obtainedalready in [2]. One would like to see similar bounds for other score functions,such as the Student ones J t ν ( u ) = ( ν + 1) F − t ν ( u ) / ( ν + F − t ν ( u ) ) 0 < u <
1= 1 + ν √ ν s − IB ν (1 − u ) IB ν (1 − u ) 1 / ≤ u < (2.14)10here IB ν ( v ) denotes the inverse of the regularized incomplete beta function eval-uated at (1 , v, ν/ , / and F − t ν stands for the Student quantile function with ν degrees of freedom. Note that lim v →− IB ν ( v ) = 0 , so that lim u → J t ν ( u ) = 0 .Since J t ν (1 /
2) = 0 and J ′ t ν (1 / > , this means that, on [1 / , , J t ν is a re-descending function; in general, it is neither convex nor concave on [1 / , .Differentiating (2.14), we get, for u ≥ / , J ′ t ν ( u ) = √ π ( ν + 1)Γ (cid:0) ν (cid:1) √ ν Γ (cid:0) ν +12 (cid:1) ( − IB ν (1 − u )) IB ν (1 − u ) − ν , (2.15)from which we deduce that lim u → J ′ t ν ( u ) = < ν < − π ν = 1 −∞ < ν . Except for the ν = 1 case, which is covered by (ii) and (iii) in Proposition 2.2,these values do not provide exploitable values for κ + . For ν < , however, onecan check from (2.15) that max u ≥ / | J ′ ( x ) | = J ′ (1 / , so that κ + J tν = −√ π ( ν + 1)Γ (cid:16) ν (cid:17). √ ν Γ (cid:18) ν + 12 (cid:19) . Elementary though somewhat tedious algebra yields K ( J t ν ) = ( ν + 1) / ( ν + 3) . Plugging this into (2.6), we obtain, for ν ≤ , the following additional bounds. Proposition 2.3.
For all < ν ≤ and all symmetric density f in F J tν and F J tν T F J vdW , respectively,(iv) ARE f ( φ t ν /φ W ) ≤ π Γ ( ν )( ν + 3)( ν + 1) / ν Γ ( ν +12 ) , and(v) ARE f ( φ t ν /φ vdW ) ≤ Γ ( ν )( ν + 3)( ν + 1) / ν Γ ( ν +12 ) . Inequality (iv) is sharp, the bound being achieved, in the limit, under very heavytails (stable densities with α ↓ , or Student- t µ densities with µ ↓ ). Sincethis is also the case, under the same sequences of densities, for inequality (i) inProposition 2.1, inequality (v) is sharp as well. The upper bounds (iv) and (v) areboth decreasing functions of the tail index ν ; both are unbounded at the origin,and both converge to the corresponding Cauchy values as ν .11 .2. The serial case Until the early eighties, and despite some forerunning time-series applicationssuch as Wald and Wolfowitz [33] (published as early as 1943—two years beforeFrank Wilcoxon’s pathbreaking 1945 paper [34]!), rank-based methods had beenessentially limited to statistical models involving univariate independent obser-vations. Therefore, the traditional ARE bounds (Hodges and Lehmann [20, 21],Chernoff-Savage [1] or Gastwirth [2]), as well as the classical monographs (H´ajekand ˇSid´ak [4], Randles and Wolfe [30], Puri and Sen [29], to quote only a few)mainly deal with univariate location and single-output linear (regression) mod-els with independent observations. The situation since then has changed, andrank-based procedures nowadays have been proposed for a much broader class ofstatistical models, including time series problems, where serial dependencies arethe main features under study.In this section, we focus on the linear rank statistics of the serial type involv-ing two square-integrable score functions. Those statistics enjoy optimality prop-erties in the context of linear time series (ARMA models; see [16] for details).Once adequately standardized, those statistics yield the so-called rank-based au-tocorrelation coefficients . Denote by R ( n )1 , . . . , R ( n ) n the ranks in a triangular ar-ray X ( n )1 , . . . , X ( n ) n of observations. Rank autocorrelations (with lag k ) are linearserial rank statistics of the form r e ( n ) J J ; k := h ( n − k ) − n X t = k +1 J (cid:16) R ( n ) t n + 1 (cid:17) J (cid:16) R ( n ) t − k n + 1 (cid:17) − m ( n ) J J i(cid:0) s ( n ) J J (cid:1) − , where J and J are (square-integrable) score-generating functions, whereas m ( n ) J J and s ( n ) J J := s ( n ) J J ; k denote the exact mean of J (cid:16) R ( n ) t n +1 (cid:17) J (cid:16) R ( n ) t − k n +1 (cid:17) and the ex-act standard error of ( n − k ) − P nt = k +1 J (cid:16) R ( n ) t n +1 (cid:17) J (cid:16) R ( n ) t − k n +1 (cid:17) under the assumptionof i.i.d. X ( n ) t ’s (more precisely, exchangeable R ( n ) t ’s), respectively; we refer topages 186 and 187 of [16] for explicit formulas. Signed-rank autocorrelationcoefficients are defined similarly; see [15] or [16].Rank and signed-rank autocorrelations are measures of serial dependence of-fering rank-based alternatives to the usual autocorrelation coefficients, of the form r ( n ) k := n X t = k +1 X t X t − k / n X t =1 X t , van der Waerden autocorrelations [14] r e ( n ) vdW ; k := h ( n − k ) − n X t = k +1 Φ − (cid:16) R ( n ) t n + 1 (cid:17) Φ − (cid:16) R ( n ) t − k n + 1 (cid:17) − m ( n ) vdW i(cid:0) s ( n ) vdW (cid:1) − , (ii) the Wald-Wolfowitz or Spearman autocorrelations [33] r e ( n ) SWW ; k := h ( n − k ) − n X t = k +1 R ( n ) t R ( n ) t − k − m ( n ) SWW i(cid:0) s ( n ) SWW (cid:1) − , (iii) and the Kendall autocorrelations [3] (where explicit values of m ( n ) K and s ( n ) K are provided) r e ( n ) K ; k := h − D ( n ) k ( n − k )( n − k − − m ( n ) K i(cid:0) s ( n ) K (cid:1) − with D ( n ) k denoting the number of discordances at lag k , that is, the numberof pairs ( R ( n ) t , R ( n ) t − k ) and ( R ( n ) s , R ( n ) s − k ) that satisfy either R ( n ) t < R ( n ) s and R ( n ) t − k > R ( n ) s − k , or R ( n ) t > R ( n ) s and R ( n ) t − k < R ( n ) s − k ; more specifically, D ( n ) k := P nt = k +1 P ns = t +1 I ( R ( n ) t < R ( n ) s , R ( n ) t − k > R ( n ) s − k ) . The van der Waerden autocorrelations are optimal—in the sense that they allowfor locally optimal rank tests in the case of ARMA models with normal innova-tion densities. The Spearman and Kendall autocorrelations are serial versions ofSpearman’s rho and Kendall’s tau , respectively, and are asymptotically equivalentunder the null hypothesis of independence; although they are never optimal forany ARMA alternative, they achieve excellent overall performance. Signed rankautocorrelations are defined in a similar way.Let J i , i = 1 , . . . , denote four square-summable score functions, and as-sume that they are monotone increasing, or the difference between two mono-tone increasing functions (that assumption tacitly will be made in the sequel eachtime AREs are to be computed). Recall that F denotes the subclass of densi-ties f ∈ F having finite moments of order two. The asymptotic relative effi-ciency, under innovation density f ∈ F , of the rank-based tests φ rJ J based onthe autocorrelations r e ( n ) J J ; k with respect to the rank-based tests φ rJ J based on the13utocorrelations r e ( n ) J J ; k isARE ∗ f ( φ rJ J /φ rJ J )= K ( J ) K ( J ) R J ( v ) ϕ f ( F − ( v ))d v R J ( v ) ϕ f ( F − ( v ))d v ! K ( J ) K ( J ) R J ( v ) F − ( v )d v R J ( v ) F − ( v )d v ! = K ( J ) K ( J ) C f ( J , J ) K ( J ) K ( J ) D f ( J , J ) (2.16)with C f ( J , J ) := K ( J , f ) / K ( J , f ) and D f ( J , J ) := J ( J , f ) / J ( J , f ) .The C f ratios have been studied in Section 2.1, and the same conclusions applyhere; as for the D f ratios, they can be treated by similar methods.Denote by φ r vdW , φ r W , φ r SWW , . . . the tests based on r e ( n ) vdW ; k , r e ( n ) W ; k , r e ( n ) SWW ; k ,etc. The serial counterpart of ARE f ( φ W /φ J ) is ARE ∗ f ( φ r SWW /φ rJ J ) , for whichthe following result holds. Proposition 2.4.
Let the score functions J and J be monotone increasing, skew-symmetric about / , and differentiable, with strictly positive J ′ (1 / and J ′ (1 / .Suppose that f ∈ F T F J T F J is a symmetric probability density function.Then, if J and J are(i) convex on [1 / , ,ARE ∗ f ( φ r SWW /φ rJ J ) = ARE ∗ f ( φ r K /φ rJ J ) ≤ K ( J ) K ( J )( J ′ (1 / J ′ (1 / ; (ii) concave on [1 / , ,ARE ∗ f ( φ rJ J /φ r SWW ) =
ARE ∗ f ( φ rJ J /φ r K ) ≤ J ′ (1 / J ′ (1 / K ( J ) K ( J ) . Proof.
In view of (2.1), we haveARE ∗ f ( φ r SWW /φ rJ J ) = ARE f ( φ W /φ J ) K ( J ) K ( J W ) R ( v − / F − ( v )d v R J ( v ) F − ( v )d v ! . Consider part (i) of the proposition. It follows from (2.7) thatARE f ( φ W /φ J ) ≤ K ( J ) / ( J ′ (1 / . J is convex over [1 / , , J ( u ) ≥ J ′ (1 / u − / for all u ∈ [1 / , ,so that Z J ( v ) F − ( v )d v = 2 Z / J ( v ) F − ( v )d v ≥ J ′ (1 / Z / ( v − / F − ( v )d v. It follows that K ( J ) K ( J W ) R ( v − / F − ( v )d v R J ( v ) F − ( v )d v ! ≤ K ( J )( J ′ (1 / , where the assumption of finite variance is used. Part (i) of the result follows.A similar argument holds (with reversed inequalities) if J is concave, yieldingpart (ii).Applying this result to the score functions J ( u ) = J ( u ) = Φ − ( u ) (convexover [1 / , ) for which J ′ (1 /
2) = J ′ (1 /
2) = √ π and K ( J ) = K ( J ) = 1 ,we readily obtain the following serial extension of Hodges and Lehmann’s “ /π result”: ARE ∗ f ( φ r SWW /φ r vdW ) = ARE ∗ f ( φ r K /φ r vdW ) ≤ (6 /π ) . (2.17)An important difference, though, is that the bound in (2.17) is unlikely tobe sharp. Section 3 provides some numerical evidence of that fact, which ishardly surprising: while the ratio C f ( J vdW , J W ) is maximized for densities puttingall their weight about the origin, this no longer holds true for D f ( J vdW , J W ) .In particular, the sequences of densities considered in [21] or [2] along which C f ( J vdW , J W ) tends to its upper bound typically are not the same as those alongwhich D f ( J vdW , J W ) does.
3. Some numerical results
In this final section, we provide numerical values of
ARE f ( φ W /φ vdW ) (de-noted as ARE f in the sequel) and ARE ∗ f ( φ r SWW /φ r vdW ) (denoted as ARE ∗ f in thesequel) under various families of distributions.First, let us give some ARE values under Gaussian densities: if f = φ , weobtain C φ ( J W , J vdW ) = D φ ( J W , J vdW ) = 12 √ π ≈ . so that ARE φ ( φ W /φ vdW ) = 3 π ≈ . C f D f ARE f ARE ∗ f Table 1: Numerical values of C f , D f , ARE f = ARE f ( φ W /φ vdW ) and ARE ∗ f = ARE ∗ f ( φ r SWW /φ r vdW ) under densities f a,ǫ in the Hodges-Lehmann family F HL ( see (3.1)), forvarious values of ǫ and a → . and ARE ∗ φ ( φ r SWW /φ r vdW ) = 9 π ≈ . . Tables 1-3 provide numerical values of
ARE f and ARE ∗ f under(i) (Table 1) the two-parameter family F HL of densities f a,ǫ associated with thedistribution functions F a,ǫ ( x ) = (cid:26) Φ( x ) if ≤ x ≤ ǫ Φ( ǫ + a ( x − ǫ )) if ǫ < x (3.1)where F a,ǫ ( x ) is defined by symmetry for x ≤ (this family of distribu-tions, which has been used by Hodges and Lehmann [21], is such that thenonserial /π bound is achieved, in the limit, as both a and ǫ go to zero),(ii) (Table 2) the family F Student of Student densities with degrees of freedom ν > , and(iii) (Table 3) the family F e of power-exponential densities, of the form f α ( x ) := e −| x | α /α ) x ∈ R , α > . (3.2)All tables seem to confirm the same findings : both the serial and the non-serialAREs are monotone in the size of the tails, with the non-serial ARE f attaining its16 C f D f ARE f ARE ∗ f Table 2: Numerical values of C f , D f , ARE f = ARE f ( φ W /φ vdW ) and ARE ∗ f = ARE ∗ f ( φ r SWW /φ r vdW ) under Student- t densities with various degrees of freedom ν . α C f D f ARE f ARE ∗ f Table 3: Numerical values of C f , D f , ARE f = ARE f ( φ W /φ vdW ) and ARE ∗ f = ARE ∗ f ( φ r SWW /φ r vdW ) under power exponential densities for various values of the shape parame-ters α . igure 2: Nonserial ARE f = ARE f ( φ W /φ vdW ) (left plot) and serial ARE ⋆f = ARE ∗ f ( φ r SWW /φ r vdW ) (right plot) under densities f a,ǫ in the Hodges-Lehmann family F HL ( see (3.1)), as a function of ǫ ∈ [0 , , for various choices of the parameter a . igure 3: Left plot : ARE f ν ( φ W /φ vdW ) and ARE ⋆f ν ( φ r SWW /φ r vdW ) for f ν the Studentdistribution, as a function of the degrees of freedom ν ∈ [2 , . Right plot : ARE f α andARE ⋆f α for the power exponential densities f α (3.2), as a function of the shape parameter α ∈ [0 , . /π ≈ . ) under heavy-tailed f densities, while the max-imal value for the serial ARE ∗ f lies somewhere around (6 /π )(3 /π ) ≈ . .Inspection of Table 1 reveals that, although the limit of C f as a → is monotonein the parameter ǫ , the ratio D f is not; from Table 3, the highest values of D f under the distribution (3.1) are attained for a → ∞ and ǫ ≈ .Under Student densities f = f t ν , the nonserial ARE f is decreasing with ν , tak-ing value 1.41277 at the Cauchy ( ν = 1 ), value one about ν = 15 . (a value of ν that is not shown in the figure; Wilcoxon is thus outperforming van der Waerdenup to ν = 15 degrees of freedom, with van der Waerden taking over from ν = 16 on), and tending to the Gaussian value . as ν → ∞ ; the serial ARE ∗ f is un-defined for ν ≤ , increasing for small values of ν , from an infimum of 0.878736(obtained as ν ↓ ) up to a maximum of 0.968852 (reached about ν = 4 . ), thenslowly decreasing to the Gaussian value 0.911891 as ν → ∞ . Sperman-Wald-Wolfowitz and Kendall thus never outperform van der Waerden autocorrelationsunder Student densities.Under the double exponential densities f = f α , the nonserial ARE f is de-creasing with α , with a supremum of /π (the Hodges-Lehmann bound, obtainedas α ↓ ), and reaches value one about α = 1 . (similar local asymptotic per-formances of Wilcoxon and van der Waerden, thus, occur at power-exponentialswith parameter α = 1 . ); the serial ARE ∗ f is quite bad as α ↓ , then rapidlyincreasing for small values of α , with a maximum of 1.08552 about α = 0 . ,then deteriorating again as α → ∞ ; for α larger than 3, the serial and nonserialAREs roughly coincide. Acknowledgments
This note originates in a research visit by the last two authors to the De-partment of Operations Research and Financial Engineering (ORFE) at Prince-ton University in the Fall of 2012; ORFE ’s support and hospitality is gratefullyacknowledged. Marc Hallin’s research is supported by the Sonderforschungs-bereich “Statistical modelling of nonlinear dynamic processes” (SFB 823) of theDeutsche Forschungsgemeinschaft, a Discovery Grant of the Australian ResearchCouncil, and the IAP research network grant P7/06 of the Belgian government(Belgian Science Policy). We gratefully acknowledge the pertinent comments byan anonymous referee on the original version of the manuscript, which lead tosubstantial improvements. 20 eferences [1] H. Chernoff and I. R. Savage (1958). Asymptotic normality and efficiency ofcertain nonparametric tests.
Annals of Mathematical Statistics
29, 972–994.[2] J.L. Gastwirth (1970). On asymptotic relative efficiencies of a class of ranktests.
Journal of the Royal Statistical Society Series B
32, 227–232.[3] T.S. Ferguson, C. Genest, and M. Hallin (2000). Kendall’s tau for serial de-pendence.
The Canadian Journal of Statistics
28, 587–604.[4] J. H´ajek and Z. ˇSid´ak (1967).
Theory of Rank Tests , Academic Press,New York.[5] M. Hallin (1994). On the Pitman non-admissibility of correlogram-basedmethods.
Journal of Time Series Analysis , 15, 607–611.[6] M. Hallin (2012). Asymptotic Relative Efficiency. In W. Piegorsch and A. ElShaarawi Eds,
Encyclopedia of Environmetrics , 2nd edition, Wiley, 106-110.[7] M. Hallin and D. Paindaveine (2002). Optimal tests for multivariate locationbased on interdirections and pseudo-Mahalanobis ranks.
Annals of Statistics
30, 1103-1133.[8] M. Hallin and D. Paindaveine (2002). Optimal procedures based on interdi-rections and pseudo-Mahalanobis ranks for testing multivariate elliptic whitenoise against ARMA dependence.
Bernoulli , 787-815.[9] M. Hallin and D. Paindaveine (2004). Rank-based optimal tests of the ade-quacy of an elliptic VARMA model. Annals of Statistics
32, 2642–2678.[10] M. Hallin and D. Paindaveine (2005). Affine-invariant aligned rank tests forthe multivariate general linear model with ARMA errors.
Journal of Multi-variate Analysis
93, 122-163.[11] M. Hallin and D. Paindaveine (2006). Semiparametrically efficient rank-based inference for shape: I Optimal rank-based tests for sphericity.
Annalsof Statistics
34, 2707–2756.[12] M. Hallin and D. Paindaveine (2008a). Chernoff-Savage and Hodges-Lehmann results for Wilks’ test of independence. In N. Balakrishnan, EdselPena and Mervyn J. Silvapulle, Eds,
Beyond Parametrics in Interdisciplinary esearch : Festschrift in Honor of Professor Pranab K. Sen . I.M.S. LectureNotes-Monograph Series, 184–196.[13] M. Hallin and D. Paindaveine (2008b). Optimal rank-based tests for homo-geneity of scatter. Annals of Statistics
36, 1261-1298.[14] M. Hallin and M.L. Puri (1988). Optimal rank-based procedures for time-series analysis: testing an
ARMA model against other
ARMA models.
An-nals of Statistics
16, 402-432.[15] M. Hallin and M.L. Puri (1992). Rank tests for time series analysis. In
NewDirections In Time Series Analysis (D. Brillinger, E. Parzen and M. Rosen-blatt, eds), Springer-Verlag, New York, 111-154.[16] M. Hallin and M.L. Puri (1994). Aligned rank tests for linear models withautocorrelated error terms.
Journal of Multivariate Analysis
50, 175-237.[17] M. Hallin, Y. Swan, T. Verdebout, and D. Veredas (2011). Rank-based test-ing in linear models with stable errors.
Journal of Nonparametric Statistics
23, 305–320.[18] M. Hallin, Y. Swan, T. Verdebout, and D. Veredas (2013). One-step R-estimation in linear models with stable errors.
Journal of Econometrics
Game Theory, Optimal Stopping, Probability, and Statistics, Papers in honorof T.S. Ferguson on the occasion of his 70th birthday , I.M.S. Lecture Notes-Monograph Series, 249-262.[20] J.L. Hodges and E.L. Lehmann (1956). The efficiency of some nonparamet-ric competitors of the t -test. Annals of Mathematical Statistics
2, 324–335.[21] J.L. Hodges and E.L. Lehmann (1961). Comparison of the normal scores andWilcoxon tests.
Proceedings of the Fourth Berkeley Symposium on Mathe-matical Statististics and Probability
Vol. 1, 307–318.[22] H.L. Koul and A.K.Md.E. Saleh (1993). R-Estimation of the parameters ofautoregressive AR ( p ) models. The Annals of Statistics
21, 685–701.2223] H.L. Koul and A.K.Md.E. Saleh (1995). Autoregression quantiles and re-lated rank-scores processes. .
The Annals of Statistics
25, 670–689.[24] Y. Nikitin (1995).
Asymptotic Efficiency of Nonparametric Tests . CambridgeUniversity Press, Cambridge.[25] G.E. Noether (1955). On a theorem of Pitman.
Annals of MathematicalStatistics
26, 64–68.[26] D. Paindaveine (2004). A unified and elementary proof of serial and non-serial, univariate and multivariate, Chernoff–Savage results.
StatisticalMethodology
1, 81–91.[27] D. Paindaveine (2006). A Chernoff–Savage result for shape: on the non-admissibility of pseudo-Gaussian methods.
Journal of Multivariate Analysis
97, 2206–2220.[28] E.J.G. Pitman (1949).
Notes on Nonparametric Statistical Inference .Columbia University, mimeographed.[29] M.L. Puri and P.K. Sen (1985).
Nonparametric Methods in General LinearModels , J. Wiley, New York.[30] R.H. Randles and D.A. Wolfe (1979).
Introduction to the theory of nonpara-metric statistics . John Wiley & Sons, New York.[31] R. Serfling (1980).
Approximation Theorems of Mathematical Statistics .John Wiley & Sons, New York.[32] A.W. van der Vaart (1998).
Asymptotic Statistics . Cambridge UniversityPress, Cambridge.[33] A. Wald and J. Wolfowitz (1943). An exact test for randomness in the non-parametric case based on serial correlation.
Annals of Mathematical Statis-tics
14, 378-388.[34] F. Wilcoxon (1945). Individual comparisons by ranking methods.
BiometricsBulletin1