The complex behaviour of Galton rank order statistic
aa r X i v : . [ m a t h . S T ] F e b The complex behaviour of Galton rankorder statistic. ∗ E. del Barrio , J.A. Cuesta-Albertos and C. Matr´an Departamento de Estad´ıstica e Investigaci´on Operativa and IMUVA,Universidad de Valladolid Departamento de Matem´aticas, Estad´ıstica y Computaci´on,Universidad de Cantabria
February 5, 2021
Abstract
Galton’s rank order statistic is one of the oldest statistical tools for two-samplecomparisons. It is also a very natural index to measure departures from stochas-tic dominance. Yet, its asymptotic behaviour has been investigated only partially,under restrictive assumptions. This work provides a comprehensive study of thisbehaviour, based on the analysis of the so-called contact set (a modification of theset in which the quantile functions coincide). We show that a.s. convergence to thepopulation counterpart holds if and only if the contact set has zero Lebesgue mea-sure. When this set is finite we show that the asymptotic behaviour is determinedby the local behaviour of a suitable reparameterization of the quantile functions in aneighbourhood of the contact points. Regular crossings result in standard rates andGaussian limiting distributions, but higher order contacts (in the sense introducedin this work) or contacts at the extremes of the supports may result in differentrates and non-Gaussian limits.
Keywords:
Relaxed stochastic dominance, asymptotics, consistency, Galton rank orderstatistic, comparison of quantile functions, contact points, crossings, tangencies, contactintensity. ∗ Research partially supported by FEDER, Spanish Ministerio de Econom´ıa y Competitividad, grantsMTM2014-56235-C2-1-P and MTM2017-86061-C2-1-P and Junta de Castilla y Le´on, grants VA005P17and VA002G18. Introduction and main results
The Introductory Remarks in Darwin’s report on the benefits of cross-fertilization tothe propagation of vegetal species [Darwin (1876)] include the following comment, byGalton: “The observations. . . have no primˆa facie appearance of regularity. But as soonas we arrange them in order of their magnitudes,. . . . We now see, with few exceptions,that. . . the largest plant on the crossed side. . . exceeds the largest plant on the self-fertilisedside, that. . . the second exceeds the second,. . . and so on. . . ” . With this argument, Galtonopened a simple way of comparison of distributions, just by comparing the values withthe same ranks in their respective settings.Given two samples of equal size, X , . . . , X n and Y , . . . , Y n , respectively coming fromthe distribution functions (d.f.’s in the sequel) F and G , let us denote by F n and G n thecorresponding sample d.f.’s. Galton’s solution consisted in reordering both data samplesin increasing order: X (1) , . . . , X ( n ) (coming from the control) and Y (1) , . . . , Y ( n ) (from thetreatment) and computing G ( F n , G n ) := { i : X ( i ) > Y ( i ) } , concluding improvementunder the treatment whenever G ( F n , G n ) is small enough. When F = G is continu-ous, the distribution of this ‘Galton Rank Order’ statistic is uniform on { , , . . . , n } (see[Chung and Feller (1949)]; see also [Sparre-Andersen (1953)], [Hodges(1955)] or [Feller(1968)]for alternative proofs). As explained in [Hodges(1955)], in Darwin’s problem the samplesizes were 15 and G ( F , G ) = 2, thus the p -value associated to Galton’s approach is3/16, which is not as rare as he suspected.Galton’s strategy was related to the assessment of stochastic dominance of G over F , F < st G , being the alternative to the null hypothesis F = G . Recall that, by definition, F ≤ st G whenever F ( x ) ≥ G ( x ) for every x ∈ R . As noted in [Lehmann(1955)], this relation is better understood when it is stated interms of the quantile functions: if F − is the quantile function associated to F, definedby F − ( t ) := inf { x : t ≤ F ( x ) } , for t ∈ (0 , , (1)then F ≤ st G whenever F − ( t ) ≤ G − ( t ) for every t ∈ (0 , . A useful feature of the quantile functions is that they provide a canonical represen-tation of random variables (r.v.’s in that follows) with a given d.f.: if we consider theLebesgue measure, ℓ , on the unit interval (0 , F − is a r.v. with d.f. F .With this in mind, we set γ ( F, G ) := ℓ { t : F − ( t ) > G − ( t ) } (2) We have tried to use throughout standard or natural notation. However, a complete enough notationguide is included at the end of this section. G ( F n , G n ) = nγ ( F n , G n ) . (3)Early work on Galton’s rank statistic focused on the case F = G and equal samplesizes. Special mention should be given to [Cs´aki and Vincze (1961)], which analyzes thejoint behaviour of the Kolmogorov-Smirnov and Galton statistics (under F = G ). Alsofor equal sample sizes, later, [Gross and Holland(1968)] considered the intermediate casewith F = G possibly, but ℓ { F − = G − } >
0. Focusing on the dominance model F = G vs F ≤ st G , [Behnen and Neuhaus (1983)] addressed the local asymptotic efficiency of γ ( F n , G m ), noting that it is just a generalization of Galton’s statistic (recall (3)) andusing empirical processes techniques to obtain the asymptotic distribution of γ ( F n , G m )under the null F = G for independent samples with different sizes. Independently, lookingfor a feasible statistical way of relaxing the idea of “treatment improvement” underlyingstochastic dominance, [ ´Alvarez-Esteban et al.(2017)] introduced (2) as a na¨ıf index tomeasure deviation from stochastic dominance, F ≤ st G and provided some asymptotictheory for the empirical index, for the case of d.f.’s with a single crossing point (the typicalcase in a location-scale family setting). In the same line, [Zhuang et al (2019)] adaptedthe theory to cover even a finite number of crosses between the d.f.’s, under the additionalassumption of an exponential density ratio model and using semiparametric estimates ofthe quantile functions.Here, in a wide setting, we provide a complete set of distributional limit results forGalton’s rank order statistic, showing the complex panorama of the asymptotic behaviourof γ ( F n , G m ). In particular, we pursuit on the goal of analyzing the scarcely treated caseof a finite number of contact points between F − and G − , leading to a sound study of thelocal behaviour at every isolated contact point between quantile functions. This focuseson the consideration of the “contact intensity” (to be properly defined), which exceedsthe merely visual scope of crossing points of smooth enough curves and presents certainsimilarities with concepts lying in Stochastic Geometry. That contact intensity relies onthe existence of a local Lypschitzian reparameterization of a curve in terms of the other. Next we introduce the basic concepts we handle and explain the main results, whoseproofs are deferred to Sections 3 and 4.Intuitively, the asymptotic behaviour of γ ( F n , G m ) depends on the size of the contactset, namely, the set Γ := { t : F − ( t ) = G − ( t ) } . (4)For equal sized samples this was already observed in [Gross and Holland(1968)]. We notethat, since the index γ is invariant with respect to strictly increasing transformations,the set Γ could be equivalently expressed, in regular cases, as { t : Λ ( F − ( t )) = 0 } , where Λ ( x ) := G − ( F ( x )) − x is the shift function introduced in [Doksum (1974)] as a richer3lternative to the difference of means for comparing two continuous d.f.’s. The analysis ofthe Q-Q process associated to Λ was done in [Aly (1986)], under smoothness assumptions,through strong approximations. Yet, intuition may fail without some regularity conditionsand, as we show in this work, Γ is not really the right set to look at. In fact, theasymptotic analysis of Galton’s rank statistic is better handled in terms of the alternativeshift function h ( t ) := F G ( t ) − t , underlying the associated P-P process considered in[Aly et al.(1987)]. Here, and throughout this work, we denote F G := F ◦ G − (similarly, G F = G ◦ F − ) and ˜Γ := { t : F G ( t ) = t } . (5)We observe that if F and G are continuous, then ˜Γ = Γ. However, these sets can bequite different: for F = G , a Bernoulli distribution with mean p , we have Γ = [0 , { − p, } . By focusing on the ‘right’ choice of contact set, our results go beyond thecases that could be treated from the analyses in [Aly (1986)] and [Aly et al.(1987)]. Infact, we provide necessary and sufficient conditions for the a.s. consistency of γ ( F n , G m ) without any smoothness assumption : Theorem 1.1
Let
F, G be arbitrary d.f.’s. Then γ ( F n , G m ) a.s. → γ ( F, G ) , as n, m → ∞ ifand only if ℓ (˜Γ) = 0 . A similar result holds for the one-sample statistic, γ ( F n , G ).As we see from Theorem 1.1, if ℓ (˜Γ) > γ ( F n , G m ) (or γ ( F n , G )) are not consistentestimators of γ ( F, G ). In this case, we provide a completely general result about theasymptotic behaviour of γ ( F n , G ): Theorem 1.2
Let
F, G be arbitrary d.f.’s. Then γ ( F n , G ) − γ ( F, G ) w → ℓ n t ∈ ˜Γ : B ( t ) > o , as n → ∞ , where B is a standard Brownian bridge on [0 , . Still in the case ℓ (˜Γ) >
0, we prove weak convergence of the two-sample statistic, γ ( F n , G m ), under mild assumptions. This problem was also treated in [Gross and Holland(1968)]for equal sample sizes ( n = m ), through combinatorial arguments and the method of mo-ments. That combinatorial approach seems to be inappropiate to handle the case ofunequal sample sizes. Additionally, our version yields a simple representation of the limitlaw. Theorem 1.3
Let
F, G be d.f.’s such that F G is Lipschitz. If B is a standard Brownianbridge on [0 , and m, n → ∞ satisfy < lim inf nm + n ≤ lim sup nm + n < , then γ ( F n , G m ) − γ ( F, G ) w → ℓ { t ∈ ˜Γ : B ( t ) > } .
4t should be noted that the limiting distribution in Theorems 1.2 or 1.3 is non-degenerate if and only if ℓ (˜Γ) > , o in [L´evy(1939)] or p. 85-86 in [Billingsley (1968)])is that if B is a standard Brownian bridge on [0 , P ( ℓ { t ∈ [0 ,
1] : B ( t ) > } ≤ x ) = x, for every x ∈ [0 , . From this and Theorem 1.3 we recover, asymptotically, the classical result for the case m = n and continuous G = F (recall that in this case γ ( F n , G n ) is uniformly distributedover { , n , . . . , n − n , } ; continuity of F ensures that F ( F − ( t )) = t for every t ∈ (0 ,
1) andTheorem 1.3 applies with ˜Γ = (0 , ℓ (˜Γ) = 0 the limiting distribution in Theorems 1.2 and 1.3 is degenerated at0. In Section 4 we obtain non-degenerated limiting distributions, with different rates,when the contact set consists of a finite collection of contact points. The key is the localasymptotic behaviour of F − n − G − m around these influential points. To avoid unnecessarysmoothness assumptions here, we must consider contact points between nondecreasingfunctions in a generalized sense, including virtual contact points: those corresponding tocontacts between the vertical segments joining lateral limits at discontinuity points. Sincequantile functions are left continuous, the following definition includes all these contactpoints. Definition 1.4
We say that t ∈ (0 , is a (generalized) contact point between F − and G − if either (i) F − ( t ) = G − ( t ) or (ii) F − ( t ) < G − ( t ) ≤ F − ( t +) or (iii) G − ( t )
1. We should also note that, while (8) excludes discontinuity points for F G , in particular, virtual contact points between F G and the identity, our approach allowsto handle these points in a rather straigthforward way (see (42), (43) and Theorem 4.10).Finally, we note that while (8) requires the contact orders to be at least 1, lower orderscan also be considered. If, for instance, ∆( h ) = sgn( h ) | h | r , with 0 < r <
1, then ∆is not Lipschitz around 0, but G F is and, under some additional assumptions, the localbehaviour can be studied through ˜ ℓ t m,n , the version of ℓ t n,m in which the roles of the X and Y samples are exchanged (see the comments before the proof of Theorem 1.5).For a compact description of the limit distribution for the terms ℓ t n,m we consider inde-pendent random elements B , B , W , W , { ξ ,n } n ≥ , { ξ ,n } n ≥ , { ξ ,n } n ≥ , { ξ ,n } n ≥ , where6 i are Brownian bridges on [0 , W i are Brownian motions on [0 , ∞ ) and { ξ i,n } n ≥ sequences of i.i.d. exponential r.v.’s with unit mean. We set S ik := ξ i, + · · · + ξ i,k , k ≥ , i = 1 , . . . ,
4. We fix λ ∈ (0 ,
1) and set B λ := √ λ B − √ − λ B .We consider r L , r R ≥ r := max( r L , r R ). Also, for real numbers a, b , wewill use the notation a sgn ( b ) for a + (the positive part of a )) either a − (the negative part)depending on whether b > b < . For t ∈ (0 ,
1) we define T r L ,r R ( t ; C L , C R ) := sgn( C L ) (cid:16) ( B λ ( t )) sgn( CL ) | C L | (cid:17) /r I ( r L = r )+ sgn( C R ) (cid:16) ( B λ ( t )) sgn( CR ) | C R | (cid:17) /r I ( r R = r ) , (9)when r > r = 1 and C R C L >
0, while T , ( t ; C L , C R ) := ( B λ ( t )) sgn( CL ) C L + ( B λ ( t )) sgn( CR ) C R + sgn( C L ) B ( t ) √ − λ , (10)when C L C R <
0. Additionally, for r > t = 0 , T r ,r ( t ; C, C ) := sgn( C ) ℓ (cid:8) y ∈ (0 , ∞ ) : sgn( C ) W t ( y ) > ( λ (1 − λ )) / | C | y r (cid:9) , (11)while in the case r = 1 we set T , (0; C, C ) := sgn( C ) λ (1 − λ ) R ∞ I (cid:8) sgn ( C ) λS ⌈ (1 − λ ) y ⌉ > sgn ( C )(1 − λ )(1+ C ) S ⌈ λy ⌉ (cid:9) dyT , (1; C, C ) := sgn( C ) λ (1 − λ ) R ∞ I (cid:8) sgn ( C ) λS ⌈ (1 − λ ) y ⌉ > sgn ( C )(1 − λ )(1+ C ) S ⌈ λy ⌉ (cid:9) dy. (12)The double subindex and the double C are redundant for these extremal contact points,but allows to keep a simple notation.We are ready to present the results describing the asymptotic behaviour of ℓ t n,m forregular contact points. Theorem 1.5 deals with innner contact points, while extremalcontact points are considered in Theorem 1.6 (in fact, Theorem 1.5 remains valid forextremal contact points, but the limit distribution is Dirac’s measure on 0 in that case). Theorem 1.5
Assume t is a regular inner contact point with contact orders r L = r L ( t ) , r R = r L ( t ) ≥ and constants C L = C L ( t ) , C R = C R ( t ) . If r = max( r L , r R ) and n, m → ∞ with nn + m → λ ∈ (0 , , then, for every small enough η > , ( n + m ) r ℓ t n,m w → T r L ,r R ( t ; C L , C R ) . Theorem 1.6
Assume t ∈ { , } is regular with contact order r ≥ and constant C . If m, n → ∞ with nn + m → λ ∈ (0 , , then, for every small enough η > n + m ) r − ℓ t n,m w → T r,r ( t ; C, C ) . (13)7e see from Theorems 1.5 and 1.6 that, with the same contact intensities, ℓ t n,m vanishesfaster for extremal contact points. In Subsection 4.1 we provide examples of extremalcontact points for which ℓ t n,m converges at rate ( n + m ) − c for every c ∈ (0 ,
1] and of innercontact points for which the rate is ( n + m ) − c , c ∈ (0 , ]. Another distinctive featureof the limiting distributions for inner contact points is that for crossing points (thosewith C L ( t ) C R ( t ) <
0) the limiting distribution takes positive and negative values withpositive probabilities. If t is a tangency point ( C L ( t ) C R ( t ) >
0) then the limitingdistribution is concentrated on (0 , ∞ ) or on − ( ∞ , γ ( F n , G m ). Theorem 1.7
Assume Γ ∗ = { t , . . . , t k } where t i is a regular contact point with intensi-ties r L ( t i ) , r R ( t i ) and constants C L ( t i ) , C R ( t i ) , i = 1 , . . . , k . Set r i = max( r L ( t i ) , r R ( t i )) if t i ∈ (0 , , and r i = max( r L ( t i ) , r R ( t i )) − if t i ∈ { , } . Then, if r = max ≤ i ≤ k r i , ( n + m ) r ( γ ( F n , G m ) − γ ( F, G )) w → k X i =1 I ( r i = r ) T r L ( t i ) ,r R ( t i ) ( t i ; C L ( t i ) , C R ( t i )) . Theorem 1.7 shows that the rate of convergence of γ ( F n , G m ) is determined by themaximal intensity of contact, and that only points with maximal intensity contributeto the limiting distribution, with adjustments to take into acount the different role ofinner and extremal contact points. If there are extremal contact points then the rate ofconvergence can be ( n + m ) c for any c ∈ (0 , n + m ) c with c ∈ (0 , ]. The only case in which γ ( F n , G m ) is asymptoticallynormal is when the inner contact points have intensity one, all of them with constants C L = − C R and there is no extremal contact point or its influence vanishes faster. The remaining sections of this work are organized as follows. Section 2 includes somekey results on quantile functions and analyzes the structure of the contact sets. We willexplicitly formulate several results on quantile functions. Some are classical, but, in factit is not an easy task to find a comprehensive reference on quantile functions, with thenotable exception of Appendix A in [Bobkov and Ledoux(2016)] on ‘Inverse DistributionFunctions’. We observe that [Bobkov and Ledoux(2016)] is devoted to the analysis ofconvergence rates of Kantorovich transport distances between probability measures on thereal line, which can be expressed in terms of quantile functions as R | F − ( t ) − G − ( t ) | p dt, thus our problem corresponds to the limiting case p = 0. Remarkably, this problem alsoencompasses a wide range of convergence rates.In Section 3 we provide the proofs of Theorems 1.1, 1.2 and 1.3. Most of the limittheorems that we give for Galton’s rank statistic are based on convenient representa-tions of empirical quantile functions, combined with some type of strong approximation.8sing representation (14) below, we can derive limit theorems for Galton’s rank statis-tic relying on strong approximations for uniform quantile processes, rather than usingstrong approximations for general quantile processes (as, for instance, in Chapter 6 in[Cs¨orgo and Horvath (1993)]). This results in a significant gain in generality, since ap-proximations for general quantile processes typically require strong smoothness assump-tions (existence of densities plus additional conditions on them) that we can circumventwith this approach.Section 4 gives the proofs of Theorems 1.5, 1.6 and 1.7. The key ingredients for thiswill be, as in Section 3, a convenient representation of the quantile processes and someapplication of strong approximations. With some simple localization results (Lemma 4.1and Corollary 4.2) we see that the asymptotic behaviour of γ ( F n , G n ) can be studiedthrough that of the localized terms ℓ t m,n with t in the contact set. Some results on theasymptotic independence between lower, central and upper order statistics allow then tocomplete the proof of Theorem 1.7. Subsection 4.1 in that section provides some examplesof contact points with different positions and contact intensities. This subsection alsoincludes a simplified version of Theorem 1.7 under conditions that guarantee that F G issmooth (see Theorem 4.9); and a further limit theorem (Theorem 4.10) for the case when F and G have finite supports. This is an interesting example which can be handled withour approach even though the contact points here are not regular contact points.We include an Appendix with some additional material. The first part is devotedto some properties of the F G transform, including a technical discussion on conditionswhich guarantee that F G is Lipschitz or locally Lipschitz. Finally, we present a strongapproximation result that we have used in several proofs. We end this Introduction with some words on notation. Through the paper L ( X ) willdenote the law of the random vector or r.v. X . We will consider a generic probabilityspace (Ω , σ, P ), where the involved random objects are defined. Given the (measurable)sets A, B , by I A we will denote the indicator function of A and A \ B will denote the set { x ∈ A : x / ∈ B } . As before, ℓ will denote the Lebesgue measure on the unit interval(0 , a.s. → , p → , and w → . Given a real value, x , we will use ⌈ x ⌉ to denote thesmaller integer greater or equal than x , and x + := sup { x, } and x − = − inf { x, } . Alsowe use the notation f ( x − ) := lim y → x − f ( y ) and f ( x +) := lim y → x + f ( y ) for the laterallimits of a real function, f , whenever these limits exist, and sgn( x ) (defined, for real x , as0 if x = 0 and x/ | x | otherwise). Also recall that for real numbers a, b , a sgn ( b ) will denoteeither a + or a − depending on whether b > b < . Throughout, X , . . . , X n and Y , . . . , Y m will be independent samples of i.i.d. r.v.’s suchthat L ( X i ) and L ( Y i ) have respective d.f.’s F and G . As above, F n and G m will denotethe respective sample d.f.’s based on the X ′ s and Y ′ s samples. Occasionally, we will use9he superscript ω in functions computed from the sample values X i ( ω ) , i = 1 , . . . , n or Y j ( ω ) , j = 1 , . . . , m , (for instance, the empirical d.f. F ωn or the empirical quantile function( F ωn ) − ). Without loss of generality we can (and often do) assume that the samples havebeen obtained from independent U (0 ,
1) samples U , . . . , U n and V , . . . , V m through thetransformations X i = F − ( U i ) , Y j = G − ( V j ) . From now on, we will denote the empiricalquantile functions of these uniform samples by U n and V m . We have the obvious relations F − n = F − ( U n ) , G − m = G − ( V m ) . Writing u n and v m for the quantile processes based onthe U i ’s and the V j ’s, respectively, ( u n ( t ) = √ n ( U n ( t ) − t ) and similarly for v m ) we notethat F − n ( t ) = F − (cid:16) t + u n ( t ) √ n (cid:17) and G − m ( t ) = G − (cid:16) t + v m ( t ) √ m (cid:17) . (14)As already noted, the limiting behaviour of Galton’s rank statistic is best describedin terms of the contact set between the identity and the function F G ( t ) := F ( G − ( t )) . (15)We note that, while the role of F and G is symmetric in the definitions of Γ and Γ ∗ ,this is not true in the case of ˜Γ. For a more clear description of the relations among thesesets we sometimes write ˜Γ F = ˜Γ, ˜Γ G = { t ∈ (0 ,
1) : G F ( t ) = t } and ˜Γ ∗ F (resp. ˜Γ ∗ G ) forthe set of generalized contact points between F G (resp. G F ) and the identity (see (19)).Obviously, ˜Γ F ⊂ ˜Γ ∗ F . Quantile functions defined as in (1) provide a useful description of probabilities on thereal line in terms of nondecreasing, left-continuous functions on (0 , H , defined on (0 ,
1) is the unique quantilefunction associated to just a unique d.f.: as a dual relation to (1), such a function H isthe quantile function associated to the d.f. F ( x ) = sup { t ∈ (0 ,
1) : H ( t ) ≤ x } . (16)As already noted, it will be convenient at some points to extend F − to 0 and 1 in theobvious way (hence, F − (0) := F − (0+) and F − (1) := F − (1 − )).In this section we present some relevant facts on the relation between quantile functionsand the composite functions F G defined in (15) without any smoothness assumption onthe d.f.’s. We must begin by stressing the fact that, in general, we cannot guaranteeeven lateral continuity of F G (that would we only guaranteed for F ( G − ( t ) − ) on the leftand for F ( G − ( t +)) on the right). On the other hand, from the well known relation for t ∈ (0 , t ≤ F ( x ) ⇐⇒ F − ( t ) ≤ x , it is easy to see the relations F G ( t ) = max { s ∈ [0 ,
1] : F − ( s ) ≤ G − ( t ) } , for t ∈ (0 ,
1) (17)10 − ( t ) > G − ( s ) ⇐⇒ t > F G ( s ) for t, s ∈ (0 , . (18)We note that F G ( t ) and G F ( t ) could be different even when t ∈ Γ. This possibility isnaturally related to the behaviour of the composition F ( F − ( t )). Clearly, F ( F − ( t ) − ) ≤ t ≤ F F ( t ), thus F F ( t ) = t when F is continuous at F − ( t ), but this could fail otherwise.More precisely, for t ∈ (0 , F F ( t ) = t is equivalent to t ∈ Im( F ),where Im( F ) := { F ( x ) , x ∈ R } (see Lemma A.3 in [Bobkov and Ledoux(2016)]). Nowlet t ∈ (0 , ∩ Γ: if t ∈ Im( F ), then t = F F ( t ) = F G ( t ), while if t / ∈ Im( F ), then t < F F ( t ) = F G ( t ). We collect these facts and some easy consequences for further referencein the following lemma. Lemma 2.1
Let
F, G be arbitrary d.f.’s. For t ∈ (0 , , with the above notation, we have:a) If t ∈ Γ ∩ Im ( F ) , then F G ( t ) = t .b) If t ∈ Γ \ Im ( F ) , then F G ( t ) > t .c) If F G ( t ) = t , then either t ∈ Γ or F − ( t ) < G − ( t ) .d) If F G ( t ) = t and G F ( t ) = t , then t ∈ Γ . The conclusions in Lemma 2.1 can be rewritten with the notation of (4) and (5). Item a) , for instance, becomes Γ ∩ Im( F ) ⊂ ˜Γ. Generalized contact points in the sense ofDefinition 1.4 (that is, points in Γ ∗ ) can also be characterized in terms of the compositefunctions F G and G F .As already noted in the Introduction, the consideration of virtual contact points asso-ciated to left-jump discontinuities of F G is not necessary because they are in fact contactpoints in the strict sense or they are associated to right-jump discontinuities of G F (seeProposition 2.5). In consequence, we consider a point t ∈ (0 ,
1) as a contact point of F G and the identity whenever F G ( t ) = t , or F G ( t ) < t ≤ F G ( t +) (19)Note that the virtual contact condition F G ( t ) < t ≤ F G ( t +) is equivalent to F G ( t ) < t ≤ F G ( s ) for all s > t , hence also to G − ( t ) < F − ( t ) ≤ G − ( t +),which is condition (iii) in Definition 1.4. Therefore, we have shown that Proposition 2.2
The virtual contact points of F − and G − are exactly the virtual con-tact points of F G or G F with the identity. We note that ˜Γ ∗ F \ ˜Γ F is contained in the set of discontinuity points of the nondecreasingfunction F G , which must be at most countable. Hence, ℓ (˜Γ ∗ F \ ˜Γ F ) = 0 and ℓ (˜Γ ∗ F ) = ℓ (˜Γ F ).Proposition 2.2 means that Γ ∗ \ Γ = (˜Γ ∗ F \ ˜Γ F ) ∪ (˜Γ ∗ G \ ˜Γ G ). We explore next the situationfor contact points in the strict sense. 11 roposition 2.3 If t ∈ Γ ∩ (0 , then F G ( t ) = t , or G F ( t ) = t (that is t ∈ ˜Γ F ∪ ˜Γ G ),or the set { t ∈ (0 ,
1) : F − ( t ) = G − ( t ) = F − ( t ) } is a non-degenerate interval (hence,in the latter case, the point x = F − ( t ) is a common discontinuity point of F and G and t cannot be an isolated element of Γ ∗ ). Proof.
It is easy to see that if t ∈ (0 , \ (Im( F ) ∪ Im( G )) satisfies F − ( t ) = G − ( t ),then the point x = F − ( t ) would have positive mass under both distributions, hencethe set { t ∈ (0 ,
1) : F − ( t ) = G − ( t ) = F − ( t ) } is a non-degenerate interval. Any otherpoint in Γ ∩ (0 ,
1) must belong to Im( F ) ∪ Im( G ) and, by Lemma 2.1, must satisfy either F G ( t ) = t or G F ( t ) = t . • Proposition 2.4
Let t ∈ (0 , be such that t ∈ ˜Γ F ∪ ˜Γ G , that is F G ( t ) = t or G F ( t ) = t . Then t ∈ Γ ∗ ( t is a contact point between F − and G − ). Proof.
For any t ∈ (0 ,
1) such that F G ( t ) = t (the case G F ( t ) = t is identical), wemust have one of the following exclusive possibilities:i) G F ( t ) < t , and then F − ( t ) < G − ( t ), andii) F G ( t ) = t = G F ( t ), or F G ( t ) = t < G F ( t ), which lead to F − ( t ) = G − ( t ) . If i) holds, then we would have G − ( t ) ≤ F − ( t +) (this follows easily from the fact thatthe strict inequality G − ( t ) > F − ( t +) would imply F G ( t ) = F ( G − ( t )) > t ). Hence,i) implies F − ( t ) < G − ( t ) ≤ F − ( t +). • The next proposition shows that it is not necessary to consider contact points associ-ated to left-discontinuities.
Proposition 2.5
Let t ∈ (0 , . If F G ( t − ) ≤ t < F G ( t ) , then G F ( t ) ≤ t ≤ G F ( t +) , or the point x = G − ( t ) is a common discontinuity point of F and G and t cannot bean isolated element of Γ ∗ . Proof.
If we suppose that G − ( t ) = G − ( t ) for some t < t , then for every sequence { t n } such that t n → t − , F ( G − ( t n )) = F ( G − ( t )) will hold eventually, thus leading tothe absurd F G ( t − ) = F G ( t ) . Therefore it must be x := G − ( t ) > G − ( t ) for every t < t , and F G ( t − ) = F ( G − ( t ) − ) . Moreover, the discontinuity of F and its link with F − easily show that F − ( t ) = x if t ∈ ( F ( x − ) , F ( x )]. Now, on the first hand, fromthe hypothesis we obtain F − ( t ) ≤ x and F − ( s ) = x for every s ∈ ( t , F ( x )), hence G F ( t +) = G ( x ) = G ( G − ( t )) ≥ t . (20)On the other hand, from the relation t < F G ( t ) we obtain F − ( t ) ≤ G − ( t ), hence F − ( t ) = G − ( t ) or, alternatively, F − ( t ) < G − ( t ) what gives G F ( t ) < t . This12elation and (20) imply that G F ( t ) < t ≤ G F ( t +). Finally, if F − ( t ) = G − ( t ) and x = F − ( t ) is a continuity point of G , from (20) we obtain that G F ( t ) = t what provesthe result. • We conclude this section with some easy consequences of the last results.
Corollary 2.6
Let t ∈ (0 , such that F G ( t ) = t (resp. G F ( t ) = t ) then t is acontact point (possibly virtual) between G F (resp. F G ) and the identity. Corollary 2.6 states that ˜Γ F ⊂ ˜Γ ∗ G . From the comments after Proposition 2.2 we seethat ℓ (˜Γ F ) ≤ ℓ (˜Γ ∗ G ) = ℓ (˜Γ G ). The same argument shows that ℓ (˜Γ G ) ≤ ℓ (˜Γ F ), hence ℓ (˜Γ F ) = ℓ (˜Γ G ). This means, in particular, that the roles of F and G in the condition ℓ (˜Γ) = 0 in Theorems 1.1, 1.2 and 1.3 are completely symmetric. Proposition 2.7 If Γ ∗ is finite then Γ ∗ = ˜Γ ∗ F ∪ ˜Γ ∗ G . In particular, ˜Γ ∗ F and ˜Γ ∗ G are finite. We remark that, while ˜Γ ∗ F ∪ ˜Γ ∗ G ⊂ Γ ∗ always holds (this follows from Proposition 2.4),the set Γ ∗ can be much bigger that ˜Γ ∗ F ∪ ˜Γ ∗ G (recall the comments in the Introduction; thecase G = F , with F the d.f. of the Bernoulli law with mean p gives a simple example ofthis).In Section 4 we prove distributional limit theorems for γ ( F n , G m ) under the assumptionthat Γ ∗ is finite, say, Γ ∗ = { t < · · · < t r } . The differences F − ( t ) − G − ( t ) must haveconstant sign in the open intervals ( t i , t i +1 ) (the same happens in (0 , t ) or ( t r ,
1) if 0 or1 are not contact points). The next result will enable us to focus on neighbourhoods ofisolated contact points to study γ ( F n , G m ). Lemma 2.8
Assume < a ≤ b < are such that [ a, b ] ∩ Γ ∗ = ∅ , and also thatsgn ( F − ( t ) − G − ( t )) > (resp. sgn ( F − ( t ) − G − )( t ) < ) for every t ∈ [ a, b ] . Thenthere exists δ > such that F − ( t ) − G − ( t +) > δ (resp. F − ( t ) − G − ( t +) < − δ ) forevery t ∈ [ a, b ] . Proof : Let us consider the case sgn( F − ( t ) − G − ( t )) >
0. Assume, on the contrary, thatthere exist a sequence { t k } ⊂ [ a, b ] such that F − ( t k ) − G − ( t k +) →
0. In this case it ispossible to choose { t ∗ k } ⊂ (0 ,
1) such that, t k < t ∗ k , t k − t ∗ k → F − ( t k +) − F − ( t ∗ k ) → G − ( t k +) − G − ( t ∗ k ) →
0. Since [ a, b ] is compact, we can assume that { t k } converges. Thenalso { t ∗ k } converges. We write t ∈ [ a, b ] for the common limit. By taking subsequences,if necessary, we can also assume that both sequences are monotone.Now, we only need to consider four possible cases. If, for instance, { t k } and { t ∗ k } are increasing, then we would obtain that F − ( t ) = G − ( t ) which is impossible byassumption. If the sequence { t k } is increasing and { t ∗ k } is decreasing, then F − ( t k ) → F − ( t ) and G − ( t ∗ k ) → G − ( t +), and we would have that F − ( t ) = G − ( t +) and,consequently, t would be a contact point what is not possible either, because [ a, b ] ∩ Γ ∗ = ∅ .The two remaining cases lead to similar contradictions. •
13e conclude this section with two observations. First, we note that sgn( t − F G ( t )) =sgn( F − ( t ) − G − ( t )) for every t / ∈ Γ ∗ . To check this recall relation (18), giving that F − ( t ) > G − ( t ) if and only if t > F G ( t ). This also implies that F − ( t ) ≤ G − ( t ) if andonly if t ≤ F G ( t ), but t = F G ( t ) cannot happen if t / ∈ Γ ∗ (Proposition 2.4). On the otherhand, if t < F G ( t ) then F − ( t ) ≤ G − ( t ) but, again, F − ( t ) = G − ( t ) is not possible if x / ∈ Γ ∗ . This means that sgn( t − F G ( t )) is constant in the intervals ( t i , t i +1 ) as above.Our second observation arises from the fact that every nondecreasing left-continuousreal function, H , defined on (0 ,
1) is the quantile function associated to the d.f. given by(16). We can apply Lemma 2.8 to the quantile function H ( t ) = F G ( t − ) and the identityand conclude, for instance, that in a compact interval where t − F G ( t ) > δ > t − F G ( t ) ≥ δ . We will exploit these facts in later sections. In this section we provide proofs of Theorems 1.1, 1.2 and 1.3. These results show thatGalton’s rank order statistic is a consistent estimator of the index γ ( F, G ) = ℓ { t : F − ( t ) >G − ( t ) } = ℓ { t : t > F G ( t ) } if and only if the contact set ˜Γ = { t : t = F G ( t ) } has zeroLebesgue measure. The key to the proof of Theorem 1.1 is the following lemma. Here wedenote Γ := Im( F ) ∩ Γ ∩ (0 , Lemma 3.1
Let
F, G be arbitrary d.f.’s. With the notation above, we have: γ ( F n , G m ) − γ ( F, G ) − ℓ ( { F − n > G − m } ∩ Γ) a.s. → as n, m → ∞ , (21) γ ( F n , G m ) − γ ( F, G ) − ℓ ( { F − n > G − m } ∩ Γ ) a.s. → as n, m → ∞ , (22) and γ ( F n , G m ) − γ ( F, G ) − ℓ ( { F − n > G − m } ∩ ˜Γ) a.s. → as n, m → ∞ . (23) Proof.
By right continuity, if t / ∈ Im( F ), then there exists δ t > t, t + δ t ) ∩ Im( F ) = ∅ and F F ( s ) = t + δ t , for every s ∈ [ t, t + δ t ]. From this, it easy to see that thereexists an at most countable family of disjoint intervals I k = [ a k , b k ), with a k < b k whichis a partition of the complement of Im( F ) and F F ( s ) = b k , for every s ∈ [ a k , b k ).The Glivenko-Cantelli Theorem gives that for some Ω ∈ σ , with P (Ω ) = 1, if ω ∈ Ω ,then sup t | F ωn ( t ) − F ( t ) | → t | G ωm ( t ) − G ( t ) | → . Now, recalling the elementary Skorohod theorem (see e.g. Lemma A.5 in [Bobkov and Ledoux(2016)]),for every ω ∈ Ω , the set T ω := (cid:8) t ∈ (0 ,
1) : ( F ωn ) − ( t ) → F − ( t ) and ( G ωm ) − ( t ) → G − ( t ) (cid:9) \{ a , a , . . . } ω ∈ Ω , γ ( F ωn , G ωm ) − γ ( F, G ) − ℓ ( { ( F ωn ) − > ( G ωm ) − } ∩ Γ)= ℓ (cid:2) { ( F ωn ) − > ( G ωm ) − , F − < G − } ∩ T ω (cid:3) − ℓ (cid:2) { ( F ωn ) − ≤ ( G ωm ) − , F − > G − } ∩ T ω ) (cid:3) , which converges to 0 because both sets within brackets converge to the empty set. Thisproves (21). To prove (22) we show that if ω ∈ Ω , then d n := ℓ (cid:0)(cid:8) ( F ωn ) − > ( G ωm ) − (cid:9) ∩ Γ (cid:1) − ℓ (cid:0)(cid:8) ( F ωn ) − > ( G ωm ) − (cid:9) ∩ Γ (cid:1) → . (24)To check this, notice that d n = ℓ (cid:0)(cid:8) ( F ωn ) − > ( G ωm ) − (cid:9) ∩ T ω ∩ Γ ∩ (cid:0) ∪ k ( a k , b k ) (cid:1)(cid:1) = ℓ (cid:0)(cid:8) t > ( F ωn ) G ωm ( t ) (cid:9) ∩ T ω ∩ Γ ∩ (cid:0) ∪ k ( a k , b k ) (cid:1)(cid:1) . Now, Glivenko-Cantelli again, and the construction of T ω yield that if ω ∈ Ω and t ∈ T ω ∩ Γ, then0 = lim n (cid:12)(cid:12) F ωn (cid:2) ( G ωm ) − ( t ) (cid:3) − F (cid:2) ( G ωm ) − ( t ) (cid:3)(cid:12)(cid:12) and lim n ( G ωm ) − ( t ) = G − ( t ) = F − ( t ) . (25)From here, if t ∈ T ω ∩ Γ ∩ ( a k , b k ) for some k , then, eventually, F [( G ωm ) − ( t )] = b k > t which, combined with the first statement in (25), makes eventually impossible that t > ( F ωn ) G ωm ( t ) and shows (24).The proof of (23) is now obvious taking into account that, from Lemma 2.1Γ ⊂ ˜Γ ⊂ Γ ∪ { F − < G − } . (26) • Proof of Theorem 1.1.
Sufficiency is a trivial consequence of Lemma 3.1. To provenecessity, if γ ( F n , G m ) a.s. → γ ( F, G ) , as n, m → ∞ , according to Lemma 3.1, we have that D n := ℓ (cid:16)(cid:8) F − n > G − m (cid:9) ∩ ˜Γ (cid:17) a.s. → , (27)From (58), we have D n = ℓ (cid:16) { U n > F G ( V m ) } ∩ ˜Γ (cid:17) ≥ ℓ (cid:16) { t : U n ( t ) > t } ∩ { t : t ≥ F G ( V m ( t )) } ∩ ˜Γ (cid:17) . Now Fubini’s theorem and independence between samples yield E [ D n ] ≥ Z ˜Γ P [ U n ( t ) > t ] P [ t ≥ F G ( V m ( t ))] dt ≥ Z ˜Γ P [ U n ( t ) > t ] P [ t > V m ( t )] dt, (28)15here the last inequality follows from the fact that, since F G is nondecreasing, F G ( t ) = t and V m ( t ) < t imply that F G ( V m ( t )) ≤ t. On the other hand, for every t ∈ (0 , /
2. But, since | D n | ≤
1, (27) implies E [ D n ] → ℓ (˜Γ) = 0. • Next, we give a proof of Theorem 1.2. We remark that our approach allows to handlethis one-sample statistic without any smoothness assumption on F or G . Proof of Theorem 1.2.
Assuming, w.l.o.g., the construction in Theorem B.1 we haveˆ γ n := γ ( F n , G ) = ℓ n t : F − (cid:16) t + u n ( t ) √ n (cid:17) > G − ( t ) o = ℓ n t : t + u n ( t ) √ n > F G ( t ) o = ℓ (cid:8) t : u n ( t ) > √ n ( F G ( t ) − t ) (cid:9) , and similarly γ := γ ( F, G ) = ℓ { t : F G ( t ) − t < } . Therefore, we see thatˆ γ n − γ = ℓ (cid:8) t : u n ( t ) > √ n ( F G ( t ) − t ) ≥ (cid:9) − ℓ (cid:8) t : 0 > √ n ( F G ( t ) − t ) ≥ u n ( t ) (cid:9) . Obviously, for the Brownian bridges B Fn ( t ), ℓ (cid:8) t : u n ( t ) > √ n ( F G ( t ) − t ) > (cid:9) ≤ ℓ n t : B Fn ( t ) + K log n √ n ≥ √ n ( F G ( t ) − t ) > o + ℓ n t : | B Fn ( t ) − u n ( t ) | > K log n √ n o . By Theorem B.1, the last summand eventually vanishes. For a fixed Brownian bridge B ( t ) and t ∈ (0 ,
1) such that F G ( t ) − t >
0, we have B ( t ) + K log n √ n < √ n ( F G ( t ) − t )eventually. This and the bounded convergence theorem imply that ℓ n t : B ( t ) + K log n √ n ≥ √ n ( F G ( t ) − t ) > o a.s. → . As a result we obtain that ℓ (cid:8) t : u n ( t ) > √ n ( F G ( t ) − t ) > (cid:9) p → . Similarly we see that ℓ (cid:8) t : 0 > √ n ( F G ( t ) − t ) ≥ u n ( t ) (cid:9) p → γ n − γ = ℓ { t : u n ( t ) ≥ , F G ( t ) = t } + o P (1) . (29)Next, we observe that, eventually, ℓ n t ∈ ˜Γ : B Fn ( t ) − K log n √ n ≥ o ≤ ℓ { t ∈ ˜Γ : u n ( t ) ≥ }≤ ℓ n t ∈ ˜Γ : B Fn ( t ) + K log n √ n ≥ o . ℓ n t ∈ ˜Γ : B ( t ) + K log n √ n ≥ o → ℓ (cid:16) t ∈ ˜Γ : B ( t ) ≥ (cid:17) , and ℓ n t ∈ ˜Γ : B ( t ) − K log n √ n ≥ o → ℓ n t ∈ ˜Γ : B ( t ) ≥ o . This and (29) show the announced result. • We recall that the set involved in the limit law in the last result is ˜Γ, which generallydoes not coincide with Γ (see Lemma 2.1 and (26) for more details). For a better under-standing of the links between Theorem 1.1 and 1.2, we note that degeneracy in the limitlaw is equivalent to ℓ (˜Γ) = 0. This is an obvious consequence of the next, simple result. Lemma 3.2 If B ( t ) is a standard Brownian bridge on [0 , , for any Borel set A in [0 , ,the r.v. ℓ ( { B > } ∩ A ) is a.s. constant if and only if ℓ ( A ) = 0 . Proof. If ℓ ( A ) = 0 then, obviously, ℓ ( { B > } ∩ A ) = 0. Assume now that ℓ ( A ) > ℓ { t ∈ [0 ,
1] : B ( t ) = 0 } = 0 (this follows easily from Fubini’sTheorem). Moreover, if B is a Brownian bridge then B = d − B . Hence, ℓ ( { B < }∩ A ) = d ℓ ( { B > } ∩ A ), while ℓ ( { B < } ∩ A ) + ℓ ( { B > } ∩ A ) = ℓ ( A ). This implies that E ( ℓ ( { B > } ∩ A )) = ℓ ( A ) /
2. Thus, if ℓ ( { B > } ∩ A ) were a.s. constant, that constantshould equal ℓ ( A ) /
2. However, ℓ { B > } stochastically dominates ℓ {{ B > } ∩ A } ,and degeneracy on the value ℓ ( A ) / U (0 ,
1) lawstochastically dominates Dirac’s measure on ℓ ( A ) /
2, which cannot hold if ℓ ( A ) > • To deal with Galton’s rank statistic in the two-sample case we must adapt the ar-gument in the proof of Theorem 1.2. This is done with Lemma 3.3, which will play animportant role in our development. It relies on the strong approximation given in Theo-rem B.1 in the Appendix. Given two real functions f and g and versions of independentsequences of Brownian bridges { B Fn } , { B Gm } and of uniform quantile processes, u n and v m , as in Theorem B.1, we set f n ( t ) := f ( t + u n ( t ) √ n ) and g m := g ( t + v m ( t ) √ m ) , ˜ f n ( t ) := f ( t + B Fn ( t ) √ n ) and ˜ g m := g ( t + B Gm ( t ) √ m ) . (30) Lemma 3.3
Consider A ⊂ (0 , such that ℓ ( A ) > . With the notation and constructionof Theorem B.1, if we assume that f, g are two real Lipschitz functions, then there exists L > such that, if C n,m := L ( log nn + log mm ) , then whenever n, m → ∞ , eventually, ℓ (cid:8) t ∈ A : ˜ f n ( t ) > ˜ g m ( t ) + C n,m (cid:9) ≤ ℓ (cid:8) t ∈ A : f n ( t ) > g m ( t ) (cid:9) ≤ ℓ (cid:8) t ∈ A : ˜ f n ( t ) > ˜ g m ( t ) − C n,m (cid:9) . (31)17 roof: Since f is Lipschitz, for t ∈ A we have that (cid:12)(cid:12) f n ( t ) − ˜ f n ( t ) (cid:12)(cid:12) = (cid:12)(cid:12) f ( t + u n ( t ) √ n ) − f ( t + B Fn ( t ) √ n ) (cid:12)(cid:12) ≤ k f k Lip k u n − B Fn k ∞ √ n , with a similar bound for (cid:12)(cid:12) g m ( t ) − ˜ g m ( t ) (cid:12)(cid:12) . These bounds and (62) imply that on a probabilityone set, eventually,sup t ∈ A (cid:12)(cid:12)(cid:12) ( f n − g m ) − ( ˜ f n − ˜ g m ) (cid:12)(cid:12)(cid:12) ≤ L ( log nn + log mm ) = C n,m for some positive constant L (depending only on f and g ). Observe that ℓ (cid:8) t ∈ A : f n ( t ) > g m ( t ) (cid:9) ≤ ℓ (cid:8) t ∈ A : ˜ f n ( t ) > ˜ g m ( t ) − C n,m (cid:9) + ℓ (cid:8) t ∈ A : | ( f n ( t ) − ˜ f n ( t )) − ( g − m ( t ) − ˜ g m ( t )) | > C n,m (cid:9) ,ℓ (cid:8) t ∈ A : ˜ f n ( t ) > ˜ g m ( t ) + C n,m (cid:9) ≤ ℓ (cid:8) t ∈ A : f n ( t ) > g m ( t ) (cid:9) + ℓ (cid:8) t ∈ A : | ( f n ( t ) − ˜ f n ( t )) − ( g − m ( t ) − ˜ g m ( t )) | > C n,m (cid:9) . On a probability one set the second summands on the last two upper bounds eventuallyvanish. Hence, on that probability one set, (31) eventually holds. • We will apply Lemma 3.3 to the cases in which f = F − and g = G − and when f isthe identity and g = F G (see Section A in the Appendix for the analysis of the Lipschitzcondition on F G ).We end the section with the proof of the two-sample analogue of Theorem 1.2. Proof of Theorem 1.3.
By taking subsequences we can assume nn + m → λ ∈ (0 , ℓ { t ∈ ˜Γ : F − n ( t ) > G − m ( t ) } w → ℓ { t ∈ ˜Γ : B ( t ) > } , which, using the approximation in Theorem B.1 and Lemma 3.3, will hold if ℓ (cid:8) t ∈ ˜Γ , t + B Fn ( t ) √ n > F G ( t + B Gm ( t ) √ m ) + C n,m (cid:9) w → ℓ { t ∈ ˜Γ : B ( t ) > } (32)and ℓ (cid:8) t ∈ ˜Γ : t + B Fn ( t ) √ n > F G ( t + B Gm ( t ) √ m ) − C n,m (cid:9) w → ℓ { t ∈ ˜Γ : B ( t ) > } . (33)Both terms can be handled similarly, hence we will address here only (32). First, we notethat ℓ { t ∈ ˜Γ : F G ( t ) ≤ x } = ℓ ((0 , x ] ∩ ˜Γ) , thus it defines a measure with density function I ˜Γ ( t ) , and, by the Lebesgue differentiation theorem,lim h → F G ( t + h ) − th = 1 for almost every t ∈ ˜Γ . (34)18ow, from ℓ (cid:8) t ∈ ˜Γ : t + B Fn ( t ) √ n > F G ( t + B Gm ( t ) √ m ) + C n,m (cid:9) (35)= ℓ (cid:8) t ∈ ˜Γ : q m + nn B Fn ( t ) > q m + nm B Gm ( t )( F G ( t + B Gm ( t ) √ m ) − t ) / B Gm ( t ) √ m + √ m + nC n,m (cid:9) d = ℓ (cid:8) t ∈ ˜Γ : q m + nn B F ( t ) > q m + nm B G ( t )( F G ( t + B G ( t ) √ m ) − t ) / B G ( t ) √ m + √ m + nC n,m (cid:9) , where B F and B G are independent standard Brownian bridges, (34), the expression of C n,m and dominated convergence imply convergence to ℓ (cid:8) t ∈ ˜Γ , λ − / B F ( t ) − (1 − λ ) − / B G ( t ) > (cid:9) . Finally, independence between B F and B G gives that λ − / B F ( t ) − (1 − λ ) − / B G ( t ) is ascaled Brownian bridge (it can be written as ( λ − + (1 − λ ) − ) / B ( t ), where B ( t ) is astandard Brownian bridge). Therefore the limit law in (35) is that ℓ { t ∈ ˜Γ : B ( t ) > } . • Remark 3.4
It is obvious that, for any Borel set A ⊂ [0 , B , the distribution of ℓ ( { B > } ∩ A ) is supported by [0 , ℓ ( A )]. One could conjucturethat this distribution should be also uniform on (0 , ℓ ( A )). However, a second thoughtshows that this distribution, in fact, depends on the set A and that it could even benon-continuous. It is well known (see e.g. pag. 42 in [Shorack and Wellner(1986)]) that P ( B ( t ) = 0 for a < t < b ) = 0 if 0 < a < b < , thus if A is contained in [ a, b ], thenthe probability of the event { ℓ ( { B > } ∩ A ) = ℓ ( A ) } is strictly positive. In fact, thisdistribution has two atoms: at ℓ ( A ) and at 0. • When the set ˜Γ is negligible, Theorems 1.2 and 1.3 yield convergence of Galton’s rankstatistic to the index γ ( F, G ). We investigate in this section the rate of convergence inthis result when the contact set Γ ∗ (recall Definition 1.4) is finite. The following simpleresult will be crucial in our analysis. Lemma 4.1
Assume that [ a, b ] ⊂ [0 , \ Γ ∗ is such that t − F G ( t ) > δ > for every t ∈ [ a, b ] . If nn + m → λ ∈ (0 , then, for every ε > such that a + ε < b − ε , we have a.s.eventually [ a + ε, b − ε ] = (cid:8) t ∈ [ a + ε, b − ε ] : F − n ( t ) > G − m ( t ) (cid:9) = (cid:8) t ∈ [ a + ε, b − ε ] : F − ( t ) > G − ( t ) } . The same conclusion holds if t − F G ( t ) < − δ for every t ∈ [ a, b ] . roof: We have F − ( t ) > G − ( t ) for every t ∈ [ a, b ]. Using the representation (14), (cid:8) t ∈ [ a + ε, b − ε ] : F − n ( t ) > G − m ( t ) (cid:9) = (cid:8) t ∈ [ a + ε, b − ε ] : t + u n ( t ) √ n > F G ( t + v m ( t ) √ m ) (cid:9) . Without loss of generality we can assume that the chosen version of u n satisfies sup ≤ t ≤ | u n ( t ) | is a.s. bounded, and the same for v m . Then, a.s., we have that for all t ∈ [ a + ε, b − ε ],eventually t + v m ( t ) √ m ∈ [ a, b ] and therefore F G (cid:0) t + v m ( t ) √ m (cid:1) < t + v m ( t ) √ m − δ < t + u n ( t ) √ n for largeenough n and m and the result follows. The same argument fixes the case t − F G ( t ) < − δ . • Now, recalling (see (6)) the notation ℓ t n,m := ℓ (cid:0) { F − n > G − m } ∩ ( t − η, t + η ) (cid:1) − ℓ (cid:0) { F − > G − } ∩ ( t − η, t + η ) (cid:1) , we obtain, as an inmediate consequence of Lemma 4.1 and Lemma 2.8 and the subsequentcomments, the following result. Corollary 4.2 If Γ ∗ = { t , . . . , t k } , k > , nn + m → λ ∈ (0 , and η > is such that { t i } = Γ ∗ ∩ ( t i − η, t i + η ) , i = 1 , . . . , k , then for s > n s ( γ ( F n , G m ) − γ ( F, G )) = n s k X i =1 ℓ t i n,m + o P (1) . The main consequence of Lemma 4.1 and Corollary 4.2 is that when Γ ∗ is finite thekey to the asymptotic behaviour of γ ( F n , G m ) is the (joint) asymptotic behaviour of ℓ t i n,m .We address this problem in this section when Γ ∗ consists of regular contact points. Wenote that these regular contact points (recall (8)) are elements of ˜Γ ∗ F . This, apparently,excludes contact points in ˜Γ ∗ G but not in ˜Γ ∗ F or points which would be regular if weexchange the roles of F and G but are not with the present definition. However, thesecases can often be handled with the same approach. To see this, observe that when Γ ∗ is finite (recall the concluding remarks in Section 2) we have that ℓ ( t ∈ A : F − ( t ) ≤ G − ( t )) = ℓ ( t ∈ A : F − ( t ) < G − ( t )) for every measurable A .If we assume further that F and G have no common discontinuity point (see Propo-sition A.2 and the more general Proposition A.3, involving just local conditions, in theAppendix), then ℓ ( t ∈ A : F − n ( t ) ≤ G − m ( t )) = ℓ ( t ∈ A : F − n ( t ) < G − m ( t )) a.s. and we seethat ℓ t n,m = − (cid:0) ℓ (cid:0) { F − n < G − m } ∩ ( t − η, t + η ) (cid:1) − ℓ (cid:0) { F − < G − } ∩ ( t − η, t + η ) (cid:1) (cid:1) = : − ˜ ℓ t m,n a.s. . Observe that ˜ ℓ t m,n is the same statistic as ℓ t n,m after exchanging the roles of the X and the Y samples. Hence, we restrict our analysis to points in Γ ∗ F . Our results hold for points in˜Γ ∗ G with obvious changes. 20e note that for every regular contact point, t , there exists η ∗ > F G ( t ) − t ) is non-null and constant on each of ( t − η ∗ , t ) and ( t , t + η ∗ ). (36)We recall from the final comments in Section 2 that, by taking η ∗ small enough (toexclude other contact points from the interval), sgn( F G ( t ) − t ) = sgn( G − ( t ) − F − ( t )) forevery t ∈ ( t − η ∗ , t ) ∪ ( t , t + η ∗ ). Now, if (36) holds, the study of ℓ t n,m can be carriedout through the study, for η ∈ (0 , η ∗ ), of the pieces L >n,m := Z t t − η I { F − n ( s ) >G − m ( s ) } ds and R >n,m := Z t + ηt I { F − n ( s ) >G − m ( s ) } ds,L
We assume, for instance, that C L >
0, and r L ≥ r R , thus r = r L . The other cases can be handled similarly. We note that ℓ t n,m = L >n,m + R >n,m if C R >
0, while ℓ t n,m = L >n,m − R 0. We consider first the case r L > 1. We set d n = ( n + m ) / r and prove next that d n L >n,m w → ℓ { y < B λ ( t ) > C L | y | r } . (37)To check this we note that, using (62), (30), Lemma 3.3 and (15), it is enough to provethat d n ℓ (cid:8) t ∈ I : t + B ( t ) √ n > F G ( t + B ( t ) √ m ) − C n,m (cid:9) w → ℓ { y < B λ ( t ) > C L | y | r } (38)and similarly with d n ℓ (cid:8) t ∈ I : t + B ( t ) √ n > F G ( t + B ( t ) √ m ) + C n,m (cid:9) , where I = [ t − η, t ].The proofs are similar, hence, we only prove (38). To ease notation we write C ( t ) for C L when t < C R when t > | t | r will mean | t | r L or | t | r R , whenever t < t > 0. Then I n t ∈I : t + B t ) √ n >F G (cid:16) t + B t ) √ m (cid:17) − C n,m o = I n t ∈I : t + B t ) √ n >t + ξ m + C ( ξ m ) | ξ m | r + o ( | ξ m | r ) − C n,m o = I { t ∈I : B , n ( t ) > √ n + m ( C ( ξ m ) | ξ m | r + o ( | ξ m | r ) − C n,m ) } , (39)where α n = p ( n + m ) /n , β m = p ( n + m ) /m , B , n ( t ) = α n B ( t ) − β m B ( t ), ξ m = (cid:0) t + B ( t ) √ m − t (cid:1) and we have used that t = F G ( t ). Denoting ξ ∗ m ( y ) = yd n + B G ( t + ydn ) √ m , the21hange of variable t = t + yd n , and (39) lead to d n ℓ (cid:8) t ∈ I : t + B ( t ) √ n > F G ( t + B ( t ) √ m ) − C n,m (cid:9) (40)= Z − d n η I (cid:8) B , n ( t + ydn ) >C (cid:0) ξ ∗ m ( y ) (cid:1)(cid:12)(cid:12) ( n + m )1 / r ( n + m )1 / rR y + ( n + m )1 / r √ m B ( t + ydn ) (cid:12)(cid:12) r + √ n + m ( o ( | ξ ∗ m ( y ) | r ) − C n,m ) (cid:9) dy. Since the Brownian bridges have continuous trajectories with probability one, theyare bounded and a.s.: sup y ∈ [ − ηd n , ξ ∗ m ( y ) ≤ sup x ∈ [0 , | B ( t ) |√ m → . inf y ∈ [ − ηd n , ξ ∗ m ( y ) ≥ − η − sup x ∈ [0 , | B ( t ) |√ m → − η. Thus, eventually, for every y < ξ ∗ m ( y ) ∈ [ − η ∗ , η ∗ ] and C (cid:0) ξ ∗ m ( y ) (cid:1) ≥ min( | C L | , | C R | ) > B and B are a.s. bounded, and also that ( n + m ) / r ( n + m ) / rR is eitherequal to one or, else, goes to infinity, yield that, a.s., the order of √ n + m | ξ ∗ m ( y ) | r is | y | r or higher. Finally, the definition of C n,m allows us to conclude that there exists M > I (cid:26) y ∈ [ − ηd n , B , n ( t + ydn ) >C ( ξ ∗ m ( y )) (cid:12)(cid:12)(cid:12)(cid:12) y + ( n + m )1 / r √ m B ( t + ydn ) (cid:12)(cid:12)(cid:12)(cid:12) r + √ n + m ( o ( | ξ ∗ m ( y ) | r ) − C n,m ) (cid:27) ≤ I {− M ≤ y ≤ } . Now, if we fix y < B λ ( t ) = C L | y | r , then, a.s., I (cid:8) B , n ( t + ydn ) >C ( ξ ∗ m ( y )) (cid:12)(cid:12)(cid:12)(cid:12) y + ( n + m )1 / r √ m B ( t + ydn ) (cid:12)(cid:12)(cid:12)(cid:12) r + √ n + m ( o ( | ξ ∗ m ( y ) | r ) − C n,m ) (cid:9) → I (cid:8) B λ ( t ) >C L | y | r (cid:9) . From here, dominated convergence yields (38), hence, as noted above, (37). We note thatthe limit in (37) equals sgn( C L ) (cid:16) ( B λ ( t )) sgn( CL ) | C L | (cid:17) /r . (41)A completely similar analysis shows that d n R >n,m w → sgn( C R ) (cid:16) ( B λ ( t )) sgn( CR ) | C R | (cid:17) /r I ( r R = r )when C R > d n R >n,m vanishes in probability if r R < r L = r ). Furthermore, we are usingthe same strong approximation to handle d n L >n,m and d n R >n,m , which implies that thereis weak convergence of ( d n L >n,m , d n R >n,m ) and, consequently, of d n ℓ t n,m = d n ( L >n,m + R >n,m ).This completes the proof in the case r L ≥ r R , C L > , C R > 0. The other cases with r > r L = r R = 1 goes along the same lines, the only difference beingthat, a.s., if y < B λ ( t ) = C ∗ | y + B ( t ) √ − λ | , where ∗ = L or R wheneversgn (cid:16) y + B ( t ) √ − λ (cid:17) = − I (cid:8) B , n ( t + ydn ) >C ( ξ ∗ m ( y )) (cid:12)(cid:12)(cid:12)(cid:12) y + ( n + m )1 / √ m B ( t + ydn ) (cid:12)(cid:12)(cid:12)(cid:12) + √ n + m ( o ( | ξ ∗ m ( y ) | ) − C n,m ) (cid:9) → I (cid:8) B λ ( t ) >C ∗ | y + B t √ − λ | (cid:9) and, by dominated convergence, d n L >n,m w → ℓ (cid:8) y < B λ ( t ) > C ∗ | y + B ( t ) √ − λ | (cid:9) = ℓ (cid:8) y < B λ ( t ) > − C L (cid:0) y + B ( t ) √ − λ (cid:1) , y + B ( t ) √ − λ < (cid:9) + ℓ (cid:8) y < B λ ( t ) > C R (cid:0) y + B ( t ) √ − λ (cid:1) , y + B ( t ) √ − λ > (cid:9) . The right side of the interval is dealt with in a similar way. In the case C L > , C R > d n R >n,m w → ℓ (cid:8) y > B λ ( t ) > − C L (cid:0) y + B ( t ) √ − λ (cid:1) , y + B ( t ) √ − λ < (cid:9) + ℓ (cid:8) y > B λ ( t ) > C R (cid:0) y + B ( t ) √ − λ (cid:1) , y + B ( t ) √ − λ > (cid:9) . Hence, d n ℓ t n,m = d n ( L >n,m + R >n,m ) w → ℓ (cid:0) y : B λ ( t ) > − C L ( y + B ( t ) √ − λ ) , y + B ( t ) √ − λ < (cid:1) + ℓ (cid:0) y : B λ ( t ) > C R ( y + B ( t ) √ − λ ) , y + B ( t ) √ − λ > (cid:1) = ( B λ ( t )) − C L + ( B λ ( t ))+ C R = T , ( t ; C L , C R ) . If C L > , C R < d n R Let t ∈ Γ ∗ ∩ (0 , , such that for some η > , ( t − η , t + η ) ∩ Γ ∗ = { t } . Then, for every small enough η > , if n/ ( n + m ) → λ ∈ (0 , as n, m → ∞ , wehave that:(i) ( virtual crossing points ) If F − ( t ) < G − ( t ) ≤ G − ( t +) < F − ( t +) , then, √ n + mℓ t n,m w → B ( t ) √ λ . (42) (ii) ( virtual tangency points ) If F − ( t ) < G − ( t ) ≤ F − ( t +) < G − ( t +) , then, √ n + mℓ t n,m w → ℓ { y : − B ( t ) √ − λ > y, − B ( t ) √ λ < y } = (cid:16) B ( t ) √ λ − B ( t ) √ − λ (cid:17) + . (43)We can easily adapt Proposition 4.3 to the case G − ( t ) < F − ( t ) ≤ F − ( t +) Let us take t = 0 and C > 0. The cases with C < t = 1 are similar. We handle first the case r = 1. Then for small enough η we have ℓ ( { F − > G − } ∩ (0 , η )) = 0. We recall that ℓ n,m = ℓ (cid:0) { F − n > G − m } ∩ (0 , η ) (cid:1) . (47)We use the well-known fact that the joint law of S n +1 ( S , . . . , S n ) is the same as that theordered sample of size n of i.i.d. U (0 , 1) r.v.’s. Thus,( X (1) , . . . , X ( n ) ) d = (cid:16) F − (cid:0) S S n +1 (cid:1) , . . . , F − ( S n S n +1 ) (cid:17) , (48)with a similar expression for the Y -sample. From (47) and (48) we see that ℓ n,m d = Z η I ( F − (cid:0) S ⌈ nt ⌉ S n +1 (cid:1) >G − (cid:0) S ⌈ mt ⌉ S m +1 (cid:1) ) dt = 1 n + m Z ( n + m ) η I (cid:8) F − ( ξ n ( y )) >G − ( ξ m ( y )) (cid:9) dy, where ξ n ( y ) := S ⌈ nn + m y ⌉ /S n +1 , and ξ m ( y ) := S ⌈ mn + m y ⌉ /S m +1 . Now (8) yields that, if y ∈ (0 , ( n + m ) η ), then I { F − ( ξ n ( y )) >G − ( ξ m ( y )) } = I { ξ n ( y ) >F G ( ξ m ( y )) } = I { ( n + m ) ξ n ( y ) > (1+ C )( n + m ) ξ m ( y )+( n + m ) o ( ξ m ( y )) } . (49)25he SLLN implies that there exists Ω , with P (Ω ) = 1, such that for every ω ∈ Ω , S n n → S m m → . Therefore, for any δ ∗ > 0, if ω ∈ Ω , eventually0 < ξ m ( y ) ≤ S ⌈ (1 − λ + δ ∗ ) y ⌉ +1 S m +1 → . (50)This and (49) show that if ω ∈ Ω , I (cid:8) F − (cid:0) ξ n ( y ) (cid:1) >G − (cid:0) ξ m ( y ) (cid:1)(cid:9) → I (cid:8) (1 − λ ) S ⌈ λy ⌉ >λ (1+ C ) S ⌈ (1 − λ ) y ⌉ (cid:9) , for every y not belonging to the countable set { j − λ : j = 0 , , . . . } ∪ { jλ : j = 0 , , . . . } .Clearly, in Ω we have lim y →∞ (1 − λ ) S ⌈ λy ⌉ λS ⌈ (1 − λ ) y ⌉ = 1 . Hence, the fact that C > , I n (1 − λ ) S ⌈ λy ⌉ >λ (1+ C ) S ⌈ (1 − λ ) y ⌉ o = 0 for largeenough y . This shows that R ∞ I n (1 − λ ) S ⌈ λy ⌉ >λ (1+ C ) S ⌈ (1 − λ ) y ⌉ o dy is an a.s. finite r.v..We will conclude (13) as soon as we prove that for every ω ∈ Ω we can applydominated convergence. To check this, notice that (50) gives that, for m large enough, I { ( m + n ) ξ n ( y ) > ( m + n )(1+ C ) ξ m ( y )+( m + n ) o ( ξ m ( y )) } (51) ≤ I { ( m + n ) ξ n ( y ) > (1+ C/ m + n ) ξ m ( y ) } . Now, for every ω ∈ Ω , there exist a natural number and a positive real numberdepending on ω , N ( ω ) and Y ( ω ), such that, if n ≥ N ( ω ) then both S n +1 /n and S m +1 /m are close to one, and, if we take y ≥ Y ( ω ), then, both S ⌈ nn + m y ⌉ nn + m and S ⌈ mn + m y ⌉ mn + m are close to y .This completes the proof for the case r = 1, since (51) gives that, for all n ≥ N ( ω ), I { ξ n ( y ) > (1+ C ( ξ m ( y )))( m + n ) ξ m ( y )+( m + n ) o ( ξ m ( y )) } ≤ I [0 ,Y ( ω )] . For the case r > C > T r,r (0; C, C ) isa.s. finite (this follows, for instance, from the fact that, a.s., W ( y ) /y → y → ∞ ).Now ℓ n,m has the same expression as in (47). We will use the same notation as in Lemma3.3. First, we have that ℓ n,m = ℓ n t ∈ (0 , η ) : F − (cid:0) t + u n ( t ) √ n (cid:1) > G − (cid:0) t + v m ( t ) √ m (cid:1)o = ℓ n t ∈ (0 , η ) : t + u n ( t ) √ n > F G (cid:0) t + v m ( t ) √ m (cid:1)o . Therefore, if we take f equal to the identity and g = F G in Lemma 3.3, we only need toshow that d n ℓ (cid:16)n t + B Fn ( t ) √ n > ˜ F G ( t ) − L n o ∩ (0 , η ) (cid:17) → w ℓ (cid:8) y ∈ (0 , ∞ ) : W ( y ) > ( λ (1 − λ )) / C (0) y r (cid:9) , (52)26nd similarly for d n ℓ (cid:16)n t + B Fn ( t ) √ n > ˜ F G ( t ) + L n o ∩ (0 , η ) (cid:17) , where, now, d n = ( n + m ) r − (since t + B Gm ( t ) √ m can take negative values, we take F G ( t ) = F G (0) for t < 0; notice that t + B Gm ( t ) √ m → t > 0, hence, this assumption has no effect in the limit) . The proofs are thesame, thus we only consider (52).We can assume, without loss of generality that B Fn ( t ) = d − / n ( W F ( d n t ) − tW F ( d n )),and B Gm ( t ) = d − / n ( W G ( d n t ) − tW G ( d n )), 0 ≤ t ≤ W F , W G independent Brownianmotions. Thus, the change of variable t = y/d n and the fact that p ( n + m ) d n = d rn give d n ℓ n,m = d n Z η I (cid:16) α n B Fn ( t ) >β m B Gm ( t )+ √ n + mC (cid:0) t + B Gm ( t ) √ m (cid:1)(cid:12)(cid:12) t + B Gm ( t ) √ m (cid:12)(cid:12) r + √ n + m (cid:0) o (cid:0)(cid:12)(cid:12) t + B Gm ( t ) √ m (cid:12)(cid:12) rR (cid:1) − L n,m (cid:1)(cid:17) dt = Z d n η I (cid:16) α n (cid:0) W F ( y ) − y W F ( d n ) d n (cid:1) >β m (cid:0) W G ( y ) − y W G ( d n ) d n (cid:1) + C (( ξ n ( y )) d rn | ξ n ( y ) | r + d rn (cid:0) o ( | ξ n ( y ) | r ) − L n,m (cid:1)(cid:17) dy, where α n = (( m + n ) /n ) / , β m = (( m + n ) /m ) / and ξ n ( y ) = yd n + √ m B Gm ( yd n ).As it is well known, there exists Ω ∈ σ , with P (Ω ) = 1 such that, if ω ∈ Ω , then, W i is continuous, W i ( x ) /x → 0, as x → ∞ , i = F, G and the set (cid:8) y : λ − / W F ( y ) = (1 − λ ) − / W G ( y ) + C (0) y r (cid:9) has Lebesgue measure zero. If we fix ω ∈ Ω , then, we have thatsup y ∈ [0 ,d n η ] | ξ n ( y ) | ≤ η + 1 √ md n sup y ∈ [0 ,d n η ] (cid:12)(cid:12)(cid:12)(cid:12) W F ( y ) − y W F ( d n ) d n (cid:12)(cid:12)(cid:12)(cid:12) → η, and we can conclude that, eventually, { ξ n ( y ) : y ∈ [0 , d n η ] } ⊂ [0 , η ∗ ], and, consequently,from an index onward, inf y ∈ [0 ,d n η ] C ( ξ n ( y )) ≥ inf h ∈ [0 ,η ∗ ] | C ( h ) | > . On the other hand, wehave d rn L n,m → d rn | ξ n ( y ) | r = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) y + β m W G ( y ) − y W G ( d n ) d n d r − n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) r = y (1 + o (1)) → ∞ , as y → ∞ . Therefore, there exists a constant M (which possibly depends on the chosen ω ) such that I (cid:18) α n (cid:0) W F ( y ) − y W F ( d n ) d n (cid:1) >β m (cid:0) W G ( y ) − y W G ( d n ) d n (cid:1) + C ( ξ n ( y )) d rn | ξ n ( y ) | r + d rn (cid:0) o ( | ξ n ( y ) | r ) − L n,m (cid:1) (cid:19) ≤ I (cid:8) ≤ y ≤ M (cid:9) , for every large enough n . Moreover, I (cid:26) α n (cid:18) W F ( y ) − y W F ( d n ) d n (cid:19) >β m (cid:18) W G ( y ) − y W G ( d n ) d n (cid:19) + C ( ξ n ( y )) d rn | ξ n ( y ) | r + d rn (cid:0) o ( | ξ n ( y ) | r ) − L n,m (cid:1) (cid:27) → I { λ − / W F ( y ) − (1 − λ ) − / W G ( y ) >C (0) y r } . ω , d n ℓ (cid:8) ( ˜ F − n > ˜ G − m − L n,m ) ∩ (0 , η ) (cid:9) → ℓ (cid:8) y ∈ (0 , ∞ ) : λ − / W F ( y ) − (1 − λ ) − / W G ( y ) > C (0) y r (cid:9) The fact that ( λ (1 − λ )) / ( λ − / W F ( y ) − (1 − λ ) − / W G ( y )) is a standard Brownianmotion yields (52). • We prove next Theorem 1.7, a global asymptotic result for Galton’s statistic underthe assumption of a finite contact set consisting of regular contact points. From a tech-nical point of view the main issue here is to prove asymptotic independence between thelocalized statistics around central and extremal contact points. Proof of Theorem 1.7: From Corollary 4.2 it is enough to prove that ( n + m ) r ( ℓ t i n,m ) ≤ i ≤ k converges weakly. This follows trivially if Γ ∗ ⊂ (0 , 1) after checking that the strong ap-proximation used in the proof of Theorem 1.5 allows to deal with all the ℓ t i n,m simulta-neously. Hence, it suffices to prove asymptotic independence among ℓ n,m , ( ℓ t i n,m ) i : t i ∈ (0 , and ℓ n,m when 0 or 1 (or both) are contact points. Let us assume, for instance, thatΓ ∗ = { < t · · · < t s < } and set A n = ( n + m ) r ℓ n,m , B n = ( n + m ) r ( ℓ t i n,m ) ≤ i ≤ s and C n = ( n + m ) r ℓ n,m . We have that there exist A, B, C such that A n w → A , B n w → B , C n w → C . Assume ( ˜ A, ˜ B, ˜ C ) is a random vector with ˜ A, ˜ B, ˜ C independent, ˜ A d = A, ˜ B d = B and ˜ C d = C and consider ( ˜ A n , ˜ B n , ˜ C n ), with the same properties with respect ( A n , B n , C n ). A n is a function of the smallest ⌈ ηn ⌉ elements in the X sample and the smallest ⌈ ηm ⌉ elements in the Y sample. Similarly, B n and C n are functions of the central and upperorder statistis. If d T V denotes the distance in total variation, then there exists a universalconstant H > d T V ( L ( A n , B n , C n ) , L ( ˜ A n , ˜ B n , ˜ C n )) ≤ H h η (1 − t − η ) t − η + η ( t s + η )1 − t s − η i / for small enough η (this follows from Theorem 4.2.9 and Lemma 3.3.7 in [1]). If ρ denotesthe Prokhorov metric, then the fact that ρ ( µ , µ ) ≤ d T V ( µ , µ ) implies ρ ( L ( A n , B n , C n ) , L ( ˜ A n , ˜ B n , ˜ C n )) ≤ H h η (1 − t − η ) t − η + η ( t s + η )1 − t s − η i / . We prove now that ( A n , B n , C n ) w → ( ˜ A, ˜ B, ˜ C ). Obviously ( ˜ A n , ˜ B n , ˜ C n ) w → ( ˜ A, ˜ B, ˜ C ).Having weakly convergent components, ( A n , B n , C n ) is tight. To complete the proof it suf-fices to show that for any weakly convergent subsequence ( A n ′ , B n ′ , C n ′ ) w → γ , necessarily γ = L ( ˜ A, ˜ B, ˜ C ). To check this, we observe that, since ρ metrizes the weak convergence,we have ρ ( γ, L ( ˜ A, ˜ B, ˜ C )) ≤ H h η (1 − t − η ) t − η + η ( t s + η )1 − t s − η i / . (53)28ow, using Corollary 4.2 we see that we can repeat the argument leading to (53) forevery small enough η . Hence, ρ ( γ, L ( ˜ A, ˜ B, ˜ C )) = 0. This completes the proof. • We provide here some simple examples that illustrate the different limiting distributionsfor ℓ t n,m that result from Theorems 1.5 and 1.6. Later we give simple sufficient conditionsunder which extremes have no influence on the asymptotic behaviour of γ ( F n , G m ) andgive a simplified version of Theorem 1.7 under the assumption that F and G have regulardensities (Theorem 4.9). Finally, we consider the case of finitely supported distributions(Theorem 4.10). Example 4.4 In this example G ( t ) = t (the uniform law on (0 , r > F − ( t ) = + sgn( t − ) | t − / | r , 0 ≤ t ≤ 1. Now we have F ( x ) = + sgn( x − ) | x − | /r , − r ≤ x ≤ + r , F G = F and F G ( ) = . Thus, is a contact point. If r < F ′ G ( t ) = r | t − | r − . In particular, F G is Lipsichitz in aneighbourhood of . We easily check that ∆( h ) = − h + sgn( h ) | h | /r = − h + o ( h ), thatis, is an isolated regular contact point (a crossing point) with intensities r L = r R = 1and constants C R = − C L = − 1. We can apply Theorem 1.5 to conclude that( n + m ) / ℓ t n,m w → B ( ) √ λ . If r > F ′ G ( ) = + ∞ and F G is not Lipschitz around the contact point. However,following the reasoning after Corollary 4.2, we have that ℓ t n,m = − ˜ ℓ t m,n and we can handlethis case exchanging the roles of the F and G samples and studying G F ( t ) = F − ( t ).Now G ′ F ( t ) = r | t − | r − and G F is Lipschitz in a neighbourhood of . Furthermore,∆( h ) = − h + sgn( h ) | h | r = − h + o ( h ). Thus we can, again, apply Theorem 1.5 to ˜ ℓ t m,n (with r L = r R = 1 , C R = − C L = − 1) and conclude that ( n + m ) / ˜ ℓ t m,n w → B ( ) √ − λ . Hence,for r > n + m ) / ℓ t n,m w → − B ( ) √ − λ . • Example 4.5 Now F denotes the d.f. of the uniform law on (0 , 1) and G − ( t ) = t +sgn( t − ) | t − / | r , 0 ≤ t ≤ 1. As before, is a contact point. For r ≥ F G = G − is differentiable, with F ′ G ( t ) = 1 + r | t − / | r − . We have ∆( h ) = sgn( h ) | h | r , that is,Theorem 1.5 can be applied here with r L = r R = r , C R = − C L = 1. Thus, for r = 1 weget ( n + m ) / ℓ t n,m w → B ( ) √ λ + B ( ) √ − λ , r > n + m ) / r ℓ t n,m w → (( B λ ( )) + ) /r − (( B λ ( )) − ) /r . The case 0 < r < • Example 4.6 Here we consider a Student’s t location model. Let F = F ν be a t -distribution with ν > G ( x ) = G ν ( x ) = F ν ( x − µ ), for some µ > 0. Obviously, in this case F − (0) = G − (0) = −∞ and F G (0) = 0. We write f v for the density of F ν . To ease notation, we set s = ( ν + 1) / 2, write K for a non-nullgeneric constant which can change from line to line (in particular, f ν ( t ) = K ( ν + t ) − s )and f ( x ) ≈ g ( x ) when f ( x ) g ( x ) → x → x .Using l’Hˆopital’s rule we see that F ν ( t ) ≈ Kt − s +1 as t → ∞ and, as a consequence, F − ν ( h ) ≈ Kh / (1 − s ) as h → F ′ G ( h ) = f ν ( F − ν ( h ) + µ ) f ν ( F − ν ( h )) = ( ν + ( F − ν ( h ) + µ ) ) − s ( ν + ( F − ν ( h )) ) − s → , as h → . (54)Some simple but tedious computations give that F ′′ G ( h ) ≈ K µ ( F − ν ( h )) s + O (cid:16) ( F − ν ( h )) s − (cid:17) ν + ( F − ν ( h )) ;therefore, F ′′ G ( h ) ≈ K ( F − ν ( h )) s − . Consequently, F ′′ G ( h ) ≈ Kh (2 s − / (1 − s ) . Now, ap-plying l’Hˆopital’s rule twice we get that ∆( h ) ≈ Kh s − − s +2 = Kh s s − = Kh ν +1 ν , thatis, ∆( h ) = Kh ( ν +1) /ν + o ( h ( ν +1) /ν ) for some K = 0 as h → . We see from (54) that F ′ G is bounded. Hence F G is Lipschitz and Theorem 1.6 can beapplied here with r R = ν +1 ν to obtain( n + m ) νν +2 ℓ n,m w → T ν +1 ν , ν +1 ν (0; K, K )for any ν > • Example 4.7 Let F (resp. G ) be centered (resp. with mean µ > 0) normal distributionswith common variance σ . Let f denote the density function of F . Now, F G ( t ) = F ( F − ( t ) + µ ), t ∈ [0 , 1] and F ′ G ( t ) = f ( F − ( t ) + µ ) f ( F − ( t )) = e − (2 µF − ( t )+ µ ) / σ → ∞ , as t → . F G is not Lipschitz in a neighbourhood of 0. However, we can use the factthat ℓ t n,m = − (cid:0) R η I ( F − n ( t ) ≤ G − m ( t )) dt − R η I ( F − ( t ) ≤ G − ( t )) dt (cid:17) = − (cid:0) R η I ( F − n ( t ) 1. UsingTheorem 1.6 we conclude that( n + m ) ℓ t n,m w → λ (1 − λ ) R ∞ I (cid:8) − (1 − λ ) S ⌈ λy ⌉ > (cid:9) dy = λ (1 − λ ) R ∞ I (cid:8) (1 − λ ) S ⌈ λy ⌉ < (cid:9) dy = 0 , since, a.s., S i > i ≥ 1. Thus the rate of convergence in this example is fasterthan ( n + m ) − . • We explore now some consequences of Theorem 1.7. If the extremal contact pointshave a non-null contribution to the limiting distribution, then this cannot be normal.We pay now attention to obtaining conditions under which √ n + mℓ n,m vanishes (andsimilarly for the upper extreme). The special attention to the rate √ n + m is due to thefact that it is the only one which can result in a normal limit. Of course, Theorem 1.6provides some answer to this problem, but we will give here simpler sufficient conditions.If the supports of F and G are bounded andlim inf | F − ( t ) − G − ( t ) | > t → t → − , (55)then ℓ n,m and ℓ n,m can be dealt with as in Lemma 4.1 to see that they eventually vanish.Note that, in the case of non-bounded support, (55) does not exclude that 0 or 1 couldbe contact points (recall Example 4.7). For this case the following criterion on the tailscan be useful to guarantee asymptotic negligibility of ℓ n,m and ℓ n,m in presence of innercontact points: Z (0 ,ε ) ∪ (1 − ε, (cid:16) p t (1 − t ) f ( F − ( t )) (cid:17) p dt < ∞ and Z (0 ,ε ) ∪ (1 − ε, (cid:16) p t (1 − t ) g ( G − ( t )) (cid:17) p dt < ∞ , (56)for some p > ε > F − ( t ) − G − ( t )) > δ > , η ) ⊂ (0 , ε ). We then focus on theintegral R η I { F − n ( t ) − G − m ( t ) ≤ } dt , noting that( F − − G − > δ ) ∩ ( F − n − G − m ≤ ⊂ ( | F − n − F − | > δ/ ∪ ( | G − m − G − | > δ/ n / Z η I { F − n ( t ) − G − m ( t ) ≤ } dt ≤ n / (cid:18)Z η I {| F − n ( t ) − F − ( t ) |≥ δ/ } dt + Z η I {| G − m ( t ) − G − ( t ) |≥ δ/ } dt (cid:19) ≤ n − p − ( δ/ p (cid:18)Z η |√ n ( F − n ( t ) − F − ( t )) | p dt + Z η |√ n ( G − m ( t ) − G − ( t )) | p dt (cid:19) p → , where the last convergence follows from the fact that by (56) and Theorem 5.3, p. 46 in[Bobkov and Ledoux(2016)], the integrals in parentheses are stochastically bounded.Now, we are ready for a general result for probabilities with smooth densities, f and g .Assuming enough differentiability, we write h ( t ) = F G ( t ) − t (the function used to obtain(44)) and, for any k ∈ N , define the setsΓ k := (cid:8) t ∈ Γ : h j ) ( t ) = 0 , j = 0 , . . . , k − h k ) ( t ) = 0 (cid:9) . Notice that the set Γ k is the set of contact points with intensity k and let k := k F,G = sup { k : Γ k = ∅} . For points in Γ k the derivatives of h can be easily related of thederivatives of f and g , as follows. Lemma 4.8 If t ∈ Γ k for some k ≥ , and we denote x = F − ( t ) , then, h k ) ( t ) = f ( x ) g ( x ) − , if k = 1 f k − ( x ) − g k − ( x ) f k ( x ) if k > , with f ( x ) = g ( x ) in the first case and f k − ( x ) = g k − ( x ) in the second one. Combining (44), (45) and (46) with the above considerations we obtain the followingversion of Theorem 1.7. Theorem 4.9 Assume that F and G have positive densities f and g on possibly un-bounded intervals which are k times continuously differentiable. Assume further that theset of contact points is finite with maximal intensity k and that condition (55) holds.Suppose in addition that either the supports are bounded or that condition (56) is sat-isfied. Then, if B and B are independent Brownian bridges, and n, m → ∞ with nn + m → λ ∈ (0 , ,(i) if k = 1 and x i = F − ( t i ) , ( n + m ) / ( γ ( F n , G m ) − γ ( F, G )) w → X t i ∈ Γ (cid:16) g ( x i ) | f ( x i ) − g ( x i ) | B ( t i ) √ λ + f ( x i ) | f ( x i ) − g ( x i ) | B ( t i ) √ − λ (cid:17) , ii) if k ≥ is odd ( n + m ) k ( γ ( F n , G m ) − γ ( F, G )) w → X t i ∈ Γ k (cid:16) k ! | h k ( t i ) | (cid:17) /k (cid:0) (( B λ ( t i )) /k ) + − (( B λ ( t )) /k ) − (cid:1) , (iii) if k is even ( n + m ) k ( γ ( F n , G m ) − γ ( F, G )) w → X t i ∈ Γ k sgn( h k ) ( t i ))2 (cid:16) k ! | h k ( t i ) | (cid:17) /k (cid:0) ( B λ ( t i )) sgn ( h k ( t i )) (cid:1) /k . We see from Theorem 4.9 that asymptotic normality (arguably, the most useful casefor statistical applications) holds, with the standard √ n + m rate, only when F and G have a finite number of ‘simple’ crossings. In all the other cases we get a slower rate anda nonnormal limit.While Theorem 1.7 (hence, also Theorem 4.9) involves only the case when Γ ∗ = Γ ∗ F consists of regular contact points, the comments about virtual contact points between F G and the identity that led to (43) apply to the global analysis of γ ( F n , G m ). As an importantexample, we consider the case when F and G are finitely supported. More precisely,let us assume F and G have a finite support x < x < · · · < x k , with probabilities p , p , . . . , p k and q , q , . . . , q k , respectively, with p i + q i > p i or q i could benull), i = 1 , . . . , k . We set P i := P ij =1 p j and Q i := P ij =1 q j , i = 1 , . . . k − 1. Then, F G ( t ) = P i for t ∈ ( Q i − , Q i ]. Hence, the only possible inner contact points are P i , Q i , i = 1 , . . . , k − P i if Q i − < P i < Q i ), vertical crossings ( Q i if Q i < P i < Q i +1 ), upper tangency points ( Q i if Q i − < Q i = P i < P i +1 ) or lower tangency points ( P i if P i − < P i = Q i − < Q i ), usingthe same terms as in the discussion following Proposition 4.3. Combining that discussionwith Corollary 4.2 we obtain the following consequence. Theorem 4.10 With the above notation, if F and G are finitely supported and H , V , U and L denote, respectively, the sets of horizontal crossing, vertical crossing, upper tan-gency and lower tangency points for F and G , then, assuming that nn + m → λ ∈ (0 , , √ n + m ( γ ( F n , G m ) − γ ( F, G )) w → X t ∈H B ( t ) √ λ − X t ∈V B ( t ) √ − λ + X t ∈U (cid:16) B ( t ) √ λ − B ( t ) √ − λ (cid:17) + − X t ∈L (cid:16) B ( t ) √ λ − B ( t ) √ − λ (cid:17) − , where B and B are independent Brownian bridges. Similar to Theorem 1.5, we get a Gaussian limiting distribution only when all thecontact points are crossing points (which, necessarily, have orders r L = r R = 1). In the33ase F = G we have Q i − < Q i = P i < P i +1 for all i , that is, every P i is an upper tangencypoint and Theorem 4.10 yields √ n + mγ ( F n , G m ) w → k − X i =1 (cid:16) B ( P i ) √ λ − B ( P i ) √ − λ (cid:17) + . (57)Of course, using the fact that √ − λB −√ λB is a Brownian bridge, we can, equivalently,write (57) as p nmn + m γ ( F n , G m ) w → P k − i =1 ( B ( P i )) + . References [ ´Alvarez-Esteban et al.(2017)] ´Alvarez-Esteban, P.C.; del Barrio, E.; Cuesta-Albertos,J.A. and Matr´an, C. (2017). Models for the assessment of treatment improvement:the ideal and the feasible. Statist. Sci. , , 469–485.[Aly (1986)] Aly, Emad-Eldin A.A. (1986). Strong Approximations of the Q-Q Process, J. Multiv. Analysis , 114–128.[Aly et al.(1987)] Aly, Emad-Eldin A.A.; Cs¨orgo, M. and Horvath, L. (1987). P-P Plots,Rank Processes and Chernoff-Savage Theorems. In New Perspectives in Theoreticaland Applied Statistics (M.L. Puri, J.P. Vilaplana and W. Wertz Eds.), 135–156.Wiley, New York[Behnen and Neuhaus (1983)] Behnen, K. and Neuhaus, G. (1983). Galton tests as lin-ear rank tests with estimated scores and its local asymptotic efficiency, Ann.Statist. (2), 588–599.[Billingsley (1968)] Billingsley, P. (1968). Convergence of Probability Measures. Wiley.[Bobkov and Ledoux(2016)] Bobkov, S. and Ledoux, M. (2016). One-dimensional empir-ical measures, order statistics and Kantorovich transport distances. Memoirs Am.Math. Soc. Vol.: 261, Number 1259.[Chung and Feller (1949)] Chung, K. L., and Feller, W. (1949). On fluctuations in coin-tossing. Proc. Nat. Acad. Sci. of USA , , 605–608.[Cs´aki and Vincze (1961)] Cs´aki, E. and Vincze, I. (1961). On some problems connectedwith the Galton-test. Publ. Math. Inst. Hungar. Acad. Sci. , 97–109[Cs¨orgo and Horvath (1993)] Cs¨orgo, M. and Horvath, L. (1993). Weighted approxima-tions in probability and statistics. Wiley.34Darwin (1876)] Darwin, C. (1876). The effect of Cross- and Self-fertilization in the Veg-etable Kingdom . John Murray.[Doksum (1974)] Doksum, K. (1974). Empirical probability plots and statistical inferencefor nonlinear models in the two-sample case. Ann. Statist. (2), 267–277.[Feller(1968)] Feller, W. (1968). An Introduction to Probability Theory and its Applica-tions Vol. I (Third edition) . Wiley.[Gross and Holland(1968)] Gross, S., and Holland, P. W. (1968). The Distribution ofGalton Statistic. Ann. Math. Statist., (6), 2114–2117.[Hodges(1955)] Hodges, J.L. (1955). Galton rank-order test. Biometrika , 261–262.[Lehmann(1955)] Lehmann, E.L. (1955). Ordered families of distributions. Ann. Math.Statist. , 399–419.[L´evy(1939)] L´evy, P. (1939). Sur certains processus stochastiques homog´enes. CompositioMath. , 283–339.[1] Reiss, R.D. (1989). Approximate Distributions of Order Statistics With Applicationsto Nonparametric Statistics. Springer.[Shorack and Wellner(1986)] Shorack, J.R. and Wellner, J.A. (1986). Empirical Processeswith Applications to Statistics . John Wiley and Sons, New York.[Sparre-Andersen (1953)] Sparre-Andersen, E. (1953). On the fluctuations of sums of ran-dom variables. Math. Scand. , 263–285.[Zhuang et al (2019)] Zhuang, W.W., Hu, B.Y., and Chen, J. (2019). Semiparametricinference for the dominance index under the density ratio model. Biometrika , ,1, 229–241 Appendix. A On the composite map F G We collect here some useful fact about the transform F G (and G F ). At some points wehave used the fact that, as a consequence of (18), for every measurable A ⊂ [0 , ℓ { t ∈ A : t > F G ( t ) } = ℓ { t ∈ A : F − ( t ) > G − ( t ) } ,ℓ { t ∈ A : U n ( t ) > F G ( V m ( t )) } = ℓ { t ∈ A : F − n ( t ) > G − m ( t ) } . (58)35ooking at (58), the corresponding statement for G F would be ℓ { t ∈ A : V m ( t ) > G F ( U n ( t )) } = ℓ { t ∈ A : F − n ( t ) < G − m ( t ) } . This shows that we can base our analysis indistinctly using G F or F G , and, in particular,to study ˜ ℓ t m,n instead of ℓ t n,m , (recall the discussion after Corollary 4.2) ifi) ℓ { t ∈ ( t − η, t + η ) : F − ( t ) = G − ( t ) } = 0 , ii) P ( { ℓ { t ∈ ( t − η, t + η ) : F − n ( t ) = G − m ( t ) } > } infinitely often) = 0 , iii) P ( ℓ { t ∈ ( t − η, t + η ) : V m ( t ) = G F ( U n ( t )) } > 0) = 0 , andiv) P ( ℓ { t ∈ ( t − η, t + η ) : U n ( t ) = F G ( V m ( t )) } > 0) = 0hold. When t is an isolated contact point then i) is satisfied. The other relations can beeasily guaranteed taking into account the next lemma and its consequences. Lemma A.1 Let X, Y be independent r.v.’s with respective d.f.’s F and G . Then P ( X = Y ) = 0 if and only if F and G have no common discontinuity point. Therefore, if F and G have no common discontinuity point, the samples { X , . . . , X n } and { Y , . . . , Y m } are a.s. disjoint. Since these samples are the images of F − n and G − m respectively, the set { F − n = G − m } must be a.s. empty. On the contrary, if there exists acommon discontinuity point, x , for F and G , then P (cid:0) ℓ { F − n = G − m } > (cid:1) ≥ P (cid:0) ℓ { F − n = G − m } = 1 (cid:1) = P (cid:0) X = x ) n P (cid:0) Y = x ) m > . This proves the following proposition. Proposition A.2 Let F − n and G − m be the sample quantile functions based on indepen-dent samples of i.i.d. r.v.’s from the d.f.’s F and G . Then P ( ℓ { F − n = G − m ) } > > for some n, m if and only if F and G have a common discontinuity point. Elaborating on the same ideas, it easily follows the following summarizing proposition. Proposition A.3 Relations iii) and iv) above always hold. Moreover, if F and G do nothave common discontinuity points on the set [ F − ( t − η ) , F − ( t + η )] , then ii) holdsfor every η ∈ (0 , η ) . Proposition A.25 in [Bobkov and Ledoux(2016)] provides simple necessary and suffi-cient conditions under which a quantile function is Lipschitz. We exploit that character-ization to give here necessary and sufficient conditions under which F G is Lipschitz.36 roposition A.4 The transform F G is Lipschitz if and only if F − is increasing on [ F G (0) , F G (1)] and supp( F ) ∩ ( G − (0) , G − (1)) ⊂ supp( G ) (59) and there exists some δ > such that lim sup y → x,y>x G ( F − ( y ) − ) − G ( F − ( x ) − ) y − x > δ a.e. on supp( F ) . Condition (59) can be equivalently stated as F is continuous and G increasing on supp( F ) ∩ ( G − (0) , G − (1)) . (60) Proof. We set H − ( t ) = F G ( t − ), t ∈ (0 , 1) and note that H − is left-continuous, hencea quantile function, and also that H − is Lipschitz if and only if F G is Lipschitz. Wewrite H for the associated d.f., namely, H ( x ) = ℓ { t : F G ( t − ) ≤ x } = ℓ { t : F G ( t ) ≤ x } . By Proposition A.25 in [Bobkov and Ledoux(2016)] H − is Lipschitz if and only ifthe associated probability is supported in a finite interval and its absolutely continuouscomponent has a density separated from zero on that interval.Since the support of the law L ( H − ) is contained in [0 , H is strictly increasing on [ F G (0) , F G (1)] (see Proposition A.7 in [Bobkov and Ledoux(2016)]).This, in turn, is equivalent to (59) and to (60).In fact, to check the equivalence to (59), we note that if H is increasing on [ F G (0) , F G (1)]then for every a, b ∈ ( F G (0) , F G (1)), a < b , we have ℓ { t : F G ( t ) ∈ ( a, b ) } > 0, which holdsif and only if ℓ { t : F G ( t ) ∈ [ a, b ) } > a, b ∈ ( F G (0) , F G (1)), a < b , hence, if andonly if ℓ { t : G − ( t ) ∈ [ F − ( a ) , F − ( b )) } > 0. This implies that F − must be increasingon ( F G (0) , F G (1)) . Moreover, if x ∈ ( G − (0) , G − (1)) \ supp( G ) , then x ∈ ( G − ( t ∗ ) , G − ( t ∗ +)) for some t ∗ ∈ (0 , x ∈ supp( F ), taking δ > x − δ, x + δ ) ⊂ ( G − ( t ∗ ) , G − ( t ∗ +)), we would have t ∗ = G ( x − δ ) = G ( x + δ ). Thus, G − ( t ∗ ) < x − δ 0. This completes the proof. • Remark A.5 Our analysis of the local behaviour of Galton’s statistic around a contactpoint, t , required F G to be Lipschitz on a neighbourhood ( t − η, t + η ). This can becharacterized in the same way as (59) and (60). Without trying to give the best possibleresult, this can be guaranteed, e.g., if G − is increasing and continuous on ( t − η, t + η )except perhaps at t , F is continuous on G − (( t − η, t + η )), and the derivatives of F and G (that exist almost everywhere) satisfy ess inf { G ′ ( x ) F ′ ( x ) , x ∈ G − (( t − η, t + η )) ∩ supp( F ) } > B Approximation of uniform quantile processes The following result has been used extensively in this paper. It is a consequence of arefined version of the Komlos-Major-Tusnady construction for the quantile process (see,e.g., Theorem 3.2.1, p. 152 in [Cs¨orgo and Horvath (1993)]), from which we know thatthere exists a sequence of Brownian bridges on [0 , { B n } , versions of u n and positiveconstants, C , C and C , such that P n sup ≤ t ≤ | u n ( t ) − B n ( t ) | > x + C log n √ n o ≤ C e − C x , x > . (61)Making use of this construction for both quantile processes and taking x = aC log n with a > K = aC + C > 0, we obtain useful independent sequences of Brownian bridges { B Fn } , { B Gm } and versions of u n and v m . Theorem B.1 With the previous notation, in a probability one set, the sequences { B Fn } , { B Gm } , { u n } and { v m } eventually satisfy sup ≤ t ≤ | u n ( t ) − B Fn ( t ) | ≤ K log n √ n and sup ≤ t ≤ | v m ( t ) − B Gm ( t ) | ≤ K log m √ m ..