Abstract

The two-sample Kolmogorov-Smirnov test is a widely used statistical test for detecting whether two samples are likely to come from the same distribution. Implementations typically recur on an article of Hodges from 1957. The advances in computation speed make it feasible to compute exact p-values for a much larger range of problem sizes, but these run into numerical stability problems from floating point operations. We provide a simple transformation of the defining recurrence for the two-side two-sample KS test that avoids this.

Full PDF

NNumerically more stable computation of the p-values for thetwo-sample Kolmogorov-Smirnov test

Thomas Viehmann ∗ MathInf Technical Report 2021-1, February 2021

Abstract

The two-sample Kolmogorov-Smirnov test is a widely used statistical test for detecting whethertwo samples are likely to come from the same distribution. Implementations typically recur on anarticle of Hodges from 1957. The advances in computation speed make it feasible to compute exactp-values for a much larger range of problem sizes, but these run into numerical stability problemsfrom ﬂoating point operations. We provide a simple transformation of the deﬁning recurrence for thetwo-side two-sample KS test that avoids this.

The Kolmogorov-Smirnov two sample test (KS test) is perhaps the go-to statistical statistical test ofwhether two samples originate form the same distribution.To make things precise, we consider samples x , . . . , x m ∈ R and y , . . . , y n ∈ R drawn from two con-tinuous distribution functions F and G , respectively. We form the empirical distribution functions F m ( x ) := |{ x i : x i ≤ x }| /m and G n ( x ) = |{ y i : y i ≤ x }| /n . The KS test then tests the null hypothesis F = G .The test-statistic is computed from the empirical distributions as D = sup x | F m ( x ) − G n ( x ) | . Following the usual notation, we write the supremum even though in the cases we consider, it is actuallya maximum. Working with only the ranks, the test does not make assumptions on the distributionsthemselves. To operationalize the test, we need to compute p-values.Smirnov famously gave the asymptotic formula that if m, n → ∞ such that n/m → q ∈ R , P = P rob (cid:20)(cid:114) mnm + n D ≥ x (cid:21) → − K ( x ) = 2 ∞ (cid:88) k =1 ( − k − exp( − k x ) , where K is the cumulative distribution function of the Kolmogorov distribution. We use Hodges’ name P for the two-sided problem.However, as noted by Hodges, it is unclear how well they work in practice: After studying the problemnumerically for m = 12 , n = 13 , . . . , he writes “The Smirnov approximation is seen to be highlyinaccurate for values of m and n which are already large enough for direct computations to be arduous.”The amount of computation that is possible with ease has dramatically increased since the 1950s and soit is natural to revisit direct computation. ∗ MathInf GmbH, [email protected] a r X i v : . [ s t a t . C O ] F e b Direct computation of p-values

The key observation for the computation of p-values is that we can compute the distribution of ‘D‘ underthe null hypothesis in purely combinatorical terms. Under the assumptions, all values are distinct withprobability 1. We may order the joint sequence of the x i and y j . We ignore the indices and write an x in positions where a x i occurs, and y where y i occurs, obtaining a random sequence of m x es and n y s.Under the null hypothesis, all possible (cid:0) m + nm (cid:1) distinct sequences have equal probability.We may map these to paths P between (0 , and (1 , where for each x we move to the right by /m and for each y we move up by /n . Given a path P , the statistic D is the maximum sup ( x,y ) ∈ P | x − y | . .

000 0 .

067 0 .

133 0 .

200 0 .

267 0 .

333 0 .

400 0 .

467 0 .

533 0 .

600 0 .

667 0 .