[PDF] Robust test statistics for data sets with missing correlation information

Abstract

Not all experiments publish their results with a description of the correlations between the data points. This makes it difficult to do hypothesis tests or model fits with that data, since just assuming no correlation can lead to an over- or underestimation of the resulting uncertainties. This work presents robust test statistics that can be used with data sets with missing correlation information. They are exact in the case of no correlation and either guaranteed to be conservative -- i.e. the uncertainty is never underestimated -- in the presence of correlations, or they are also exact in the degenerate case of perfect correlation between the data points.

Full PDF

RRobust test statistics for data sets with missing correlation information

Lukas Koch

University of Oxford

Abstract

Not all experiments publish their results with a description of the correlations between the data points. Thismakes it diﬃcult to do hypothesis tests or model ﬁts with that data, since just assuming no correlation canlead to an over- or underestimation of the resulting uncertainties. This work presents robust test statisticsthat can be used with data sets with missing correlation information. They are exact in the case of nocorrelation and either guaranteed to be conservative – i.e. the uncertainty is never underestimated – in thepresence of correlations, or they are also exact in the degenerate case of perfect correlation between the datapoints.

1. Introduction

Some data sets are published without a full co-variance matrix, describing the correlations be-tween the data points of the result. The impliedassumption in these data sets is that the correla-tion in the uncertainties is 0, i.e. the data is uncor-related. This is not always the case though , andusers of the data are put in the unenviable situationof having to use correlated results, without know-ing what the correlations actually are. Usually onewould use the fully correlated Mahalanobis distanceor its square, D = ∆ T S − ∆ , to judge how well acertain model ﬁts the data. Here ∆ is the diﬀer-ence between the data and the model, and S is thecovariance matrix describing the uncertainty of theresult. Just ignoring the correlations and apply-ing a “naive”, uncorrelated Mahalanobis distanceto compare a model to the data can lead to plainlywrong results.Consider multivariate normal distributed datawith ten dimensions. In fact, unless otherwisestated, let us assume that the examples in this pa-per are all multivariate normal distributed and weknow the correct diagonal elements of the respec-tive covariance matrices. The only problem we will Email address: [email protected] (LukasKoch) If, e.g., the data points vary smoothly within their errorbands or when they are supposed to describe a “shape-only”uncertainty, it is clear that there is a correlation there. address here is the missing of information regardingthe correlations between the variables. The “naive”test statistic would consist of just summing up thesquared z-scores, i.e. the residuals normalised bythe uncertainty:naive( x | µ , s ) = (cid:88) i ( x i − µ i ) S ii , (1)where x is the data result, µ is a prediction of theexpectation value from some model, and S ii = s i isthe variance of the data points. This test statisticwill be chi-square distributed if there are no cor-relations present in the data, but using it in thepresence of correlations leads to under- or overesti-mation of uncertainty, depending on the correlationand the actual value of the statistic. Figure 1 showsthis for 10-dimensional toy data sets thrown withdiﬀerent levels of correlation. The diagonal terms ofthe covariance matrix are kept constant at 1, whilethe oﬀ-diagonals are set to the values 0, 0.5, 0.9,and 0.99, in the diﬀerent sets.The left plot shows the cumulative probabilitydensity functions (CDFs) of the expected distribu-tion of the test statistic in the absence of corre-lations, as well as the actual CDFs of the diﬀer-ent toy data sets. The diﬀerent distributions aﬀectwhich signiﬁcance level (or p-value) a certain valueof the test statistic corresponds to. The right plotshows how an assumed signiﬁcance level (as cal-culated with the expected CDF) translates to theactual signiﬁcance level (as calculated from the ac- Preprint submitted to TBD February 12, 2021 a r X i v : . [ phy s i c s . d a t a - a n ] F e b ual CDFs). To put it another way: The x-axisshows how often one would like to make a Type-Ierror (rejecting a true hypothesis), while the y-axisshows how often one actually makes a Type-I er-ror, given the diﬀerent levels of correlation in thedata. If the actual signiﬁcance level is larger thanassumed, one rejects a true hypothesis more oftenthan intended. In terms of error bars or conﬁdenceregions, this means that the size of the uncertaintiesis eﬀectively underestimated.This behaviour is clearly undesirable. If the datais (suspected to be) strongly correlated, it would bebetter to use a diﬀerent test statistic that is able toperform consistently under diﬀerent levels of cor-relation in the data. For such a test statistic, thefollowing properties would be desirable:1. Exact in the case of no correlations.2. Guaranteed to be conservative when not exact.3. Low deviations from exactness when not exact.4. Exact in the case of 100% correlation.5. Exact at every possible level of correlation.Some of these properties are contained in one an-other. The naive use of the uncorrelated Maha-lanobis distance has property 1 but none of theothers. The following sections will describe sometest statistics that have more of these properties.Finally they will be compared in section 4.

2. Fitting the covariance to the data

The ﬁrst considered test statistic arises fromtreating the oﬀ-diagonal elements of the covari-ance matrix as nuisance parameters of the statis-tical model. For any given prediction in the N-dimensional parameter space µ and a given sam-ple (i.e. the data) x , it is possible to choosethe oﬀ-diagonal elements of S in a way to min-imise the resulting squared Mahalanobis distance D = ( x − µ ) T S − ( x − µ ). This is diﬀerent frommaximising the likelihood of the data, since theprobability density of a multivariate normal dis-tribution also depends on the determinant of thecovariance matrix: L = (2 π ) − N det( S ) − e − ( x − µ ) T S − ( x − µ ) . (2)A minimal Mahalanobis distance is only equivalentto a maximal likelihood if the determinant of thecovariance matrix is constant. This is not the casehere. Furthermore, for N ≥ ∞ , rendering it useless as a teststatistic. This will be shown below.To minimise the Mahalanobis distance, we cantransform the variable space to make the last vari-able x N independent of the others: y =  − S N S NN  x (3)=  . . . − S N S NN . . . − S N S NN ... ... . . . ... ...0 0 . . . − S ( N − N S NN . . .  x . (4)Here S NN is the variance of the Nth variable, and S N is the vector of the N − S N =  S N ... S ( N − N  . (5)The covariance and expectation values for y arethen: S y = ( ∇ y T ) T S ( ∇ y T ) (6)=  − S N S NN  S /N S N S TN S NN  − S TN S NN  (7)=  S /N − S N S TN S NN S TN S NN  − S TN S NN  (8)=  S /N − S N S TN S NN S NN  , (9) µ y =  − S N S NN  µ , (10)where S /N is the original covariance matrix for theremaining N − y N = x N to the total Ma-halanobis distance is ﬁxed, since it only dependson S NN , which has a given constant value. Thecontribution of the remaining variables y /N couldbe minimised to 0 by choosing the oﬀ-diagonal ele-ments of the covariance such that their expectationvalue µ y/N is equal to the actual value: µ y/N ! = y /N (11) µ /N − S N S NN µ N = x /N − S N S NN x N (12)2 x )0.00.20.40.60.81.0 C D F Uncorr.Corr.50Corr.90Corr.99expected × 10 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 Uncorr.Corr.50Corr.90Corr.99

Figure 1: CDFs (left) for the “naive” test statistic for diﬀerent levels of correlations in the data. When using the uncorrelatedCDF to calculate the assumed signiﬁcance level (or p-value) of a value of the statistic, the actual level will diﬀer from theassumption depending on the correlations (right). As the correlation increases, the distribution of the naive test statisticapproaches that of a χ distributed variable which is multiplied by the number of bins (in this case 10). S N = S NN x /N − µ /N x N − µ N (13)=  ∆ i √ S ii √ S NN ∆ N √ S ii S NN ... ∆ N − √ S ( N − N − √ S NN ∆ N (cid:112) S ( N − N − S NN  (14)Here ∆ i / √ S ii = ( x i − µ i ) / √ S ii is the (positive ornegative) z-score of the i th variable, and √ S ii S NN is the maximum allowed absolute value of the co-variance between the i th and N th variable. Thelatter arises from the fact that the correlation coef-ﬁcients S ij / √ S ii S NN must be within [ − , S N is thus the ratio of the N − − , N th is the one with the largest absolute z-score.Thus, by eliminating the contribution of the othervariables as shown above, the minimal achievableMahalanobis distance under variation of the oﬀ-diagonal elements of the covariance matrix is equalto the largest absolute z-score of the single vari-ables. This behaviour is illustrated in Figure 2 fortwo dimensions. Note that we only need N − N ( N − / S arbitrarily small, leading to the inﬁnite supremumof the Likelihood maximisation over the covarianceelements for N ≥ b the largest absolute z-score, and wecan deﬁne the “ﬁtted” test statistic as:ﬁtted( ∆ | s ) = b = max i (cid:18) ∆ i S ii (cid:19) . (15)It is straight-forward to derive the expected dis-tribution of this test statistic in the case of no corre-lations. The CDF of b , F b ( b (cid:48) ), is just the probabilityof the absolute values of all z-scores being smallerthan or equal to b (cid:48) : F b ( b (cid:48) ) = P ( b ≤ b (cid:48) ) = + b (cid:48) (cid:90) − b (cid:48) · · · + b (cid:48) (cid:90) − b (cid:48) f ( z , . . . , z N ) d z . . . d z N , (16)where f ( z ) denotes the probability density function(PDF) of the potentially negative z-scores. Withuncorrelated, standard normal distributed z-scoresthis evaluates to F b ( b (cid:48) ) = erf N (cid:18) b (cid:48) √ (cid:19) . (17)With this we can write down the CDF of b as F b ( y ) = F b ( √ y ), (18)3 .5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 / S / S z z S = 0 S = S S z / z Figure 2: Minimum achievable Mahalanobis distance for two dimensions. The minimum achievable Mahalanobis distance whenvarying the oﬀ-diagonal covariance element is equal to the largest absolute z-score of the single variables. The surface wherethe Mahalanobis distance is equal to 1 is an ellipse contained within the square with its edges at ∆ i / √ S ii = ±

1. Varying theoﬀ-diagonal element of the covariance matrix does not rotate the principal axes of the ellipse, but it changes where it touchesthe edges of the square. When chosen correctly, the ellipse touches the edge of the box at the point where the data is projectedonto it. Because of the linearity of the Mahalanobis distance, this means that the total distance of the data point is thensimply the largest z-score. This two-dimensional minimisation can be done for all marginal projections of pairs of variablesin N-dimensional problems. To achieve the minimal total Mahalanobis distance, only the pairs involving the overall largestabsolute z-score need to be speciﬁed like this, but applying the scheme to all pairs ensures a valid covariance matrix. ∆ | s ) ∼ Bee N , (19)for uncorrelated normal distributed ∆ i . A Pythonimplementation of the distribution can be found inListing 1 in the appendix.Figure 3 shows how this test statistic fares for dif-ferent levels of correlation in the toy data. For nocorrelations, the distribution follows the expecta-tion. With increasing correlations, the distributiondeviates more and more, approaching a chi-squaredistribution with one degree of freedom. Com-pared to the naive approach, we can see that thedeviation from an exact statistic has been decreasedfor a wide range of signiﬁcance levels. But more im-portantly, the ﬁtted test statistic is conservative forall signiﬁcance levels and all correlation strengths.The real signiﬁcance level of a result is always equalto or lower than the assumed signiﬁcance that wasevaluated using the expected Bee-square distribu-tion.

3. Asymptotically invariant test statistics

The ﬁtted test statistic described above is “safe”to use in the sense that it is always conservative.Unfortunately it gets more and more conservativewith increasing correlations in the data. It wouldbe advantageous if the test statistic was exact at alllevels of correlations, or at least at both no corre-lations, and (in the limit of) 100% correlations. Toachieve the latter, it is useful to view the problemin the “CDF space” of the data points.Instead of the distribution of ∆ , let us considerthe CDFs of the squares of the single variables: y i = F χ (∆ i /S ii ), (20)where F χ is the CDF of ∆ i /S ii , since we assume∆ i to be normal distributed with a variance of S ii .Since y is a function of a random variable, it is As the data gets more and more correlated, the z-scoreswill approach being equal in all cases, and the maximumz-score will be distributed like a single standard normal dis-tributed variable. I.e. the probability of a result at least as extreme as theobserved is actually lower than what the assumed distribu-tion suggest; the uncertainty is overestimated. itself a random variable. Also, by deﬁnition, y i isuniformly distributed between 0 and 1: y i ∼ U(0 , y i , f y ( y ) will also be uniform within theN-cube deﬁned by the N unit vectors: f y ( y ) = (cid:40) ≤ y i ≤ ∀ i y i .If, on the other hand, the variables are perfectlycorrelated, the values of all y i will be identical ineach random sampling. This means the combined PDF must be zero wherever the y i are not identical.In this case, the combined probability density willbe a delta function that concentrates all probabilityon the main diagonal of the hypercube: f (cid:48) y ( y ) = (cid:40)(cid:81) N − i =1 δ ( y i − y i +1 ) if 0 ≤ y ≤

10 else (23)Note that this does not aﬀect the marginal distri-butions of the single variables. Marginalising outall but one variable leads again to a uniform distri-bution in that variable. If we can deﬁne a function z ( y ) that is identically distributed under both as-sumptions, we can use it to deﬁne a test statisticthat is exact in both cases.Let us demand that a low value of z indi-cates a good agreement between data and model,while high values indicate tension between the two.Within the N-dimensional hypercube, this meansthat z should be low towards the corner at y = and increase towards the corner at y = . For z tobe identically distributed with no correlations andwith 100% correlations, the surface deﬁned by theimplicit function z ( y ) = z (cid:48) = const . must enclosethe same amount of probability in both cases, asthis deﬁnes the CDF of z : F z ( z (cid:48) ) = (cid:90) z ( y ) ≤ z (cid:48) f ( (cid:48) ) y ( y )d N y (24)5 x )0.00.20.40.60.81.0 C D F Uncorr.Corr.50Corr.90Corr.99expected Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 Uncorr.Corr.50Corr.90Corr.99

Figure 3: CDFs (left) for the “ﬁtted” test statistic for diﬀerent levels of correlations in the data. When using the uncorrelatedCDF to calculate the assumed signiﬁcance level (or p-value) of a value of the statistic, the actual level will diﬀer from theassumption depending on the correlations (right). In the presence of correlations, the real signiﬁcance is consistently higher(the signiﬁcance level is lower) than the assumption. This means the uncertainties are overestimated and the statistic behavesconservatively. As the correlations increase, the distribution of the ﬁtted test statistic approaches the χ distribution. In the case of no correlation, this is the volume A of the part of the N-cube enclosed by the implicitfunction. In the case of perfect correlation, it isequal to the single (identical) y i coordinates wherethe surface intersects with the diagonal. Figure 4aillustrates this in the case of two dimensions.Let us call this coordinate x and let us also de-mand that the function z ( y ) = x at that point. Wethen get the following condition: A = (cid:90) z ( y ) ≤ x d N y ! = x , (25)where the integral is understood to be conﬁned tothe inside of the N-cube. The challenge is now toﬁnd functions z ( y ) which fulﬁl this condition forany number of dimensions. The simplest way to fulﬁl Equation 25 in twodimensions is to draw straight lines from the pointson the diagonal to the ”oﬀ-diagonal” corners of thesquare, as shown in Figure 4b. The function z canbe easily calculated for any given point, as eachpoint can be seen as lying on a straight diagonalline starting at the lower or left edge of the square( y s ) and ending on the right or top edge ( y e ). Thefractional distance along this line is the desired z and evaluates to z ( y ) = y min y min + (1 − y max ) , (26) where y min / max are the minimum and maximum ofthe elements of y respectively. This also directlyapplies to the N-dimensional case without change.Now, it would be possible to use z as the teststatistic directly when used on its own. When thedata is intended to be used in conjunction withother data sets though, e.g. in a global ﬁt, it isuseful to use a test statistic that is (approximately)chi-square distributed. To this end, we can simplyapply another function to z which is chosen so thatthe result is chi-square distributed if z is uniformlydistributed. This function is just the inverse of theCDF of the chi-square distribution − F χ . Finally weget the ﬁrst of the “invariant” test statistics:invariant ( ∆ | s ) = − F χ (cid:16) z (cid:16) F χ (cid:16) ∆ i / (cid:112) S ii (cid:17) , . . . (cid:17)(cid:17) ,(27)with z as deﬁned in Equation 26.Note that we could have chosen a diﬀerent num-ber of degrees of freedom for the transformationback to a chi-square distribution. This makes nodiﬀerence when using the test statistic on a singledata set alone, but it changes the relative weight adata set has in a global ﬁt with other data whencomputing the total chi-square. If one is conﬁdentthat the correlations in the data set are weak, itmight be better to use the actual number of datapoints. In the presence of medium to strong cor-relations it could be argued though that there isactually less information in the data than the num-ber of data points suggests, or rather, we are losing“degrees of freedom” by having to make up for the6 .0 0.2 0.4 0.6 0.8 1.0 y = F ( / S )0.00.20.40.60.81.0 y = F ( / S ) x A z = const . = x (a) Illustration y = F ( / S )0.00.20.40.60.81.0 y = F ( / S ) x A z = y min y min + (1 y max ) y s + z ( y e y s ) (b) Invariant 1 y = F ( / S )0.00.20.40.60.81.0 y = F ( / S ) x A z = max( g ( y max ), y min ) (c) Invariant 2 y = F ( / S )0.00.20.40.60.81.0 y = F ( / S ) x A z = max( h ( y max ), y min y min + (1 y max )) (d) Invariant 3 Figure 4: Illustration of the test statistics in CDF space. For z to be identically distributed at both no correlations (PDF of y is uniform over square) and at 100% correlations (PDF of y is uniform along diagonal), the area A enclosed by the implicitfunction z = const. must be equal to the position x where the function crosses the diagonal. Another simple solution in two dimensions is toadd identical rectangles to the two inside-facingsides of the x square, as shown in Figure 4c. Let l be the length of the added rectangles: l = 1 − x N dimensions, the shape of A be-comes a hypercube H with an edge length of x + l ,minus another hypercube H at the inside diagonalcorner with an edge length of l : A = ( x + l ) N − l N ! = x . (29)Expressed in the total width of the larger cube d = x + l we get: d N − l N ! = d − l . (30)This equation will always have a solution of l ∈ (0 , d ) for any d ∈ ( N − / ( N − , l ( d ) be that solution, so we candeﬁne the function g : [0 , → [0 , g ( d ) =  d ≤ N − / ( N − d − l ( d ) if N − / ( N − < d <

11 if d = 1, (31)which can calculate x from a given (possible) edgelength d of H .To calculate the z of any given point, we can usethat every point on the surface of ( H \ H ) is eitheron one of the the “outer” faces of H (where all co-ordinates are > H (where at least one coordinate is x ). In the formercase, the edge length of H is given by the maxi-mum of the y -coordinates, while in the latter casethe position of the “inner” diagonal corner of H is given by the minimum of the coordinates. Sincethat inner corner is at y i = x ∀ i by construction,we can write z as z ( y ) = max( g ( y max ) , y min ). (32) With this z we can then deﬁne the invariant ( ∆ | s )test statistic just like in Equation 27.Its performance can be seen in Figure 6. It isconservative for all strengths of correlation and sig-niﬁcance levels, and shows the expected limit of ex-actness at no and very strong correlations. Finally, Figure 4d shows an intermediate shapebetween “invariant 1” and “invariant 2”. It can beinterpreted as the shape of “invariant 1”, but in-stead of connecting the diagonal to the oﬀ-diagonalcorners of the square, it is connected to the respec-tive corners of a larger square with edge length 1 /α ,with the shape parameter α ∈ (0 , A is equal to x , it is cut oﬀ at theedges of a square with edge length d .In N dimensions, the volume of such a body is: A = d N − ( d − x ) N (1 − αx ) N − = x . (33)This equation has one solution of x ∈ (0 , d ) if ( N − αdN + αd ) > d − N . Let x ( d ) be that solution and,like before, we can deﬁne a function h α : (0 , → (0 , h α ( d ) =  N − αdN + αd ) ≤ d − N x ( d ) if ( N − αdN + αd ) > d − N >

11 if d = 1. (34)This determines the value of z for points on thesurface of the hypercube.The value of z for points on the “cut-oﬀ” cornercan be calculated just like in the case of “invari-ant 1”. Only the scaling of the containing cubeneeds to be taken into account. The total z func-tion is then again the maximum of the two values: z ( y ) = max (cid:18) h α ( y max ) , y min αy min + (1 − αy max ) (cid:19) .(35)The ﬁnal test statistic “invariant 3” is then builtaccording to Equation 27. A python implementa-tion of this test statistic is provided in Listing 2 inthe appendix.This test statistic has one free shape parameter α .It determines the opening angle of the iso- z surfacewhere it meets the main diagonal of the hypercube.In the 2D case, this angle is constant at 90 ◦ for α →

0, making it identical to the “invariant 2”8 x )0.20.30.40.50.60.70.80.91.0 C D F Uncorr.Corr.50Corr.90Corr.99expected 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 Uncorr.Corr.50Corr.90Corr.99

Figure 5: CDFs (left) for the “invariant 1” test statistic for diﬀerent levels of correlations in the data. When using theuncorrelated CDF to calculate the assumed signiﬁcance level (or p-value) of a value of the statistic, the actual level will diﬀerfrom the assumption depending on the correlations (right). The eﬀect of the correlations is weaker than for the naive teststatistic, but it is not consistently conservative as the ﬁtted one. As the correlations increase, the distribution of the teststatistic approaches the uncorrelated expectation again. x )0.20.30.40.50.60.70.80.91.0 C D F Uncorr.Corr.50

Corr.90

Corr.99expected 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 4 Uncorr.Corr.50Corr.90Corr.99

Figure 6: CDFs (left) for the “invariant 2” test statistic for diﬀerent levels of correlations in the data. When using theuncorrelated CDF to calculate the assumed signiﬁcance level (or p-value) of a value of the statistic, the actual level will diﬀerfrom the assumption depending on the correlations (right). Like the ﬁtted test statistic, this one is consistently conservative.As the correlations increase, the distribution of the test statistic approaches the uncorrelated expectation again. α >

0, the angle starts at 90 ◦ at x = 0and then opens up with increasing x . The value of α determines where the angle reaches 180 ◦ : x ◦ = 12 α . (36)E.g. for α = 1, corresponding to the “invariant 1”case, the opening angle is 180 ◦ at x = 0 . < ◦ means that the surface “protrudes” into parts ofthe CDF space that should “belong” to a highervalue of x , suggesting a conservative statistic. Con-versely, an opening angle > ◦ means that theCDF space perpendicular to the diagonal is onlycovered with higher x , suggesting a coverage thatis actually lower than the expectation. At α = 0 . < ◦ for all x (as x is always < α though. Figure 7 shows the performance of the“invariant 3” test statistic with α = 2 /

3. Out ofall considered test statistics, it shows the smallestdeviations from exactness, and it is conservative forthe considered signiﬁcance levels and correlations.

4. Comparison

It is quite clear that among the considered teststatistics, “invariant 3” performs the most consis-tent under many diﬀerent levels of correlations. Itsuﬀers from a kind of arbitrariness though whentrying to combine it in larger ﬁts with other datasets. The transformation to a chi-square distribu-tion with one degree of freedom is a conservativechoice that allows it to be combined in least-squaresor likelihood ﬁts. Aside from the case of perfect cor-relations, it will under-estimate the amount of in-formation compared to the other data-sets though.The “ﬁtted” statistic is quite a bit more conserva-tive and it gets more conservative the stronger thecorrelations are. It has the advantage though thatit corresponds to an “actual Mahalanobis distance”when considering the correlations in the data as nui-sance parameters. It should thus easily be includedin least-square ﬁts in combination with other datasets. Wilks’ theorem does not hold for it though,except for very strong correlations. Only then isit distributed like a chi-square with one degree of freedom. With no correlations present, it followsthe “Bee-square” distribution.Figure 8 shows the shape of the resulting conﬁ-dence regions for the diﬀerent test statistics in twodimensions. The “invariant” statistics all have across shape at low conﬁdence levels and then getprogressively more square. The conﬁdence regionsof the “invariant 1” statistic extend all the way to ±∞ along the variable axes. This is due to the factthat all lines of the construction in the CDF spacego to the edges of the hypercube. For a normaldistributed variable a CDF of 1 means a variablevalue of ±∞ . The coverage is still correct in thecase of no correlations, but in principle this meansthat one would have to include a point in the 1 σ re-gion that is arbitrarily far away on one axis, as longas it is close enough to 0 in the other axis. This isclearly counter-intuitive and could very well lead tostrange behaviour in ﬁts. Combined with the factthat it is not conservative for all conﬁdence levels,the “invariant 1” statistic should probably not beused.It is worth stressing that the actual coverage be-haviour of the test statistics will depend on the ac-tual correlations present in the data sets. The ex-amples shown here were very simple, with constantcorrelation coeﬃcients between all data points. Fig-ure 9 shows the performance of the “invariant 3”test statistic for a data set with a more complicatedcorrelation structure. The data consists of N = 100data points with a covariance matrix of the form S =  c c . . . c N − c c . . . c N − c c . . . c N − ... ... ... . . . ... c N − c N − c N − · · ·  . (37)Here c is the maximum correlation as speciﬁed inthe plot labels, and c N − = − c /

2. All intermedi-ate c i are linearly equidistant, so c i − c i +1 = const .The correlations are thus stronger for “neighbour-ing” data points and there is some negative cor-relation as well. Since the total covariance neverreaches the limit of “perfect correlation”, the limit-ing exactness is not as eﬃcient in this data set andthe performance is closer to that of the “ﬁtted” teststatistic as shown in Figure 10. Since the “ﬁtted”test statistic is considerably easier to calculate, itmight be worth choosing it over an “invariant” one,depending on the reasonably expected correlationsand computation requirements.10 x )0.20.30.40.50.60.70.80.91.0 C D F Uncorr.Corr.50Corr.90Corr.99expected 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 Uncorr.Corr.50Corr.90Corr.99

Figure 7: CDFs (left) for the “invariant 3” test statistic for diﬀerent levels of correlations in the data. When using theuncorrelated CDF to calculate the assumed signiﬁcance level (or p-value) of a value of the statistic, the actual level will diﬀerfrom the assumption depending on the correlations (right). The behaviour of this statistic depends on the parameter α . For α = 1 it is identical to the “invariant 1” statistic, while for α → α = 2 /

5. Application to real neutrino cross-sectiondata

As a test, let us apply the “invariant 3” teststatistic to some real data and compare its resultswith the “naive” approach. We will use the double-diﬀerential, charged-current quasi-elastic cross sec-tion measurement of (anti-)muon neutrinos byMiniBooNE[1, 2]. This data is presented in theform of a set of diﬀerential cross sections with a“shape error” for each bin, plus a relative “normal-isation error” common to all bins. The publicationsprovide no information about the correlations be-tween the bins, or between the shape and the nor-malisation error.Model predictions for the measurements weregenerated with NUISANCE[3]. The generators andmodels considered in these studies are:1. GENIE[4] v3.00.06 tune G18 10a 02 11a2. GENIE v3.00.06 tune G18 10b 00 0003. NEUT[5] v5.4.14. NuWro[6] v19.02.25. GENIE v3.00.06 with SuSAv2[7]To make them comparable to the data, they aresplit into a shape and a normalisation part as de-scribed in [2]. The shape part is scaled to the totalcross-section of the data, so it can be compared di-rectly to the published data points: x data/MC,norm = (cid:88) i x data/MC i w i , (38) x MC,shape i = x MC i x MC,norm x data,norm , (39) with the 2D bin area w i , which is constant for allbins in this case. Figure 11 shows the compari-son of the data and the model predictions. Despitethe lack of covariance information and the impliedclaim that the uncertainties are independent, it isclear that the “shape errors” must be correlated insome way. Not only does the cross-section vary toosmoothly from bin to bin, but any variation of onlythe shape must yield a set of points with a constantsum, meaning the uncertainties of the points can-not be uncorrelated. Furthermore, it is reasonableto assume that there could be correlations betweenthe normalisation and the shape. Such correlationswould easily arise from scaling uncertainties thataﬀect some bins more than others.Given the data and the model predictions, it iseasy to calculate p-values with the “naive” and the“invariant 3” test statistics: p naive = 1 − F χ N (cid:0) naive (cid:0) ∆ | s data (cid:1)(cid:1) , (40) p invar = 1 − F χ (cid:0) invariant (cid:0) ∆ | s data , α = 2 / (cid:1)(cid:1) ,(41)with ∆ = x data − x MC , and the number of datapoints N . Both the vectors x and N explicitly in-clude the added “normalisation bin” when compar-ing both shape and normalisation.Table 1 shows the p-values resulting from thecomparisons using the two test statistics, with andwithout the normalisation bin. In most cases, thenaive chi-square statistic suggest a much strongerdisagreement between the data and the model thanthe invariant statistic. This is consistent with the11 / S / S / S / S / S / S / S / S Figure 8: Shape of conﬁdence regions of the diﬀerent statistis in two dimensions. The conﬁdence levels correspond to a 0 . σ , 1 σ ,2 σ , and 3 σ deviation of a one-dimensional normal distributed variable. The shown test statistics are: top left, ”naive” (solid)and ”invariant 3” (dashdotdot); top right, ”naive” (solid), ”ﬁtted” (dashdot), ”invariant 1” (dashed); bottom left, ”naive”(solid), ”invariant 1” (dashed), ”invariant 2” (dotted); bottom right, ”naive” (solid), ”invariant 1” (dashed), ”invariant 3( α = 2 / x )0.20.30.40.50.60.70.80.91.0 C D F Uncorr.Corr.50Corr.90Corr.99expected 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 Uncorr.Corr.50Corr.90Corr.99

Figure 9: CDFs (left) for the “invariant 3” test statistic for diﬀerent levels of correlations in data with a more complicatedcorrelation structure (see text). When using the uncorrelated CDF to calculate the signiﬁcance level (or p-value) of a value ofthe statistic, the actual level depends on the correlations (right). x )0.00.20.40.60.81.0 C D F Uncorr.Corr.50Corr.90Corr.99expected Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 Uncorr.Corr.50Corr.90Corr.99

Figure 10: CDFs (left) for the “ﬁtted” test statistic for diﬀerent levels of correlations in data with a more complicated correlationstructure (see text). When using the uncorrelated CDF to calculate the signiﬁcance level (or p-value) of a value of the statistic,the actual level depends on the correlations (right).Table 1: P-values from the comparison of models and MiniBooNE data.

Genie 10a Genie 10b Neut NuWro SuSAv2 ν Shape naive 1.89e-03 2.29e-03 5.24e-10 5.14e-06 2.19e-24& norm. invariant 2.76e-04 2.94e-03 1.82e-02 1.31e-01 2.48e-03Shape naive 3.50e-02 2.37e-02 5.66e-09 2.27e-05 3.40e-24only invariant 6.89e-02 1.40e-01 1.81e-02 1.30e-01 2.46e-03¯ ν Shape naive 6.09e-26 5.58e-25 8.67e-15 1.94e-26 5.12e-01& norm. invariant 5.12e-05 7.13e-05 5.61e-04 4.27e-05 8.46e-01Shape naive 9.92e-24 3.71e-23 2.82e-14 7.32e-26 4.96e-01only invariant 5.05e-05 7.05e-05 5.54e-04 4.22e-05 8.42e-0113

20 40 60 80 100 120 140Bin mumber0.00.51.01.52.02.53.03.54.0 d d T d c o s () / [ c m G e V ] p e r n e u t r o n

1e 38

Shape & norm comparison, nu mode

MiniBooNE totalMiniBooNE shape SuSAv2 ( p naive =2.19e-24, p invar =2.48e-03)NuWro ( p naive =5.14e-06, p invar =1.31e-01) Neut ( p naive =5.24e-10, p invar =1.82e-02)Genie_10b ( p naive =2.29e-03, p invar =2.94e-03) Genie_10a ( p naive =1.89e-03, p invar =2.76e-04) t o t / [ c m ] p e r n e u t r o n

1e 380 10 20 30 40 50 60 70 80Bin mumber0.00.51.01.52.0 d d T d c o s () / [ c m G e V ] p e r p r o t o n

1e 38

Shape & norm comparison, antinu mode

MiniBooNE totalMiniBooNE shape SuSAv2 ( p naive =5.12e-01, p invar =8.46e-01)NuWro ( p naive =1.94e-26, p invar =4.27e-05) Neut ( p naive =8.67e-15, p invar =5.61e-04)Genie_10b ( p naive =5.58e-25, p invar =7.13e-05) Genie_10a ( p naive =6.09e-26, p invar =5.12e-05) t o t / [ c m ] p e r p r o t o n

1e 39

Figure 11: Comparison of shape and normalisation between MiniBooNE data and several model predictions. The cosine of themuon angle increases bin-by-bin, the muon’s kinetic energy increases block-by-block. The error bars show the “shape error” ofthe data. The data point on the right shows the common “normalisation error”. The model predictions have been scaled tothe same total cross section as the data for the shape comparison, i.e. the sum over all bins is identical for all models and thedata. The points on the right show the actual total cross section of the model predictions. The p-values were calculated usingall bins and the normalisation. See Table 1 for shape-only p-values. Q ij = (cid:88) k,l J ki M ij J lj . (42)Here M is the original covariance matrix and J is the Jacobian matrix of the combined shape andnorm vector: J ij = (cid:40) δ ij x − − x i w j x − if 0 ≤ i < Nw j if i = N , (43) with the number of original cross-section bins N .The correlation matrices before and after the de-composition are shown in Figure 13. As intended,the decomposition reduces the amount of positivecorrelation in the data, but considerable positiveand negative correlations remain.Figure 14 shows the model comparisons with dif-ferent test statistics in this decomposed data. Fol-lowing MiniBooNE’s approach, the shape errors arescaled up by the total cross section in the data forplotting purposes. Again, the “invariant 3” teststatistic is consistently conservative compared tothe full Mahalanobis distance, while the “naive”test statistic is not. Note that for the calculationof the Mahalanobis distance, the pseudo-inverse ofthe covariance matrix was used. Since shape andnorm together have one additional dimension com-pared to the original cross section, their covariancematrix is not positive deﬁnite in general.Table 2 summarises the p-values from the diﬀer-ent model comparisons to the MINERvA data. The“full” p-values using the correlated Mahalanobisdistance in the shape and norm case are diﬀer-ent from the original case, despite the fact thatthey should contain the same information. Thisis caused by the diﬀerent parameterisations of theproblem space. Since the decomposition in shapeand norm is not a linear transformation, the likeli-hood surfaces described by the covariance matricescannot be identical. It is to be expected that thiswill lead to diﬀering p-values, especially in the low-probability tails. But even for the comparativelywell-ﬁtting models with p-values in the order of afew percent, the diﬀerence is surprisingly strong.The NuWro prediction of the antineutrino data hasa p-value of ∼

13% in the original paramterisation,but only 0 .

03% in the decomposed view. This wouldmake a huge diﬀerence in the interpretation of thedata with respect to this model! Note also, thatthe invariant test statistic in the decomposed caseis much closer to the original full p-values in thosecases.Since we have the covariance matrices, we cancreate toy data that is distributed according tothem and check for the coverage properties of thediﬀerent test statistics, just like we did in the previ-ous sections. The results of these studies is shown inFigure 15. The top plot shows the performance ofthe full, naive, and invariant statistics when appliedto the data distributed according to MINERvA’soriginal covariance matrix. The middle plots showthe same for data distributed according to the de-15

20 40 60 80 100 120 140Bin mumber01234 d dp T dp || / [ c m G e V c ] p e r nu c l e o n

1e 39

As-is comparison, nu mode

MINERvASuSAv2 ( p full =7.87e-37, p naive =4.00e-18, p invar =1.94e-06) NuWro ( p full =7.82e-22, p naive =3.24e-14, p invar =1.04e-06)Neut ( p full =1.90e-17, p naive =5.06e-08, p invar =1.72e-05) Genie_10b ( p full =1.02e-43, p naive =1.87e-41, p invar =4.31e-21)Genie_10a ( p full =8.61e-41, p naive =1.84e-47, p invar =1.53e-20) d dp T dp || / [ c m G e V c ] p e r nu c l e o n

1e 39

As-is comparison, antinu mode

MINERvASuSAv2 ( p full =5.02e-03, p naive =1.73e-01, p invar =4.79e-01) NuWro ( p full =9.45e-02, p naive =3.70e-04, p invar =2.47e-01)Neut ( p full =5.95e-03, p naive =1.37e-08, p invar =5.31e-02) Genie_10b ( p full =1.49e-02, p naive =3.56e-09, p invar =6.73e-02)Genie_10a ( p full =8.10e-03, p naive =3.77e-12, p invar =2.47e-02) Figure 12: Comparison between MINERvA data and several model predictions. The muon’s parallel momentum p || increasesbin-by-bin, its transverse momentum p T block-by-block. The error bars show the diagonal elements of the full covariance ofthe data.

20 40 60 80 100 120 140020406080100120140

Full correlation, nu mode

Full correlation, antinu mode

Shape + total correlation, nu mode

Shape + total correlation, antinu mode

Figure 13: Correlation matrix of the MINERvA experiment as reported (top) and after decomposition into shape and normal-isation bin (bottom) for both neutrino (left) and antineutrino (right) modes. The decomposition reduces the strong positivecorrelations in the data, but considerable positive and negative correlations remain.

20 40 60 80 100 120 140Bin mumber01234 d dp T dp || / [ c m G e V c ] p e r nu c l e o n

1e 39

Shape & norm comparison, nu mode

MINERvA totalMINERvA shapeSuSAv2 ( p full =6.86e-27, p naive =3.15e-23, p invar =2.61e-05) NuWro ( p full =2.09e-37, p naive =1.38e-32, p invar =3.29e-10)Neut ( p full =3.87e-26, p naive =7.58e-21, p invar =1.44e-07) Genie_10b ( p full =4.99e-58, p naive =2.28e-78, p invar =1.09e-26)Genie_10a ( p full =9.66e-62, p naive =9.89e-84, p invar =4.44e-28) t o t / [ c m ] p e r nu c l e o n

1e 390 10 20 30 40 50 60Bin mumber0.00.51.01.52.02.53.03.5 d dp T dp || / [ c m G e V c ] p e r nu c l e o n

1e 39

Shape & norm comparison, antinu mode

MINERvA totalMINERvA shapeSuSAv2 ( p full =7.07e-06, p naive =1.25e-04, p invar =3.76e-02) NuWro ( p full =1.41e-04, p naive =6.36e-03, p invar =2.19e-02)Neut ( p full =2.94e-09, p naive =2.33e-05, p invar =4.70e-04) Genie_10b ( p full =8.95e-08, p naive =5.54e-07, p invar =4.21e-02)Genie_10a ( p full =1.43e-09, p naive =5.08e-07, p invar =2.68e-04) t o t / [ c m ] p e r nu c l e o n

1e 39

Figure 14: Comparison between MINERvA data and several model predictions. The muon’s parallel momentum p || increasesbin-by-bin, its transverse momentum p T block-by-block. The error bars show the “shape error” of the data. The data pointon the right shows the common “normalisation error”. The model predictions have been scaled to the same total cross sectionas the data for the shape comparison, i.e. the sum over all bins weighted by the 2D bin widths is identical for all models andthe data. The points on the right show the actual total cross section of the model predictions. The p-values were calculatedusing all shape bins and the normalisation. See Table 2 for shape-only p-values.

6. Conclusions

It was shown that the use of the “naive” uncor-related Mahalanobis distance in the presence of un-known correlations leads to wrong coverage proper-ties. Alternative test statistics were presented thatperform more robust under varying degrees of cor-relation in the data.The “ﬁtted” test statistic is motivated by treat-ing the correlations as nuisance parameters. Its dis-tribution in the absence of correlations is known(the “Bee-square distribution”) and it is conserva-tive in their presence. It could be used when itis important to have an “actual” Mahalanobis dis-tance for the combination with other data sets, The usefulness of this is probably limited though, as it and the overestimation of the error in the presenceof correlations is acceptable. It also has the advan-tage of being incredibly easy to calculate, as it isjust the maximum squared z-score among the vari-ables. Depending on the actual correlations in thedata, its performance is actually comparable to the“invariant” test statistics.The “invariant 3” test statistic performs the bestacross varying levels of correlations. Its level of con-servativeness can be tuned with the shape param-eter α . A value of 0.5 seems to be a safe choice,while the best performance in the toy data sets inthe presented studies was achieved at α = 2 /

3. Asit goes to 0, the test statistic becomes equivalentto the “invariant 2” statistic, which can be seen asthe most conservative version of “invariant 3”. The“invariant 3” statistic could be used when it is im-portant to not over-estimate the uncertainties bytoo much. If necessary, the relative weight of thestatistic can be tuned by transforming it to a chi-square distribution with

N > N is somewhat arbitrary though.The application to real data from MiniBooNEand MINERvA shows that the “invariant 3” teststatistic performs as expected. It is consistentlyconservative, while the “naive” uncorrelated Maha-lanobis distance overestimates the strength of thediscrepancy between data and model in multiple in-stances. When the MINERvA data is decomposedinto a shape and a normalisation part, the invariantstatistic even has better coverage properties than afull Mahalanobis distance that uses a linearly prop-agated covariance matrix. This is most likely dueto the non-linear nature of the decomposition intoshape and norm, which distorts the shape of thedistribution. If it was multivariate Gaussian origi-nally, the covariance in the shape and norm spacecan only ever be an approximation. Since the in-variant statistic does not use information about thecorrelations between the data points, it is not aﬀectas much by this as the fully correlated Mahalanobisdistance. Thanks

I would like to thank Callum Wilkinson andStephen Dolan for generating the NUISANCE ﬁlesthat were used in the MiniBooNE and MINERvAstudies, as well as providing a space for discussions, will not be chi-square distributed in general. Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 invariant 3naivefull 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 invariant 3naivefull10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 invariant 3naivefull 10 Assumed significance level 10 R e a l s i g i n i f i c a n c e l e v e l

1 2 3 invariant 3naivefull10 Assumed significance level R e a l s i g i n i f i c a n c e l e v e l

1 2 3 invariant 3naivefull 10 Assumed significance level R e a l s i g i n i f i c a n c e l e v e l

1 2 3 invariant 3naivefull

Figure 15: Assumed vs actual signiﬁcances with diﬀerent test statistics in toy data distributed according to the providedMINERvA covariance (top) and the covariance of the decomposition into shape and total cross section (middle) for both theneutrino (left) and antineutrino data (right). The bottom plots show the calculated and true signiﬁcance levels when generatingtoy data according to the original covariance and applying the decomposition afterwards. The nonlinear transformation of thedecomposition causes a signiﬁcant distortion in the distribution of the Mahalanobis distance, even when using the full covariancematrix. The invariant test statistic seems to be less sensitive to these distortions than the full and naive ones. able 2: P-values from the comparison of models and MINERvA data. Genie 10a Genie 10b Neut NuWro SuSAv2 ν As is full 8.61e-41 1.02e-43 1.90e-17 7.82e-22 7.87e-37naive 1.84e-47 1.87e-41 5.06e-08 3.24e-14 4.00e-18invariant 1.53e-20 4.31e-21 1.72e-05 1.04e-06 1.94e-06Shape full 9.66e-62 4.99e-58 3.87e-26 2.09e-37 6.86e-27& norm. naive 9.89e-84 2.28e-78 7.58e-21 1.38e-32 3.15e-23invariant 4.44e-28 1.09e-26 1.44e-07 3.29e-10 2.61e-05Shape full 9.37e-62 4.32e-58 3.11e-26 1.96e-37 7.36e-27only naive 8.78e-84 1.43e-78 6.60e-21 1.63e-32 2.54e-23invariant 4.41e-28 1.08e-26 1.43e-07 3.27e-10 2.60e-05¯ ν As is full 8.10e-03 1.49e-02 5.95e-03 9.45e-02 5.02e-03naive 3.77e-12 3.56e-09 1.37e-08 3.70e-04 1.73e-01invariant 2.47e-02 6.73e-02 5.31e-02 2.47e-01 4.79e-01Shape full 1.43e-09 8.95e-08 2.94e-09 1.41e-04 7.07e-06& norm. naive 5.08e-07 5.54e-07 2.33e-05 6.36e-03 1.25e-04invariant 2.68e-04 4.21e-02 4.70e-04 2.19e-02 3.76e-02Shape full 6.04e-08 2.71e-06 1.28e-08 2.78e-04 9.13e-06only naive 3.10e-06 2.19e-06 8.74e-05 1.12e-02 1.54e-04invariant 2.64e-04 4.14e-02 4.62e-04 2.15e-02 3.69e-02feedback, and venting. Thanks also goes to LouisLyons for providing feedback and clarifying discus-sions about the concepts discussed in this paper.This work was supported by a grant from the Sci-ence and Technology Facilities Council.

References [1] A. A. Aguilar-Arevalo, et al., First measurement of themuon neutrino charged current quasielastic double dif-ferential cross section, Physical Review D 81 (9) (may2010). arXiv:1002.2680 , doi:10.1103/physrevd.81.092005 .[2] A. A. Aguilar-Arevalo, et al., First measurement of themuon antineutrino double-diﬀerential charged-currentquasielastic cross section, Physical Review D 88 (3)(aug 2013). arXiv:1301.7067 , doi:10.1103/physrevd.88.032001 .[3] P. Stowell, et al., NUISANCE: a neutrino cross-sectiongenerator tuning and comparison framework, Journalof Instrumentation 12 (01) (2017) P01016–P01016 (jan2017). doi:10.1088/1748-0221/12/01/p01016 .[4] C. Andreopoulos, et al., The GENIE neutrino montecarlo generator, Nucl. Instrum. Meth. A 614 (2010)87–104 (2010). arXiv:0905.2517 , doi:10.1016/j.nima.2009.12.009 .[5] Y. Hayato, A neutrino interaction simulation programlibrary NEUT, Acta Phys. Polon. B40 (2009) 2477–2489(2009).[6] T. Golan, J. Sobczyk, J. ˙Zmuda, NuWro: the wroc(cid:32)lawmonte carlo generator of neutrino interactions, NuclearPhysics B - Proceedings Supplements 229-232 (2012) 499(aug 2012). doi:10.1016/j.nuclphysbps.2012.09.136 .[7] R. Gonz´alez-Jim´enez, G. D. Megias, M. B. Barbaro,J. A. Caballero, T. W. Donnelly, Extensions of super- scaling from relativistic mean ﬁeld theory: The SuSAv2model, Physical Review C 90 (3) (sep 2014). doi:10.1103/physrevc.90.035501 .[8] D. Ruterbories, et al., Measurement of quasielastic-likeneutrino scattering at (cid:104) E ν (cid:105) ∼ . arXiv:1811.02774 , doi:10.1103/physrevd.99.012004 .[9] C. E. Patrick, et al., Measurement of the muon anti-neutrino double-diﬀerential cross section for quasi-elasticscattering on hydrocarbon at e ν ∼ . arXiv:1801.01197 , doi:10.1103/PhysRevD.97.052002 . isting 1: Python implementation of the Bee-square distribution. import numpy a s np from s c i p y . s t a t s import r v c o n t i n u o u s from s c i p y . s p e c i a l import e r f , e r f i n v c l a s s Bee ( r v c o n t i n u o u s ) : def c d f ( s e l f , x , d f ) : return e r f ( x/np . s q r t ( 2 ) ) ∗ ∗ d f def p d f ( s e l f , x , d f ) :r e t = d f ∗ ( e r f ( x/np . s q r t ( 2 ) ) ) ∗ ∗ ( df − return r e t ∗ np . s q r t ( 2 / np . p i ) ∗ np . exp( − x ∗ ∗ def ppf ( s e l f , x , d f ) : return e r f i n v ( ( x ) ∗ ∗ ( 1 / d f ) ) ∗ np . s q r t ( 2 ) bee = Bee ( a=0) c l a s s Bee2 ( r v c o n t i n u o u s ) : def c d f ( s e l f , x , d f ) :b = np . s q r t ( x )r e t = bee . c d f ( b , d f ) return r e t def p d f ( s e l f , x , d f ) :r e t = d f ∗ ( e r f ( np . s q r t ( x / 2 ) ) ) ∗ ∗ ( df − return r e t / np . s q r t ( 2 ∗ np . p i ∗ x ) ∗ np . exp( − x / 2 ) def p p f ( s e l f , x , d f ) :b = bee . ppf ( x , d f ) return b ∗∗ bee2 = Bee2 ( a=0) 22 isting 2: Python implementation of the “invariant 3” test statistic. Data must be provided in z-scores, i.e. normalised by thevariance. Uses caching for improved performance and works with survival functions instead of CDFs for improved numericalaccuracy. @np . v e c t o r i z e@ l r u c a c h e ( 1 0 0 0 0 ) def yfrommax ( b , d f =2, a l p h a = 0 . 5 ) : ””” (1 − d i a g o n a l c o o r d i n a t e ) from (1 − max) o f a c c e p t e d r e g i o n ””” − b ) ∗∗ d f − ( ( y − b ) ∗∗ d f )/((1 − a l p h a+a l p h a ∗ y ) ∗ ∗ ( df − − y b e t a = 1 − a l p h aq = ( 1 . 0 − b ) ∗∗ d f − − def f ( y ) : return q − ( ( y − b ) ∗∗ d f ) / ( ( b e t a + a l p h a ∗ y ) ∗∗ ( dfm ) ) + y i f b < = 0 : return i f b > = 1 : return e l s e : return r o o t s c a l a r ( f , x0=b , x1=b ∗ def yfrommax ( b , d f =2, a l p h a = 0 . 5 ) : ””” B u f f e r and i n t e r p o l a t e v a l u e s t o s p e e d t h i n g s up . ””” s t e p = 0 . 0 0 0 1b = np . f l o o r ( b / s t e p , dtype= f l o a t ) ∗ s t e pb = b + s t e pd e l t a = ( b − b ) / s t e px = yfrommax ( b , d f=df , a l p h a=a l p h a )x = yfrommax ( b , d f=df , a l p h a=a l p h a ) return x + ( x − x ) ∗ d e l t a def i n v a r i a n t 3 ( x , a l p h a =0.5 , f a s t=F a l s e ) : ””” Return t e s t s t a t i s t i c g i v e n v e c t o r o f n o r m a l i s e d v a l u e s . ””” i f f a s t :s f = 1 − c h i 2 . c d f ( x ∗∗ e l s e :s f = c h i 2 . s f ( x ∗∗ a = np . min ( s f , a x i s = − max ( s f , a x i s = − − y f c = ( a l p h a ∗ ( a − b ) + b ) / ( 1 . 0 + a l p h a ∗ ( a − b ) )y = np . minimum ( y f c , yfm )y = np . maximum( y , 0 ) returnreturn