Chi-squared Test for Binned, Gaussian Samples
CChi-squared Test for Binned, Gaussian Samples
Nicholas R. Hutzler
Division of Physics, Mathematics, and AstronomyCalifornia Institute of TechnologyPasadena, CA 91125E-mail: [email protected]
June 2019
Abstract.
We examine the χ test for binned, Gaussian samples, including effectsdue to the fact that the experimentally available sample standard deviation and theunavailable true standard deviation have different statistical properties. For dataformed by binning Gaussian samples with bin size n , we find that the expected valueand standard deviation of the reduced χ statistic is n − n − ± n − n − (cid:114) n − n − (cid:114) N − , (1)where N is the total number of binned values. This is strictly larger in both meanand standard deviation than the value of 1 ± (2 / ( N − / reported in standardtreatments, which ignore the distinction between true and sample standard deviation.
1. Introduction
Precision measurements of physical quantities typically require a very large numberof individual measurements of the same quantity often taken under varying conditions,such as drifting signal-to-noise or many experimental configurations with different signalsizes. For this reason, as well as for simplification of data analysis and reductionof computational requirements, the data are typically binned together such thatmeasurements in the same bin were taken within a time during which the conditions weresimilar. In order to check whether the binning is susceptible to the varying conditions,as well as to search for unknown sources of noise, a χ test [1, 2, 3] is commonly used.Regardless of whether or not it is an ideal choice of statistic for this case, it is fairlyintuitive as a measure of whether the assigned error bars are correctly capturing thestatistics of the data. However, some of the simplifying assumptions used to constructthe standard χ can give results with a significant bias for large data sets. We discusswhy the standard treatment underestimates both the mean and variance of the χ statistic, and then determine the appropriate correction factors. a r X i v : . [ phy s i c s . d a t a - a n ] J un hi-squared Test for Binned, Gaussian Samples
2. Chi-squared test for binned, Gaussian samples
Consider a quantity N x (cid:29) x i without any assigned uncertainties.Say that the measurements are normally distributed with constant, true mean µ thatis not known to the experimenter. We shall not assume that the data has a constantvariance. Let us gather these data sequentially into groups G j with n consecutive pointseach. Now compute the usual sample mean, standard deviation, and standard error ofeach group of points: y j = 1 n (cid:88) x i ∈ G j x i , s j = (cid:115) n − (cid:88) x i ∈ G j ( x i − y j ) , s yj = 1 √ n s j . (2)We have now binned our data into a smaller set of N = N x /n (cid:29) y j and uncertainties s yj . As a check to see whether the assigned uncertainties are correctlycapturing the statistical fluctuations of the data we can perform a χ test as outlinedin many standard texts [1, 2, 3]. We will test the hypothesis that the y j are normallydistributed about a constant ¯ y (though this approach is easily extended to models withmore degrees of freedom), and that the uncertainties correctly describe the statisticalfluctuations of the data about the mean. The reduced- χ value of the data set is χ red = 1 N − N (cid:88) j =1 (cid:18) y j − ¯ yσ yj (cid:19) ≡ N − N (cid:88) j =1 χ j , (3)where ¯ y = ( (cid:80) j y j /s yj ) / ( (cid:80) j /s yj ) is the weighted mean of the y data, and σ yj is thetrue (unknown) standard deviation of the points { x i ∈ G j } , which need not be constantover different values of j . If the fluctuations in the data are Gaussian in nature, andcorrectly accounted for by the uncertainties, then we have the usual resultE[ χ red ] = 1 , Std[ χ red ] = (cid:114) N − . (4)However, the experimenter does not know the true standard deviation, and thereforeactually computes the statistic˜ χ red = 1 N − N (cid:88) j =1 (cid:18) y j − ¯ ys yj (cid:19) ≡ N − N (cid:88) j =1 ˜ χ j , (5)using s yj as an estimator for σ yj . We wish to find the statistical properties ofthis quantity, which we shall find differ from χ red . Intuitively, the sample standarddeviation is computed from a finite number of measurements and therefore has someuncertainty associated with it, and that uncertainty should be propagated through whenexamining the ˜ χ red statistic. This is a well-known effect when estimating parametersfrom finite data sets and has been previously explored in a number of contexts, forexample Poisson distributions, counting experiments, weighted means, and histogramfitting [4, 5, 6, 7, 8, 9, 10].More specifically, while χ j ∼ N (0 ,
1) is normally distributed, ˜ χ j is not:˜ χ j ≡ (cid:18) y j − ¯ ys yj (cid:19) ≈ (cid:18) y j − µs yj (cid:19) ∼ t ( n − , (6) hi-squared Test for Binned, Gaussian Samples t -distribution with n − n thana normal distribution. Notice that we are treating ¯ y = µ as a constant, which is valid inthe limit N (cid:29)
1, though for smaller N the statistical properties of the weighted meancannot be ignored [9, 11, 12, 13, 14, 15]. In particular, the weighted mean also hascorrection factors due to the difference between true and sample standard deviation,and has a non-trivial variance, both of which will impact the ˜ χ red statistic. A gooddiscussion of these complexities can be found in reference [15].The square of ˜ χ j is therefore distributed as ˜ χ j ∼ F (1 , n − F -distributionwith (1 , n −
1) degrees of freedom, which hasE[ F (1 , n − n − n − , Var[ F (1 , n − (cid:18) n − n − (cid:19) n − n − . (7)This is as opposed to the χ j statistic, which has (appropriately) a χ distribution. ˜ χ red is therefore distributed as a sum of F -distributions, which is complicated [16]. However,the expectation value and variance are straightforward to calculate,E[ ˜ χ red ] = NN − (cid:2) ˜ χ j (cid:3) = n − n − O (cid:0) N − (cid:1) , (8)Var[ ˜ χ red ] = N ( N − Var (cid:2) ˜ χ j (cid:3) = 2 N − (cid:18) n − n − (cid:19) n − n − O (cid:0) N − (cid:1) . (9)This implies that the mean and standard deviation of the ˜ χ red statistic are largerthan those of the χ red statistic byE[ ˜ χ red ]E[ χ red ] = n − n − , Std[ ˜ χ red ]Std[ χ red ] = n − n − (cid:114) n − n − , (10)up to further corrections of order O ( N − ). A plot of these correction factors is shownin Figure 1. In the limit n → ∞ we recover the usual result, but for finite n we willalways expect larger values for both mean and standard deviation. We can also see thatchoosing n ≤ χ red2 ]/E[ χ red2 ]Std[ χ red2 ]/Std[ χ red2 ] ~~ Figure 1.
Correction factors to the mean and standard deviation of ˜ χ red . hi-squared Test for Binned, Gaussian Samples
3. Conclusion
In summary, we find that the standard χ statistic computed from binning finite datasets underestimates the mean and variance for binned Gaussian samples, and derivesimple, closed expressions for the biases. For very large data sets with finite bin sizes,such as those commonly found in precision physics measurements, these corrections canbe significant and should not be neglected. Acknowledgments.
I would like to acknowledge helpful discussions with DavidWatson, and many helpful discussions with the ACME Collaboration, in particularDavid DeMille, John M. Doyle, and Brendon O’Leary.
Appendix: A simple example
We can see how the “usual” chi-squared statistic gives an incorrect result by performinga simple numerical test on some simulated data. Generate 1,000,000 points x i ∼ N (0 , n = 10, and then compute means y j , standard errors σ yj , and thereduced chi-squared statistic ˜ χ red (as described in the main text) for the resulting 100,000binned points. Nx = 1000000 //Number of x valuesnbin = 10 //Number of points to binfor j = 1:(Nx/nbin) //Step over binsx = randn(1,nbin) //Generate nbin normally distributed pointsy(j) = mean(x) //Meanssigmayi(j) = std(x)/sqrt(nbin) //Standard errorsendybar = sum(y./sigmayi.^2)/sum(1./sigmayi.^2) //Weighted meanchi = (y-ybar)./sigmayi //chichi2 = sum(chi.^2) //chi^2dof = length(y)-1 //Degrees of freedomredchi2 = chi2/dof //Reduced chi^2redchi2sigma = sqrt(2/dof) //‘‘Usual’’ uncertainty of chi^2
If we run this piece of code, we will find redchi2 = 1.2868 and redchi2sigma =0.0045 (though of course the former will be different each time due to the randomnature of the calculation.) This value differs considerably from the na¨ıve expectationof 1 ± . . ± . hi-squared Test for Binned, Gaussian Samples [1] Press W H, Teukolsky S A, Vetterling W T and Flannery B P 2007 Numerical Recipes
Data Reduction and Error Analysis for the PhysicalSciences
An Introduction to Error Analysis
Nucl. Instruments Methods Phys. Res.
Nucl. Instruments Methods Phys. Res. Sect. A Accel. Spectrometers,Detect. Assoc. Equip.
Nucl. Instruments Methods Phys. Res. Sect. A Accel.Spectrometers, Detect. Assoc. Equip.
Astrophys. J.
Nucl. Instruments Methods Phys. Res. Sect. A Accel.Spectrometers, Detect. Assoc. Equip.
Metrologia Nucl. Instruments Methods Phys. Res. Sect. A Accel. Spectrometers, Detect.Assoc. Equip.
A New Limit on the Electron Electric Dipole Moment
Ph.D. thesis HarvardUniversity[12] Cochran W G 1937
Suppl. to J. R. Stat. Soc. Biometrics Biometrical J. Statistical Meta-Analysis with Applications (Wiley-Interscience)[16] Morrison D F 1971
J. Am. Stat. Assoc.66