[PDF] The Crossing Statistic: Dealing with Unknown Errors in the Dispersion of Type Ia Supernovae

Abstract

We propose a new statistic that has been designed to be used in situations where the intrinsic dispersion of a data set is not well known: The Crossing Statistic. This statistic is in general less sensitive than `chi^2' to the intrinsic dispersion of the data, and hence allows us to make progress in distinguishing between different models using goodness of fit to the data even when the errors involved are poorly understood. The proposed statistic makes use of the shape and trends of a model's predictions in a quantifiable manner. It is applicable to a variety of circumstances, although we consider it to be especially well suited to the task of distinguishing between different cosmological models using type Ia supernovae. We show that this statistic can easily distinguish between different models in cases where the `chi^2' statistic fails. We also show that the last mode of the Crossing Statistic is identical to `chi^2', so that it can be considered as a generalization of `chi^2'.

Full PDF

PPreprint typeset in JHEP style - HYPER VERSION

The Crossing Statistic:

Dealing with Unknown Errors in the

Dispersion of Type Ia Supernovae

Arman Shaﬁeloo

Department of Physics, University of Oxford1 Keble Road, Oxford, OX1 3NP, UK and

Institute for the Early Universe, Ewha Womans UniversitySeoul, 120-750, South KoreaE-mail: [email protected]

Timothy Clifton

Astrophysics Department, University of OxfordDenys Wilkinson Building, Oxford, OX1 3RH, UKE-mail: [email protected]

Pedro Ferreira

Astrophysics Department, University of OxfordDenys Wilkinson Building, Oxford, OX1 3RH, UKE-mail: [email protected]

Abstract:

We propose a new statistic that has been designed to be used in situ-ations where the intrinsic dispersion of a data set is not well known: The CrossingStatistic. This statistic is in general less sensitive than χ to the intrinsic dispersionof the data, and hence allows us to make progress in distinguishing between diﬀerentmodels using goodness of ﬁt to the data even when the errors involved are poorlyunderstood. The proposed statistic makes use of the shape and trends of a model’spredictions in a quantiﬁable manner. It is applicable to a variety of circumstances,although we consider it to be especially well suited to the task of distinguishing be-tween diﬀerent cosmological models using type Ia supernovae. We show that thisstatistic can easily distinguish between diﬀerent models in cases where the χ statis-tic fails. We also show that the last mode of the Crossing Statistic is identical to χ ,so that it can be considered as a generalization of χ . Keywords:

Supernovae, dark energy, cosmological parameter estimation. a r X i v : . [ a s t r o - ph . C O ] J u l ontents

1. Introduction 12. Method and Analysis 33. Results 94. Conclusion 11

1. Introduction

The intrinsic dispersion of the data plays a crucial role in comparing theoreticalmodels to observations. If, for some reason, we do not know this dispersion, thenevaluating which model best ﬁts a given set of data points can be particularly diﬃcult.This is the problem we face in cosmology when we attempt to make inferences aboutcosmological models using type Ia Supernovae (SN Ia).SN Ia act to some degree like standardized candles, and are widely used incosmology to probe the expansion history of the Universe, and hence to investigatethe properties of dark energy. Indeed, it is from observations of SN Ia that the ﬁrstdirect evidence for an accelerating universe was found [1], and although this result hasfar reaching physical consequences, a complete understanding of the physics of SN Iais still lacking. This lack of understanding is manifest in the largely unaccounted forintrinsic dispersion of SN Ia, which aﬀects almost any subsequent statistical analysisthat one attempts to perform [2]. Given that the intrinsic dispersion of SN Ia, σ int ,typically constitutes a large fraction of the total error on a data point, σ i , this is aserious problem.One procedure that is often used to ﬁnd the a priori unknown intrinsic dispersionis to look for the value of σ int that gives a reduced χ of 1 for a particular model, andthen to use this value to determine the likelihood of the data given that model. Suchan approach does indeed allow one to distinguish between diﬀerent models usingthe likelihood function, but at the expense of losing much of the original conceptof ‘goodness of ﬁt’ (which is the essence of a χ analysis). Rather than directlyanswering the question of which model actually ﬁts the data best, we are then leftwith answering the question of which model can be made to give an ideal ﬁt to thedata by adding the smallest possible error bars. This gives us no direct information– 1 –bout which model best ﬁts the data, as the error bars have been adjusted by hand sothat they all ﬁt perfectly. Furthermore, by treating error bars in this way it becomesvery diﬃcult to detect any features that may be present in the data.If we want to determine the goodness of ﬁt of diﬀerent models to the data, wemust therefore take a diﬀerent approach. Standard statistics, such as χ , however, areonly reliable when the assumed parameterization of the model is correct, and whenthe errors on the data are properly estimated. Given that the true nature of darkenergy is still not known, and that we have no reliable theoretical derivation of σ int ,the application of χ statistics to the SN Ia data is not at all straightforward. Theseproblems persist even when using non-parametric or model independent approaches[3]. There have been extensive discussions in the literature on using supernovae datafor the purposes of model selection [4], and a number of problems have been identiﬁedwith using statistical methods in inappropriate ways [5].To address these diﬃculties we propose a new statistic, that we call the CrossingStatistic . This statistic is signiﬁcantly less sensitive than χ to uncertainties in theintrinsic dispersion, and can therefore be used more easily to check the consistencybetween a given model and a data set with largely unknown errors. The CrossingStatistic does not compare two models directly, but rather determines the probabilityof getting the observed data given a particular theoretical model. It works with thedata directly, and makes use of the shape and trends in a model’s predictions whencomparing it with the data.In the following we will discuss the concept of goodness of ﬁt and show howthe χ statistic is sensitive to the size of unknown errors, as well as how it fails todistinguish between diﬀerent cosmological models when errors are not prescribed ina deﬁnite manner. We will then introduce the Crossing Statistic and show how itcan be used to distinguish between diﬀerent cosmological models when the standard χ analysis fails to do so. For simplicity, we will restrict ourselves to four theoreticalmodels: (i) a best ﬁt ﬂat ΛCDM model, (ii) a smooth Lemaˆıtre-Tolman-Bondi voidmodel with simultaneous big bang [6], (iii) a ﬂat ΛCDM model with Ω = 0 . χ we then show that the Crossing Statistic isrelatively insensitive to the unknown intrinsic error, as well as being more reliablein distinguishing between diﬀerent cosmological models. In a companion paper, wewill test a number of other dark energy models using this statistic.– 2 – . Method and Analysis First let us consider the χ statistic. For a given data set ( µ e i , i = 1 · · · N ) we havethat χ is given by χ = N (cid:88) i ( µ t i − µ e i ) σ i , (2.1)where µ t i is the prediction of the model that we are comparing the data set to, and σ i are the corresponding variances ( σ has units of magnitudes throughout). If the data-points are uncorrelated and have a Gaussian distribution around the distributionmean, then we have a χ distribution with N − N P degrees of freedom (where N P isthe number of parameters in the theoretical model).Now let us now calculate the χ goodness of ﬁt for two of our cosmologicalmodels: a ﬂat best ﬁt ΛCDM model, and a Milne universe. Let us also assume anadditional intrinsic error, σ int , on top of the error prescribed in the Constitution dataset, σ i (data) , so that the total error is σ i = σ i (data) + σ . This will allow us to checkhow sensitive our analysis is to coherent changes in the size of error bars. In Fig. 1we plot the χ goodness of ﬁt for our two theoretical models as a function of σ int .It can be seen that these two models cannot be easily distinguished from each otherusing χ alone, unless the additional intrinsic error is already known. We also notethat the χ goodness of ﬁt for the standard ﬂat ΛCDM model, given the Constitutiondata without any additional intrinsic errors, is less than 0 .

6% ( χ = 465 . µ i ( z i ) (e.g. the Constitution data set [7]). As in [12], weuse the χ statistic to ﬁnd the best ﬁt form of the assumed model, and from this wethen construct the error normalized diﬀerence of the data from the best ﬁt distancemodulus ¯ µ ( z ): q i ( z i ) = µ i ( z i ) − ¯ µ ( z i ) σ i ( z i ) . (2.2)Let us now consider the one-point Crossing Statistic , which tests for a model anda data set that cross each other at only one point. We must ﬁrst try to ﬁnd this– 3 – G oodne ss o f F i t ( P r obab ili t y ) σ int FLCDM: best fitMilne Open Universe

Figure 1:

The χ goodness of ﬁt of the Constitution supernova data [7] to a ﬂat ΛCDMmodel (red line) and a Milne universe (blue line), assuming additional intrinsic errorsadded quadratically to the errors speciﬁed in the data set. The χ goodness of ﬁt for thesetwo models can be seen to be comparable for diﬀerent values of additional intrinsic error,making them diﬃcult to distinguish without any knowledge of the value of σ int .

32 34 36 38 40 42 44 46 48 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 µ ( z ) z Actual ModelProposed Model 32 34 36 38 40 42 44 46 48 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 µ ( z ) z Actual ModelProposed Model

Figure 2:

An idealized schematic plot of one crossing (left panel) and two crossings (rightpanel) between a proposed theoretical model and the actual model of the Universe whencomparing magnitudes as a function of redshift, µ ( z ). In reality the actual Universe isobserved in the form of data with error bars, of course. crossing point, which we label by n CI1 and z CI1 . To achieve this we deﬁne T ( n ) = Q ( n ) + Q ( n ) , (2.3)– 4 –here Q ( n ) and Q ( n ) are given by Q ( n ) = n (cid:88) i =1 q i ( z i ) Q ( n ) = N (cid:88) i = n +1 q i ( z i ) , (2.4)and where N is the total number of data points. If n is allowed to take any valuefrom 1 to N (when the data is sorted by redshift) then we can maximize T ( n ) byvarying with respect to n CI1 . We then write the maximized value of T ( n ) as T I .Finally, we can use Monte Carlo simulations to ﬁnd how often we should expect toobtain a T MCI larger or equal to the value derived from the observed data, T dataI . Thisinformation can then be used to estimate the probability that the particular data setwe have in our possession should be realized from the cosmological model we havebeen considering.In our analysis, the process of estimating the distribution of T MCI using MonteCarlo simulations is done in a model independent way as follows. Firstly, a numberof diﬀerent data sets are generated from a single ﬁducial model, which we take torepresent the ‘true’ model of the Universe. The residuals of the fake data are thencalculated by subtracting the mean values of the same ﬁducial model, from whichwe can then determine T MCI . As such, it follows that T MCI does not depend onthe background model (which is subtracted away from the generated data to ﬁnd theresidual), but rather on the dispersion about the ﬁducial model that we haven chosento adopt. This dispersion is taken to correspond to the errors on the observationaldata, and so is itself model independent (up to the extent that observers makeassumptions about the background cosmology when specifying their value).Now, before applying the Crossing Statistic to real data, let us ﬁrst we considerhow it fares when applied to simulations. For this we create 1000 realizations ofdata similar to the Constitution supernova sample based on a ﬁducial ﬂat ΛCDMmodel with Ω = 0 .

27. We then test two diﬀerent models using the same fake datasets. The ﬁrst of these is the ﬁducial model itself (the ‘correct model’), and thesecond is a ﬂat ΛCDM model with Ω = 0 .

22 (the ‘incorrect model’). These twomodels are intentionally chosen to be similar to each other in order to explicitly showthe eﬀectiveness of the Crossing Statistic at distinguishing between diﬀerent models.Next, we add an extra intrinsic dispersion of σ int = 0 .

05 to the data and test thetwo models again. This is done to simulate the more realistic situation in which theprecise value of σ is unknown. Using the simulated data, and applying our statistic,we then test how often the simulated data is suﬃcient to rule out each of the twomodels at the 99% conﬁdence level (CL). This data is displayed in Table 1, alongwith the result of using χ alone . We call our statistic T I + χ in Table 1, as we minimize for χ ﬁrst, by adjusting the nuisance – 5 – int = 0 . σ int = 0 . T I + χ χ T I + χ χ Correct Model (ΛCDM with Ω = 0 .

27) 1% 1% 0 .

5% 0%Incorrect Model (ΛCDM with Ω = 0 .

22) 28 .

5% 1 .

9% 26 .

4% 0%

Table 1:

A comparison of the χ and T I statistics using data simulated from a ΛCDMmodel with Ω = 0 .

27. Percentages show the fraction of simulations in which the modelin question is ruled out at the 99% conﬁdence level.

It can be seen that with σ int = 0 the χ and T I + χ statistics both rule out thecorrect model at 99% CL in 1% of the cases, as should be expected to happen fromtheir deﬁnitions. The incorrect model, however, is ruled out by the χ statistic at 99%CL in less than 2% of the cases only, while the T I + χ statistic is ruled out at 99% CLabout 28 .

5% of the time. This is a signiﬁcant improvement in distinguishing diﬀerentmodels by using a more sophisticated statistic that is extracting more informationfrom the data. Also, when σ int = 0 .

05 we can see that χ + T I is still sensitive to theincorrect model, picking it up and ruling it out at 99% CL in about 26 .

4% of cases.This is not true of χ , and clearly demonstrates that T I is much less sensitive to theunknown value of σ int than χ , while being better at distinguishing the correct modelfrom the incorrect one. In fact, even if we over-estimate the size of the error-bars, T I still performs well, and frequently picks out the incorrect model with high conﬁdence.To elaborate further on why χ is often not sensitive to using the incorrectmodel, while χ + T I is, let us consider the distribution of residuals with redshift.This is shown in Fig. 3 for a single random realization of data generated from aﬂat ΛCDM model with Ω = 0 .

27, and using a test ΛCDM model that is alsoﬂat with Ω = 0 .

22. The distribution of the fake data points is similar to that ofthe Constitution sample and the data has no extra intrinsic dispersion. The greenhorizontal dashed line in Fig. 3 is the zero line about which the normalized residualsshould ﬂuctuate, when the model being tested and the actual model are the same.The blue dotted vertical line in the right-hand plot represents the redshift at which T ( n ) is maximized, z CI1 . The derived values of Q ( n CI1 ) and Q ( n CI1 ) on either side ofthe blue line are also displayed. For this particular realization of the data the derived χ for the test model with Ω = 0 .

22 is 375 .

72, which represents a very good χ ﬁt to the data considering the number of data points is 557. The correspondingP-value derived from Monte Carlo simulations is more than 50%. However, thederived value of T I is 3102 .

13, which comparing with the Monte Carlo realizations parameters, before calculating T I . P-value is deﬁned as the probability that, given the null hypothesis, the value of the statistic islarger than the one observed. We remark that in deﬁning this statistic one has to be cautious about a posteriori interpretations of the data. That is, a particular feature observed in the real data may – 6 – µ da t a ( z )- µ m ode l ( z ) z -4-3-2-1 0 1 2 3 4 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 q ( z ) z Q (n )=37.99 Q (n )=-40.72z =0.54 Figure 3:

Residuals with error bars taken from the data (left panel) and the error nor-malized diﬀerence, q ( z ), (right panel), for a single random realization of supernova datasimilar to Constitution sample. The simulated data here is based on the ﬁducial modelof ﬂat ΛCDM with Ω = 0 .

27, and the test model is ﬂat ΛCDM with Ω = 0 .

22. Weassumed here that there is no extra intrinsic dispersion. The crossing point occurs at z CI1 ,and is shown by the vertical blue dashed line. Derived values of Q ( n CI1 ) and Q ( n CI1 ) arealso displayed. In the right-hand panel one can see the unbalanced distribution of pointsaround the green zero line (on the right side there are more points below the line, while onthe left there are more points above it). results in a P-value of less than 0 . = 0 . χ + T I statistic, at the level of 3 σ .This approach can be extended to models with more than one crossing point bythe two-point Crossing Statistic . In this case we assume that the model and the datacross each other at two points and, as above, we try to ﬁnd the two crossing pointsand their red shifts, which we now label n CII1 , z

CII1 and n CII2 , z

CII2 . This is achieved bydeﬁning T ( n , n ) = Q ( n , n ) + Q ( n , n ) + Q ( n , n ) , (2.5)where the Q i ( n , n ) are now given by Q ( n , n ) = n (cid:88) i =1 q i ( z i ) Q ( n , n ) = n (cid:88) i = n +1 q i ( z i ) Q ( n , n ) = N (cid:88) i = n +1 q i ( z i ) . (2.6) be very unlikely (and lead to a low P-value), but the probability of observing some feature may bequite large – see the discussion in [14]. – 7 – C oun t s q(z) z =0.54 Complete Samplez < z z > z Figure 4:

The distribution of error normalized diﬀerence at diﬀerent redshifts, as studiedin Fig 3. While the overall distribution has a reasonable Gaussian shape, the normalizedresiduals at z < .

54 have a clear shift to the right while those at z > .

54 have a clearshift to the left.

We can then maximise T ( n , n ) by varying with respect to n and n , to get T II .Comparing T II with the results from Monte Carlo realizations then allows us todetermine how often we should expect a two-point crossing statistic that is greaterthan or equal to the T II obtained from real data. The three-point Crossing Statistic ,and higher statistics, can be deﬁned in a similar manner. This can continue up tothe N-point Crossing Statistic which is, in fact, identical to χ . We also note that thezero-point Crossing Statistic, T = ( (cid:80) N i q i ) , is very similar to the Median Statisticdeveloped by Gott et al. [13]. The Crossing Statistic can therefore be thought ofas generalizing both the χ and Median Statistics, which it approaches in diﬀerentlimits.We can also look at the Crossing Statistic from another perspective: In termsof the Gaussianity of a sample about its mean. If an assumed model is indeedthe correct one to describe Gaussian distributed data, then the histogram of thenormalized residuals should also have a Gaussian distribution, with zero mean and astandard deviation of 1 [15]. To test Gaussianity in this context one can use a varietyof diﬀerent methods, including, for example, the Kolmogorov-Smirnov test [16]. Ifthe histogram instead exhibits signiﬁcant deviation from the a Gaussian distribution,however, then this can be used to rule out the assumed model. The Crossing Statisticpushes this well known idea from statistical analysis a step further by pointing to thefact that not only should the whole sample of residuals have a Gaussian distributionaround the mean, but so should any continuous sub-sample . In our case, these sub-samples should be taken to be those residuals within certain redshift ranges, as– 8 –iscussed above. The importance of our new statement can be realized if we look atFig. 4 for the T I statistic. While the overall histogram of the normalized residuals mayhave a Gaussian distribution, this does not mean that the distributions of residualsfor the data on either side of the crossing are also Gaussian distributed. It may bethe case that the normalized residuals to the left of the crossing point (in redshiftrange) contribute more to one side of the histogram than the other, and the residualsfrom the other side of the crossing point do the opposite. In essence, this is what the T I statistic estimates and tests. In the case of T I , in fact, we divide the sample upinto all possible two sub-samples and we test the Gaussianity for all of them. As canbe seen in Fig. 4, while the overall distribution seems to have a reasonable Gaussianshape, the histogram of the normalized residuals at z < .

54 has a clear shift to theright, while those at z > .

54 are shifted to the left. In our analysis, deviation fromGaussianity with zero mean is calculated by derivation of Q ( z ) and Q ( z ) whichare, in fact, the areas under the histograms on the two sides of the zero mean. Thisis a simple, but robust way, to test the hypothesis above.

3. Results

Now let us apply our Crossing Statistics to a suite of diﬀerent models. We willcalculate χ , T I , T II and T III for (i) the best ﬁt ﬂat ΛCDM model (with Ω = 0 . σ int = 0), (ii) a best ﬁt asymptotically ﬂat void model with Ω = 0 .

28 atthe centre, and with FWHM at z = 0 .

66 when σ int = 0, (iii) a ﬂat ΛCDM modelwith Ω = 0 .

22, and (iv) the Milne open universe. We use the Constitution dataset [7], and vary the additional intrinsic error, σ int , between 0 and 0 . σ int . This is done in acompletely model independent manner.It can be seen from Fig. 5 that the χ statistic (upper panel) cannot easily beused to distinguish between the diﬀerent models with a high degree of conﬁdence,especially if we do not know σ int . Indeed, if we add σ int = 0 . σ = 0 all four models are outside of the 99% conﬁdence level. This illustratesthe ineﬀectiveness of χ as a statistic for determining the goodness of ﬁt when theerrors on the data are not well known.The results for the one-point Crossing Statistic are shown in the second panelfrom the top in Fig. 5. In terms of this statistic it can be seen that the best ﬁtﬂat ΛCDM model and the best ﬁt void model are now very much consistent with This model uses the Lemaˆıtre-Tolman-Bondi solution of general relativity [17] to model anunder-density formed due to a Gaussian ﬂuctuation in the spatial curvature parameter, k . For anobserver at the centre the aﬀect of the resulting inhomogeneity is to create a universe that lookslike it is accelerating, without any actual acceleration taking place. – 9 –

300 350 400 450 500 550 600 0 0.02 0.04 0.06 0.08 0.1 Χ σ int FLCDM : Best FitVoid ModelMilne Open UniverseFLCDM ; Ω =0.2260%, 95% and 99% CL 0 1000 2000 3000 4000 5000 6000 7000 8000 0 0.02 0.04 0.06 0.08 0.1 T I σ int FLCDM : Best FitVoid ModelMilne Open UniverseFLCDM ; Ω =0.2260%, 95% and 99% CL 0 1000 2000 3000 4000 5000 6000 7000 8000 0 0.02 0.04 0.06 0.08 0.1 T II σ int T III σ int Figure 5:

The χ , T I , T II and T III statistics for a best ﬁt ﬂat ΛCDM model (red lines), avoid model (blue dashed lines), the Milne universe (green dashed lines) and a ﬂat ΛCDMmodel with Ω = 0 .

22 (pink dotted lines). The analyses are performed using the Con-stitution supernova data [7], and by assuming various diﬀerent additional intrinsic errors.The conﬁdence limits from 1000 Monte Carlo realizations of the error-bars are derived in acompletely model independent manner. It can be seen the χ statistic fails to distinguishbetween these models with any degree of signiﬁcance, and that by assuming additionalintrinsic errors this statistic allows all models to be made consistent with the data. The T I crossing statistic, on the other hand, rules out the Milne universe to more than 5 σ , and alsothe ﬂat ΛCDM model with Ω = 0 .

22 to nearly 3 σ , even when the amount of additionalintrinsic error is large. the data, even with no additional intrinsic error. At the same time, it is also clearthat the Milne universe lie well outside the 99% conﬁdence level and the ﬂat ΛCDMmodel with Ω = 0 .

22 lie well outside the 95% conﬁdence level, even when σ int islarge. In the third and fourth panels in Fig. 5 we see the results for the two-point and– 10 –hree-point Crossing Statistics, respectively. The Milne Universe remains outside the99% conﬁdence level in each of these, for the range of σ int considered, while the ﬂatΛCDM model with Ω = 0 .

22 now lies mostly within the 60-99% conﬁdence region.This diﬀerence in probability of the diﬀerent Crossing Statistics for the ΛCDMmodel with Ω = 0 .

22 is due to this model having only one ‘crossing’ with the data.Adding extra hypothetical crossings then has little aﬀect on T i , as the extra crossingpoints all cluster around the same z . A model that ﬁts the data better, with manycrossings, however, should be expected to have T i statistics that increase with i . Onthis basis, one can then argue that for a model to be considered consistent with thedata it must show consistency across all crossing modes. The point here is that ifthere is a signiﬁcant crossing of the data and the model, then it should show up inthe Crossing Statistics as a failure of T i to decrease suﬃciently with decreasing i .A ﬂat ΛCDM model with Ω = 0 .

22 is therefore considered non-viable at close to3 σ because of the discrepancy in T I , even though T II and T III show some degree ofconsistency.One should notice that ∆ χ with respect to the best ﬁtting point in the parameterspace can be used in deriving the conﬁdence only in cases where we know the correctunderlying theoretical model. If we assume an incorrect theoretical model there willstill be a best ﬁt χ point in the parameter space, and we can still deﬁne 1 σ , 2 σ or nσ conﬁdence limits, but this then has little or nothing to do with goodness of ﬁtor whether the assumed model is correct or not. While we do not know the size ofthe error bars, playing with the σ int can also help an incorrect model to achieve a χ of one (or close to one) for the best ﬁtting point in its parameter space. On theother hand, while deﬁning the conﬁdence limits depends on ∆ χ and the degrees offreedom in the assumed parametric model, the ∆ χ between two models (or eventwo points in the parameter space of one model) changes with changing σ int whilethe degrees of freedom of the assumed models are ﬁxed.

4. Conclusion

In summary, we have presented a new statistic that can be used to distinguish be-tween diﬀerent cosmological models using their goodness of ﬁt with the supernovadata. Previous work on this subject has analyzed the residuals from supernova data,and in particular has examined pulls [18]. In these analyses, however, the correla-tions as functions of redshift have not been examined. Here we have included thisextra information, and have shown that the diﬀerent Crossing Statistics that havebeen derived as a result are sensitive to the shapes and trends of the data and theassumed theoretical model. These statistics are in general also less sensitive to theunknown intrinsic dispersion of the data than χ , as exempliﬁed by the fact that theconsistency between a model and a data set does not change much even when weassume large additional intrinsic errors. The Crossing Statistic can be used in the– 11 –rocess of parameter estimation, and for this purpose it can be put in the categoryof shrinkage estimators [19] (as raw estimates are improved by combining them withother information in the data set). The χ method, as an example of a maximumlikelihood estimator, is a very good summarizer of the data, but does not extract allof the available statistical information. We have shown here that by using T I , T II etc. we can extract more information from the data, and use this to make more precisestatements about the likelihood of diﬀerent parameters and models.Let us now mention some of the important remaining issues that need to beresolved in the context of the Crossing Statistic. So far, in all our analyses, we haveconsidered uncorrelated data. The Constitution supernova data set [7] that we haveused in our analysis is, in fact, strictly uncorrelated (as all oﬀ-diagonal elements of thecorrelation matrix are zero). However, in reality this will only be approximately true,and the most recent methods of supernova light-curve ﬁtting results in data sets withslight correlations between the individual data points [21]. It is an important questionas to how best to modify the Crossing Statistic to take account of such correlations, asthis would broaden the application of the Crossing Statistic to a much wider categoryof problems. Another important issue involves comparing the Crossing Statistic withBayesian methods of model selection. The Crossing Statistic proposed in this paperis by nature a frequentist approach, and is able to deal with diﬀerent models withoutany prior information. In contrast, Bayesian methods require priors that play animportant role in model selection and parameter estimation. This will complicatecomparisons, which will depend on whether we are dealing with completely unknownphenomena (for which we have no prior information), or with phenomena where wehave some prior information available. These issues will be the subject of futurework, and their results will reported elsewhere.Finally, let us brieﬂy mention the “Three Region Test” proposed by [20] thatdetects and maximizes the deviation between the data and a hypothesis in threebins. This test uses normalized residuals to test the goodness of ﬁt in a similar wayto our Crossing Statistic, but is considerably less general.The Crossing Statistic appears to us to be a promising method of confronting cos-mological models with supernovae observations, and has the potential to be straight-forwardly generalized to other datasets where similar problems occur. Acknowledgments

AS thanks Subir Sarkar for his valuable suggestions, and many useful discussions onthis subject over the past few years. We also thank Eric Linder, Alexei Starobinsky,Istvan Szapudi and Steﬀen Lauritzen for their useful comments and discussions. ASacknowledges the support of the EU FP6 Marie Curie Research and Training Network“UniverseNet” (MRTN-CT-2006-035863) and Korea World Class University grant– 12 –o. R32-10130. TC acknowledges the support of Jesus College, Oxford. TC andPGF both acknowledge the BIPAC.

References [1] A. G. Riess et al. , Astron. J. , 1009 (1998); S. J. Perlmutter et al. , Astrophys. J. , 565 (1999).[2] A. G. Kim, E. V. Linder, R. Miquel and N. Mostek, Mon. Not. Roy. Astron. Soc. , 909 (2004); E. V. Linder, Phys. Rev. D , 023509 (2009); R. P. Kirshner,arXiv:0910.0257.[3] R. A. Daly and S. G. Djorgovski, Astrophys. J. , 9 (2003); Y. Wang andP. Mukherjee, Astrophys. J. , 654 (2004); V. Sahni, A. Shaﬁeloo andA. A. Starobinsky, Phys. Rev. D , 103502 (2008); C. Zunckel and C. Clarkson,Phys. Rev. Lett , 181301 (2008); A. Shaﬁeloo, V. Sahni and A. A. Starobinsky,Phys. Rev. D

80 R , 101301 (2009); A. Shaﬁeloo, U. Alam, V. Sahni andA. A. Starobinsky, Mon. Not. Roy. Astron. Soc. , 1081 (2006); A. Shaﬁeloo,Mon. Not. Roy. Astron. Soc. , 1573 (2007); Y. Wang and M. Tegmark, Phys.Rev. D , 103513 (2005); A. Shaﬁeloo and C. Clarkson, Phys. Rev. D , 083537(2010); S. Nesseris and A. Shaﬁeloo, Mon. Not. Roy. Astron. Soc. , 1879 (2010).[4] T. M. Davis, Astrophys. J. , 716 (2007); P. Serra, A. Heavens and A. Melchiorri,Mon. Not. Roy. Astron. Soc. , 169 (2007); J. Sollerman, Astrophys. J. et al. Mon. Not. Roy. Astron. Soc. , 2381 (2010).[5] E. Linder and R. Miquel, Int. J. Mod. Phys. D , 2315 (2008).[6] H. Alnes, M. Amarzguioui and Ø. Grøn, Phys. Rev. D , 083519 (2006); H. Alnesand M. Amarzguioui, Phys. Rev. D , 023506 (2007); T. Clifton, P. G. Ferreira andK. Land, Phys. Rev. Lett. et al. , Astrophys. J. , 1097 (2009); M. Hicken et al. , Astrophys. J. , 331 (2009).[8] P. Astier et al. , Astron. Astrophys. , 31 (2006).[9] G. Miknaitis et al. , Astrophys. J. , 674 (2007).[10] A. G. Riess et al. Astrophys. J. , 98 (2007).[11] J. Guy et al. , Astron. Astrophys. , 781 (2005).[12] L. Perivolaropoulos and A. Shaﬁeloo, Phys. Rev. D , 123502 (2009).[13] J. R. Gott III et al. , Astrophys. J. , 1 (2001).[14] J. Hamann, A. Shaﬁeloo and T. Souradeep, JCAP , 1004 (2010). – 13 –

15] R. J. Barlow,

Statistics , John Wiley & Sons Ltd, (1989).[16] A. Kolomogorov, Giornale dell’ Istituto Italiano delgi Attuari, , 83 (1939); N.Smirnov, Annals of Mathematical Statistic, , 279 (1948).[17] G. Lemaˆıtre, Ann. Soc. Sci. Brussels A53 , 51 (1933) (reprinted in Gen. Rel. Grav. , 641 (1997)); R. C. Tolman, Proc. Nat. Acad. Sci. USA , 169 (1934) (reprintedin Gen. Rel. Grav. , 935 (1997)); H. Bondi, Mon. Not. Roy. Astron. Soc. , 410(1947).[18] M. Kowalski et al. , Astrophys. J. , 749 (2008).[19] C. Stein, Proc. Third Berkeley Symp. Math. Stat. Probab., , 197 (1956).[20] B. Aslan and G. Zech, arXiv:math/0207300.[21] N. Suzuki et al. , arXiv:1105.3470., arXiv:1105.3470.