[PDF] Optimizing future dark energy surveys for model selection goals

Abstract

We demonstrate a methodology for optimizing the ability of future dark energy surveys to answer model selection questions, such as `Is acceleration due to a cosmological constant or a dynamical dark energy model?'. Model selection Figures of Merit are defined, exploiting the Bayes factor, and surveys optimized over their design parameter space via a Monte Carlo method. As a specific example we apply our methods to generic multi-fibre baryon acoustic oscillation spectroscopic surveys, comparable to that proposed for SuMIRe PFS, and present implementations based on the Savage-Dickey Density Ratio that are both accurate and practical for use in optimization. It is shown that whilst the optimal surveys using model selection agree with those found using the Dark Energy Task Force (DETF) Figure of Merit, they provide better informed flexibility of survey configuration and an absolute scale for performance; for example, we find survey configurations with close to optimal model selection performance despite their corresponding DETF Figure of Merit being at only 50% of its maximum. This Bayes factor approach allows us to interpret the survey configurations that will be good enough for the task at hand, vital especially when wanting to add extra science goals and in dealing with time restrictions or multiple probes within the same project.

Full PDF

OOptimizing future dark energy surveys for model selection goals

Catherine Watkinson, Andrew R. Liddle, , Pia Mukherjee, and David Parkinson Astronomy Centre, University of Sussex, Brighton BN1 9QH, United Kingdom School of Mathematics and Physics, University of Queensland, Brisbane, QLD 4072, Australia

ABSTRACT

We demonstrate a methodology for optimizing the ability of future dark energy surveys toanswer model selection questions, such as ‘Is acceleration due to a cosmological constantor a dynamical dark energy model?’. Model selection Figures of Merit are deﬁned, exploit-ing the Bayes factor, and surveys optimized over their design parameter space via a MonteCarlo method. As a speciﬁc example we apply our methods to generic multi-ﬁbre baryonacoustic oscillation spectroscopic surveys, comparable to that proposed for SuMIRe PFS, andpresent implementations based on the Savage–Dickey Density Ratio that are both accurateand practical for use in optimization. It is shown that whilst the optimal surveys using modelselection agree with those found using the Dark Energy Task Force (DETF) Figure of Merit,they provide better informed ﬂexibility of survey conﬁguration and an absolute scale for per-formance; for example, we ﬁnd survey conﬁgurations with close to optimal model selectionperformance despite their corresponding DETF Figure of Merit being at only 50% of its max-imum. This Bayes factor approach allows us to interpret the survey conﬁgurations that willbe good enough for the task at hand, vital especially when wanting to add extra science goalsand in dealing with time restrictions or multiple probes within the same project.

Key words:

Cosmology - Bayesian model comparison - Statistical methods

Cosmology has developed dramatically in recent years; from beingrestricted to the realms of philosophy, our observational abilitieshave advanced it to the point where we may obtain precise evidencewith which to shape our models and understanding. In this age ofprecision cosmology, the ﬁne tuning of surveys can dramaticallyimprove their performance. This requires us to think in terms ofdesigner surveys rather than using a build-and-point approach.For any given problem in cosmology (we use that of dark en-ergy hereafter) many surveys of varying capabilities will be pro-posed, and a combination will make it through the conceptualstages to see the light of night. For example, there are severalstages deﬁned by the Dark Energy Task Force (DETF) to clas-sify dark energy surveys; stage II surveys are complete, e.g. theSloan Digital Sky Survey (SDSS); several stage III surveys arenow taking data, e.g. The Baryon Oscillation Spectroscopic Survey(BOSS) and WiggleZ, with others at the manufacturing stage, e.g.The Dark Energy Survey (DES) and the Hobby–Eberly TelescopeDark Energy EXperiment (HETDEX); and Stage IV surveys arestill in the design phase e.g. BigBOSS, the Square Kilometre Array(SKA), the Wide-Field Infrared Survey Telescope (WFIRST), andthe recently-approved Euclid satellite mission.When considering the large investments of time, money andexpertise involved in these projects, it is imperative that designersidentify the survey conﬁguration that maximises the science return.Given the number of surveys all targeting the same goal, it is also important that they identify the appropriate niche; doing so max-imises the overall science return from the combined effort of allrelevant surveys. Moreover, naive optimization can be wasteful un-less there is an absolute scale of performance that can be used todetermine survey conﬁgurations that are good enough for a giventask, especially when dealing with time or cost restrictions, or withmultiple probes within a survey. The importance of optimizationfor dark energy surveys was ﬁrst stressed by Bassett (2005) andBassett, Parkinson & Nichol (2005b).The concept of optimization is universal to design, regardlessof the product in hand; a scalar rating or Figure of Merit (FoM) isdeﬁned, and the conﬁguration of product variables that optimizesthis number is identiﬁed. When it comes to designing a survey’sobservational parameters it is usual to exploit Monte Carlo MarkovChain (MCMC) methods to vary things like survey time, area, ex-posure time, and redshift range to identify the extreme of a FoM.In developing their roadmap for the future, the Dark EnergyTask Force deﬁned a FoM for comparing proposed dark energy sur-veys. The FoM is based on parameter estimation, quantifying theerrors measured on the

ΛCDM values of the dark energy equationof state. This has subsequently become the standard for the quan-tiﬁcation of dark energy survey performance. However the ques- We refer here to the FoM of the original report (Albrecht et al. 2006).A subsequent report (Albrecht et al. 2009) suggested a more complicatedparameter estimation FoM based on principal components of w ( a ) , but wedo not consider that here as our intention is to deploy alternative FoMs.c (cid:13) a r X i v : . [ a s t r o - ph . C O ] N ov C. Watkinson et al. tion we wish to answer in building these surveys asks: which of themodels we have should be preferred; and if a single model cannotbe selected outright, as is presently the case with the dark energyproblem, which can we discount? The DETF approach skips thisquestion and assumes that we already know the right model, theidea being that if the true parameter values lie outside of the 2- σ error contours then the survey will be well placed to identify it.Whilst this does not seem an unreasonable presumption it has notbeen properly tested.Bassett (2005) introduced the Integrated Parameter Space Op-timization (IPSO) design framework to address this, proposing thatthe FoM be some function of the 1- σ marginalized dark energycovariance matrix. In this paper we take a similar approach, butinstead adopt FoMs that rate a survey’s ability to perform modelselection, thereby directly optimizing for the survey’s designed ob-jective.It is highly problematic to use frequentist statistics to deal withmodel selection, whereas Bayesian statistics provides the perfectplatform. In particular we employ the Bayes factor, which mea-sures the increase of belief in one model over another that new dataprovides. A downside of parameter estimation ratings is that theirscale is relative, providing no simple interpretation of when a sur-vey is good enough for the job in hand. This is unfortunate as it isvital that money and effort does not get frittered away in makingsurveys arbitrarily more powerful, whilst promising no signiﬁcantadvances in our knowledge. The Bayes factor addresses this issueby providing an absolute scale; this is a big motivation for con-sidering its use within forecasting and optimization (Mukherjee etal. 2006a; Trotta 2007a; Trotta 2007b).In this paper we deﬁne our model selection FoMs in section2; in section 3 we use a particular ground based dark energy surveyaiming to exploit baryonic acoustic oscillations to identify practicalimplementations for each, and also to test their performance; and insection 4 we exploit these model selection ratings to optimize thebaryon acoustic oscillation survey SuMIRe PFS (Subaru Measure-ment of Images and Redshifts Prime Focus Spectrograph) (Takada2010; Takada & Silverman 2010).It is worth noting that the implementations we investigate arerelatively crude and there exists much room for reﬁnements. How-ever, our main motivation is to compare the performance of themodel selection FoM to that of parameter estimation FoM; reﬁne-ments to improve on computational efﬁciency have no inﬂuenceon the outcome and are therefore superﬂuous at this point. Fur-thermore, as all optimizations to date exploit parameter estimationFoMs, we deem it sensible to identify ways in which these opti-mizations can be easily adapted to address their short-fallings. Our principal goal is to introduce the concept of model selectionoptimization to astrophysics, the concept being a very general one.However, for concreteness we will focus throughout on a realisticscenario where the idea could be deployed, by considering opti-mization of baryon acoustic oscillation (BAO) surveys for dark en-ergy that could be carried out by large multi-object spectrographson eight-metre class telescopes.This study builds on an optimization study that was carriedout by Parkinson et al. (2010, henceforth P10). In brief, the sur-vey modelled by this optimization comprised a 3000-ﬁbre spectro-graph mounted on a ground-based 8m optical-infrared telescope.

Constraint Parameter ValueTotal observing time 1500 hoursField of view . o diameter n ﬁbres dz Table 1.

Survey constraints used in modelling the multi-object spectro-graph.

Figure 1.

Representation of the z binning method of the optimization codeused. The speciﬁcations used to model this survey are outlined in Table 1,and a full description can be found in Bassett, Nichol & Eisenstein(2005a). Ultimately this project (WFMOS) did not move forwardto construction, but the concept and design lives on in the form ofSuMIRe PFS, with the spectrograph to be mounted on the Subarutelescope. Other proposed spectroscopic BAO surveys include Big-BOSS (Schlegel et al. 2011) and DESpec.We model the survey to observe line emission of pre-selectedactive star-forming galaxies. Its wavelength coverage allows obser-vation of the OII lines and the ˚ A break in the redshift range . ≤ z ≤ . ; this overall range is divided into sub-bins as perFigure 1 and the density throughout is ﬁxed by that of the deepestredshift bin.Our optimization method closely follows that of P10, utilis-ing Monte Carlo Markov Chain (MCMC) (Metropolis et al. 1953;Hastings 1970) methods to identify the survey conﬁguration thatmaximises a FoM. The larger this rating the better the survey’s per-formance. The variables that describe each survey conﬁguration aredescribed in Table 2. Given that we already have dark energy con-straints from the Sloan Digital Sky Survey (SDSS) and will soonhave data from Planck, SDSS data and forecasts for the Planck dataare included as prior information; again refer to P10 for details.We wish to compare a parameter estimation FoM, that rates Survey Parameter SymbolTime allocated τ Area covered A Minimum of redshift bin z min Maximum of redshift bin z max Number of pointings n p Table 2.

Survey parameters varied by the optimization code, affecting thevarious FoM under consideration. c (cid:13)000

Survey parameters varied by the optimization code, affecting thevarious FoM under consideration. c (cid:13)000 , 000–000 odel selection optimization a survey’s ability to measure the parameters of interest assumingthe true model is known, to model selection FoMs that recogniseour uncertainty surrounding the most preferable model. The DarkEnergy Task Force (DETF) FoM has been widely adopted by thecosmological community as the standard for comparison of darkenergy surveys and their optimization (Albrecht et al. 2006). Assuch, we take the DETF FoM as the parameter estimation baselinefor comparison.The DETF FoM makes use of the CPL (Chevallier & Po-larski 2001; Linder 2003) parametrisation of the dark energy equa-tion of state w , given by w ( a ) = w + w a (cid:18) z z (cid:19) , (1)where w is a constant characterising the behaviour of w in thelocal universe and the constant w a characterises its redshift depen-dence. The DETF FoM is the inverse of the area conﬁned withinthe 95% conﬁdence level on w and w a measurements, assumingthroughout that w = − and w a = 0 , i.e. ΛCDM , is the truecosmological model. The smaller this area, the larger the FoM, andthe more accurate the survey.For each survey conﬁguration, the optimization of P10 fore-casts the errors on the measurable quantities d A and H by usinga 1D Fisher matrix based transfer function as derived by Seo &Eisenstein (2007). This returns the BAO distance errors as a func-tion of survey properties and non-linearity. These errors are thentranslated onto w and w a , given by the inverse of the marginalisedFisher matrix i.e. F − w w a ; from this the DETF FoM may be calcu-lated: FoM

DETF = 1 (cid:112) σ w w σ w a w a − σ w w a = 1 (cid:113) det F − w w a . For detailed information on this optimization and Fisher matrix ap-proach please refer to P10 and Parkinson et al. (2009).

As mentioned we use Bayesian Statistics as the foundation for ourmodel selection FoMs. As this subject has been covered extensivelyin the literature, we will only provide an overview here. The basiclaws of probability such as the multiplication rule were shown byCox (1946) to be the mathematical framework of Boolean logic.Bayes theorem derives from direct application of this rule and pro-vides a means to calculate the probability of a given model ( M ) (asper equation 2) or hypothesis ( θ ) in light of data ( D ) (Jeffreys 1961;Jaynes 2003; MacKay 2003; Gregory 2005): p ( M | D ) = p ( D | M ) p ( M ) p ( D ) . (2)Bayesian statistics provide a natural framework for dealing withmodel selection and as such form the basis for the model selectionFoM used in this paper.From here on we use standard notation when referring to prob-abilities, e.g. p ( A | B, C ) means the probability of A given that B and C are true. In equation 2, p ( M | D ) is referred to as the modelposterior; p ( D | M ) is the model likelihood, generally referred toas the evidence; the prior p ( M ) characterises our state of knowl-edge before the data was collected; and the normalisation term is p ( D ) , the probability of the data.The star of the show is the Bayes factor B (Jeffreys 1961;Kass & Raftery 1995), which measures the increase of belief in one | ln B | range Level of signiﬁcance | ln B | < Not worth mentioning < | ln B | < . Signiﬁcant . < | ln B | < Strong < | ln B | Decisive

Table 3.

The Jeffreys’ Scale provides a useful guide when interpreting theBayes factor. model over another given new data. Alternatively it can be consid-ered as the change in model odds from before the data was consid-ered to after. This scalar quantity is evaluated by taking the ratio ofthe evidence E of one model given data, i.e. p ( D | M ) , to that ofanother, p ( D | M ) , as given in equation 3: B = E ( M ) E ( M ) = p ( D | M ) p ( D | M ) . (3)The combination of model selection FoMs we use for thiswork takes account of the uncertainty in our knowledge. We al-low the assumed model to vary through its values of w and w a ,rather than ﬁxing it to one ﬁducial model as is the case for theDETF FoM. The allowed models are restricted to a chosen regionof w – w a parameter space, in which − ≤ w ≤ − . and − . ≤ w a ≤ . . This restricted parameter space summarisesthe prior range used in our calculations.A plethora of models exists offering explanation for dark en-ergy, see Caldwell & Kamionkowski (2009) and references therein.Here, two overarching models are considered; ΛCDM ( M ) forwhich w = − and w a = 0 , and evolving dark energy ( M )where w and w a can have any values chosen uniformly withinthe conﬁnes of the above prior range. Future observational indica-tions of a deviation from the ΛCDM case would no doubt prompta much wider investigation of both dynamical dark energy modelsand modiﬁed gravity models, but for this work a two-model ap-proach is sufﬁcient.We test two Bayes factor based model selection FoMs, ﬁrstdeﬁned by Mukherjee et al. (2006a, M06 hereafter), for use in op-timization. We also speak in terms of ln B for the bulk of thispaper, whereby even odds translate to ln B = 0 , positive valuessupport the simpler Λ CDM model and negative values support themore complex model. As in M06 we make use of the Jeffreys’scale (Jeffreys 1961), outlined in Table 3, to judge the signiﬁcanceof ln B . In constructing our FoM we treat ln B as a function of thedark energy model we assume to be ‘true’, that is ln B ( w , w a ) .(i) Assuming constant w The ﬁrst of these FoMs measures how strongly a survey will sup-port

ΛCDM when it is the true underlying model. This is done bysetting w = − and w a = 0 in all calculations contributing tothe Bayes factor forecast. The larger B is at this point in w – w a parameter space, the stronger the survey if ΛCDM does transpireto be the true model. This FoM will be referred to as ln B ( − , hereafter. One of its useful properties is that it gives an absolutescale of support for Λ CDM; i.e. if future experiments continue toincrease support for this paradigm, it gives a criterion by which we Alternative model selection FoMs, also suitable for these purposes, havebeen given in Trotta (2007a,b) and Trotta et al. (2011). Unlike those usedhere, these FoMs average over the present state of knowledge. It is equally valid to assign M to be the evolving dark energy modelinstead of Λ CDM, in which case this interpretation is inverted.c (cid:13) , 000–000

C. Watkinson et al.

Figure 2.

Each point in this w – w a plot is in turn assumed to be the truemodel, and the Bayes factor of Λ CDM versus dark energy is calculated atthat point. The (green) circles mark the region where

ΛCDM is preferred,i.e. ln B > . ; (blue) squares mark the region in which ΛCDM is notdiscounted, i.e. − . < ln B < . ; and the (red) triangles correspond tothe region in which evolving dark energy is correctly preferred, i.e. ln B < − . . can decide whether we have done enough to satisfy ourselves, andshould turn to other scientiﬁc questions.(ii) Assuming evolving w ( z ) The second model selection FoM measures a survey’s ability todiscount

ΛCDM when it is not the true model. To evaluate this weforecast the Bayes factor as a function of w and w a and calculatethe area of w – w a parameter space in which the survey will not beable to discount ΛCDM . This is done by ﬁnely gridding this pa-rameter space and for each point on the grid forecasting the Bayesfactor assuming the w and w a values at that point.The FoM that we will refer to as area − hereafter is the inverseof the area containing values of ln B > − . , corresponding to thegreen circles and blue squares of Figure 2. The larger this ﬁgurethe more powerful the survey for constraining evolving dark en-ergy models and the greater the chance of detection of evolution ifpresent. Unlike ln B ( − , which has a direct probabilistic inter-pretation, the area − does not; it instead provides a measure of asurvey’s predicted interpretation of dark energy model-space. Two different approaches are utilised in calculating ln B ( w , w a ) :nested sampling as devised by Skilling (2006); and the Savage–Dickey Density Ratio, ﬁrst introduced in a cosmological contextby Trotta (2007a). Any model is deﬁned by a set of cosmological parame-ters θ ; for example, ΛCDM can be described by θ =[ w de , Ω de , Ω m , Ω r , Ω k , Ω b , H , n s , σ ] . The values of each ofthese parameters must be estimated by means of best ﬁt to the data.This can be done using Bayes theorem, as per equation 4, p ( θ | D, M ) = p ( D | θ , M ) p ( θ | M ) p ( D | M ) , (4) with MCMC methods identifying the point in cosmological param-eter space at which the posterior p ( θ | D, M ) is maximised (Hob-son et al. 2010).Notice the denominator in equation 4 is the evidence requiredfor the Bayes factor of equation 3. By integrating over all allowedvalues of this parameter set θ it is possible to calculate the evidenceusing equation 5; the evidence is therefore the average likelihoodover the prior parameter space, thus rewarding models for predic-tive power. E ( M ) = p ( D | M ) = (cid:90) p ( D | θ , M ) p ( θ | M ) d θ . (5)Nested sampling (Skilling 2006) recasts this multi-dimensional evidence integral in 1D by integrating over theprior mass X , where dX = p ( θ | M ) d θ and L refers to thelikelihood: E = (cid:90) L ( X ) dX . (6)The nested sampling algorithm starts by sampling a large numberof points from the likelihood surface simultaneously and assignsequal fractions of the total remaining prior mass to each sample. Itthen proceeds by adding the lowest probability point ( L j ) (whoseprior mass is X j ) to the evidence integral sum: E = m (cid:88) j =1 L j X j − − X j +1 ) . (7)The algorithm then reduces the remaining prior mass, by a statis-tically estimated amount. The lowest likelihood sample is replacedwith a sample randomly selected from the prior, with the sole se-lection criteria that it be of higher likelihood than the previous. Themain challenge in implementing the algorithm is to ﬁnd a way tocarry out this sampling efﬁciently, a simple approach being ellip-soidal sampling (Mukherjee et al. 2006b) and a more sophisticatedapproach suitable for multi-modal likelihoods being to partition thepoints into clusters of ellipsoids (Feroz, Hobson & Bridges 2009).This entire process is repeated, building up the evidence sum,until the accuracy has reached an acceptable level. At the point oftermination the remaining contribution to the evidence integral isadded. As this is a numerical estimation, several repeats are donefrom which the mean and error are extracted. A detailed account ofthe nested sampling implementation we use is given in Mukherjeeet al. (2006b).Calculations of the Bayes factor are in principle simple us-ing nested sampling. The evidence is calculated by ﬁrst assuming ΛCDM as the true model, then independently by assuming evolv-ing dark energy, when simulating survey data. Unfortunately, due tothe very large number of computations required to sample both sur-vey and model parameter spaces, nested sampling is too inefﬁcientto be regarded as practical in full MCMC optimizations. Howeverit is still possible to utilise nested sampling when investigating the ln B ( − , FoM for very basic manual optimizations, e.g. manu-ally altering only one survey parameter at a time.

As the nested sampling algorithm is too slow to be seriously con-sidered for the full scope of model selection optimization that wewish to consider, the Savage–Dickey Density Ratio (SDDR) is in-vestigated. The SDDR is a simpliﬁcation of the Bayes factor thatassumes a less complex model is nested within a more complex c (cid:13)000 , 000–000 odel selection optimization model and that the priors are separable. For example, ΛCDM isnested within the evolving dark energy model’s parameter spacewhere w = − and w a = 0 , and furthermore the priors concernedwith these two dark energy parameters ( w ) and those concernedwith the nuisance parameters ( N ) of the models can be separated,i.e. p ( w , N ) = p ( w ) p ( N ) . The SDDR is given by equation 8, B = p ( w | D ) p ( w ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) w = w ∗ , (8)where w ∗ represents the simpler models’ nested values, being aspecial case of the more complex model’s parameter vector w . Fora derivation of this see Appendix B of Trotta (2007a).This allows the Bayes factor to be evaluated by considering themarginalised posterior probability of the more complex model andits prior at the parameter values of the nested simpler model. Thisremoves the need for the computationally expensive integral as re-quired to calculate the evidence via equation 5. Both assumptionsmade in deriving equation 8 are true for the dark energy modelsunder consideration and nothing has been assumed about the like-lihood, therefore it is exact in this case. However, we now make afurther assumption that makes this implementation approximate.To minimise alteration to the original DETF optimization andhence calculation time, our SDDR calculation assumes Gaussianityof the posterior in w and w a having marginalized over all otherparameters. The Bayes factor can therefore be forecast with only afew simple additions to the DETF optimization, by application ofthe following: B ( w , w ) = ∆ w ∆ w π √ det F − e − (cid:80) µν ( w µ − w ∗ µ ) F µν ( w ν − w ∗ ν ) . (9)In this equation ν and µ can have values of either 0 or 1; w ∗ are thenested values of the simpler model; w = w a ; ∆ w and ∆ w arethe width of the ﬂat prior ranges; and F µν is the marginalised Fishermatrix. A numerical approach using ﬁnite-differencing was used todetermine the Fisher matrix, the details of which are described inAppendix A of P10.Recall that values larger than unity ( ln B > ) support thesimpler model, and values less than unity ( ln B < ) support themore complex model. We see then that the pre-factor of equation9 acts as an amplitude, measuring the ratio of the area of w – w a parameter space allowed by the more complex model to the areaof the error ellipse; this term therefore penalises the more complexmodel for unjustiﬁed parameter space. The exponential part mea-sures the distance between the two models and can lend support forthe more complex model by suppressing the amplitude term.This SDDR approach allows investigation of both model se-lection FoMs. However for the area − FoM to be practical for fullMCMC optimizations it needs reﬁning; for example, ln B ( w , w a ) calculations could be parallellized and MCMC could be used todetermine the area − . The prior range mentioned in section 2.2 is the same as that usedin M06, but we acknowledge that the choice of priors is arbitraryto a degree. Whilst changing the prior range, i.e. ∆ w and ∆ w ,will quantitatively affect ln B calculations it will not qualitativelychange the FoMs we are considering. Furthermore, as discussedin M06, different (sensible) prior choices will not have a seriousimpact on the interpretation of the resulting Bayes factors. Figure 3.

Plot showing the relation between the DETF FoM and the nested ln B ( − , FoM. The monotonic relation between the two is clear, withincreased scatter seen around the maximal values.

Early in the process of adding the new FoM options to the opti-mization we became aware of a coding error that made the originalresults in P10 incorrect. We have ﬁxed this error and present the up-dated results in Appendix A. The main differences are a reductionin the DETF FoM by a factor of 3 and that the optimization is quali-tatively unchanged by including curvature as a nuisance parameter.The latter of these results is found to be also true of the model se-lection FoMs we investigate in the following, therefore we do notexplicitly consider curvature in any of our presented ﬁndings.

As mentioned, both the nested ln B ( − , and SDDR area − computations are quite slow. To deal with this issue, discrete, i.e.manual, optimizations are considered.Large-scale surveys such as this are designed with the numberof ﬁbres tuned to the required source density, therefore repeated ob-servations of the same area of sky are rarely needed. Furthermorethe minimum exposure time is nearly always sufﬁcient to achievethe required S/N on large populations of the the observed galax-ies ( ∼ ln B ( − , : an absolute scale for optimization The ﬁndings of our nested sampling discrete optimization are sum-marised in Figure 3, which compares the behaviour of ln B ( − , with that of the DETF FoM.The mostly monotonic relation we see between the DETF andthe nested ln B ( − , FoMs means that a DETF optimized sur-vey will be extremely similar to one optimized with ln B ( − , .However the latter FoM provides an absolute scale; in this case ln B ( − , shows the survey is capable of strongly preferring c (cid:13) , 000–000 C. Watkinson et al.

Figure 4.

Comparison of the z max dependence of ln B ( − , when calcu-lated with nested sampling and Gaussian SDDR. The (blue) circles mark thenested smapling calculations of ln B ( − , , whilst the dashed (red) lineshows the respective Gaussian SDDR calculations. The Gaussian SDDRnumerically underestimates ln B ( − , in a nearly perfectly uniform fash-ion. We infer that it would be trivial to calibrate it to the more accuratenested calculations. ΛCDM when the DETF FoM for the equivalent survey conﬁgura-tion is still around 60% of its optimum. This additional informationwould be invaluable when deciding on a survey’s operating mode. ln B ( − , : an immediately viablemodel selection optimization FoM By deploying the SDDR Gaussian approximation it is possible toperform full MCMC optimizations with ln B ( − , . However, be-ing an approximation it is necessary to establish the impact of thissimpliﬁcation on the resulting Bayes factors. Figure 4 compares theGaussian SDDR calculations with the more accurate nested sam-pling ones in the case where the upper redshift limit z max is varied.We can see that the SDDR ln B ( − , typically underesti-mates the Bayes factor. While the assumption of Gaussianity hasbeen seen to be good around the peak of the likelihood (Mukherjeeet al. 2006a), it appears to be less accurate around the tails. If the in-formation in the tails is overestimated by this assumption, i.e. if inreality the likelihood falls off more sharply than in the Gaussian ap-proximation, then the average likelihood and therefore evidence forthe evolving dark energy model will be overestimated. This will re-sult in the underestimation of the Bayes factor we see here. It wouldalso explain the increased scatter away from the monotonic relationbetween the nested ln B ( − , and the DETF FoM for strongersurveys, clearly seen in Figure 3. More accurate surveys will havetighter likelihood peaks, and therefore any non-Gaussianity of thetails would be more inﬂuential.Despite this, the general trend is the same, best seen by makinga logarithmic plot of the DETF FoM against nested calculations of B , and writing equation 9 for the Gaussian SDDR calculation of B in terms of the DETF FoM: ln B SDDR ( − ,

0) = ln (cid:18) ∆ w ∆ w a π (cid:19) + ln (FoM DETF ) . (10)Figure 5 shows how the nested calculation of ln B ( − , followsthis linear relation well, despite the increased deviations around thehighest values. This implementation of SDDR presents a good al-ternative to the nested sampling approach, and furthermore it is asquick as the DETF optimization with only a few extra calculations Figure 5.

Logarithmic plot of the DETF FoM against the nested ln B ( − , FoM. We see that the nested calculation follows the linearrelation described by equation 10.

Figure 6.

DETF FoM performance compared with that of the area − FoM.The (blue) triangles result from a manual optimization where z min aloneis adjusted, and the (red) circles from adjusting z max only. There is a clearmonotonic relation between the two, with general agreement around themaximum. FoMs were normalised with respect to their maximum. required. Its underestimation of the Bayes factor is also seen to bealmost uniform across the redshift range of Figure 4, and we there-fore infer that it would be simple to calibrate the SDDR FoM togain more accurate estimates of ln B by performing a few nestedsampling computations of ln B . − : an informative optimization FoM with potentialfor practical application The Gaussian SDDR approximation also allows investigation of thearea − FoM but, as with the nested calculations of ln B ( − , ,computational limitations mean that we are again restricted to man-ual optimizations.Figure 6 shows the monotonic relation between the area − FoM and that of the DETF. This further supports the widespreadadoption of this parameter estimation FoM. The area − FoM isseen to attain a performance only slightly weaker than its optimumwhile the DETF is only 50% of its optimum. This means that themodel selection ability of a survey is close to optimal for much c (cid:13)000

Full implementation area − FoMs plotted against the correspond-ing values found using the ﬁducial galaxy approximatation. In both casesthe FoMs have been normalised with respect to their maximum. Clearly theapproximation is very good. weaker conﬁgurations then the DETF optimization would deem ac-ceptable.

As the area − FoM is slow to compute we consider a further ap-proximation. We exclude the modelling of the galaxy density, re-quired for the Seo & Eisenstein (2007) transfer function, from the w – w a gridding. That is to say we assume that ΛCDM is the truemodel for all calculations of galaxy density, regardless of the as-sumed cosmology in calculating the Bayes factor. In doing so wereduce the number of times the galaxy density must be estimatedfrom of order 400 per FoM to 1. There are 10,000 FoM calcula-tions in an average optimization, so this substantially reduces thecalculation time.The results of this are summarised in Figure 7, showing thisto be an extremely good approximation and that this FoM is notparticularly sensitive to galaxy density. This provides a much fasteralternative to the full implementation of the area − FoM.

Figure 8 provides a summary of this investigation with all FoMs(excluding the nested sampling calculation of ln B ( − , ) plottedas a function of maximum redshift. From this plot and that of Figure4 we see that the model selection FoM would deem a maximumredshift of about 0.7 to be acceptable. Clearly a maximum redshiftof between 1.1 and 1.6 is optimal using the DETF, but it is not clearhow much this upper limit may be reduced before the survey willnot fulﬁl its desired purpose.Similarly a survey that extends to higher redshifts than 1.6 willbe less than optimal but will, up to a point, be deemed suitable forits purpose by both model selection FoMs. This highlights the valueof this approach as our modelled survey has the capability to pushto higher redshifts; such high-redshift observations are extremelyuseful for ancillary cosmology and astronomy. The model selec-tion optimizations provide well informed ﬂexibility of this upper z limit.The absolute scale of ln B ( − , and the additional informa-tion from area − are very useful when considering time allocations. Figure 8.

Optimization results comparing the various FoMs investigated.All FoMs were normalised with respect to their maximum values. The(blue) dashed line marks the SDDR area − FoM, the (green) dot-dashedline plots the SDDR ln B ( − , FoM, and the (red) solid line shows theDETF FoM. Parameter SuMIRe speciﬁcationmirror diameter 8.2mﬁbre diameter 1.2 arcsecaperture 0.8Signal/Noise 6.5Number of ﬁbres 3000Field of View π/ sq.deg. Table 4.

The SuMIRE speciﬁcation used in our calculations.

For example, a dark energy survey will have limited time on a tele-scope; being able to provide detailed information on the requiredtime would be vital for requesting more time be allocated, or whensharing an overall allocation between different independent obser-vation modes within the same project.As the FoM have all been seen to have a monotonic rela-tion, information from DETF and ln B ( − , optimizations canbe used to perform similar discrete area − optimizations as donehere, with negligible expense of time and effort. We now apply the model selection FoMs that have been investi-gated so far to a practical optimization. To make this as relevantas possible we update the original P10 survey parameters to be asclose as can be to SuMIRe PFS. This involves increasing the mir-ror diameter from 8m to 8.2m and the ﬁbre diameter from 1 arcsecto 1.2 arcsecs, while the target signal-to-noise ratio and throughputare slightly reduced from the original. The speciﬁcations for thesurvey we optimize in this section are given in Table 4; otherwisethe survey details remain unchanged from our original optimizationas summarised in Table 1. For a full description of SuMIRe PFS seeTakada & Silverman (2010).As the WiggleZ survey is complete and BOSS (Baryon Os-cillation Spectroscopic Survey) is well under way, the SuMIRe op-timization must concentrate on ﬁlling the available observationalniche, otherwise some of its allocated time will be lost on unnec- c (cid:13) , 000–000 C. Watkinson et al.

Figure 9.

Representation of the z binning method used in optimizingSuMIRe PFS.Survey Parameter Symbol (mega-bin)Time split between mega-bins τ (low), τ (high)Area covered A (low), A (high)Minimum of redshift mega-bin z min (low), z min (high)Maximum of redshift mega-bin z max (low), z max (high)Number of pointings n p (low), n p (high) Table 5.

Survey parameters varied in the SuMIRe optimization. essary repetition of observations. With this in mind we include theforecast data-points for BOSS and WiggleZ as prior informationfor this optimization.Although we have so far only considered the redshift range0.1 to 1.6, SuMIRe PFS as modelled here is capable of observ-ing redshifts up to about 4.9. At redshifts greater than 2 the spec-trograph can measure the Lyman-alpha spectral features, howeverfor . < z < there exists an effective blind spot in which nospectral features are observable by this survey. This optimizationtherefore considers two independent redshift mega-bins; the low-redshift mega-bin covering . < z (low) < . and the high-redshift mega-bin covering < z (high) < . . Figure 9 depictsthe mega-bin modelling and Table 5 details the survey variables.The modelling details of the high redshift mega-bin can be foundin P10.Two types of optimization are performed. Firstly a full MCMCoptimization using the Savage–Dickey ln B ( − , FoM is exe-cuted using 3 different optimization settings:(i) Varying all 10 survey parameters listed in Table 5, i.e. τ , A , z min , n p and z max over both redshift mega-bins,(ii) Focusing all time in the low-redshift mega-bin, i.e. varyingonly A (low), z min (low), n p (low) and z max (low),(iii) Focusing all time in the high-redshift mega-bin, i.e. varyingonly A (high), z min (high), n p (high) and z max (high).From optimization (i) we establish the optimum time split be-tween the low and high mega-bins; the optimum redshift and expo-sure times for each are then found from (ii) and (iii). Discrete op-timizations can then be performed with the Savage–Dickey area − FoM. To do this we ﬁx the redshift limits and exposure time ac-cording to the ln B optimization results. By manually varying thetime (and therefore area) we can examine this FoM’s performancewith total time allocation and its split across the redshift mega-bins.The SuMIRe project has moved on a great deal from the ver-sion modelled here, for example the current design has no redshiftblind spot and can in principle observe as deep as z = 10 if targetsare available (Murayama 2011). As such direct comparison cannotbe drawn; we instead use the observational technique outlined inthe 2010 SuMIRe PFS white paper (Takada & Silverman 2010) as Figure 10.

Plot of the time allocation in the low mega-bin versus the ln B ( − , FoM; this results from varying all parameters in both redshiftmega-bins, i.e. optimization setting (i). Note, any remains of the total 1500hours is allocated to the high redshift mega-bin. Allocating all 1500 hoursto the low mega-bin is seen to be preferable our reference. In this set-up the survey only covers an area of 2000sq.deg., limited by the projected area that the Hyper Suprime-Cam(HSC), needed for pre-selecting PFS’s target galaxies, will coverduring its operational lifetime. Its redshift range is 0.6 to 1.6, ex-posure time is 15 minutes, and therefore total time is roughly 500hours.

ΛCDM achievedwith all time spent observing redshifts between 0.1 and1.6

The full MCMC optimization using ln B ( − , found that an im-provement in our conﬁdence in ΛCDM (if it is the underlyingmodel) will be gained for a wide range of time allocation to thelow redshift mega-bin, but it is clearly preferable to focus all timein this mega-bin. This is summarised in Figure 10, plotting τ low against ln B ( − , .Figure 11 shows how the value of z min (low) affects the sur-vey’s ability to prefer ΛCDM . We ﬁnd that it is best to make useof the full redshift range in the low mega-bin, i.e. . < z < . .However, as long as the lower limit is not greater than z = 1 . there is no great loss of performance. This has a great deal to dowith the fact that the data-points measured by WiggleZ and BOSScover the redshift range between 0.1 and 1.0. It also indicates thatthe reference survey’s choice of z min = 0 . is reasonable.As with the DETF optimization there is a preference for max-imising the survey volume, and we therefore ﬁnd that the optimalsurvey minimises the exposure time and maximises survey area. The time-area − FoM plots of Figures 12 and 13 summarise theﬁndings of our discrete area − optimization used to investigate theoptimal time split. Their jagged nature is a result of a tipping-pointstyle effect caused by the gridding approach we use for investi-gating the dark energy parameter space. This jaggedness is absentin Figures 6 and 8 because the range for the FoM is greater bya factor of around 4. An MCMC style approach to measuring the c (cid:13)000

Plot of the minimum redshift of the low mega-bin versus ln B ( − , when all time is focused in the low mega-bin; this results fromvarying all parameters in both redshift mega-bins, i.e. optimization setting(i). z min = 0 . is seen to be preferable, with model selection performancedropping off steeply after z min = 1 . Figure 12.

Plot of the total time allocation effect against the Area − FoM;this is achieved by ﬁxing the redshift range at 0.1 - 1.6 and exposure time to15 minutes. We see that optimal results can be achieved for time allocationsabove 1100 hours. area − would produce more gradual increase of this FoM. The ﬂat-lining should be conceptually smoothed across, joining the tips ofthe jags.Note that in making the total time plot of Figure 12, four dif-ferent time splits are calculated per total time allocation; it wasfound the maximal survey spends all the time in the low mega-binregardless of the total time allocated.The main thing we note is the very limited gain in carrying outthis survey with 500 hours as per our reference survey; ln B ( − , increases from 3.3 to only 3.4, and the prior-space area in which ΛCDM cannot be ruled out, i.e. area in which ln B > − . , re-duces from 30% to only 29%. This fact is already clear from theDETF FoM with an increase of only 4. From our model selectionoptimization a time allocation of around 1100 hours is best witha minimum of 500 hours spent in the low mega-bin, but even thispromises only a minimal advance in our knowledge.SuMIRe PFS as modelled here is only a part of the SuMIReproject; the other part being the HSC survey, which we’ve men-tioned is used to pre-select galaxies for PFS. Having the pre- Figure 13.

Plot of time allocation in the low mega-bin versus the Area − FoM; this is achieved by ﬁxing the redshift range at 0.1 - 1.6 and exposuretime to 15 minutes. Any remains from the total 1100 hours are allocated tothe high redshift mega-bin. We see that spending over 500 hours in the lowis preferable for optimal model selection ability. selection survey as an integral part of the project, operating on thesame telescope, is of great beneﬁt in itself, but the HSC will alsobe used to measure gravitational lensing. Dark energy constraintsfrom BAO and gravitational lensing are complementary and as suchtheir combination dramatically boosts the power of SuMIRe. Thisis clear from the white paper’s forecasted DETF FoMs which risefrom 33 (for both PFS and BOSS surveys), to 217 when the HSCsurvey is included (i.e. PFS, BOSS and HSC). Again SDSS andPlanck datasets were taken as prior information in calculating theseFoMs (Takada & Silverman 2010).It is a pity our modelled survey is not as comprehensive as thefull incarnation of SuMIRe, as the model selection FoM’s beneﬁtswould be more transparent; we therefore present this section as aproof of concept rather than a display of strength.

We have discussed the importance of survey optimization, particu-larly in ﬁnding the appropriate niche for an upcoming survey. Theusual approach of parameter estimation, which maximises the sur-vey’s ability to accurately measure the parameters of interest, as-sumes that a particular model is true thereby ignoring the modelselection requirement of these surveys. A future survey’s primaryaim is to discount models with the ultimate goal of one to prevail.We therefore attempt to directly optimize a survey for its intendedpurpose, model selection.In doing so we test Bayesian model selection in the contextof optimizing a ground-based spectroscopic baryon acoustic oscil-lation survey. To do this we extend an optimization based on theDETF FoM, designing it to instead target two Bayesian FoMs. Forthe sake of efﬁciency we assume a Gaussian likelihood in bothcases. The results of each one’s optimization are compared withthat of the original parameter estimation FoM.The

ΛCDM

Bayes factor, ln B ( − , , measures the survey’sability to prefer ΛCDM if it does transpire to be the correct model.This quantiﬁes the increase in probability of one model over an-other in light of fresh data, assuming w = − and w a = 0 for allcalculations.For a second FoM, which we call area − , the dark energy pa- c (cid:13) , 000–000 C. Watkinson et al. rameter space is gridded and discrete calculations of ln B ( w , w a ) are made for each point on the grid. For each of these calculationsthe w , w a values for that point in parameter space are assumed.The area in which ΛCDM was not discounted when presumed in-correct, i.e. all places where ln B ( w , w a ) > − . , was calculated.The smaller this area, the more effective a survey is at model selec-tion and its inverse forms our second FoM.The ln B ( − , FoM is implemented with only minor adjust-ment to the original optimization, and furthermore the calculationtime is unchanged. Whilst this FoM follows the same trend as theoriginal, and therefore its optimal survey agrees, it does enjoy theadded merit of being an absolute scale allowing interpretation ofwhen the survey is ‘good enough’. This added insight is invaluablefor making efﬁcient use of precious survey time or when biddingfor extra time allocations. It also allows for educated ﬂexibility, es-sential for adding independent science goals to a project.The area − FoM needs some further development to be use-ful in full MCMC optimizations, but even as presented here it haspotential for immediate application. We again ﬁnd this FoM fol-lows the trends of the original optimization, as such the resultingoptimal surveys will be very close. However there is more informa-tion to be had using this model selection approach; where the otherFoMs increase gradually in a near linear fashion before reaching abrief peak, the area − FoM is seen to reach values close to opti-mum for conﬁgurations the usual approach would deem relativelyweak. This approach provides better insight into the ﬂexibility ofthe survey’s observational strategy.Whilst these results do not blow the usual parameter esti-mation approach out of the water, they do present a powerfulalternative. Considering the extreme simplicity with which the ln B ( − , FoM can be implemented it seems wasteful to not atthe very least calculate this alongside the DETF FoM. As men-tioned the area − FoM also has potential for immediate applicationeven before improvements such as MCMCing the area or parallelis-ing are considered. Whilst there might not always be a strong trendsuch as the need to maximise survey volume to allow the ﬁxing ofeverything but time, there will invariably be reﬁned optimizationsfor which the discrete method used here is applicable.Dark energy surveys often require several probes be exploited,some requiring different observational strategies; for example, theDark Energy Survey has one operational mode for taking super-novae data, and another for everything else (Annis et al. 2004).These Bayesian FoMs are perfect for identifying the best time sharebetween such observational modes, ensuring each independent sur-vey mode is good enough to achieve its design goals.The Gaussian approach used here is seen to be reasonable forthe ln B ( − , FoM. We do not however investigate its impacton the area − FoM, which requires future work as the assumptionwill be less appropriate away from the ﬁducial point in dark energyparameter space. However it seems unlikely that the general trendwill be severely altered, and as such development to improve thespeed of our implementation would be beneﬁcial.Despite the fact we have focused on a particular survey, themethods discussed here are applicable to any dark energy optimiza-tion; furthermore this can, in theory, be employed beyond dark en-ergy surveys and even cosmology itself. Therefore it will be inter-esting to see how such FoMs fare under different circumstances,especially in cases where an absolute scale is of particular use, asis the case for multi-probe surveys.

ACKNOWLEDGEMENTS

C.W. was supported by the South East Physics Network (SEPnet),A.R.L. and P.M. by the Science and Technology Facilities Coun-cil [grant numbers ST/F002858/1 and ST/I000976/1], A.R.L. bya Royal Society–Wolfson Research Merit Award, and D.P. by theAustralian Research Council through a Discovery Project grant. Wethank Bruce Bassett and Bob Nichol for comments.

REFERENCES

Albrecht A. et al., 2006, astro-ph/0609591Albrecht A. et al., 2009, arXiv:0901.0721Annis J. et al., 2004, available from:https://des.fnal.gov/main documents/A Proposal to NOAO.pdfBassett B. A., 2005, Phys. Rev., D71, 083517, astro-ph/0407201Bassett B. A., Nichol R. C., Eisenstein D. J., 2005a, Astron. Geophys, 46,526, astro-ph/0510272Bassett B. A., Parkinson D., Nichol R. C., 2005b, ApJ, 626, L1, astro-ph/0409266Caldwell R. R., Kamionkowski M., 2009, Ann. Rev. Nucl. Part. Sci, 59,397, arXiv:0903.0866Chevallier M., Polarski D., 2001, Int. J. Mod. Phys., D10, 213, gr-qc/0009008Cox R. T., 1946, Am. J. Phys., 14, 1, 1Feroz F., Hobson M. P., Bridges M., 2009, MNRAS, 398, 1601,arXiv:0809.3437Gregory P., 2005,

Bayesian Logical Data Analysis for the Physical Sci-ences , Cambridge University PressHastings, W. K., 1970, Biometrika, 57 97Hobson M., Jaffe A. H., Liddle A. R., Mukherjee P., Parkinson D., 2010,

Bayesian Methods in Cosmology . Cambridge University PressJaynes E. T., 2003,

Probability Theory: The Logic of Science , CambridgeUniversity PressJeffreys H., 1961,

Theory of Probability , 3rd edition, Oxford UniversityPressKass R. E., Raftery A. E., 1995, JASA, 90, 430, 773Linder E. V., 2003, Phys. Rev. Lett., 90, 091301, astro-ph/0402503MacKay D. J. C., 2003,

Information theory, inference and learning algo-rithms (cid:13)000

Information theory, inference and learning algo-rithms (cid:13)000 , 000–000 odel selection optimization Figure A1. (a) Revised z max - DETF FoM relation when the universe isassumed ﬂat (red solid line) compared to when curvature is included (bluedashed line) (b) Original z max - DETF FoM relation when the universe isassumed ﬂat (black solid line) compared to when curvature is included (reddashed line). [Lower image from P10.] APPENDIX A: REVISION OF P10 RESULTS

The optimal survey parameter values found with the debugged ver-sion of the optimization are summarised in Figure A1 and Table A1.The most pronounced difference compared with P10 is the reduc-tion of performance predicted by this optimization, with returnedFoMs around a factor of 3 smaller. This has impact on the fore-casted ﬁtness of this survey, which at the time of P10’s publishingperformed fairly well alongside its competitors. These results showit would not have been as strong as previously thought.The improvement in performance achieved by carrying outoptimization is around a factor of 3 in both the original and de-bugged results, this is seen by comparison with the unoptimizedFoMs. There is also general agreement that including curvature de-grades the FoM, due to presence of an extra parameter in the Fishermatrix diluting constraining power on the dark energy parameters.In the original results, it was inferred that a gamble exists inthe assumption of ﬂatness. P10 found that to proceed with the ﬂatoptimization settings, i.e. limiting observation to . < z < . ,would be damaging if curvature does in fact require constraining.However the new results ﬁnd far less of a clash of interests, withthe inclusion of curvature pushing the upper redshift limit up from Survey Parameter Flat Curved A low (sq.degs) 6300 6300 τ low (hours) 1500 1500 z min (low) z max (low) (h / Mpc ) 5 . × − . × − number of galaxies (low) . × . × FoM 20 11Unoptimized FoM 7 3

Table A1.

Revised optimal survey parameters obtained with the debuggedversion of the WFMOS optimization.Flat opt. Curv opt. z range = 0 . – . z range = 0 . – . FoM (assuming ﬂat) 20 18FoM (Curvature incl.) 10 11

Table A2.

Revised results of degradation caused by either not accountingfor curvature in optimization when it is necessary or allowing for it when itis not. For all conﬁgurations the area is set to 6300 and the time in the lowto 1500. z min and z max in the high redshift mega-bin. Furthermorewhen this survey conﬁguration is tested with ﬂatness assumed, theFoM is 17, so there is still no serious impact on the power of thesurvey if considering curvature transpires to be unnecessary. This isa positive feature, as it is likely that non dark energy science wouldbeneﬁt greatly from such time allocation; its presence could havepotentially increased support for this project.This paper has been typeset from a TEX/ L A TEX ﬁle prepared by theauthor. c (cid:13) , 000–000 C. Watkinson et al. (a) z min (high) (b) z max (high) Figure A2.

The z range sensitivity when 380 hours assigned to high-redshift mega-bin. (a) Shows how the DETF FoM varies with z min (high) ,and (b) shows how the DETF FoM varies with z max (high) . The lack ofsensitivity indicates that time spent in the upper mega-bin is not vital, butis also not damaging providing the upper redshift limit is greater than ∼ (cid:13)000