[PDF] Steve: A hierarchical Bayesian model for Supernova Cosmology

Abstract

We present a new Bayesian hierarchical model (BHM) named Steve for performing type Ia supernova (SNIa) cosmology fits. This advances previous works by including an improved treatment of Malmquist bias, accounting for additional sources of systematic uncertainty, and increasing numerical efficiency. Given light curve fit parameters, redshifts, and host-galaxy masses, we fit Steve simultaneously for parameters describing cosmology, SNIa populations, and systematic uncertainties. Selection effects are characterised using Monte-Carlo simulations. We demonstrate its implementation by fitting realisations of SNIa datasets where the SNIa model closely follows that used in Steve. Next, we validate on more realistic SNANA simulations of SNIa samples from the Dark Energy Survey and low-redshift surveys. These simulated datasets contain more than 60000 SNeIa, which we use to evaluate biases in the recovery of cosmological parameters, specifically the equation-of-state of dark energy, w . This is the most rigorous test of a BHM method applied to SNIa cosmology fitting, and reveals small w -biases that depend on the simulated SNIa properties, in particular the intrinsic SNIa scatter model. This w -bias is less than 0.03 on average, less than half the statistical uncertainty on w .These simulation test results are a concern for BHM cosmology fitting applications on large upcoming surveys, and therefore future development will focus on minimising the sensitivity of Steve to the SNIa intrinsic scatter model.

Full PDF

DDraft version May 2, 2019

Typeset using L A TEX twocolumn style in AASTeX62

Steve : A hierarchical Bayesian model for Supernova Cosmology

S. R. Hinton, T. M. Davis, A. G. Kim, D. Brout, C. B. D’Andrea, R. Kessler,

4, 5

J. Lasker,

4, 5

C. Lidman, E. Macaulay, A. M¨oller,

8, 6

M. Sako, D. Scolnic, M. Smith, R. C. Wolf, M. Childress, E. Morganson, S. Allam, J. Annis, S. Avila, E. Bertin,

13, 14

D. Brooks, D. L. Burke,

16, 17

A. Carnero Rosell,

18, 19

M. Carrasco Kind,

20, 11

J. Carretero, C. E. Cunha, L. N. da Costa,

19, 22

C. Davis, J. De Vicente, D. L. DePoy, P. Doel, T. F. Eifler,

24, 25

B. Flaugher, P. Fosalba,

26, 27

J. Frieman,

12, 5

J. Garc´ıa-Bellido, E. Gaztanaga,

26, 27

D. W. Gerdes,

29, 30

R. A. Gruendl,

20, 11

J. Gschwend,

19, 22

G. Gutierrez, W. G. Hartley,

15, 31

D. L. Hollowood, K. Honscheid,

33, 34

E. Krause, K. Kuehn, N. Kuropatkin, O. Lahav, M. Lima,

36, 19

M. A. G. Maia,

19, 22

M. March, J. L. Marshall, F. Menanteau,

20, 11

R. Miquel,

37, 21

R. L. C. Ogando,

19, 22

A. A. Plazas, E. Sanchez, V. Scarpine, R. Schindler, M. Schubnell, S. Serrano,

26, 27

I. Sevilla-Noarbe, M. Soares-Santos, F. Sobreira,

39, 19

E. Suchyta, G. Tarle, D. Thomas, V. Vikram, and Y. Zhang School of Mathematics and Physics, University of Queensland, Brisbane, QLD 4072, Australia Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA 19104, USA Department of Astronomy and Astrophysics, University of Chicago, Chicago, IL 60637, USA Kavli Institute for Cosmological Physics, University of Chicago, Chicago, IL 60637, USA The Research School of Astronomy and Astrophysics, Australian National University, ACT 2601, Australia Institute of Cosmology and Gravitation, University of Portsmouth, Portsmouth, PO1 3FX, UK ARC Centre of Excellence for All-sky Astrophysics (CAASTRO) School of Physics and Astronomy, University of Southampton, Southampton, SO17 1BJ, UK Graduate School of Education, Stanford University, 160, 450 Serra Mall, Stanford, CA 94305, USA National Center for Supercomputing Applications, 1205 West Clark St., Urbana, IL 61801, USA Fermi National Accelerator Laboratory, P. O. Box 500, Batavia, IL 60510, USA CNRS, UMR 7095, Institut d’Astrophysique de Paris, F-75014, Paris, France Sorbonne Universit´es, UPMC Univ Paris 06, UMR 7095, Institut d’Astrophysique de Paris, F-75014, Paris, France Department of Physics & Astronomy, University College London, Gower Street, London, WC1E 6BT, UK Kavli Institute for Particle Astrophysics & Cosmology, P. O. Box 2450, Stanford University, Stanford, CA 94305, USA SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA Centro de Investigaciones Energ´eticas, Medioambientales y Tecnol´ogicas (CIEMAT), Madrid, Spain Laborat´orio Interinstitucional de e-Astronomia - LIneA, Rua Gal. Jos´e Cristino 77, Rio de Janeiro, RJ - 20921-400, Brazil Department of Astronomy, University of Illinois at Urbana-Champaign, 1002 W. Green Street, Urbana, IL 61801, USA Institut de F´ısica d’Altes Energies (IFAE), The Barcelona Institute of Science and Technology, Campus UAB, 08193 Bellaterra(Barcelona) Spain Observat´orio Nacional, Rua Gal. Jos´e Cristino 77, Rio de Janeiro, RJ - 20921-400, Brazil George P. and Cynthia Woods Mitchell Institute for Fundamental Physics and Astronomy, and Department of Physics and Astronomy,Texas A&M University, College Station, TX 77843, USA Department of Astronomy/Steward Observatory, 933 North Cherry Avenue, Tucson, AZ 85721-0065, USA Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Dr., Pasadena, CA 91109, USA Institut d’Estudis Espacials de Catalunya (IEEC), 08034 Barcelona, Spain Institute of Space Sciences (ICE, CSIC), Campus UAB, Carrer de Can Magrans, s/n, 08193 Barcelona, Spain Instituto de Fisica Teorica UAM/CSIC, Universidad Autonoma de Madrid, 28049 Madrid, Spain Department of Astronomy, University of Michigan, Ann Arbor, MI 48109, USA Department of Physics, University of Michigan, Ann Arbor, MI 48109, USA Department of Physics, ETH Zurich, Wolfgang-Pauli-Strasse 16, CH-8093 Zurich, Switzerland Santa Cruz Institute for Particle Physics, Santa Cruz, CA 95064, USA Center for Cosmology and Astro-Particle Physics, The Ohio State University, Columbus, OH 43210, USA Department of Physics, The Ohio State University, Columbus, OH 43210, USA Australian Astronomical Optics, Macquarie University, North Ryde, NSW 2113, Australia Departamento de F´ısica Matem´atica, Instituto de F´ısica, Universidade de S˜ao Paulo, CP 66318, S˜ao Paulo, SP, 05314-970, Brazil Instituci´o Catalana de Recerca i Estudis Avan¸cats, E-08010 Barcelona, Spain Brandeis University, Physics Department, 415 South Street, Waltham MA 02453 Instituto de F´ısica Gleb Wataghin, Universidade Estadual de Campinas, 13083-859, Campinas, SP, Brazil a r X i v : . [ a s t r o - ph . C O ] M a y Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 Argonne National Laboratory, 9700 South Cass Avenue, Lemont, IL 60439, USA

ABSTRACTWe present a new Bayesian hierarchical model (BHM) named

Steve for performing type Ia supernova(SN Ia) cosmology ﬁts. This advances previous works by including an improved treatment of Malmquistbias, accounting for additional sources of systematic uncertainty, and increasing numerical eﬃciency.Given light curve ﬁt parameters, redshifts, and host-galaxy masses, we ﬁt

Steve simultaneously forparameters describing cosmology, SN Ia populations, and systematic uncertainties. Selection eﬀects arecharacterised using Monte-Carlo simulations. We demonstrate its implementation by ﬁtting realisationsof SN Ia datasets where the SN Ia model closely follows that used in

Steve . Next, we validate on morerealistic SNANA simulations of SN Ia samples from the Dark Energy Survey and low-redshift surveys(DES Collaboration et al. 2019). These simulated datasets contain more than

60 000

SNe Ia, which weuse to evaluate biases in the recovery of cosmological parameters, speciﬁcally the equation-of-state ofdark energy, w . This is the most rigorous test of a BHM method applied to SN Ia cosmology ﬁtting,and reveals small w -biases that depend on the simulated SN Ia properties, in particular the intrinsicSN Ia scatter model. This w -bias is less than . on average, less than half the statistical uncertaintyon w . These simulation test results are a concern for BHM cosmology ﬁtting applications on largeupcoming surveys, and therefore future development will focus on minimising the sensitivity of Steve to the SN Ia intrinsic scatter model.

Keywords: cosmology: supernovae INTRODUCTIONTwo decades have passed since the discovery of the ac-celerating universe (Riess et al. 1998; Perlmutter et al.1999). Since that time, the number of observed type Iasupernovae (SN Ia) has increased by more than an or-der of magnitude, with contributions from modern sur-veys at both low redshift (Bailey et al. 2008; Freedmanet al. 2009; Hicken et al. 2009a; Contreras et al. 2010;Conley et al. 2011), and higher redshift (Astier et al.2006; Wood-Vasey et al. 2007; Frieman et al. 2008; Bal-land et al. 2009; Amanullah et al. 2010; Chambers et al.2016; Sako et al. 2018). Cosmological analyses of thesesupernova samples (Kowalski et al. 2008; Kessler et al.2009b; Conley et al. 2011; Suzuki et al. 2012; Betouleet al. 2014; Rest et al. 2014; Scolnic et al. 2017) havebeen combined with complementary probes of large scalestructure and the CMB. For a recent review, see Huterer& Shafer (2018). While these eﬀorts have reduced theuncertainty on the equation-of-state of dark energy ( w )by more than a factor of two, it is still consistent witha cosmological constant and the nature of dark energyremains an unsolved mystery.In attempts to tease out the nature of dark energy, ac-tive and planned surveys are continually growing in sizeand scale. The Dark Energy Survey (DES, Bernsteinet al. 2012; Abbott et al. 2016) has discovered thou-sands of type Ia supernovae, attaining both spectroscop-ically and photometrically identiﬁed samples. The LargeSynoptic Survey Telescope (LSST, Ivezic et al. 2008;LSST Science Collaboration et al. 2009) will discovertens of thousands of photometrically classiﬁed super-novae. Such increased statistical power demands greaterﬁdelity and ﬂexibility in modelling supernovae for cos- mological purposes, as we will require reduced system-atic uncertainties to fully utilise these increased statis-tics (Betoule et al. 2014; Scolnic et al. 2017).As such, considerable resources are aimed at develop-ing more sophisticated supernova cosmology analyses.The role of simulations mimicking survey observationshas become increasingly important in determining biasesin cosmological constraints and validating speciﬁc super-nova models. First used in SNLS (Astier et al. 2006) andESSENCE analyses (Wood-Vasey et al. 2007), and thenreﬁned and improved for SDSS (Kessler et al. 2009b),simulations are a fundamental component of modernsupernova cosmology. Betoule et al. (2014) quantiseand correct observational bias using simulations, andmore recently Scolnic & Kessler (2016) and Kessler &Scolnic (2017) explore simulations to quantify observa-tional bias in SN Ia distances as a function of multi-ple factors to improve bias correction. ApproximateBayesian computation methods also make use of simu-lations, trading traditional likelihoods and analytic ap-proximations for more robust models with the cost ofincreased computational time (Weyant et al. 2013; Jen-nings et al. 2016). Bayesian Hierarchical models abound(Mandel et al. 2009; March et al. 2011, 2014; Rubin et al.2015; Shariﬀ et al. 2016; Roberts et al. 2017), and ei-ther use simulation-determined distance-corrections tocorrect for biases, or attempt to ﬁnd analytic approxi-mations for eﬀects such as Malmquist bias to model thebiases inside the BHM itself.In this paper, we lay out a new hierarchical modelthat builds oﬀ the past work of Rubin et al. (2015). Weinclude additional sources of systematic uncertainty, in-cluding an analytic formulation of selection eﬃciencyteve 3which incorporates parametric uncertainty. We alsoimplement a diﬀerent model of intrinsic dispersion toboth incorporate redshift-dependent scatter and to in-crease numerical eﬃciency, allowing our model to per-form rapid ﬁts to supernovae datasets.Section 2 is dedicated to a quick review of super-nova cosmology analysis methods, and Section 3 outlinessome of the common challenges faced by analysis meth-ods. In Section 4 we outline our methodology. Modelveriﬁcation on simulated datasets is given in Section 5,along with details on potential areas of improvement.We summarise our methodology in Section 6. REVIEWWhilst supernova observations take the form of pho-tometric time-series brightness measurements in manybands and a redshift measurement of the supernova (orits assumed host), most analyses do not work from thesemeasurements directly. Instead, most techniques ﬁt anobserved redshift and these photometric observations toa supernova model, with the most widely used being thatof the empirical SALT2 model (Guy et al. 2007, 2010).This model is trained separately before ﬁtting the su-pernova light curves for cosmology (Guy et al. 2010; Be-toule et al. 2014). The resulting output from the modelis, for each supernova, an amplitude x (which can beconverted into apparent magnitude, m B = − . ( x ) ),a stretch term x , and color term c , along with a covari-ance matrix describing the uncertainty on these sum-mary statistics ( C η ). As all supernovae are not identical,an ensemble of supernovae form a redshift-dependent,observed population of ˆ m B , ˆ x and ˆ c , where the hat de-notes an observed variable.This ensemble of ˆ m B , ˆ x and ˆ c represents an observedpopulation, which – due to the presence of various se-lection eﬀects – may not represent the true, underly-ing supernova population. Accurately characterisingthis underlying population, its evolution over redshift,and eﬀects from host-galaxy environment, is one of thechallenges of supernova cosmology. Given some mod-elled underlying supernova population that lives in theredshift-dependent space M B (absolute magnitude of thesupernova, traditionally in the Bessell B band), x , and c , the introduction of cosmology into the model is simple– it translates the underlying population from absolutemagnitude space into the observed population in appar-ent magnitude space. Speciﬁcally, for any given super-nova our map between absolute magnitude and apparentmagnitude is given by the distance modulus: µ obs = m B + α x − β c − M B + ∆ M · m + other corrections , (1)where M B is the mean absolute magnitude for all SN Iagiven x = c = , α is the stretch correction (Phillips1993; Phillips et al. 1999), and β is the color correction(Tripp 1998) that respectively encapsulate the empiri-cal relation that broader (longer-lasting) and bluer su-pernovae are brighter. ∆ M · m refers to the host-galaxy mass correlation discussed in Section 4.4.3. The dis-tance modulus µ obs is a product of our observations,however a distance modulus µ C can be precisely cal-culated given cosmological parameters and a redshift.The ‘other corrections’ term often includes bias correc-tions for traditional χ analyses. Bias corrections cantake multiple forms, such as a redshift-dependent func-tion (Betoule et al. 2014) or a 5D function of c , x , α , β and z (Kessler & Scolnic 2017; Scolnic et al. 2017).2.1. Traditional Cosmology Analyses

Traditional χ analyses such as that found in Riesset al. (1998); Perlmutter et al. (1999); Wood-Vasey et al.(2007); Kowalski et al. (2008); Kessler et al. (2009b);Conley et al. (2011); Betoule et al. (2014), minimise thediﬀerence in distance modulus between the observed dis-tance modulus attained from equation 1, and the cosmo-logically predicted distance modulus. The χ functionminimised is χ = ( µ obs − µ C ) † C − µ ( µ obs − µ C ) , (2)where C − µ is an uncertainty matrix that combines sta-tistical and systematic uncertainties (see Brout et al.(2019) for a review of these uncertainties for the DESsupernova analysis). The predicted µ C is deﬁned as µ C = (cid:20) d L (cid:21) , (3) d L = ( + z ) cH ∫ z dz (cid:48) E ( z (cid:48) ) , (4) E ( z ) = (cid:113) Ω m ( + z (cid:48) ) + Ω k ( + z (cid:48) ) + Ω Λ ( + z (cid:48) ) ( + w ) (5)where d L is the luminosity distance for redshift z givena speciﬁc cosmology, H is the current value of Hubble’sconstant in km s − Mpc − and Ω m , Ω k , and Ω Λ representthe energy density terms for matter, curvature and darkenergy respectively.The beneﬁt this analysis methodology provides isspeed – for samples of hundreds of supernovae or less,eﬃcient matrix inversion algorithms allow the likelihoodto be evaluated quickly. The speed comes with severallimitations. Firstly, formulating a χ likelihood resultsin a loss of model ﬂexibility by assuming Gaussian un-certainty. Secondly, the method of creating a covariancematrix relies on computing partial derivatives and thusany uncertainty estimated from this method loses infor-mation about correlation between sources of uncertainty.For example, the underlying supernova color popula-tion’s mean and skewness are highly correlated, how-ever this correlation is ignored when determining popu-lation uncertainty using numerical derivatives of popula-tion permutations. Whilst correlations can be incorpo-rated into a covariance matrix, it requires human aware-ness of the correlations and thus methods that can au-tomatically capture correlated uncertainties are prefer-able. Thirdly, the computational eﬃciency is dependenton both creating the global covariance matrix, and theninverting a covariance matrix with dimensionality lin-early proportional to the number of supernovae. As thisnumber increases, the cost of inversion rises quickly, andis not viable for samples with thousands of supernovae.A recent solution to this computational cost problem isto bin the supernovae in redshift bins, which producesa matrix of manageable size and can allow ﬁtting with-out matrix inversion at every step. Whilst binning dataresults in some loss of information, recent works testedagainst simulations show that this loss does not result insigniﬁcant cosmological biases (Scolnic & Kessler 2016;Kessler & Scolnic 2017).Selection eﬃciency, such as the well known Malmquistbias (Malmquist K. G. 1922) is accounted for by correct-ing the determined µ obs from the data, or equivalently,adding a distance bias to the µ C prediction. Speciﬁcally,Malmquist bias is the result of losing the fainter tail ofthe supernova population at high redshift. An exampleof Malmquist bias is illustrated in Figure 1, which simu-lates supernovae according to equation (1). Simulationsfollowing survey observational strategies and geometryare used to calculate the expected bias in distance mod-ulus, which is then subtracted from the observationaldata. When using traditional ﬁtting methods such asthat found in Betoule et al. (2014), these eﬀects are notbuilt into the likelihood and instead are formed by cor-recting data. This means that the bias uncertainty is notcaptured fully in the χ distribution, and subtle corre-lations between cosmological or population parametersand the bias correction is lost. Recent developmentssuch as the BBC method (Kessler & Scolnic 2017) in-corporate corrections dependent on α and β , improvingtheir capture of uncertainty on bias corrections in the χ likelihood.2.2. Approximate Bayesian Computation

To avoid the limitations of the traditional approaches,several recent methods have adopted ApproximateBayesian Computation, where supernova samples areforward modelled in parameter space and compared toobserved distributions. Weyant et al. (2013) providesan introduction into ABC methods for supernova cos-mology in the context of the SDSS-II results (Sako et al.2014) and ﬂat Λ CDM cosmology, whilst Jennings et al.(2016) demonstrates their superABC method on simu-lated ﬁrst season Dark Energy Survey samples. In bothexamples, the supernova simulation package SNANA(Kessler et al. 2009a) is used to forward model the dataat each point in parameter space.Simulations provide great ﬂexibility and freedom inhow to treat the systematic uncertainties and selectioneﬀects associated with supernova surveys. By using for-ward modelling directly from these simulations, datadoes not need to be corrected, analytic approximationsdo not need to be applied, and we are free to incorpo-rate algorithms that simply cannot be expressed in a . . . . . z µ o b s Observed SNUnobserved SNSN means 0 . . . . . . . h µ o b s − µ C i Figure 1.

An example of the eﬀects of Malmquist bias. Hereare shown 1000 simulated supernovae redshifts and distancemodulus given ﬁducial cosmology. The simulated survey ismagnitude limited, and all supernovae brighter than mag-nitude 24 are successfully observed (shown as blue dots),and all dimmer than 24th magnitude are not successfullyobserved (shown as red dots). By binning the supernovaealong redshift, and taking the mean distance modulus of thesupernovae in each bin, we can see that at higher redshiftwhere Malmquist bias kicks in, the population mean dropsand becomes biased. This source of bias must either be cor-rected by adjusting the data (such as subtracting the foundbias) for by incorporating Malmquist bias explicitly in thecosmological model. tractable likelihood such as those found in traditionalanalyses from Section 2.1. This freedom comes with acost – computation. The classical χ method’s mostcomputationally expensive step in a ﬁt is matrix inver-sion. For ABC methods, we must instead simulate anentire supernova population – drawing from underlyingsupernova populations, modelling light curves, applyingselection eﬀects, ﬁtting light curves and applying datacuts. This is an intensive process.One ﬁnal beneﬁt of ABC methods is that they canmove past the traditional treatment of supernovae withsummary statistics ( m B , x , and c ). Jennings et al.(2016) presents two metrics, which are used to measurethe distance between the forward modelled populationand observed population, and are minimised in ﬁtting.The ﬁrst metric compares forward modelled summarystatistic populations (denoted the ‘Tripp’ metric) andthe second utilises the observed supernova light curvesthemselves, moving past summary statistics. However,we note that evaluation of systematic uncertainty wasonly performed using the Tripp metric.2.3. Bayesian Hierarchical Models teve 5Sitting between the traditional models simplicity andthe complexity of forward modelling lies Bayesian hi-erarchical models (BHM). Hierarchical models utilisemultiple layers of connected parameters, with the layerslinked via well deﬁned and physically motivated condi-tional probabilities. For example, an observation of aparameter from a population will be conditioned on thetrue value of the parameter, which itself will be condi-tioned on the population distribution of that parame-ter. We can thus incorporate diﬀerent population dis-tributions, and parameter inter-dependence which can-not be found in traditional analyses where uncertaintymust be encapsulated in a covariance matrix. UnlikeABC methods, which can model arbitrary probabilitydistributions, BHM methods are generally constrainedto representing probabilities using analytic forms.With the introduction of multiple layers in our model,we can add more ﬂexibility than a traditional analy-sis whilst still maintaining most of the computationalbeneﬁts that come from having a tractable likelihood.Mandel et al. (2009, 2011, 2017) construct a hierarchi-cal model that they apply to supernova light-curve ﬁt-ting. March et al. (2011) derive a hierarchical modeland simplify it by analytically marginalising over nui-sance parameters to provide increased ﬂexibility withreduced uncertainty over the traditional method, butdo not incorporate bias corrections. March et al. (2014)and Karpenka (2015) improve upon this by incorporat-ing redshift-dependent magnitude corrections from Per-rett et al. (2010) to remove bias, and validate on 100realisations of SNLS-like simulations. The recent BA-HAMAS model (Shariﬀ et al. 2016) builds on this andreanalyses the JLA dataset (using redshift dependentbias corrections from Betoule et al. 2014), whilst in-cluding extra freedom in the correction factors α and β ,ﬁnding evidence for redshift dependence on β . Ma et al.(2016) performed a reanalysis of the JLA dataset withina Bayesian formulation, ﬁnding signiﬁcant diﬀerences in α and β values from the original analysis from Betouleet al. (2014). Notably, these methods rely on data thatis bias corrected or the methods ignore biases, howeverthe UNITY framework given by Rubin et al. (2015) in-corporates selection eﬃciency analytically in the model,and is applied to the Union 2.1 dataset (Suzuki et al.2012). The assumption made by the UNITY analysis isthat the bias in real data is perfectly described by an an-alytic function. They validate their model to be free ofsigniﬁcant biases using ﬁts to thirty realisations of super-nova datasets that are constructed to mimic the UNITYframework. The well known BEAMS (Bayesian esti-mation applied to multiple species) methodology fromKunz et al. (2007) has been extended and applied in sev-eral works (Hlozek et al. 2012), most lately to includeredshift uncertainty for photometric redshift applicationas zBEAMS (Roberts et al. 2017) and to include simu-lated bias corrections in Kessler & Scolnic (2017). Forthe latter case, by inferring biases using Bayesian mod- els, sophisticated corrections can be calculated and thenapplied to more traditional χ models.Whilst there are a large number of hierarchical modelsavailable, none of them have undergone comprehensivetests using realistic simulations to verify each models’respective bias. Additionally, testing has generally beenperformed on supernovae simulations with either Λ CDMcosmology or Flat Λ CDM cosmology. However, quanti-fying the biases on w CDM cosmology simulations withrealistic simulations is becoming critically important asprecision supernovae cosmology comes into its own, andfocus shifts from determination of Ω m to both Ω m and w .The ﬂexibility aﬀorded by hierarchical models allowsfor investigations into diﬀerent treatments of underly-ing supernova magnitude, color and stretch populations,host-galaxy corrections, and redshift evolution, each ofwhich will be discussed further in the outline of ourmodel below. Our model is designed to increase thenumerical eﬃciency of prior works whilst incorporatingthe ﬂexibility of hierarchical models. We reduce our de-pendence on an assumed scatter model in simulationsby not utilising bias-corrections in an eﬀort to providea valuable cross-check on analysis methodologies whichutilise scatter-model-dependant bias corrections. CHALLENGES IN SUPERNOVA COSMOLOGYThe diverse approaches and implementations appliedto supernova cosmology are a response to the signiﬁ-cant challenges and complications faced by analyses. Inthis Section, we outline several of the most prominentchallenges.Forefront among these challenges is our ignorance ofthe underlying type Ia intrinsic dispersion. Ideally, anal-ysis of the underlying dispersion would make use of anensemble of time-series spectroscopy to characterise thediversity of type Ia supernovae. However this data isdiﬃcult to obtain, and recent eﬀorts to quantify the dis-persion draw inference from photometric measurements.The underlying dispersion model is not a solved prob-lem, and we therefore test against two dispersion mod-els in this work. The ﬁrst is based on the Guy et al.(2010, hereafter denoted G10) scatter model, the sec-ond is sourced from Chotard et al. (2011, hereafter de-noted C11). As the SALT2 model does not include fulltreatment of intrinsic dispersion, each scatter model re-sults in diﬀerent biases in m B , x , and c when ﬁttingthe SALT2 model to light curve observations, and re-sults in increased uncertainty on the summary statisticsthat is not encapsulated in the reported covariance C η .These two scatter models are currently assumed to spanthe possible range of scatter in the underlying supernovapopulation. We have insuﬃcient information to preferone model over the other, and thus we have to accountfor both possible scatter models.The underlying supernova population is further com-plicated by the presence of outliers. Non-type Ia super-novae often trigger transient follow-up in surveys andcan easily be mistaken for type Ia supernovae and repre-sent outliers from the standardised SN Ia sample. Thiscontamination is not just a result of non-SN Ia beingobserved, but can also arise from host galaxy misiden-tiﬁcation causing incorrect redshifts being assigned tosupernovae. Diﬀerent optimizations to the host-galaxyalgorithm can result in misidentiﬁcation of the host atthe 3% to 9% level (Gupta et al. 2016), resulting ina broad population of outliers. For spectroscopic sur-veys, where both supernova type and redshift can beconﬁrmed through the supernova spectra, this outlierpopulation is negligible. However, for photometric sur-veys, which do not have the spectroscopic conﬁrmation,it is one of the largest challenges; how to model, ﬁt, andcorrect for contaminants.Finally, one of the other persistent challenges facingsupernova cosmology analyses are the high number ofsystematics. Because of the rarity of SN Ia events,datasets are commonly formed from the SN Ia discov-eries of multiple surveys in order to increase the num-ber of supernovae in a dataset. However, each surveyintroduces additional sources of systematic error, fromsources within each survey such as band calibration,to systematics introduced by calibration across surveys.Peculiar velocities, diﬀerent host environments, and dustextinction represent additional sources of systematic un-certainty which must all be modelled and accounted for. OUR METHODWe construct our hierarchical Bayesian model

Steve with several goals in mind: creation of a redshift-dependent underlying supernova population, treatmentof an increased number of systematics, and analytic cor-rection of selection eﬀects, including systematic uncer-tainty on those corrections. We also desire

Steve morecomputationally eﬃcient than prior works, such thatcosmological results from thousands of supernovae areobtainable in the order of hours, rather than days. Asthis is closest to the UNITY method from Rubin et al.(2015, hereafter denoted R15), we follow a similar modelscaﬀold, and construct the model in the programminglanguage Stan (Carpenter et al. 2017; Stan DevelopmentTeam 2017). The primary challenge of ﬁtting hierarchi-cal models is their large number of ﬁt parameters, andStan, which uses automatic diﬀerentiation and the no-U-turn Sampler (NUTS, a variant of Hamiltonian MonteCarlo), allows us to eﬃciently sample high dimensionalparameter space.At the most fundamental level, a supernova cosmol-ogy analysis is a mapping from an underlying populationonto an observed population, where cosmological param-eters are encoded directly in the mapping function. Thediﬃculty arises both in adequately describing the bi-ases in the mapping function, and in adding suﬃcient,physically motivated ﬂexibility in both the observed andunderlying populations whilst not adding too much ﬂex- ibility, such that model ﬁtting becomes pathological dueto increasing parameter degeneracies within the model.In the analysis of this article, the underlying modeluniverse maps to the observed universe as sketched inthe BHM of Figure 2. The dependencies that betweenthe model and observations can be tracked following thearrows of the BHM, and a summary of all the conditionalprobabilities can be found in 4.5.In the following sections, we will describe the modelparameters, the mapping functions that connect them todata, the eﬀect of sample selection (in Equation (12)),and the pathologies that can occur when evaluating themodel. Summaries of observables and model parametersare shown in Table 1 for easy reference.4.1.

Observed Populations

Observables

Like most of the BHM methods introduced previ-ously, we work from the summary statistics, where eachobserved supernova has a brightness measurement ˆ m B (which is analogous to apparent magnitude), stretch ˆ x and color ˆ c , with uncertainty on those values encodedin the covariance matrix C η . Additionally, each super-nova has an observed redshift ˆ z and a host-galaxy stellarmass associated with it, ˆ m , where the mass measurementis converted into a probability of being above so-lar masses. Our set of observables input into the Steve is therefore given as { ˆ m B , ˆ x , ˆ c , ˆ z , ˆ m , C η } , as shown in theprobabilistic graphical model (PGM) in Figure 2.As we are focused on the spectroscopically conﬁrmedsupernovae for this iteration of the method, we assumethe observed redshift ˆ z is the true redshift z such that P ( ˆ z | z ) = δ ( ˆ z − z ) . Potential sources of redshift error (suchas peculiar velocities) are taken into account not via un-certainty on redshift (which is technically challenging toimplement as varying redshifts introduce computationalcomplexity in computing the distance modulus integralby reducing the amount of pre-computation that canbe utilised) but instead uncertainty on distance modu-lus. Similarly, we take the mass probability estimate ˆ m as correct, and do not model a latent variable to rep-resent uncertainty on the probability estimate. One ofthe strengths of Steve (and the R15 analysis) is thatfor future data sets where supernovae have been clas-siﬁed photometrically, and we expect some misclassiﬁ-cation and misidentiﬁcation of the host galaxies, thesemisclassiﬁcations can naturally be modelled and takeninto account by introducing additional populations thatsupernovae have a non-zero probability of belonging to.4.1.2.

Latent Variables for Observables

The ﬁrst layer of hierarchy is the set of true (latent)parameters that describe each supernova. In contrastto the observed parameters, the latent parameters aredenoted without a hat. For example, c is the true colorof the supernova, whilst ˆ c is the observed color, which,as it has measurement error, is diﬀerent from c .teve 7 Table 1.

Model ParametersParameter Description

Global Parameters Ω m Matter density w Dark energy equation of state α Stretch standardisation β color standardisation δ ( ) Scale of the mass-magnitude correction δ (∞)/ δ ( ) Redshift-dependence of mass-magnitude correction δ Z i Systematics scale (cid:104) M B (cid:105) Mean absolute magnitude

Survey Parameters δ S Selection eﬀect deviation (cid:104) x i (cid:105) Mean stretch nodes (cid:104) c i (cid:105) Mean color nodes α c Skewness of color population σ M B Population magnitude scatter σ x Population stretch scatter σ c Population color scatter κ Extra color dispersion κ Redshift-dependence of extra color dispersion

Supernova Parameters m B True ﬂux x True stretch c True color z True redshift M B Derived absolute magnitude µ Derived distance modulus

Input Data a ˆ m B Measured ﬂux ˆ x Measured stretch ˆ c Measured color C Covariance on ﬂux, stretch and color ˆ z Observed redshift ˆ m Observed mass probability a Not model parameters but shown for completeness.

For the moment, let us consider a single supernova andits classic summary statistics m B , x , c . For convenience,let us deﬁne η ≡ { m B , x , c } . A full treatment of the sum-mary statistics would involve determining p ( ˆ y | η ) , where ˆ y represents the observed light curves ﬂuxes and uncer-tainties. However, this is computationally prohibitiveas it would require incorporating SALT2 light curve ﬁt-ting inside our model ﬁtting. Due to this computationalexpense, we rely on initially ﬁtting the light curve ob-servations to produce a best ﬁt ˆ η along with a × covariance matrix C η describing the uncertainty on ˆ η .Using this simpliﬁcation, our latent variables are given ˆ m B , ˆ x , ˆ c , ˆ C − η ˆ z E mass m B x , cz mass µ M B h x i i , h c i i ,σ x , σ c , α c κ , κ Ω m , w h M B i ,σ M B δ Z i α, β,δ ( ) , δ ( ∞ ) ∀ SN ∀ Survey

Figure 2.

Probabilistic graphical model for our likelihoodwithout selection eﬀects. Double-lined nodes represent ob-served variables, diamond nodes represent deterministic vari-ables, single-lined ellipse nodes represent ﬁt variables. TheSN box represents observed variables and latent variablesfor each individual supernova, whilst the survey box rep-resents survey-speciﬁc variables, which in general describethe supernova population for the survey and the systemat-ics associated with it. Variables that appear outside bothboxes represent top level model parameters. We note thatwe have shown the model to have latent variables { M B , x c } ,which uniquely determines m B , given µ and other parame-ters. Thus, the two nodes M B and m B , make up a singlelayer in our hierarchy, not two layers. In the code implemen-tation, m B is more eﬃciently parametrised instead of M B ,however the mathematics remains constant regardless of ifyou parametrise M B or m B as one can determine the other.We write out mass instead of m to reduce possible confusionwith magnitude in the diagram. Finally, as we take redshiftmeasurement ˆ z and mass probability ˆmass as exact they arenot conditioned on underlying distributions and are top levelparameters. by p ( ˆ η | η ) ∼ N ( ˆ η | η, C η ) . (6)As discussed in Section 3, the SALT2 model does not in-clude full treatment of intrinsic dispersion, and thus thisapproximation does not encapsulate the full uncertaintyintroduced from this dispersion.4.2. Underlying Population

Type Ia population

Unlike many previous formalisms which utilise M B asa singular number and model magnitude scatter on theapparent magnitude m B , we incorporate this scatter intothe underlying rest-frame population by having a popu-lation in absolute magnitude space. This is mathemat-ically equivalent, however allows us to model the un-derlying population and intrinsic scatter distinctly. Todenote this diﬀerence, we refer to the mean of our abso-lute magnitude population with (cid:104) M B (cid:105) .In addition to absolute magnitude, the underlying su-pernova population is also characterised by distributionsin color and stretch. We follow the prior work of R15 andmodel the color population as an independent redshift-dependent skew normal distribution for each survey. Forthe stretch population, we adopt a redshift-dependentnormal distribution, and magnitude dispersion is mod-elled as a normal distribution. We also tested a skew-normal approach for these parameters, reverting to thenormal distributions as they are computationally easierto evaluate and we found no reduction in cosmologi-cal bias with the skew-normal distributions for stretchand magnitude. Following R15 we allow the mean colorand stretch to vary over redshift, anchoring four equallyspaced redshift nodes spanning the redshift range ofeach survey, linearly interpolating between the nodes todetermine the mean stretch and color for a given red-shift. These nodes are represented as (cid:104) x i (cid:105) and (cid:104) c i (cid:105) .Both the color and stretch means are modelled withnormal priors. Initial versions of our model adopteda fully covariant multivariate skew normal (with skew-ness set to zero only for the magnitude component) tocapture correlations between m B and c , however patho-logical ﬁtting complications required us to simplify ourtreatment. We parametrise the color skewness α c bysampling δ c = α c / (cid:112) + α c which itself is given a uni-form prior U ( , . ) that allows α c to span positive val-ues in the realm of physical plausibility as determinedfrom constraints in Scolnic & Kessler (2016). We sample δ c in log space for eﬃciency in sampling close to zero.The width of the population, represented by the vector { σ M B , σ x , σ c } is subject to Cauchy priors with mean 0and width 1, following recommendations from the Stanuser guide.The only constant between survey populations is theabsolute magnitude (cid:104) M B (cid:105) . We model the colour skew-ness and the redshift-dependent means and width of thecolour and skew distributions individually for each sur-vey. The probability for a supernova to have true values M B , x , c given the underlying population is thus givenas P ( M B , x , c , z | θ ) = N ( M B |(cid:104) M B (cid:105) , σ M B ) N ( x |(cid:104) x ( z )(cid:105) , σ x ) N skew ( c |(cid:104) c ( z )(cid:105) , σ c , α c ) , (7)where θ = {(cid:104) M B (cid:105) , (cid:104) x ( z )(cid:105) , (cid:104) c ( z )(cid:105) , σ m B , σ x , σ c , α c } for leg-ibility. 4.2.2. Outlier populations

For the spectroscopic paper we do not consider out-lier populations, however we ensure that our model hasﬂexibility for such populations for future use with pho-tometrically classiﬁed surveys. We thus include a sim-plistic outlier population model. We follow R15 (andtherefore Kunz et al. 2007) by implementing a Gaus-sian mixture model, where an additional observable ofthe SN Ia probability would be needed in order to in-form the weights of the mixture model. For surveyswith low rates of contamination, it is not possible to ﬁta contamination population, and the mean of the out-lier population has been ﬁxed to the SN Ia populationin prior works. However, with the increased number ofcontaminants expected in the DES photometric sample,we seek a more physically motivated outlier population.We ﬁnd that an acceptable parametrisation is to modelthe outlier population with a mean absolute magnitudeof (cid:104) M outl B (cid:105) = (cid:104) M B (cid:105) + δ outl M B , where δ outl M B is constrained tobe positive, or even to be greater than a small positivenumber to reduce degeneracy between the two popula-tions. We note that this represents the mean brightnessof outliers, and so outliers could both be brighter anddimmer than the mean SN Ia absolute magnitude. Weset the population width to σ outl = in M B , x and c inour tests. The probability of each supernova falling intoeither population is determined by the observed type Iaprobability ˆ p . For the spectroscopic survey, we set thisto unity, and thus it is not included in Figure 2 or Table1. For the photometric proof-of-concept we provide anaccurate probability estimate. Further investigation onthe eﬀect of inaccurate estimates will be left for futureimprovements during the analysis of the DES photomet-ric sample.4.3. Correcting biased summary statistics

With the ﬁtted summary statistics ˆ η being biased andtheir uncertainty under-reported, we face a signiﬁcantchallenge utilising these statistics naively in supernovacosmology. We must either correct the observables toremove the biases introduced by the intrinsic dispersionof the underlying population, or incorporate this disper-sion into our model. We should also avoid assuming aspeciﬁc dispersion model – either the G10 or C11 model,or utilise the results of computing the bias from bothmodels.We model the extra dispersion only in color, and doso by adding independent uncertainty on the color ob-servation ˆ c . We note that extra dispersion in magni-tude ˆ m B (from coherent scatter) is absorbed completelyby the width of the underlying magnitude population(discussed in Section 4.2.1) without introducing cosmo-logical bias, which is not true of the color term, hencethe requirement for modelling additional color disper-sion. Tests on incorporating extra dispersion on stretchas well show that stretch is less biased than color, andcauses negligible bias in cosmology.teve 9As shown in (Kessler et al. 2013b), the extra colordispersion shows heavy redshift dependence, increasingwith redshift. This is an artifact of diﬀerent ﬁlters, how-ever as we may be subject to similar eﬀects in our ob-servational data, we decide to incorporate redshift de-pendence in our extra uncertainty. We thus add κ + κ z to our observed color uncertainty (in quadrature). The κ parameters are highly degenerate with the width ofthe intrinsic color population σ c . We subject them toCauchy priors centered on zero and with width . ,where κ is bounded between and . . We pick thismaximum value to allow extra dispersion without com-pletely subsuming the intrinsic population widths dueto the severe degeneracy, where this maximum valueeasily encapsulates the determined dispersion accordingto the results of Kessler et al. (2013b). As such, ourcombined covariance on the observation ˆ η is given by C tot = C η + DiagMatrix (cid:2) , , ( κ + κ z ) (cid:3) .Fully covariant extra dispersion on { m B , x , c } (ratherthan just dispersion on c ) was also tested, by modellingthe dispersion as a multivariate Gaussian, but it showednegligible improvement in recovering unbiased cosmol-ogy over just color dispersion, and was far more com-putationally ineﬃcient. We note here that we modeldispersion in magnitude, but this is done at the level ofunderlying populations, not observed populations. Thismagnitude dispersion is modelled with redshift indepen-dence. 4.4. Mapping function

Cosmology

We formulate our model with three diﬀerent cosmo-logical parameterisations; Flat Λ CDM, Flat w CDM, andstandard Λ CDM (without a ﬂatness prior). Ω m is giventhe prior U ( . , . ) , Ω Λ was treated with U ( , . ) and the equation of state w was similarly set to a ﬂatprior U (− . , − . ) . For calculating the distance mod-ulus, we ﬁx H = − Mpc − . If the Hubble con-stant has a diﬀerent value, the absolute magnitude is M B + ( H / − Mpc − ) with the other cosmolog-ical parameters unaﬀected.4.4.2. Supernova Standardisation Parameters

With increasingly large datasets and more nuancedanalyses, the choice of how to handle α and β be-comes an important consideration when constructing amodel. R15 employs a broken linear relationship forboth color and stretch, where diﬀerent values of α and β are adopted depending on whether x and c are positiveor negative (although the cut could be placed at a loca-tion other than 0). Shariﬀ et al. (2016) instead model β as redshift-dependent, testing two phenomenologicalmodels; β ( z ) = β + β z and a second model which eﬀectsa rapid but smooth change in β at a turnover redshift z t .We tested two models with varying β against simu-lated supernova sets; β ( c ) = β + β c and β ( z ) = β + β z . See Section 5.2 for details on simulation generation. Wefound for both models that non-zero values for β arepreferred even with constant β used in simulation, dueto severe degeneracy with selection eﬀects. This degen-eracy resulted in a signiﬁcant bias in recovered cosmol-ogy. Due to the recovery of non-zero β , we continue toadopt the constant α and β found in traditional anal-yses. As such, our calculation of distance modulus µ mirrors that found in Equation (3).4.4.3. Host Galaxy Environment

There are numerous results showing statistically sig-niﬁcant correlations between host-galaxy environmentand supernova properties (Kelly et al. 2010; Lampeitlet al. 2010; Sullivan et al. 2010; D’Andrea et al. 2011;Gupta et al. 2011; Johansson et al. 2013; Rigault et al.2013). The latest sample of over 1300 spectroscopi-cally conﬁrmed type Ia supernovae show > σ evidencefor correlation between host mass and luminosity (Ud-din et al. 2017). The traditional correction, as em-ployed in analyses such as Suzuki et al. (2012) and Be-toule et al. (2014), invokes a step function such that ∆ M = γ H ( log ( M ) − )) , where H is the Heaviside stepfunction, M is the galaxy mass in solar masses and γ represents the size of the magnitude step. The scale ofthis step function varies from analysis to analysis, andis treated as a ﬁt parameter. In this work we adopt themodel used in R15, which follows the work from Rigaultet al. (2013), such that we introduce two parameters toincorporate a redshift-dependent host galaxy mass cor-rection: ∆ M = δ ( )  . (cid:16) − δ ( ) δ (∞) (cid:17) . + . z + δ ( ) δ (∞)  , (8)where δ ( ) represents the correction at redshift zero,and δ (∞) a parameter allowing the behaviour to changewith increasing redshift. We take ﬂat priors on δ ( ) and δ ( )/ δ (∞) . Finally, we assume that the observed massprobability ˆ m supplied to the model is perfectly deter-mined, and thus set P ( ˆ m | m ) = δ ( ˆ m − m ) .4.4.4. Uncertainty Propagation

The chief diﬃculty with including systematic uncer-tainties in supernova analyses is that they have diﬃcult-to-model eﬀects on the output observations. As such,the traditional treatment for systematics is to computetheir eﬀect on the supernova summary statistics – com-puting the numerical derivatives d ˆ m B d Z i , d ˆ x d Z i , d ˆ c d Z i for eachsupernova light curve ﬁt, where Z i represents the i th systematic.Assuming that the gradients can be linearly extrapo-lated – which is a reasonable approximation for modernsurveys with high quality control of systematics – wecan incorporate into our model a deviation from the ob-0served values by constructing a ( × N sys ) matrix contain-ing the numerical derivatives for the N sys systematics andmultiplying it with the row vector containing the oﬀsetfor each systematic. By scaling the gradient matrix torepresent the shift over σ of systematic uncertainty, wecan simply enforce a unit normal prior on the systematicrow vector to increase computational eﬃciency.This method of creating a secondary covariance ma-trix using partial derivatives is used throughout the tra-ditional and BHM analyses. For each survey and band,we have two systematics — the calibration uncertaintyand the ﬁlter wavelength uncertainty. We include thesein our approach, in addition to including HST Calspeccalibration uncertainty, ten SALT2 model systematicuncertainties, a dust systematic, a global redshift biassystematic, and also the systematic peculiar velocity un-certainty. A comprehensive explanation of all systemat-ics is given in Brout et al. (2019); see Table 4 for de-tails. This gives thirteen global systematics shared byall surveys that apply globally to all supernova sum-mary statistics, plus two systematics per band in eachsurvey. Systematics with known correlations are shiftedtogether to produce covariant deviations, and we thusassume that the numerical derivatives input into ourmodel represent independent systematics. Full detailscan be found in Brout et al. (2019). With η ≡ { m B , x , c } ,our initial conditional likelihood for our observed sum-mary statistics shown in Equation (6) becomes P (cid:18) ˆ η, ∂ ˆ η∂ Z i | η, δ Z i , C η (cid:19) = N (cid:18) ˆ η + δ Z i ∂ ˆ η∂ Z i | η, C η (cid:19) . (9)4.4.5. Selection Eﬀects

One large diﬀerence between traditional methods andBHM methods is that we treat selection eﬀects by in-corporating selection eﬃciency into our model, ratherthan relying on simulation-driven data corrections. Wedescribe the probability that the possible events weobserve are drawn from the distribution predicted bythe underlying theoretical model and that those events,given they happened, are observed and pass cuts. Tomake this extra conditional explicit, we can write thelikelihood of the data given an underlying model, θ , and that the data are included in our sample, denoted by S ,as L ( θ ; data ) = P ( data | θ, S ) . (10)As our model so far describes components of a basiclikelihood P ( data | θ ) , and we wish to formulate a func-tion P ( S | data , θ ) that describes the chance of an eventbeing successfully observed, we rearrange the likelihoodin terms of those functions and ﬁnd L ( θ ; data ) = P ( S | data , θ ) P ( data | θ ) ∫ P ( S | D , θ ) P ( D | θ ) dD , (11)where the denominator represents an integral over allpotential data D , and θ represents top-level parameters. In the case that our selection eﬀects are best charac-terised by latent variables instead of data, we can addthem to our formulation and our likelihood becomes L ( θ ; data ) = ∫ P ( S | L , θ ) P ( data | L ) P ( L | θ ) dL ∬ P ( S | L , θ ) P ( L | θ ) dL , (12)where L represents our latent parameters. This is de-rived in Appendix A.1. To evaluate the eﬀect of ourselection eﬀects, we need to evaluate both the selectioneﬀect terms in the numerator and the integral in thedenominator. The numerator represents the probabil-ity we caught the supernova and it was selected into thecosmology sample. The integral represents our global se-lection eﬃciency at a location in parameter space, ratherthan the probability of our data being selected into oursample. As θ represents the vector of all top-level modelparameters, and L represents a vector of all latent pa-rameters, this is not a trivial integral. Techniques toapproximate this integral, such as Monte-Carlo integra-tion or high-dimensional Gaussian processes failed togive tractable posterior surfaces that could be sampledeﬃciently by Hamiltonian Monte-Carlo, and post-ﬁttingimportance sampling failed due to high-dimensionality(a brief dismissal of many months of struggle). We there-fore simplify the integral and approximate the selectioneﬀects from their full expression in all of θ -space, toapparent magnitude and redshift space independently(not dependent on x or c ), such that the denominatorof equation (12), denoted now d for simplicity, is givenas d = ∫ (cid:20)∫ P ( S | m B ) P ( m B | z , θ ) dm B (cid:21) P ( S | z ) P ( z | θ ) dz , (13)where P ( m B | z , θ ) can be expressed by translating the un-derlying M B , x , and c population to m B given cosmo-logical parameters. A full derivation of this can be foundin Appendix A.2.We now apply two further approximations similar tothose made in R15 – that the redshift distribution ofthe observed supernovae reasonably well samples the P ( S | z ) P ( z | θ ) distribution, and that the survey color andstretch populations can be treated as Gaussian for thepurposes of evaluating P ( m B | z , θ ) . We found that dis-carding the color population skewness entirely resultedin highly biased population recovery (see Figure 12 tosee the populations), and so we instead characterise theskew normal color distribution with a Gaussian that fol-lows the mean and variance of a skew normal; with meangiven by (cid:104) c ( z )(cid:105) + (cid:113) π σ c δ c and variance σ c ( − δ c / π ) .This shifted Gaussian approximation for color com-pletely removes the unintended bias when simply dis-carding skewness. This shift was not required for thestretch population, and so was left out for the stretchpopulation for numerical reasons. The impact of thisapproximation on the calculated eﬃciency is shown inteve 11 . . . . . E ﬃ c i e n c y CorrectUnshiftedShifted

Shift . . . ∆ E ﬃ c i e n c y Figure 3.

Testing the correctness of our normal approx-imation to the skewed color distribution. The ‘correct’line (shown in black) represents the exact integral w = ∫ P ( S | x ) P ( x ) dx where P ( S | x ) is an error function (followingour high-redshift surveys) and P ( x ) = N Skew ( x , . , ) , calcu-lated numerically. The x -axis is analogous to m B is cosmo-logical context. As expected, all eﬃciencies drop towardszero as shift increases (as objects get fainter). The unshiftednormal approximation shows signiﬁcant discrepancy in thecalculated eﬃciency as it transitions from 1 to 0, whilst theshifted normal approximation shows negligible error to thecorrect solution. From these plots, further reﬁnement of thenormal approximation (such as including kurtosis or higherpowers) are unnecessary. Figure 3, and more detail on this shift and resultingpopulation recovery can be found in Appendix A.3.The population P ( m B | z , θ ) becomes N ( m B | m ∗ B ( z ) , σ ∗ m B ) ,where m ∗ B ( z ) = (cid:104) M B (cid:105) + µ ( z ) − α (cid:104) x ( z )(cid:105) + β (cid:104) c ( z )(cid:105) (14) σ ∗ m B = σ M B + ( ασ x ) + ( βσ c ) . (15)What then remains is determining the functional formof P ( S | m B ) . For the treatment of most surveys, we ﬁndthat the error function, which smoothly transitions fromsome constant eﬃciency down to zero, is suﬃcient. For-mally, this gives P ( S | m B ) = Φ c ( m B | µ CDF , σ

CDF ) , (16)where Φ c is the complementary cumulative distributionfunction and µ CDF and σ CDF specify the selection func-tion. The appropriateness of an error function has been found by many past surveys (Dilday et al. 2008; Barbaryet al. 2010; Perrett et al. 2012; Graur et al. 2013; Rod-ney et al. 2014). However, for surveys which suﬀer fromsaturation and thus rejection of low-redshift supernovae,or for groups of surveys treated together (as is commonto do with low-redshift surveys), we ﬁnd that a skewnormal is a better analytic form, taking the form P ( S | m B ) = N Skew ( m B | µ Skew , σ

Skew , α

Skew ) . (17)The selection functions are ﬁt to apparent magnitudeeﬃciency ratios calculated from SNANA simulations, bytaking the ratio of supernovae that are observed andpassed cuts over the total number of supernovae gen-erated in that apparent magnitude bin. That is, wecalculate the probability we would include a particularsupernova in our sample, divided by the number of suchsupernovae in our simulated ﬁelds. To take into accountthe uncertainty introduced by the imperfection of ouranalytic ﬁt to the eﬃciency ratio, uncertainty was uni-formly added in quadrature to the eﬃciency ratio datafrom our simulations until the reduced χ of the analyticﬁt reached one, allowing us to extract an uncertainty co-variance matrix for our analytic ﬁts to either the errorfunction or the skew normal. This is mathematicallyidentical to ﬁtting the eﬃciency ratio with a second ‘in-trinsic dispersion’ parameter which adds uncertainty tothe eﬃciency ratio data points.We thus include into our model parametrised selectioneﬀects by including the covariance matrix of selection ef-fect uncertainty. Formally, we include deviations fromthe determined mean selection function parameters withparameter vector δ S , and apply a normal prior on thisparameter as per the determined uncertainty covariancematrix. Whilst this uncertainty encapsulates the poten-tial error from the simulations not matching the analyticapproximations, it does not cover potential variations ofthe selection function at the top level — varying cos-mology or spectroscopic eﬃciency. Tests with changingthe intrinsic scatter model used in the selection eﬃciencysimulations show that the uncertainty introduced is neg-ligible.With the well-sampled redshift approximation we canremove the redshift integral in Eq (13) and replace itwith a correction for each observed supernova. For theerror function (denoted with the subscript ‘CDF’) andskew normal selection functions respectively (denotedwith a subscript ‘Skew’), the correction per SN Ia be-2comes d CDF = Φ c (cid:169)(cid:173)(cid:173)(cid:171) m ∗ B − µ CDF (cid:113) σ ∗ m B + σ (cid:170)(cid:174)(cid:174)(cid:172) (18) d Skew = N (cid:169)(cid:173)(cid:173)(cid:171) m ∗ B − µ Skew (cid:113) σ ∗ m B + σ (cid:170)(cid:174)(cid:174)(cid:172) × Φ (cid:169)(cid:173)(cid:173)(cid:173)(cid:173)(cid:171) sign ( α Skew )( m ∗ B − µ Skew ) σ ∗ mB + σ σ (cid:114) σ α + σ ∗ mB σ σ ∗ mB + σ (cid:170)(cid:174)(cid:174)(cid:174)(cid:174)(cid:172) , (19)and is incorporated into our likelihood. Note that theabove eﬃciencies utilise the common form of the normaldistribution rather than the conditional probability no-tation found previously in this work. This is illustratedin Figure 4. Our corrections for the DES spectroscopicdata utilise the CDF functional form, with the combinedlow redshift surveys being modelled with the skew nor-mal eﬃciency. Further details on this choice are givenin Section 5.2. 4.5. Model Summary

Having laid out each individual aspect of the model,the relationships between variables and our treatment ofuncertainty, here we summarise the relationships in ourmodel mathematically. In this summary of P ( data | θ ) ,we leave out sample selection for simplicity. Referringto equation (12), the relationships with P ( data | θ ) are asfollows: P ( data | θ ) ∝ ∫ d µ dm B dx dc dz , dm P ( ˆ z | z ) P ( ˆ m | m ) P ( ˆ m B , ˆ x , ˆ c , ˆ C − η | m B , x , c , δ Z i , κ , κ ) P ( x , c |(cid:104) x i (cid:105) , (cid:104) c i (cid:105) , σ x , σ c , z ) P ( m B |(cid:104) M B (cid:105) , µ, δ ( ) , δ (∞) , σ M B , σ x , σ c , α, β, z , m ) P ( µ | z , Ω m , w ) . (20)The denominator of equation (12) is then given byeither equation (18) or (19) depending on the survey,and similarly P ( S | data , θ ) is given by equation (17) or(16) respectively. Combining all of these gives us our fullmodel likelihood with selection eﬀects accounted for. MODEL VERIFICATIONIn order to verify our model we run it through strin-gent tests. First, we validate on toy models, verifyingthat we recover accurate cosmology when generating toysupernovae data constructed to satisfy the assumptionsof the BHM construction. We then validate our model . . . P r o b a b ili t y d CDF

SNIa pop: P ( m B | θ ) CDF eﬀ: P ( S | m B ) P ( S | m B ) P ( m B | θ )19 20 21 22 23 24 25 26 27 m B . . . P r o b a b ili t y d Skew

SNIa pop: P ( m B | θ ) Skew eﬀ: P ( S | m B ) P ( S | m B ) P ( m B | θ ) Figure 4.

The eﬃciency for supernova discovery at an ar-bitrary redshift. Shown in both panels in dashed blue isthe SN Ia population distribution, which takes the form of anormal distribution. The top panel shows a CDF based sur-vey eﬃciency (green dotted line), whilst the bottom panelshows a skew normal based survey eﬃciency (red dottedline), as functions of apparent magnitude. The survey ef-ﬁciency, given the SN Ia population, is shown as a solid linein both panels, and the probability of observing a SN Ia isfound by integrating over the population detection eﬃciencyas described in equation (13), and has been shown by shadingthe area integrated. This area is what is analytically givenby equations (18) and (19). on SNANA simulations based on a collection of low red-shift surveys and the DES three-year spectroscopic sam-ple, termed the DES-SN3YR sample5.1.

Applied to Toy Spectroscopic Data

We generate simple toy data to validate the basicpremise of the model. The data generation algorithmis described below:1. Draw a redshift from a power law distribution. Forthe low redshift survey this is U ( . , . ) . , andfor the DES-like survey this is U ( . , . ) . . Forthe low redshift survey, this is equivalent to sampling y = z from . to √ . , for the high redshift sur-vey, this is equivalent to sampling y = z from . to . . These distributions are arbitrary, and this testhas been performed with various ﬂat and power lawdistributions.teve 13 . . . . z − . − . − . − . − . m B − µ High-redshift .

02 0 .

04 0 .

06 0 .

08 0 . z − . − . − . − . − . m B − µ Low-redshift − . − . . . . . . C o l o u r − . − . . . . . . C o l o u r Figure 5.

Population distributions shown in redshift anduncorrected absolute magnitude m B − µ for 1000 supernovaein both high-redshift and low-redshift surveys. Selection ef-fects are visible in both samples, where red supernovae areoften cut as redshift increases, creating a skewed color pop-ulation. The color of the data points is representative of thesupernova color itself, a negative color value showing bluersupernovae, with positive color values representing reddersupernovae. Table 2.

Cosmological Parameters for toy supernova dataModel Ω m (cid:104) µ (cid:105) , (cid:104) σ (cid:105) (scatter) w (cid:104) µ (cid:105) , (cid:104) σ (cid:105) (scatter)Flat Λ CDM . , . ( . ) –Flat w CDM – − . , . ( . ) Note —Cosmological parameters determined from the sur-faces of 100 ﬁts to independent realisations of toy super-nova data. As described in the main text, each datasetcomprised 1000 low-redshift supernovae and 1000 high-redshift supernovae. For each chain, we record the meanand standard deviation, and then show the average meanand average standard deviation in the table. The scatterintroduced by simulation variance (the standard deviationof the 100 mean parameter values) is shown in brackets.Model bias would appear as shifts away from the simula-tion values of Ω m = . , w = − . . . . . Ω m . . . . Ω Λ Posterior MaximumsRepresentative Surface

Figure 6.

Maximal posterior points for 100 realisations ofsupernova data with the Flat Λ CDM model, with a repre-sentative contour from a single data realisation shown forcontext. Even a large supernova sample when treated ro-bustly is insuﬃcient to provide tight constraints on either Ω m or Ω Λ separately due to the severe degeneracy betweenthe parameters.

2. Draw a random mass probability from U ( , ) andcalculate the mass-brightness correction using δ ( ) = . , δ ( )/ δ (∞) = . , and equation (8).3. Draw an absolute magnitude, stretch and color fromthe respective distributions N (− . , . ) , N ( , ) , N ( , . ) .4. Calculate µ ( z ) given the drawn redshift and cosmo-logical parameters Ω m = . , w = − under Flat Λ CDM cosmology. Use this to determine the trueapparent magnitude of the object m B using m B = µ + M B − α x + β c .5. Determine if the SN Ia is detected using detec-tion probability P ( S | m B ) = N skew ( . , . , . ) for the low redshift survey (numeric values ob-tained by ﬁtting to existing low redshift data).For the DES-like survey, accept with probability P ( S | m B ) = Φ C ( . , . ) . Repeat from step oneuntil we have a supernova that passes. We use re-alistic values for the selection probability to ensureour model is numerically stable with highly skewedselection functions.6. Add independent, Gaussian observational error ontothe true m B , x , c using Gaussian widths of . , . ,4 . . . . . Ω m − . − . − . − . − . w Posterior MaximumsRepresentative Surface

Figure 7.

Maximal posterior points for 100 realisations ofsupernova data with the Flat w CDM model, with a repre-sentative contour from a single data realisation shown forcontext. The well known banana shaped contour is re-covered, with the marginalised distributions in Ω m and w shown to recover input cosmology. For contours that arenon-Gaussian due to the curved degeneracy between Ω m and w , the marginalised distributions can provide mislead-ing statistics where maximum marginalised distribution candisagree with the maximum likelihood in multiple dimen-sions. For our contours, the non-Gaussianity is small andthe marginalised distributions still provide a valuable met-ric. The recovered posteior maximums show the same de-generacy direction as the representative surface, and scatteraround the truth values input into the simulation, which areshown in dashed lines. . respectively (following the mean uncertainty forDES-like SNANA simulations). Add extra color un-certainty in quadrature of κ + κ z , where κ = κ = . .The selection functions parameters (a skew normalfor low-redshift and a complementary error function forhigh-redshift) are all given independent uncertainty of . (mean and width for the CDF selection function,and mean, width and skewness for the skew normal se-lection function). Draw from each survey simulation un-til we have 1000 low- z supernovae and 1000 DES-likesupernovae, representing a statistical sample of greaterpower than the estimated 350 supernovae for the DES-SN3YR sample. Sample data for 1000 high and low red-shift supernovae are shown in Figure 5, conﬁrming the presence of strong selection eﬀects in both toy surveys,as designed.We test four models: Flat Λ CDM, Flat w CDM, Λ CDM, and Flat w CDM with a prior Ω m ∼ N ( . , . ) ,with the latter included to allow sensitive tests on biasfor w . To achieve statistical precision, we ﬁt 100 realisa-tions of supernovae datasets. Cosmological parametersare recovered without signiﬁcant bias. Combined pos-terior surfaces of all 100 realisations ﬁts for Λ CDM areshown in Figure 6 and ﬁts for Flat w CDM are shown inFigure 7. By utilising the Stan framework and several ef-ﬁcient parametrisations (discussed further in AppendixB), ﬁts to these simulations of 2000 supernovae take onlyon order of a single CPU-hour to run.To investigate biases in the model in ﬁne detail, welook for systematic bias in Ω m in the Flat Λ CDM cos-mology test, and bias in w for the Flat w CDM test withstrong prior Ω m ∼ N ( . , . ) . This allows us to investi-gate biases without the investigative hindrances of non-Gaussian or truncated posterior surfaces. The strongprior on Ω m cuts a slice through the traditional ‘ba-nana’ posterior surface in the w - Ω m plane of Figure 7.Without making such a slice, the variation in w is largerdue to a shift along the degeneracy direction of the ‘ba-nana’. By focusing the slice at an almost ﬁxed Ω m , wecan see the variation in the mean value of w approxi-mately perpendicular to the lines of degeneracy, insteadof along them. The results of the analysis are detailed inTable 2, and demonstrate the performance of our modelin recovering the true cosmological parameters. As weare using 100 independent realisations, the precision ofour determination of the mean simulation result is ap-proximately a tenth of the quoted scatter (as a degreeof non-Gaussianity of our ﬁts will make this relation-ship inexact). The deviation from truth values is belowthis threshold, no signiﬁcant bias is detected in eitherthe Flat Λ CDM model or the Flat w CDM model. Withthis simple data, we also correctly recover underlyingsupernova populations, which can be seen in Figure 12.5.2.

DES SN data validation

Many BHM methods have previously been validatedon data constructed explicitly to validate the assump-tions of the model. This is a useful consistency checkthat the model implementation is correct, eﬃcient andfree of obvious pathologies. However, the real testof a model is its application to realistic datasets thatmimic expected observational data in as many possibleways. To this end, we test using simulations (using theSNANA package) that follow the observational scheduleand observing conditions for the DES and low- z sur-veys, where the low- z sample is based on observationsfrom CfA3 (Hicken et al. 2009a,b), CfA4 (Hicken et al.2012) and CSP (Contreras et al. 2010; Folatelli et al.2010; Stritzinger et al. 2011). Simulation speciﬁcs canbe found in Kessler et al. (2019). The primary diﬀer-ence between the toy data of the previous section areteve 15 Note —The SK16 low- z stretch distribution is formed as sumof two bifurcated Gaussians, with the mean and spread ofeach component given respectively. Table 3.

Supernova population distributionsModel (cid:104) x (cid:105) σ x (cid:104) c (cid:105) σ c SK16 low- z . & − . + . − . & + . − . − . + . − . SK16 DES . + . . − . + . . the diﬀerent underlying colour and stretch, inclusion ofspectroscopic data cuts and light curve cuts, and inclu-sion of intrinsic dispersion models.Prior analyses often treated intrinsic dispersion simplyas scatter in the underlying absolute magnitude of theunderlying population (Conley et al. 2011; Betoule et al.2014), but recent analyses require a more sophisticatedapproach. In our development of this model and testsof intrinsic dispersion, we analyse the eﬀects of two dif-ferent scatter models via simulations, the G10 and C11models described in Section 3. The G10 models dis-persion with 70% contribution from coherent variationand 30% from chromatic variation whilst the C11 modelhas 25% coherent scatter and 75% from chromatic varia-tion. These two broadband scatter models are convertedto spectral energy distribution models for use in simula-tions in Kessler et al. (2013a).In addition to the improvements in testing multiplescatter models, we also include peculiar velocities forthe low- z sample, and our full treatment of systematicsas detailed in Brout et al. (2019). Our simulated popula-tions are sourced from Scolnic & Kessler (2016, hereafterSK16) and shown in Table 5.2. Initial tests were alsodone with a second, Gaussian population with color andstretch populations centered on zero and with respectivewidth . and , however cosmological parameters werenot impacted by choice of the underlying population andwe continue using only the SK16 population for compu-tational eﬃciency. The selection eﬀects were quantiﬁedby comparing all the generated supernovae to those thatpass our cuts, as shown in Figure 8. It is from this sim-ulation that our analytic determination of the selectionfunctions for the low- z and DES survey are based. Werun two simulations to determine the eﬃciency using theG10 and C11 scatter models and ﬁnd no diﬀerence in thefunctional form of the Malmquist bias between the twomodels. Uncertainty on the analytic selection functionis incorporated into our ﬁts, mitigating the imperfectionof our analytic form by allowing it to vary in our ﬁts.Each realisation of simulated SN Ia light curves con-tains the SALT2 light-curve ﬁts and redshifts to 128 low- z supernovae, and 204 DES-like supernovae, such that

20 21 22 23 24 m B . . . . . . P r o b a b ili t y DES 3YR Spectroscopically ConﬁrmedFitted eﬃciencySimulation

13 14 15 16 17 m B . . . . . . . . . P r o b a b ili t y Low- z SampleFitted eﬃciencySimulation

Figure 8.

Fitting the selection function for both the DES3YR spectroscopically conﬁrmed supernova sample and thelow- z sample. Blue errorbars represent the eﬃciency cal-culated by determining the ratio of discovered to generatedsupernovae in apparent magnitude bins for SNANA simula-tions. The black line represents the best ﬁt analytic functionfor each sample, and the light grey lines surrounding the bestﬁt value represent random realisations of analytic functiontaking into account uncertainty on the best ﬁt value. the uncertainties found when combining chains is rep-resentative of the uncertainty in the DES-SN3YR sam-ple. As our primary focus is Dark Energy, we now focusspeciﬁcally on the Flat w CDM model with matter prior.Points of maximum posterior for 100 data realisationsare shown in Figure 9. The parameter bounds and biasesfor w are listed in Table 9, and secondary parameters areshown in Table 8.Table 9 shows that the G10 model is consistent with w = − , whilst the C11 model show evidence of biason w , scattering high. However, their deviation fromthe truth value represents a shift of approximately . σ when taking into account the uncertainty on ﬁts to w .6 Note —Standardisation parameters and base intrinsic scatter parameter results for the 100 ﬁts to G10 and C11 simulations. Weshow the average parameter mean and average standard deviation respectively, with the simulation scatter shown in brackets,such that each cell shows (cid:104) µ (cid:105) [(cid:104) σ (cid:105) ( scatter )] . The width of the intrinsic scatter ( σ m B ) does not have an input truth value as it isdetermined from the scatter model. Table 4.

Realistic simulation standardisation parametersModel α − α True β − β True (cid:104) M B (cid:105) − (cid:104) M B (cid:105) True σ DES m B σ low − zm B G10 Stat + Syst . [ . ( . )] . [ . ( . )] − . [ . ( . )] . [ . ( . )] . [ . ( . )] G10 Stat . [ . ( . )] . [ . ( . )] . [ . ( . )] . [ . ( . )] . [ . ( . )] C11 Stat + Syst . [ . ( . )] − . [ . ( . )] . [ . ( . )] . [ . ( . )] . [ . ( . )] C11 Stat . [ . ( . )] − . [ . ( . )] . [ . ( . )] . [ . ( . )] . [ . ( . )] Note —Investigating the combined 100 ﬁts to G10 and C11simulations, ﬁtting with both statistics only and also whenincluding systematics. The quoted value for w representsthe average mean of the ﬁts, with the average uncertaintybeing shown in square brackets and the simulation scatter(the standard deviation of the mean of 100 ﬁts) shown instandard brackets. The bias signiﬁcance represents our con-ﬁdence that the deviation in the mean w away from − is notdue to statistical ﬂuctuation. Table 5.

Realistic simulation determination of w Model w (cid:104) µ (cid:105) [(cid:104) σ w (cid:105) (scatter) ] w -BiasG10 Stat + Syst − . [ . ( . )] ( . ± . ) σ G10 Stat − . [ . ( . )] (− . ± . ) σ C11 Stat + Syst − . [ . ( . )] ( . ± . ) σ C11 Stat − . [ . ( . )] ( . ± . ) σ The bias is sub-dominant to both the size of the uncer-tainty for each ﬁt, and the scatter induced by statisticalvariance in the simulations. We also note that the sim-ulations do not vary cosmological parameters nor popu-lation. As our model does include uncertainty on thosevalues, the simulation scatter is expected to be less thanthe model uncertainty, and represents a minimum boundon permissible uncertainty values.Table 8 shows a clear diﬀerence in both β and σ m B across the G10 and C11 simulations. As expected, theC11 simulations recover a far smaller intrinsic magni-tude scatter, giving results of approximately . whencompared to the result of . for the G10 simulations.The extra smearing of the C11 model does not result ina signiﬁcantly biased β value compared to the averageuncertainty on β , with recovery of β ≈ . close to theinput truth value of . , however the β recovery for theG10 simulations is biased high, ﬁnding β ≈ . with an input of . . Interestingly, w -bias is only found forthe C11 simulations. A measure of the signiﬁcance ofthe parameter bias can be calculated by comparing thebias to a tenth of the scatter (as our Monte-Carlo esti-mate uncertainty is √ of the scatter). From this, wecan see that most biases are detected with high statisti-cal signiﬁcance due to the large number of simulationstested against.We investigate the cosmological bias and ﬁnd itssource to be a bias in the observed summary statistics(i.e. the ˆ m B , ˆ x , ˆ c output from SALT2 light curve ﬁt-ting), in addition to incorrect reported uncertainty onthe summary statistics. To conﬁrm this, we run twotests. The ﬁrst of which, we replace the SALT2-ﬁtted ˆ m B , ˆ x and ˆ c with random numbers drawn from a Gaus-sian centered on the true SALT2 m B , x and c val-ues with covariance as reported by initial light curveﬁts. With this test, both the G10 and C11 ﬁts recover w = − . exactly. Our second test aims to test ourmodel whilst allowing biases in the summary statisticsnot caused from intrinsic scatter through. That is, theﬁrst test ascertained that biases in the summary statis-tics are the cause of cosmological bias. It is thus impor-tant to determine the source of those biases; whetherthey are from intrinsic scatter model or another aspectof the simulation. To this end, we test a set of 100 simu-lations generated using an intrinsic dispersion model ofonly coherent magnitude scatter. We ﬁnd w = − . ,showing that the source of the biases in summary statis-tics is the underlying intrinsic scatter model. From this,the main challenge of improving our methodology is tohandle the fact that observational uncertainty reportedfrom ﬁtting the SALT2 model to light curves is incorrect,non-Gaussian and biased. Our current model and tech-niques can quantify the eﬀect of diﬀerent scatter mod-els on biasing the observed summary statistics, but be-ing unable to constrain the ‘correct’ (simulated) scattermodel in our model ﬁt means we cannot fully correct forthe bias introduced by an unknown scatter model.teve 17 − . − . − . − . w G10 Stat+SystG10 Stat . . . . α .

295 0 .

300 0 .

305 0 . Ω m . . . . β − . − . − . − . w .

135 0 .

150 0 .

165 0 . α − . − . − . − . w C11 Stat+SystC11 Stat . . . . α .

288 0 .

294 0 .

300 0 .

306 0 . Ω m . . . . β − . − . − . − . w .

135 0 .

150 0 .

165 0 . α Figure 9.

Maximum posterior points for 100 realisations ofsupernova data for two intrinsic dispersion models - the G10model for the top panel and the C11 model for the bottompanel. Points are shown for parameters Ω m , w , α and β , withthe other ﬁt parameters being marginalised over. As we areunable to fully correct observed summary statistics, a steprequired by the lack of intrinsic scatter in the SALT2 model,we expect to see an oﬀset in α and β . This in turn eﬀectscosmology, resulting in small biases in w . Note —Correlations determined from the combined 100 sim-ulation ﬁts. Correlations for the low- z band systematics andthe latent parameters representing selection function uncer-tainty are not shown but have negligible correlation. Zerosuperscripts indicate the Dark Energy Survey, and a super-script one represents the low- z survey. Table 6.

Reduced parameter correlations with w .Parameter G10 Stat+Syst C11 Stat+Syst Ω m − − α − − β − − (cid:104) M B (cid:105) δ ( ) δ (∞)/ δ ( ) σ B σ B σ x σ x σ c σ c α c − α c κ c − − κ c − − κ c − − κ c − (cid:104) x (cid:105) − − (cid:104) x (cid:105) − (cid:104) x (cid:105) − − (cid:104) x (cid:105) − − (cid:104) x (cid:105) − − (cid:104) x (cid:105) (cid:104) x (cid:105) (cid:104) x (cid:105) (cid:104) c (cid:105) − − (cid:104) c (cid:105) (cid:104) c (cid:105) (cid:104) c (cid:105) (cid:104) c (cid:105) − − (cid:104) c (cid:105) − − (cid:104) c (cid:105) − − (cid:104) c (cid:105) − − δ [ SALT ] δ [ SALT ] − δ [ SALT ] − − δ [ SALT ] − − δ [ SALT ] δ [ SALT ] δ [ SALT ] δ [ SALT ] − − δ [ SALT ] δ [ SALT ] δ [ MWE B − V ] δ [ HSTCalib ] − − δ [ v pec ] − δ [ δ z ] δ [ ∆ g ] δ [ ∆ r ] δ [ ∆ i ] − − δ [ ∆ z ] − − δ [ ∆ λ g ] δ [ ∆ λ r ] δ [ ∆ λ i ] − δ [ ∆ λ z ] m w M B m B m B x x c c c c c c c c ( ) () / ( ) x x x x x x x x c c c c c c c c m wM B B B x x c c c c c c c c (0)( )/ (0) x x x x x x x x c c c c c c c c Figure 10.

Parameter correlations for the combined ﬁts to the 100 G10 scatter model simulations. We see that the primarycorrelations with w enter through α , β and (cid:104) M B (cid:105) , as shown in Table 5. Whilst (cid:104) M B (cid:105) is generally thought to be a nuisanceparameter, we ﬁnd cosmological correlation. We note that, by ﬁxing H in our distance modulus calculation, (cid:104) M B (cid:105) absorbsany cosmological uncertainty on this term. Additionally (cid:104) M B (cid:105) also eﬀects the selection eﬃciency, which was computed fromsimulations with a ﬁxed M B value, introducing a second plausible source of correlation. Also visible in this ﬁgure are severalother interesting relationships. β is strongly anti-correlated with intrinsic dispersion σ m B for both surveys (DES-like and low- z ),with σ m B showing strong anti-correlation with κ c . This relationship is indeed expected — as κ c grows larger (more unexplaineddispersion on the color observation), the width of the supernova population in apparent magnitude space increases. As the ﬁtprefers it to conform to the observed width of the distribution, the extra width in color causes the inherent magnitude smearingamount to decrease. And with extra freedom on the observed color from κ c , β shifts in response. The other striking featurein the plot is the strong correlation blocks in the bottom right and the anti-correlation stripes on the edges. These too areexpected, for they show the relationship between the color distribution’s mean value, its width and its skewness. As skewness orpopulation width increases, the eﬀective mean of the population shifts (see Appendix A.3 for details), creating anti-correlationbetween skewness and the (Gaussian) mean color population. Strong anti-correlation between κ c and κ c with σ m B reveals thestrong population degeneracy, and – for the C11 simulation results – a constrained positive value shows that a ﬁnite non-zeroextra color dispersion is indeed preferred by our model. teve 19Unfortunately, adding extra ﬁt parameters to allowfor shifting observables washes out our ability to con-strain cosmology, and applying a speciﬁc bias correctionrequires running a ﬁducial simulation (assuming cosmol-ogy, population and scatter model), which presents diﬃ-culties when trying to account for correlations with pop-ulation and scatter model. This is compounded by thefact that bias corrections do not in general improve ﬁts(increase the log posterior), and so are diﬃcult to ﬁtparametrically. Works such as Kessler & Scolnic (2017)show that bias corrections can be applied to supernovaedatasets that can robustly handle multiple intrinsic scat-ter models, and future work will center on uniting thesemethodologies — incorporating better bias correctionsthat separate intrinsic scatter bias and non-Gaussiansummary statistic bias from Malmquist bias, withouthaving to precompute standardisation parameters andpopulations.Diﬃculty in providing an adequate parametrisationfor realistic intrinsic dispersion, and the simpliﬁcation ofMalmquist bias to only apparent magnitude also leadsto biases in the population parameters. To determineif the underlying population mischaracterisation was acause of cosmological bias, we ran ﬁts wherein the un-derlying population was ﬁxed to the known distributionsused for the simulation. These ﬁts did not change thebias in the C11 simulation. We conclude that the biasedpopulation recovery is not the cause of cosmological bias.As the population parameters recovered using the sim-plistic toy supernova data in the previous section do notexhibit signiﬁcant bias, future work will focus on intrin-sic dispersion and Malmquist bias rather than alternateparametrisations of the underlying supernova popula-tion.Table 5 lists the ﬁt correlations between our model ﬁtparameters (excluding the low- z band systematics, andMalmquist bias uncertainty parameters which had neg-ligible correlation), showing (in order) cosmological pa-rameters, standardisation parameters, population widthand skewness parameters, intrinsic dispersion parame-ters, mass-step parameters, population mean parame-ters, SALT2 model systematics, dust systematic, globalHST calibration systematic, peculiar velocity system-atic, global redshift systematic and DES band magni-tude and wavelength systematics. Figure 10 show thefull correlations between all non-systematic model pa-rameters. Other interesting correlations are shown anddiscussed in Figure 10. The band systematics for DESﬁlters g , r and i also show signiﬁcant correlation with w ,highlighting the importance of minimising instrumentaluncertainty.For the sample size of the DES + low- z supernovasamples (332 supernova), the bias from intrinsic scattermodels is sub-dominant to the statistical uncertainty,as shown in Figure 9. For our full systematics model,the bias represents a deviation between σ to . σ de-pending on scatter model, and given that they remain sub-dominant, we will leave more complicated treatmentof them for future work.5.3. Uncertainty Analysis

With the increased ﬂexibility of Bayesian hierarchi-cal models over traditional models, we expect to ﬁndan increased uncertainty on parameter inference. Thisincreased uncertainty is one of the strengths of hierarchi-cal models as it represents a more thorough accountingof model uncertainty. To characterise the inﬂuence ofthe extra degrees of freedom in our model, we analysethe uncertainty on w averaged across 10 nominal simu-lations of the DES-SN3YR sample with various modelparameters allowed to either vary or stay locked to aﬁxed value. By taking the diﬀerence in uncertainty inquadrature, we can infer the relative contribution foreach model feature to the uncertainty error budget.The error budget detailed in Table 5.3 shows that ouruncertainty is dominated by statistical error, as the totalstatistical uncertainty is on w is ± . . With the lownumber of supernovae in the DES-SN3YR sample, this isexpected. We note that the label ‘Systematics’ in Table5.3 represents all numerically computed systematics (asdiscussed in Section 4.4.4) and systematic uncertaintyon the selection function.5.4. Methodology Comparison

We compare the results of our model against those ofthe BBC+CosmoMC method (Kessler & Scolnic 2017).BBC+CosmoMC has been used in prior analyses, suchas the Pantheon sample analysis of Scolnic et al. (2017)and is being used in the primary analysis of the DES-SN3YR sample (DES Collaboration et al. 2019). TheBBC method is a two-part process, BBC computes biascorrections for observables, and then the corrected dis-tances are ﬁt using CosmoMC (Lewis & Bridle 2002).For shorthand, we refer to this combined process as theBBC method hereafter in this paper, as we are concernedwith the results of cosmological parameter inference. Asa leading supernova cosmology method, it provides agood consistency check as to the current levels of accu-racy in recovering cosmological parameters.To this end, we take the results of the BBC methodwhich were also run on the same set of 200 validationsimulations and compare the recovered w values to thoseof our method. The results are detailed in Table 8, anda scatter plot of the simulation results is presented inFigure 11.As shown in Brout et al. (2019), the BBC method re-covers cosmological parameters without bias so long asthe intrinsic scatter model is known. As we do not knowthe correct intrinsic scatter model, the BBC method av-erages the results when using bias corrections from G10and from C11. As such, we expect the BBC method tohave a w -bias in one direction for G10 simulations andthe other direction for C11 simulations. These resultsare consistent with those displayed in Table 8. Both0 Note —Error budget determined from analysing uncertainty on simulation data whilst progressively enabling model features.We start from the top of the table, only varying cosmological parameters Ω m and w , and then progressively unlock parametersand let them ﬁt as we progress down the table. The cumulative uncertainty shows the total uncertainty on w on the ﬁt for all,where the σ w term is derived by taking the quadrature diﬀerence in cumulative uncertainty as we progress. Table 7. w error budgetFeature Parameters σ w CumulativeCosmology Ω m , w α , β , (cid:104) M B (cid:105) , δ ( ) , δ (∞)/ δ ( ) κ , κ σ M B , σ c , σ x , α c (cid:104) c i (cid:105) , (cid:104) x , i (cid:105) δ Z i , δ S BBC method and

Steve are sensitive to the intrinsicscatter model, ﬁnding diﬀerences of ∼ . and . respectively in w when varying the scatter model. TheBBC method ﬁnds w biased low for G10 and w biasedhigh for C11 (by about ± . ), so taking the averageresult only results in a small bias of − . in w . Ourmethod shows a small improvement in the insensitivityto the intrinsic scatter model (having a decrease in dif-ference in w between the G10 and C11 models), ﬁndingno bias for G10 but a w biased high for C11. This de-crease in error is not statistically signiﬁcant as we havestatistical uncertainty of ∼ . for simulation reali-sations. The average bias over the two scatter models is + . , representing a larger bias than the BBC method.When comparing both the G10 and C11 set of sim-ulations independently, our model diﬀers from BBC inits average prediction of w by + . and + . re-spectively. For the G10 model this diﬀerence is a resultof bias in the BBC results, however for the C11 simu-lations this is a result of both bias from BBC, and alarger bias from our method. These results also allowus to state the expected values for w when run on theDES-SN3YR sample. When using Planck priors our un-certainty on w is reduced compared to using our simu-lation Gaussian prior on Ω m , shrinking the average w -diﬀerence from . to . . After factoring this into ouruncertainty, we expect our BHM method to, on average,recover w BHM = w BBC + . ± . .Having established that our method exhibits similarshifts in the recovery of w compared to BBC, futurework will focus on improving the parametrisation of in-trinsic scatter model into our framework, with the goalof minimising the eﬀect of the underlying scatter modelon the recovery of cosmology. CONCLUSIONSIn this paper we have outlined the creation of a hier-archical Bayesian model for supernova cosmology. The

Note —We characterise the bias on w using the 100 simula-tions for the G10 scatter model and 100 simulations for C11scatter model. We also show the results when combining theG10 and C11 models into a combined set of 200 simulations.The mean w value for our method and BBC are presented,along with the mean when averaging the diﬀerence betweenour method and BBC for each individual simulation. Aver-ages are computed giving each simulation sample the sameweight. In the model, ∆ represents Steve - BBC. The ﬁnalrow shows the scatter between Steve and BBC for the diﬀer-ent simulations. Table 8. w bias comparisonModel G10 C11 (G10 + C11)Steve (cid:104) w (cid:105) − . ± . − . ± . − . ± . BBC (cid:104) w (cid:105) − . ± . − . ± . − . ± . a ∆ (cid:104) w (cid:105) + . ± . + . ± . + . ± . ∆ σ w . ± .

004 0 . ± .

004 0 . ± . a This value is computed with each simulation having the sameweight. It disagrees with the value provided in Brout et al.(2019, Table 10, row 3) which uses inverse variance weightedaverages. We do not utilise this weight because the variance iscorrelated with the value of w due to the Ω m prior applied inthe ﬁtting process. We note that if inverse variance weightingis applied to both datasets, they both shift by ∆ w ≈ . , andthus the predicted diﬀerence between the BBC method and Steve remains the same. model takes into account selection eﬀects and their un-certainty, ﬁts underlying populations and standardis-ation parameters, incorporates unexplained dispersionfrom intrinsic scatter color smearing and incorporatesteve 21 w BB C w G10C11

Figure 11.

Recovered w for the 200 validation simulationswith full treatment of statistical and systematic errors. Un-certainty on the recovered w value is shown for every seconddata point for visual clarity. uncertainty from peculiar velocities, survey calibration,HST calibration, dust, a potential global redshift oﬀ-set, and SALT2 model uncertainty. Furthermore, ouruncertainties in standardisation, population, mass-stepand more, being explicitly parametrised in our model,are captured with covariance intact, an improvement onmany previous methods. The model has been optimisedto allow for hundreds of supernovae to be modelled fullywith latent parameters. It runs in under an hour of CPUtime and scales linearly with the number of supernovae,as opposed to polynomial complexity of matrix inversionof other methods.The importance of validating models using high-precision statistics gained by performing ﬁts to hun-dreds of data realisations cannot be overstated, howeverthis validation is lacking in many earlier BHM mod-els for supernova cosmology. We have validated thismodel against many realisations of simplistic simula-tions with well-known and well-deﬁned statistics, andfound no cosmological bias. When validating usingSNANA simulations, we ﬁnd evidence of cosmologicalbias which is traced back to light curve ﬁts reportingbiased observables and incorrect covariance. Allowingfully parametrised corrections on observed supernovaesummary statistics introduces too many degrees of free-dom and is found to make cosmology ﬁts too weak. Al-lowing simulation based corrections to vary in strengthis found to give minor reductions in w bias, however theuncertainty on the intrinsic scatter model itself limits the eﬃcacy of the bias corrections. For the data sizerepresented in the DES three-year spectroscopic survey,the determined biases should be sub-dominant to othersources of uncertainty, however this cannot be expectedfor future analyses with larger datasets. Stricter biascorrections calculated from simulations are required toreduce bias. Ideally, this would include further work onthe calculation of intrinsic dispersion of the type Ia su-pernova population such that we can better characterisethis bias.With our model being validated against hundreds ofsimulation realisations, representing a combined datasetover more than

60 000 simulated supernovae, we havebeen able to accurately determine biases in our modeland trace their origin. With the current biases beingsub-dominant to the total uncertainty, we now prepareto analyse the DES three-year dataset.ACKNOWLEDGEMENTSPlots of posterior surfaces and parameter summarieswere created with

ChainConsumer (Hinton 2016).Funding for the DES Projects has been provided bythe U.S. Department of Energy, the U.S. National Sci-ence Foundation, the Ministry of Science and Educationof Spain, the Science and Technology Facilities Councilof the United Kingdom, the Higher Education Fund-ing Council for England, the National Center for Su-percomputing Applications at the University of Illinoisat Urbana-Champaign, the Kavli Institute of Cosmo-logical Physics at the University of Chicago, the Cen-ter for Cosmology and Astro-Particle Physics at theOhio State University, the Mitchell Institute for Fun-damental Physics and Astronomy at Texas A&M Uni-versity, Financiadora de Estudos e Projetos, Funda¸c˜aoCarlos Chagas Filho de Amparo `a Pesquisa do Estado doRio de Janeiro, Conselho Nacional de DesenvolvimentoCient´ıﬁco e Tecnol´ogico and the Minist´erio da Ciˆencia,Tecnologia e Inova¸c˜ao, the Deutsche Forschungsgemein-schaft and the Collaborating Institutions in the DarkEnergy Survey.The Collaborating Institutions are Argonne NationalLaboratory, the University of California at Santa Cruz,the University of Cambridge, Centro de InvestigacionesEnerg´eticas, Medioambientales y Tecnol´ogicas-Madrid,the University of Chicago, University College Lon-don, the DES-Brazil Consortium, the University ofEdinburgh, the Eidgen¨ossische Technische Hochschule(ETH) Z¨urich, Fermi National Accelerator Laboratory,the University of Illinois at Urbana-Champaign, theInstitut de Ci`encies de l’Espai (IEEC/CSIC), the In-stitut de F´ısica d’Altes Energies, Lawrence BerkeleyNational Laboratory, the Ludwig-Maximilians Univer-sit¨at M¨unchen and the associated Excellence ClusterUniverse, the University of Michigan, the National Op-tical Astronomy Observatory, the University of Not-tingham, The Ohio State University, the University ofPennsylvania, the University of Portsmouth, SLAC Na-2tional Accelerator Laboratory, Stanford University, theUniversity of Sussex, Texas A&M University, and theOzDES Membership Consortium.Based in part on observations at Cerro Tololo Inter-American Observatory, National Optical AstronomyObservatory, which is operated by the Association ofUniversities for Research in Astronomy (AURA) un-der a cooperative agreement with the National ScienceFoundation.The DES data management system is supported bythe National Science Foundation under Grant Num-bers AST-1138766 and AST-1536171. The DES par-ticipants from Spanish institutions are partially sup-ported by MINECO under grants AYA2015-71825,ESP2015-66861, FPA2015-68048, SEV-2016-0588, SEV-2016-0597, and MDM-2015-0509, some of which includeERDF funds from the European Union. IFAE is par-tially funded by the CERCA program of the Generali-tat de Catalunya. Research leading to these results has received funding from the European Research Coun-cil under the European Union’s Seventh FrameworkProgram (FP7/2007-2013) including ERC grant agree-ments 240672, 291329, and 306478. We acknowledgesupport from the Australian Research Council Centreof Excellence for All-sky Astrophysics (CAASTRO),through project number CE110001020, and the Brazil-ian Instituto Nacional de Ciˆencia e Tecnologia (INCT)e-Universe (CNPq grant 465376/2014-2).This manuscript has been authored by Fermi Re-search Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Oﬃceof Science, Oﬃce of High Energy Physics. The UnitedStates Government retains and the publisher, by ac-cepting the article for publication, acknowledges thatthe United States Government retains a non-exclusive,paid-up, irrevocable, world-wide license to publish or re-produce the published form of this manuscript, or allowothers to do so, for United States Government purposes.REFERENCES

Abbott T., et al., 2016, Monthly Notices of the RoyalAstronomical Society, 460, 1270Amanullah R., et al., 2010, The Astrophysical Journal, 716,712Astier P., et al., 2006, Astronomy and Astrophysics, 447, 31Bailey S., et al., 2008, eprint arXiv:0810.3499Balland C., et al., 2009, Astronomy and Astrophysics, 507,85Barbary K., et al., 2010, The Astrophysical Journal, 745, 27Bernstein J. P., et al., 2012, The Astrophysical Journal,753, 152Betoule M., et al., 2014, Astronomy & Astrophysics, 568, 32Brout D., et al., 2019, ApJ, 874, 150Carpenter B., et al., 2017, Journal of Statistical Software,76, 1Chambers K. C., et al., 2016, preprint,( arXiv:1612.05560 )Chotard N., et al., 2011, Astronomy & Astrophysics, 529, 6Conley A., et al., 2011, The Astrophysical JournalSupplement Series, 192, 1Contreras C., et al., 2010, The Astronomical Journal, 139,519D’Andrea C. B., et al., 2011, The Astrophysical Journal,743, 172DES Collaboration et al., 2019, ApJ, 872, L30Dilday B., et al., 2008, The Astrophysical Journal, 682, 262Folatelli G., et al., 2010, The Astronomical Journal, 139,120Freedman W. L., et al., 2009, The Astrophysical Journal,704, 1036Frieman J. A., et al., 2008, AJ, 135, 338 Graur O., et al., 2013, The Astrophysical Journal, 783, 28Gupta R. R., et al., 2011, ApJ, 740, 92Gupta R. R., et al., 2016, AJ, 152, 154Guy J., et al., 2007, Astronomy and Astrophysics, 466, 11Guy J., et al., 2010, Astronomy and Astrophysics, 523, 34Hicken M., et al., 2009a, The Astrophysical Journal, 700,331Hicken M., Wood-Vasey W. M., Blondin S., Challis P., JhaS., Kelly P. L., Rest A., Kirshner R. P., 2009b,Astrophysical Journal, 700, 1097Hicken M., et al., 2012, Astrophysical Journal, SupplementSeries, 200Hinton S., 2016, The Journal of Open Source Software, 1Hlozek R., et al., 2012, The Astrophysical Journal, 752, 79Huterer D., Shafer D. L., 2018, Dark energy two decadesafter: Observables, probes, consistency tests( arXiv:1709.01091 ), doi:10.1088/1361-6633/aa997e,http://arxiv.org/abs/1709.01091http://dx.doi.org/10.1088/1361-6633/aa997eIvezic Z., et al., 2008, eprint arXiv:0805.2366Jennings E., Wolf R., Sako M., 2016, eprintarXiv:1611.03087, pp 1–22Johansson J., et al., 2013, Monthly Notices of the RoyalAstronomical Society, 435, 1680Karpenka N. V., 2015, The supernova cosmology cookbook:Bayesian numerical recipes. ( arXiv:1503.03844 ),http://arxiv.org/abs/1503.03844Kelly P. L., Hicken M., Burke D. L., Mandel K. S., KirshnerR. P., 2010, The Astrophysical Journal, 715, 743Kessler R., Scolnic D., 2017, The Astrophysical Journal,836, 56 teve 23

Kessler R., et al., 2009a, Publications of the AstronomicalSociety of the Paciﬁc, 121, 1028Kessler R., et al., 2009b, Astrophysical Journal,Supplement Series, 185, 32Kessler R., et al., 2013a, ApJ, 764, 48Kessler R., et al., 2013b, The Astrophysical Journal, 764, 48Kessler R., et al., 2019, MNRAS, 485, 1171Kowalski M., et al., 2008, The Astrophysical Journal, 686,749Kunz M., Bassett B., Hlozek R., 2007, Physical Review D,75, 1LSST Science Collaboration et al., 2009, eprintarXiv:0912.0201Lampeitl H., et al., 2010, The Astrophysical Journal, 722,566Lewis A., Bridle S., 2002, PhRvD, 66, 103511Ma C., Corasaniti P.-S., Bassett B. A., 2016, MonthlyNotices of the Royal Astronomical Society, 463, 1651Malmquist K. G. 1922, Lund Medd. Ser. I, 100, 1Mandel K. S., Wood-Vasey W. M., Friedman A. S.,Kirshner R. P., 2009, The Astrophysical Journal, 704, 629Mandel K. S., Narayan G., Kirshner R. P., 2011, TheAstrophysical Journal, 731, 120Mandel K. S., Scolnic D., Shariﬀ H., Foley R. J., KirshnerR. P., 2017, The Astrophysical Journal, 842, 26March M. C., Trotta R., Berkes P., Starkman G. D.,Vaudrevange P. M., 2011, Monthly Notices of the RoyalAstronomical Society, 418, 2308March M. C., Karpenka N. V., Feroz F., Hobson M. P.,2014, Monthly Notices of the Royal AstronomicalSociety, 437, 3298Perlmutter S., et al., 1999, The Astrophysical Journal, 517,565Perrett K., et al., 2010, Astronomical Journal, 140, 518Perrett K., et al., 2012, The Astronomical Journal, 144, 59Phillips M. M., 1993, The Astrophysical Journal, 413, L105 Phillips M. M., Lira P., Suntzeﬀ N. B., Schommer R. A.,Hamuy M., Maza J., 1999, The Astronomical Journal,118, 1766Rest A., et al., 2014, The Astrophysical Journal, 795, 44Riess A. G., et al., 1998, The Astronomical Journal, 116,1009Rigault M., et al., 2013, Astronomy & Astrophysics, 560,A66Roberts E., Lochner M., Fonseca J., Bassett B. A.,Lablanche P.-Y., Agarwal S., 2017, eprintarXiv:1704.07830Rodney S. A., et al., 2014, The Astronomical Journal, 148,13Rubin D., et al., 2015, The Astrophysical Journal, 813, 15Sako M., et al., 2014, eprint arXiv:1401.3317Sako M., et al., 2018, PASP, 130, 064002Scolnic D., Kessler R., 2016, The Astrophysical JournalLetters, 822Scolnic D. M., et al., 2017, eprint arXiv:1710.00845Shariﬀ H., Jiao X., Trotta R., van Dyk D. A., 2016, TheAstrophysical Journal, 827, 1Stan Development Team 2017, PyStan: the interface toStan, http://mc-stan.org/Stritzinger M., et al., 2011, The Astronomical Journal, 142,14Sullivan M., et al., 2010, Monthly Notices of the RoyalAstronomical Society, 406, 782Suzuki N., et al., 2012, The Astrophysical Journal, 746, 85Tripp R., 1998, A two-parameter luminosity correction forType IA supernovae. Vol. 331, EDP Sciences [etc.], http://adsabs.harvard.edu/abs/1998A { % } A. SELECTION EFFECT DERIVATIONA.1.

General Selection Eﬀects

When formulating and ﬁtting a model using a constraining dataset, we wish to resolve the posterior surface deﬁnedby P ( θ | data ) ∝ P ( data | θ ) P ( θ ) , (A1)which gives the probability of the model parameter values ( θ ) given the data. Prior knowledge of the allowed valuesof the model parameters is encapsulated in the prior probability P ( θ ) . Of primary interest to us is the likelihood ofobserving the data given our parametrised model, L ≡ P ( data | θ ) . When dealing with experiments that have imperfectselection eﬃciency, our likelihood must take that eﬃciency into account. We need to describe the probability that theevents we observe are both drawn from the distribution predicted by the underlying theoretical model and that thoseevents, given they happened, are subsequently successfully observed. To make this extra conditional explicit, we writethe likelihood of the data given an underlying model, θ , and that the data are included in our sample, denoted by S ,as: L ( θ ; data ) = P ( data | θ, S ) . (A2)A variety of selection criteria are possible, and in our method we use our data in combination with the proposed modelto determine the probability of particular selection criteria. That is, we characterise a function P ( S | data , θ ) , whichcolloquially can be stated as the probability of a potential observation passing selection cuts, given our observationsand the underlying model . We can introduce this expression in a few lines due to symmetry of joint probabilities andutilising that P ( x , y , z ) = P ( x | y , z ) P ( y , z ) = P ( y | x , z ) P ( x , z ) : P ( data | S , θ ) P ( S , θ ) = P ( S | data , θ ) P ( data , θ ) (A3) P ( data | S , θ ) = P ( S | data , θ ) P ( data , θ ) P ( S , θ ) (A4) = P ( S | data , θ ) P ( data | θ ) P ( θ ) P ( S | θ ) P ( θ ) (A5) = P ( S | data , θ ) P ( data | θ ) P ( S | θ ) (A6)which is equal to the likelihood L . At this point, our derivation depends on whether our selection eﬀects are bestmodelled as a function of observables or as a function of latent parameters. In most cases, selection eﬀects will be bestmodelled directly as a function of observables, and we would introducing an integral over all possible events D , so wecan evaluate P ( S | θ ) , L ( θ ; data ) = P ( S | data , θ ) P ( data | θ ) ∫ P ( S , D | θ ) dD (A7) L ( θ ; data ) = P ( S | data , θ ) P ( data | θ ) ∫ P ( S | D , θ ) P ( D | θ ) dD . (A8)The second option, wherein selection eﬀects can be more computationally eﬃciently modelled with the inclusion oflatent parameters (for example, in the case where we do not have direct access to the observables upon which our dataselection is determined), we can introduce our latent parameters in addition to an integral over all possible data: L ( θ ; data ) = ∫ P ( S | data , L , θ ) P ( data | L ) P ( L | θ ) dL ∬ P ( S | D , L , θ ) P ( D | L , θ ) P ( L | θ ) dD dL , (A9)where L represents our latent parameters that model the true underlying values of our observables, such that ourdata is conditioned directly on them. In this formulation, the selection eﬀects can depend both on data and latentvariables. A.2. Supernova Selection Eﬀects

To turn the generalised equations from the previous sections into selection eﬀects relevant for this model, need tohighlight that θ represents only our top level parameters ( Ω m , w , α , β , etc), and that our parametrisation of trueteve 25underlying values (for example m B ) takes the form of latent parameters. We thus continue from equation (A9) andwrite out denominator d : d = ∬ P ( S | D , L , θ ) P ( D | L , θ ) P ( L | θ ) dD dL . (A10)In our formulation, we assume that our selection eﬀects can be suﬃciently encapsulated by latent parameters. Thatis, we simplify P ( S | D , L , θ ) → P ( S | L ) . This allows us to isolate our integral ∫ P ( D | L , θ ) in the above equation, integrateit to unity, and come to d = ∫ P ( S | L ) P ( L | θ ) dL . (A11)Here we can now move from the generic label L to something speciﬁc to our model. Speciﬁcally, we assume thatour selection eﬀects can be quantiﬁed using the true apparent magnitude m B and redshift z , and so L → { m B , z } , orformally P ( S | D , L , θ ) = P ( S | z ) P ( S | m B ) . We do not write out other latent parameters found in the model as they do notimpact selection eﬀects — we would simply ﬁnd them integrated to unity. Note that this assumption does not capturebiases engendered by Poisson noise ﬂuctuations. We advocate that future analyses with higher statistical precision douse a precisely determined P ( S | D ) . Writing out the latent variables, our denominator becomes d = ∫ P ( S | z ) P ( S | m B ) P ( z , m B | θ ) dz dm B . (A12)We can express P ( z , m B | θ ) as P ( m B | z , θ ) P ( z | θ ) , where the ﬁrst term requires us to calculate the magnitude distributionof our underlying population at a given redshift, and the second term is dependent on survey geometry and supernovaerates. We can thus state d = ∫ (cid:20)∫ P ( S | m B ) P ( m B | z , θ ) dm B (cid:21) P ( S | z ) P ( z | θ ) dz . (A13)By assuming that the distribution P ( S | z ) P ( z | θ ) is well sampled by the observed supernova redshifts, we can approximatethe integral over redshift by evaluating ∫ P ( S | m B ) P ( m B | z , θ ) dm B (A14)for each supernova in the dataset – i.e. Monte Carlo integration with assumed perfect importance sampling.As stated in Section 4.4.5, the underlying population in apparent magnitude, when we discard skewness, can berepresented as N ( m B | m ∗ B ( z ) , σ ∗ m B ) , where m ∗ B ( z ) = (cid:104) M B (cid:105) + µ ( z ) − α (cid:104) x ( z )(cid:105) + β (cid:32) (cid:104) c ( z )(cid:105) + (cid:114) π σ c δ c (cid:33) (A15) σ ∗ m B = σ M B + ( ασ x ) + (cid:169)(cid:173)(cid:171) βσ c (cid:115) − δ c π (cid:170)(cid:174)(cid:172) . (A16)Then, modelling P ( S | m B ) as either a normal or a skew normal, we can analytically perform the integral in equation(A14) and reach equations (18) and (19).A.3. Approximate Selection Eﬀects

In this section, we investigate the eﬀect of approximating the skew normal underlying color distribution as a normal.Speciﬁcally, equations (A15) and (A16) make the assumption that, for our color distribution, N Skew ( µ, σ, α ) is wellapproximated by N ( µ, σ ) . We sought to improve on this approximation by adjusting the mean and standard deviationof the approximated normal to more accurately describe the actual mean and standard deviation of a skew normal.With δ ≡ α /√ + α , the correct mean and standard deviation are µ = µ + (cid:114) π δσ (A17) σ = σ (cid:114) − δ π , (A18)where we highlight that µ here represents the mean of the distribution, not distance modulus. We can then testthe approximation N Skew ( µ , σ , α ) → N ( µ , σ ) . Unfortunately, this shift to the mean and standard deviation of the6 − . − . − . w .

075 0 .

100 0 . σ c .

075 0 .

100 0 .

125 0 . σ c α c α c − .

06 0 .

00 0 . h c i − .

05 0 .

00 0 . h c i − .

06 0 .

00 0 . h c i − .

08 0 .

00 0 .

08 0 . h c i − .

05 0 .

00 0 . h c i − .

05 0 .

00 0 . h c i − .

08 0 .

00 0 . h c i − . . . h c i Figure 12.

Marginalised probability distributions for 100 realisations of cosmology, ﬁt to Flat w CDM with prior Ω m ∼ N ( . , . ) , each containing 1000 simulated high- z and 1000 simulated low- z supernovae. The dashed green surfaces represent aﬁt to an underlying Gaussian color population with the unshifted model. The blue solid surface represents ﬁts to a skewed colorpopulation with the unshifted model, and the purple dotted surface represents a ﬁt to a skewed color population with the shiftedmodel. The superscript and denote the two diﬀerent surveys (high- z and low- z respectively), and similarly the ﬁrst four (cid:104) c i (cid:105) parameters represent the four redshift nodes in the high- z survey, and the last four represent the nodes for the low- z survey.We can see that the shifted model is far better able to recover skewed input populations than the unshifted, performing betterin terms of recovering skewness α c , mean color (cid:104) c (cid:105) and width of the color distribution σ c . The unshifted model recovers thecorrect color mean and width if you approximate a skew normal as a normal: ∆ µ = (cid:112) / πσ c δ c ≈ . , which is approximatelythe deviation found in ﬁts to the color population mean. Importantly, the unshifted model when run on skewed data (the solidblue) shows extreme bias in α c , where it ﬁts strongly around zero regardless, showing it to be a poor approximation. Based onthese results and the good performance in correctly recovering underlying populations of the shifted normal approximation, weadopt the shifted normal approximation in our model. normal approximation where we treat m B , x , and c as a multivariate skew normal did not produce stable posteriorsurfaces. Due to this, we treat the underlying m B , x , and c populations as independent.We tested a ﬁxed σ c in the shift correction, such that µ = µ + (cid:112) / πδ k , where we set k = . to mirror thewidth of the input simulation population. This resulted in stable posterior surfaces, however this introduced recoverybias in several population parameters, and so we do not ﬁx σ c . Comparing whether we shift our normal in theapproximation or simply discard skewness, Figure 3 shows that the calculated eﬃciency is signiﬁcantly discrepant tothe actual eﬃciency if the normal approximation is not shifted. The biases when using shifted or unshifted normalapproximations when we ﬁt our model on Gaussian and skewed underlying populations are shown in Figure 12, andonly the shifted normal approximation correctly recovers underlying population parameters.teve 27 B. NUMERICAL OPTIMISATIONSNot many ﬁtting methodologies and algorithms can handle the thousands of ﬁt parameters our model requires.By using Stan, we are able to take advantage automatic diﬀerentiation and the NUTS sampler, which is a class ofHamiltonian Monte Carlo samplers. Even with these advantages, early implementations of our model still had excessiveﬁt times, with our desired sub-hour running time far exceeded.The simplest and most commonly found optimisation we employed was to precompute as much as possible. Thisis in a bid to reduce the complexity of the mathematical graph our model is translated into by Stan, to simplify thecomputation of surface derivatives. For example, when computing the distance modulus, redshift is encountered tovarious powers. Instead of computing those powers in Stan, we simply pass in several arrays of redshift values alreadyraised to the correct power. Small changes like this however only give small improvements.The primary numerical improvement we made on existing frameworks was to remove costly probability evaluationsof multivariate normals. To increase eﬃciency, the optimum way to sample a multivariate normal is to reparameteriseit such that instead of sampling N ( (cid:174) x | (cid:174) µ, Σ ) , we sample N ( (cid:174) δ | , ) where (cid:174) x = (cid:174) µ + L (cid:174) δ and L is the cholesky decompositionof Σ . In this way, we can eﬃciently sample the unit normal probability distribution instead of sampling a multivariatenormal probability distribution. Switching to this parametrisation resulted in a computational increase of an order ofmagnitude, taking ﬁts for a sample of approximately 500 supernovae from roughly four hours down to thirty minutes.This parametrisation does come with one signiﬁcant downside — inﬂexibility. For each step the algorithm takes,we do not recompute the cholesky decomposition of the covariance of the summary statistics — that happens onceat the beginning of the model setup. If we had kept the full covariance matrix parametrisation we could modify thematrix easily — for example when incorporating intrinsic dispersion we could simply add on a secondary matrix tocreate an updated covariance. However as the cholesky decomposition of a sum of matrices is not equal to the sum ofthe cholesky decomposition of each individual matrix, we would need to recompute the decomposition for each step,which discards most of the computational beneﬁt just gained.Considering a × matrix with cholesky decomposition L = (cid:169)(cid:173)(cid:173)(cid:171) a b c d e f (cid:170)(cid:174)(cid:174)(cid:172) , (B19)the original covariance matrix Σ is given by Σ = (cid:169)(cid:173)(cid:173)(cid:171) a ab adab b + c bd + cead bd + ce d + e + f (cid:170)(cid:174)(cid:174)(cid:172) . (B20)Now, the primary source of extra uncertainty in the intrinsic dispersion models comes from chromatic smearing,which primarily inﬂuences the recovered color parameter, which is placed as the last element in the observables vector { m B , x , c } . We can now see that it is possible to add extra uncertainty to the color observation on the diagonalwithout having to recompute the cholesky decomposition - notice that f is unique in that it is the only element of L that appears in only one position in the covariance matrix. To take our covariance and add on the diagonal uncertaintyfor color an extra σ e term, we get C = (cid:169)(cid:173)(cid:173)(cid:171) σ m B ρ , σ m B σ x ρ , σ m B σ c ρ , σ m B σ x σ x ρ , σ x σ c ρ , σ m B σ c ρ , σ x σ c σ c + σ e (cid:170)(cid:174)(cid:174)(cid:172) . (B21)The cholesky decomposition of this is, in terms of the original cholesky decomposition, is L = (cid:169)(cid:173)(cid:173)(cid:171) a b c d e f + g (cid:170)(cid:174)(cid:174)(cid:172) , (B22)where g = (cid:112) f + σ e − f . This allows an easy update to the cholesky decomposition to add extra uncertainty to theindependent color uncertainty. For both the G10 and C11 models, we ran ﬁts without the cholesky parametrisationto allow for extra correlated dispersion (instead of just dispersion on cc