[PDF] Report from the Tri-Agency Cosmological Simulation Task Force

Abstract

The Tri-Agency Cosmological Simulations (TACS) Task Force was formed when Program Managers from the Department of Energy (DOE), the National Aeronautics and Space Administration (NASA), and the National Science Foundation (NSF) expressed an interest in receiving input into the cosmological simulations landscape related to the upcoming DOE/NSF Vera Rubin Observatory (Rubin), NASA/ESA's Euclid, and NASA's Wide Field Infrared Survey Telescope (WFIRST). The Co-Chairs of TACS, Katrin Heitmann and Alina Kiessling, invited community scientists from the USA and Europe who are each subject matter experts and are also members of one or more of the surveys to contribute. The following report represents the input from TACS that was delivered to the Agencies in December 2018.

Full PDF

RReport from the Tri-Agency Cosmological Simulation TaskForce

Authors:

Nick Battaglia , Andrew Benson , Tim Eiﬂer , Andrew Hearin , Katrin Heitmann ,Shirley Ho , , , Alina Kiessling , Zarija Luki´c , Michael Schneider , Elena Sellentin , JoachimStadel Cornell University, Ithaca, NY 14853, USA Carnegie Observatories, Pasadena, CA 91101, USA Steward Observatory, University of Arizona, Tucson, AZ 85721, USA Argonne National Laboratory, Lemont, IL 60439, USA Flatiron Institute, New York, NY 10010, USA Jet Propulsion Laboratory/California Institute of Technology, Passadena, CA 91009, USA Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA Lawrence Livermore National Laboratory, Livermore, CA 94550, USA Leiden Observatory, Leiden NL-2333, The Netherlands University of Zurich, Zurich 8057, Switzerland Princeton University, Princeton, NJ 08540, USA Carnegie Mellon University, Pittsburgh, PA 15213, USA

Foreword

The Tri-Agency Cosmological Simulations (TACS) Task Force was formed when Program Managers fromthe Department of Energy (DOE), the National Aeronautics and Space Administration (NASA), andthe National Science Foundation (NSF) expressed an interest in receiving input into the cosmologicalsimulations landscape related to the upcoming DOE/NSF Vera Rubin Observatory (Rubin), NASA/ESA’sEuclid, and NASA’s Wide Field Infrared Survey Telescope (WFIRST). The Co-Chairs of TACS, KatrinHeitmann and Alina Kiessling, invited community scientists from the USA and Europe who are eachsubject matter experts and are also members of one or more of the surveys to contribute. The followingreport represents the input from TACS that was delivered to the Agencies in December 2018.

Executive Summary

Upcoming wide-ﬁeld surveys require extensive numerical simulations for a number of interrelated tasks,including carrying out the simulations, transforming them into synthetic sky maps, validating the results,and serving the data in an easily accessible way. These are all major eﬀorts involving large computing andstorage resources as well as people with specialized expertise to develop the modeling and analysis pipelinesand database approaches. Many of the tasks are common between the major cosmological surveys and itis therefore strongly advisable to evaluate common approaches and resource sharing between the surveys.Additionally, investigations of scientiﬁc gains that can be reaped from joint pixel analysis eﬀorts have beeninitiated; such investigations rely on the availability of shared synthetic catalogs that can be used acrossthe surveys and are based on the same underlying cosmological simulations.Here we report on our ﬁndings regarding common generation, use, and curation of cosmological simu-lation data products for the Vera Rubin Observatory (Rubin) Dark Energy Science Collaboration (DESC),the Wide Field Infrared Survey Telescope (WFIRST), and Euclid, as well as possibilities to share in a com-mon computational and storage infrastructure. We describe the use of extreme-scale simulations (deﬁned a r X i v : . [ a s t r o - ph . C O ] M a y s simulations that require very large computational resources) as well as simulation suites that enablecovariance estimation and exploration of the cosmological and physical modeling parameter space. Wethen discuss diﬀerent methods for generating synthetic sky maps from gravity-only simulations. Whilesome aspects of these methods must be tailored to the survey at hand, the general concepts that arebeing developed are similar and sharing resources to develop those concepts would be natural. Next wedescribe the biggest challenge facing cosmology in the coming decade – understanding systematic eﬀectsand disentangling them from cosmological information. In this report we focus on systematic eﬀects thatcan be addressed by simulation eﬀorts and are common across the three surveys. Another area wherejoint eﬀorts can be fruitful is in the exploration of statistical methods. Here we discuss a range of topicsfrom next generation prediction tools to covariances. Finally, we describe the advantages of a commoninfrastructure to share simulation products.Our report should make it clear that providing joint resources between the surveys will enable eﬃcientdevelopment and sharing of simulations and related analysis tools. The current support for a programof this nature is not well established since often these activities are viewed as infrastructure tasks ratherthan as a broader research and development activity. Consequently, funding that, in particular, supportswork across surveys (and therefore funding Agencies) is sparse. However, we also emphasize that survey-speciﬁc work still needs to be supported as well. In order to make this distinction between cross-surveyand survey-speciﬁc work very clear, each subsection of our report ends with a summary of our ﬁndingswith particular focus on how cross-survey support would strengthen the speciﬁc research and developmentdescribed. ontents Common Infrastructure to Share Simulation Products 33 Motivation

Cosmological simulations have become increasingly sophisticated over the last several decades and their rolein cosmological surveys has correspondingly experienced enormous growth. Cosmological simulations arenow integral to forecasting and survey formulation, in addition to the eventual analysis of the observationaldata. The shift from Stage 3 to Stage 4 cosmological surveys has been underway for the last several yearsand during this time the role of cosmological simulations in the surveys has undergone a shift from beinga research and development (R&D) eﬀort to being a key element of the Stage 4 survey infrastructure.Elements that are considered part of the survey infrastructure are deemed as essential to the successof the survey and have traditionally included eﬀorts like ground operations, analysis pipelines, and datamanagement pipeline development, but not cosmological simulations. However, although it is currentlywidely accepted that cosmological simulations are essential to upcoming Stage 4 surveys, the funding andsupport for these eﬀorts is still largely only being covered by competitively selected R&D proposals. As aresult, key work is diﬃcult to undertake in a timely or planned manner due to the uncertainty of proposalselection. This has resulted in eﬀorts to date being limited to the few groups that have been successful insecuring short-term funding and resources for very speciﬁc tasks.Added to this challenge is the reality that students and postdocs working on cosmological simulationsand synthetic sky generation have historically had very little success in securing permanent jobs in theﬁeld. Consequently, the number of people available to contribute to these eﬀorts is consistently low andthe “next generation” are being lost to more secure jobs in data science. While we do not attempt to solvethis issue directly in this document, we did want to highlight it as a pervasive problem that is deservingof more focused consideration in future.The purpose of this document is to clearly detail cosmological simulation eﬀorts that are essential tothe success of the upcoming Stage 4 cosmological surveys from the Vera Rubin Observatory (Rubin), theWide Field Infrared Survey Telescope (WFIRST), and Euclid. The document also highlights work that isstill required and focuses on collaborative eﬀorts that will beneﬁt two or more of the surveys. A focusedcollaboration between the surveys and Agencies will enable the most eﬃcient use of resources and willfacilitate rapid development in key areas that are currently experiencing only moderate progress due to alack of support. It is important to stress that such a collaboration does need additional support that iscurrently not available.The document begins by introducing “extreme-scale” simulations followed by large simulation cam-paigns, which are the two primary classes of simulation required for upcoming cosmological surveys. Next,the generation of synthetic sky maps and the challenges to this eﬀort are discussed in detail, followed byan analysis of how simulations are essential to investigating and mitigating systematic eﬀects. The role ofsimulations in developing advanced statistical techniques is then investigated and the document concludesby presenting an argument for the development of a common infrastructure to share simulation products.

In this section we discuss what we call “extreme-scale simulations” – very large, high-resolution N-bodysimulations (“grand challenge” simulations) that form the basis for synthetic sky maps and very detailed,large hydrodynamic simulations that are important to advance our understanding of astrophysics system-atics. These simulations require major computing allocations (in the U.S. for example, the DOE INCITE –Innovative and Novel Computational Impact on Theory and Experiment – Program provides opportunitiesto apply for such allocations at the Leadership Computing Facilities) in the tens of millions of hours (exactnumbers depend on the supercomputer) and access to a supercomputer with a performance that is close to he top 10 supercomputers in the world . By the very deﬁnition of extreme-scale, only a handful of thesesimulations will be available in the world at any one time due to the high cost of running the simulationsand storing the outputs. We ﬁrst discuss gravity-only simulations and focus on questions concerning theconstruction of synthetic catalogs for cosmological surveys. The details on how the catalogs are built aregiven in Section 4 and some of the requirements listed here are justiﬁed in that section in more detail. Nextwe discuss hydrodynamic simulations for cosmological surveys. This area still requires signiﬁcantly moredevelopment to ensure that the physics in the simulations is captured correctly and to enable simulationsof cosmological volumes at suﬃcient resolution. Detailed use-cases of the hydrodynamic simulations in thecontext of cosmological surveys are discussed in Section 5. We start by summarizing the requirements from the three surveys, focusing on volume and mass resolutionfor gravity-only simulations. Next, we list the outputs obtained after an initial analysis step to enable theinvestigation of the diﬀerent cosmological probes targeted by the surveys. In this case, the requirementsare essentially the same for all surveys – if there are speciﬁc needs for a subset of the surveys, these will beexplicitly listed. Currently, two codes are actively being developed within the cosmology community thatcan carry out these “base” simulations at the needed resolution, PKDGRAV3 described in Potter et al.[2017] and HACC (Hardware/Hybrid Accelerated Cosmology Code) described in Habib et al. [2016]. Whilethere are obviously many more simulation codes, most of them do not scale to the largest supercomputerscurrently available (scaling is a prerequisite when applying for large supercomputer allocations at programssuch as the INCITE Program) or cannot take advantage of the architectures at all (e.g. only very few N-body codes can currently run on GPU-accelerated systems) . In addition, in order to carry out these largesimulations, the memory footprint of the codes has to be optimized to enable the simulation of trillions andmore particles. The development of these kinds of scalable codes, including analysis tools that can be runon the ﬂy to minimize storage requirements, usually takes major development eﬀorts that continue overmany years to adjust to new computing hardware. Both PKDGRAV3 and HACC can take full advantageof current accelerated supercomputing architectures. The remainder of this section describes the alreadyavailable simulations and possible future simulations, including plans going beyond ΛCDM. The exact requirements and outputs needed to generate synthetic sky maps depends strongly on the methodused to populate the simulation with the objects of interest, as described in Section 4, and the targetedwavelength. In this report we focus on optical and near-IR surveys, even though for cross-correlation tasks,other wavelengths are an important target for synthetic sky maps as well. With the type of simulationsdescribed in this section, galaxy modeling approaches focus on using the galaxy-halo connection ratherthan methods that identify galaxies with single particles. Given the depth of the surveys of interest and thedesire to resolve halos with a minimum number of particles to host a galaxy, the simulation requirementsare rather demanding.First, we brieﬂy describe the requirements with regard to volume and mass resolution for the simula-tions that underlie detailed synthetic sky catalogs as required by Rubin, Euclid, and WFIRST. All threesurveys are exceptionally deep and/or wide and therefore require large volume simulations to capture thecomplete survey volume out to the desired redshifts. At the same time, the resolution of the dim galaxiesthat will be captured in the surveys requires the simulation to have a very high particle mass resolution.The volume for these simulations should not be less than 3 Gpc to avoid too many replications whencreating light cones at high redshifts. Optimally, the volume covered would be around 5 Gpc, which is iﬃcult to reach with currently available supercomputers at the required mass resolution. As we discussfurther below, most methods for generating synthetic catalogs rely on the identiﬁcation of subhalos andon ﬁnding halos down to low masses. Therefore the (minimal) particle mass resolution requirement foran extreme-scale simulation is around 10 M (cid:12) , more optimal would be 10 M (cid:12) , which is, however, diﬃ-cult to reach in large volume simulations. Recently developed methods to generate synthetic catalogsfor Rubin-DESC (Dark Energy Science Collaboration) employ a hybrid approach that uses results fromsmaller volume simulations with mass resolutions around 10 M (cid:12) to populate halo catalogs with less wellresolved halos.Next, we describe the data products that are extracted from the N-body simulations to prepare thecreation of synthetic catalogs. These catalogs include galaxy properties as well as weak lensing shearmeasurements. In order of increasing complexity, the products typically include:1. Particle lightcone data:

These are required for the generation of shear maps. In principle, they canbe constructed after the simulations ﬁnish if enough particle snapshots are saved. In practice, theamount of data that would need to be stored from these large runs is in the few PBytes which isvery challenging for most computing centers. Therefore, it is preferable to generate the lightconeson the ﬂy. We discuss some more technical questions about this below.2.

Dark matter halo positions and masses:

These are required by all synthetic sky modeling approachesconsidered here (with the exception of hydrodynamic simulations). A variety of methods have beendeveloped to identify and characterize halos in N-body simulations. Beyond diﬀerences in halo massarising from the somewhat arbitrary deﬁnition of a halo, properties of halos (e.g. mass, position,maximum velocity of the rotation curve) are robustly determined by almost all halo ﬁnders (seeKnebe et al. [2011] for an extensive halo ﬁnder comparison), except under conditions of majormergers [Behroozi et al., 2015] where careful consideration of the algorithm used is required. Giventhe need to push base simulations to the limits of current supercomputing facility’s abilities toachieve survey simulation goals, there remains a fundamental limit to the accuracy with which haloproperties can be recovered due to ﬁnite particle sampling [Trenti et al., 2010, Benson, 2017].3.

Dark matter subhalo positions and masses:

Subhalo masses are not required by classic halo occupa-tion distribution (HOD) approaches, but are required by both subhalo abundance matching (SHAM)and semi-analytic model (SAM) approaches. As with halos, a variety of tools have been developedto identify and quantify subhalos in N-body simulations. In general, subhalo masses are determinedrobustly down to around 100 particles, with subhalo detection robust down to as few as 20 particles[Onions et al., 2012]. Other important quantities, such as subhalo spins [Onions et al., 2013] aredetermined robustly, while the spatial distribution of subhalos display discrepancies at the 5–10%level between diﬀerent ﬁnder algorithms [Pujol et al., 2014].4. Merger trees:

Providing the linkage between halos across time, merger trees are required for allSAMs, and by some empirical models. Construction of merger trees is non-trivial and requirescareful consideration of how to identify progenitor/descendant halos across multiple snapshots ofthe base simulation [Srisawat et al., 2013, Wang et al., 2016], and relies crucially on the propertiesof the input halo/subhalo catalogs [Avila et al., 2014].5.

Halo shapes:

While halo shapes (i.e. the departure of the halo from spherical symmetry) are notdirectly required by any SAM or empirical model that we know of, they are often used to assignposition angles to galaxies—an important consideration for weak lensing studies which must assessthe importance of intrinsic alignments [Kiessling et al., 2015]. Shape determination is known to beaﬀected by the number of particles with which a halo is resolved in the base simulation [Schneideret al., 2012a], but the consequences of this for synthetic sky simulations have not been assessed. This is true for most SAMs, although some are able to operate without information on subhalos, in which casethey either provide no information on the spatial distribution of galaxies within each halo (beyond simply identifyingone galaxy as the central), or determine this information by integrating subhalo orbits directly. lternative methods such as using a measurement of the local tidal ﬁeld smoothed on scales of300 kpc or larger are currently being developed and would provide an approach that relies on thesimulation raw output particle data instead of halo information.The details of the merger tree construction algorithm can have signiﬁcant eﬀects on the properties ofthe resulting galaxies [Lee et al., 2014]. While these systematic eﬀects can often be “calibrated out” byadjusting the parameters of the galaxy model, calibration is expensive, and limits the applicability of themodel to a single combination of base simulation, halo ﬁnder, and tree builder—this may be limiting if amodel is to be applied to synthetic sky catalog generation for multiple surveys.Finally, the ﬁnite number of discrete particle snapshots of the base simulations which are typicallystored can have consequences for the resulting galaxy catalogs. Benson et al. [2012] emphasize the needfor a minimum number of snapshots to ensure that galaxy properties are converged (a requirement thatbecomes more problematic at high redshifts where fewer outputs prior to that redshift are available). Thediscreteness of base simulation snapshots also propagates into the construction of lightcone catalogs—inwhich galaxy properties are output at the epoch at which the galaxy crosses the past lightcone of an“observer”—in both the positions, physical properties, and observable properties (e.g. SEDs) of galaxies.While these discreteness eﬀects can be minimized through interpolation [Merson et al., 2013], they arediﬃcult to fully mitigate. In the Euclid Flagship simulation (see Potter et al. [2017] and the descriptionin Section 2.2.2) this problem was overcome by generating a particle light cone on-the-ﬂy (while thesimulation was running) and ﬁnding halos directly using the particle lightcone data. This approach alsoaddresses some of the storage concerns (since a large number of full raw particle snapshots does not needto be stored) though also restricts the available number of light cones (if the lightcones are generatedin post-processing, the observer can be placed in many locations and therefore many light cones can begenerated). Currently, there are three major simulations available that cover large volumes at high mass resolutionand are being used for generating synthetic maps for diﬀerent surveys: (i) the Euclid Flagship Simulation,which is the base for the synthetic sky catalogs used by Euclid, (ii) the Outer Rim simulation, which iscurrently being used for the second Rubin-DESC Data Challenge and has been used for DESI and eBOSScatalogs, and (iii) the Dark Sky Simulation, which is used for building DESI catalogs.

The Euclid Flagship Simulation

The Euclid Flagship Simulation (see Potter et al. 2017) features a simulation box of 3780 h − Mpcon a side with 12 , particles, leading to a mass resolution of 2 . × h − M (cid:12) . An agreed uponreference cosmology, close to Plank 2015 values, was used with the following parameters: Ω m = 0 . , Ω b =0 . , Ω CDM = 0 . , Ω Λ = 0 . , w = − . , h = 0 . , σ = 0 . , n s = 0 .

96. A contribution to theenergy density from relativistic species in the background was ignored (Ω

RAD ,ν = 0). Using this EuclidReference Cosmology allows comparison to many other smaller simulations from N-body codes as well fromapproximate techniques that also use these reference values within the Euclid collaboration. The initialconditions were realized at z = 49 with second order Lagrangian perturbation theory (2LPT) displacementsfrom a uniform particle grid. The transfer function was generated at z = 0 by CAMB and the resultingpower spectrum was scaled back to the starting redshift of z = 49 via the scale independent growth factor.The main data product was produced on-the-ﬂy during the simulation and is a continuous full-sky particlelight cone (to z = 2 . Or, of course, by simply outputting more snapshots of the base simulation, although this will be limited byavailable data storage, and can lead to challenges in building merger trees [Wang et al., 2016]. ncludes 100 dark matter halo catalogs (that were identiﬁed using a friends-of-friends, FoF, algorithm) andpower spectra at proper time snapshots, as well as 11 complete particle snapshots from z = 0 .

764 to z = 0.This simulation was performed using PKDGRAV3 on the Piz Daint supercomputer at the Swiss NationalSupercomputer Center (CSCS) in 2016. The Outer Rim Simulation

The Outer Rim simulation covers a volume of (3 h − Gpc) and evolves 10,240 particles, leading toa mass resolution of ∼ . · h − M (cid:12) . The simulation was carried out on the Argonne LeadershipComputing Facility’s BlueGene/Q machine, Mira, in 2013/14. Almost 100 time snapshots were saved andanalyzed, yielding a data volume of more than 5PB. The data products from the simulation include halocatalogs for diﬀerent mass deﬁnitions, subhalo catalogs, detailed merger trees, two-point statistics, lightcone representations of the data (halos and particles), and subsamples of raw and halo particles. TheOuter Rim run continues in the tradition of the Millennium simulation by Springel et al. [2005], witha similar mass resolution but with a volume coverage increased by more than a factor of 200. This isessential for capturing galaxy clustering at large length scales and for achieving the needed statistics forcluster cosmology. The cosmology used for the simulation is close to the best-ﬁt model determined byWMAP-7 [Komatsu et al., 2011]. The chosen cosmological parameters are: Ω m = 0 . b = 0 . n s = 0 . h = 0 . σ = 0 . w = − . The Dark Sky Simulation

The Dark Sky Simulation covers a much larger volume simulation box of 8 h − Gpc on a side with10 , particles, leading to a mass resolution of 3 . × h − M (cid:12) (see Skillman et al. 2014). Thecosmological parameters are: Ω b = 0 . , Ω m = 0 . , Ω Λ = 0 . , w = − . , h = 0 . , σ =0 . , n s = 0 . z = 2 .

3) as well as Rockstar halo catalogs are availablefor this simulation ( ≈

16 TB). While the Dark Sky Simulation evolved an impressive number of particles,its mass resolution is not suﬃcient for the mock galaxy catalogs needed by current and upcoming surveys.The dark sky simulation was performed using the 2HOT code [Warren, 2013] on Titan at the Oak RidgeNation Labs Supercomputer Center.

Both Euclid and Rubin-DESC have used the large scale simulations described above to generate detailedsynthetic mocks. WFIRST is currently discussing with Rubin-DESC how to take advantage of some ofthis work but is overall not as far advanced with their eﬀort. Both Euclid and Rubin-DESC have startedto identify future simulation needs already, which we brieﬂy describe below. We emphasize that thesedescriptions are only capturing the next steps but not the full need for simulations in the future.

Euclid Consortium

There is already a need for an improved version of the Flagship Simulation which will have a massresolution of 10 h − M (cid:12) in the same volume to satisfy the wide (15,000 sq. degree) part of the survey. Sucha simulation should now also include radiation and massive neutrinos ( (cid:80) ν m ν = 0 .

06 ev) in the backgroundas well as a linear treatment of the evolving neutrino ﬂuctuations (and their eﬀect on the dark matter).Halo merger trees including so called “orphan galaxies”, which attempt to continue tracing positions ofdissolved subhalos, are also needed to support more sophisticated mock galaxy catalogs using SAMs (seeSection 4). Finally, there is a need for a simulation which has an order of magnitude better mass resolution(10 h − M (cid:12) ) and greater depth in the light cone ( z = 5) but only covering a much smaller area on thesky in order to satisfy the needs of the deep survey (40 sq. degrees) of Euclid. One or more pencil beamscould be generated from a smaller volume simulation to cover these needs, but such a simulation shouldstatistically “match onto” the full sky simulation and hence should use the same reference cosmologicalparameters. Mock galaxy catalogs for these two requirements should be made available as early as mid2019. Combined, these 2 simulations will require about 1.2 million node-hours (with 1 GPU per node), or Rubin-DESC

Rubin-DESC is starting to prepare a third data challenge (DC3). The Outer Rim simulation is beingused for the second data challenge (DC2) to generate a 5000 sq degree synthetic sky catalog. DC3is supposed to cover the full survey area (currently being planned at 18,000 sq degree). The OuterRim simulation can be only used for this purpose if it is replicated many times which could lead toartifacts. On the other hand, increasing the volume considerably at the same mass resolution would bevery challenging. In addition to a ΛCDM universe, DESC is planning to use a non-ΛCDM simulationto investigate the sensitivity of the diﬀerent probes to subtle changes in cosmologies. For this, a newsimulation (the Knowhere simulation) has been started on the Cori supercomputer (located at DOE’sNERSC facility) with a mass resolution of approximately 5.5 · h − M (cid:12) and a box side length of 2 h − Gpc.While the simulation volume is not very large, the mass resolution is excellent and will enable the use ofa diverse set of galaxy-halo connection approaches.

There are several large cosmological simulations that include baryons (and associated astrophysical pro-cesses) which attempt to provide a partially predictive model for galaxy formation, follow the evolution ofbaryons inside and out of galaxies, and produce the observable properties of galaxies across cosmic time.These simulations solve the hydrodynamic equations in addition to gravity and employ comprehensivephysical “subgrid” models for process including (but not limited to) radiative cooling, star formation, andfeedback. We describe several examples of such simulations below. The simulations are extremely compu-tationally expensive in order to resolve the necessary galactic scales, while still having a large statisticallyrepresentative volume. However, cosmological hydrodynamic simulations do not resolve the scales neces-sary to perform ab initio calculations of critical physical processes of galaxy evolution like star formation.Instead they include subgrid modeling schemes that attempt to capture the key features of the underlyingphysical mechanisms or simply use phenomenological prescriptions. Current hydrodynamical simulationshave to push the boundaries of computational power to increase the dynamic range of the simulations inorder to decrease the assumptions included in the subgrid modeling. As a result, we are limited in thenumber of cosmological hydrodynamic simulations that can feasibly be run.Given the computational cost of running a single cosmological volume hydrodynamic simulation , muchless a suite of simulations spanning cosmological and subgrid nuisance parameters, their important rolein future wide-ﬁeld surveys will be to characterize systematic uncertainties and provide critical tests fortechniques to mitigate these uncertainties (see Section 5).Although this is not the focus of this report, other surveys spanning the electromagnetic spectrum fromX-rays to radio wavelengths will be carried out on similar time scales to the optical and near-IR surveysmentioned here. Combining these multi-wavelength surveys via cross-correlation measurements will pro-vide powerful constraints on both cosmological and astrophysical parameters, for example cross-correlationmeasurements of optical and cosmic microwave background data has the potential to constrain the sumof neutrino masses or feedback from Active Galactic Nuclei [e.g., Spacek et al., 2016, Battaglia et al.,2017]. For such measurements, hydrodynamic simulations are essential to provide testable predictionsto check the subgrid modeling assumptions within the simulations. Additionally, these cross-correlationmeasurements will provide lasting constraints that will provide critical test for and inform future sub-gridmodels. For example Illustris used 19 million core hours on the CURIE supercomputer in France. h − Mpc] Particles Particles [ h − M (cid:12) ] [ h − M (cid:12) ]High-resolution Hydrodynamical SimulationsBlueTides a

400 7040 . × . × EAGLE 67.77 1504 . × . × MassiveBlack-II 100 1972 . × . × OWLS 100 512 . × . × Horizon AGN b

100 1024 N/A 8 × N/AIllustris 75 1820 . × . × MUFASA 50 512 . × . × Low-resolution Hydrodynamical SimulationsBAHAMAS 400 1024 × × Magneticum 2688 4536 . × . × Note this is not a complete list of all available simulations a BlueTides was run to z = 8. b Horizon AGN was run with an AMR code and does not use gas particles, theequivalent spacial resolution is 1 kpc (proper units)

Currently hydrodynamic simulations do not have the combination of mass resolution and volume to meetthe eventual requirements for the systematic studies discussed in Section 5. There are some simulationshighlighted in the this report (see Table 1) that have suﬃcient mass resolution but lack volume and viceversa. However, the more immediate concern is that all of these simulations are only as good as the sub-grid physics models they employ. Further development and exploration of various models and techniquesare essential to capture physical processes including star-formation and feedback at high accuracy. Inaddition to the information needed from N-body simulations, catalogs of simulated galaxies with opticaland near-IR properties and thermodynamic properties are critical to providing necessary multi-wavelength,cross-correlation predictions and post-dictions with observations.

Cosmological hydrodynamic simulations can broadly be split into those simulations which resolve galaxyproperties, like morphology, and those that do not (hereafter we use the adjectives high-resolution and low-resolution to distinguish between these simulations, respectively). In Table 1 we provide the speciﬁcationsfor the example simulations we discuss below.

High-resolution hydrodynamical simulations

The general goal of high-resolution cosmological hydrodynamical simulations is to provide a predictivemodel for galaxy formation and produce the detailed properties of galaxies we observe. There are severalongoing eﬀorts to this end and the following are examples of such eﬀorts.The BlueTides simulation [Feng et al., 2016] aims to simulate the ﬁrst galaxies and active galacticnuclei (AGN), and their contribution to reionization using a version of Lagrangian TreePM-SPH codeGadget-2 [Springel, 2005]. This simulation is quite large given its mass resolution, however it has onlybeen run to z = 8 limiting its utility to high-redshifts. BlueTides was run on the Blue Waters system t the National Center for Super-computing Applications (NCSA) using the a total of 648 000 Cray XEcompute cores.The EAGLE project [Schaye et al., 2015] is a suite of hydrodynamical simulations that follow theformation of galaxies and supermassive black holes in cosmologically representative volumes using a versionof Gadget-2. The Eagle simulation includes sub-grid physics models that are tuned to agree with keyobservations of galaxies properties at as close to the level possible that one could attain by semi-analyticmodels [Schaye et al., 2015]. The subgrid physics used in the EAGLE simulations are based on theOWLS project [Schaye et al., 2010], which is a large suite of simulations with varying sub-grid physics toinvestigate the eﬀects of altering or adding a single physical process on the total matter distribution. TheMassiveBlack-II simulation [Khandai et al., 2015] is the same size as the EAGLE simulations with slightlyhigher resolution. A single set of subgrid physics models was used in the MassiveBlack-II simulation.Other projects that use diﬀerent hydrodynamic solvers include the Horizon-AGN suite of simulations[Kaviraj et al., 2017] which was carried out with the adaptive mesh reﬁnement (AMR) code RAMSES[Teyssier, 2002]. The Horizon-AGN simulations are similar in size and resolution compared to the othersimulations described. They include a variety of subgrid models to capture baryonic processes. The Illustrissimulation [Vogelsberger et al., 2014] used the moving mesh code AREPO [Springel, 2010]. The Illustrissimulation incorporates a broad range of galaxy formation physics [Vogelsberger et al., 2013] tuned onsmaller volume simulations to match stellar luminosity functions and optical properties of galaxies. TheMUFASA suite of simulations [Dav´e et al., 2016] employs the GIZMO meshless ﬁnite mass (MFM) code[Hopkins, 2015]. Despite their size the MUFASA simulations include subgrid models that were reﬁnedon high-resolution simulations of individual galaxies from the FIRE suite of simulations [Hopkins et al.,2014]. Low-resolution hydrodynamical simulations

The general goal of low-resolution hydrodynamical simulations is to follow the evolution of baryonsinside and out of galaxies and to capture rare objects like clusters and super-clusters of galaxies. The lowerresolution reduces the computational cost of the simulations and therefore enables simulations in largervolumes that can capture these rare objects. There are several ongoing eﬀorts and we list some examplesin the following.The BAHAMAS project [McCarthy et al., 2017] is a suite of simulations run with a version Gadget-2,that have been calibrated to reproduce the present-day galaxy stellar mass function and the hot gas massfractions of groups and clusters in order to ensure the eﬀects of feedback on the overall matter distributionare broadly correct. The Magneticum simulations [e.g., Dolag et al., 2016] are a suite of simulations runwith a Gadget variant that have large simulation volumes with comparable resolution to the BAHAMASsimulations. The largest Magneticum simulation has a simulation box length that is roughly 6.5 timeslarger than a BAHAMAS box. Given the computational expense of generating this large volume simulation,only a single set of sub-grid physics models were employed.

Gravity-only simulations at large volume and high mass resolution are extremely important for all threesurveys to enable the generation of detailed synthetic sky maps. Currently, two simulations are available(the Euclid Flagship simulation and the Outer Rim simulation) that are used for this purpose and are veryclose to the ultimately required mass resolution and volume for generating these maps. With the adventof the next-generation supercomputers (e.g., Summit at the Oak Ridge Leadership Computing Facility)the remaining needed increase in resolution should be achievable relatively easily. Currently two codes arebeing actively developed (PKDGRAV3 and HACC) that can carry out these extreme-scale simulations.Sharing the results from these simulations is very desirable as these simulations are very computationallyexpensive to produce, analyze, and store, and there are very few people with the expertise to undertake hese eﬀorts. However, sharing will require an infrastructure support investment to enable sharing ofthe simulation data and to enable the collaborations to generate synthetic catalogs given the diﬀerentapproaches used by the two codes to carry out analysis tasks. The infrastructure support would includestorage space accessible across the collaborations and people support to develop an infrastructure thatallows for easy data access (more details are discussed in Section 6). Additionally, if these simulations aredirectly shared between the surveys (rather than making them world-wide publicly available) the questionarises of how the simulation groups should be acknowledged for their work. At a minimum, they shouldbe made external collaborators to the surveys to facilitate co-authorship on papers that are enabled viatheir contributions. Unlike gravity-only simulations, hydrodynamic simulations are far from the ultimate goal with respect toachieving large, cosmological volume simulations at high resolution with reliable physics implementations.Not even the next-generation of supercomputers will rectify this situation, although some progress isbeing made to (at least) generate consistent results across codes at moderate scales. The challenges areon many fronts. First, most hydrodynamic codes do not scale eﬃciently to utilize the full machinesavailable today. Given the mass resolution requirements (around 10 M (cid:12) ), load-balancing is a diﬃcult taskand therefore enabling large volume simulations at very high mass resolution is currently out of reach.Beyond this problem, an even more serious problem is due to the uncertainties in the current subgridmodel implementations. The use of relatively crude subgrid models prevents us from achieving truly ﬁrstprinciple predictions and therefore makes it very diﬃcult to use the simulations for the purpose theyare primarily needed for – understanding astrophysical systematics. These systematics will ultimately bethe limiting factors to improving the cosmological constraints. Therefore, it is crucial to have concertedsupport across the surveys for improving hydrodynamic simulation capabilities. Eﬀorts are needed tohelp bridge the work carried out on the smallest scales to the larger volume, cosmologically relevant,simulations. Detailed studies of subgrid models must also be carried out to improve our understanding ofbaryonic eﬀects. The most eﬀective studies will come from multi-wavelength comparisons including cross-correlations with observables for which hydrodynamic simulations make testable predictions. Sharing theresults of hydrodynamic simulations is much easier than for the gravity-only simulations due to theircurrent limitations in size. Therefore, in order to make progress in the ﬁeld of hydrodynamic simulations,emphasis should be placed on supporting code development eﬀorts, the calibration of subgrid models, andpublic access to the simulations to enable wide utilization and cross-comparisons. In this section we summarize the resource requirements for cross-survey activities for both gravity-only andhydrodynamic simulations. We emphasize that cross-survey work is currently not explicitly supported andusually only occurs if the contributing scientists belong to more than one project. The demands that eachsurvey puts on members of the simulation team are already very high and the eﬀorts are not supportedsuﬃciently within each survey to begin with. Therefore, additional eﬀorts would need to be funded toenable cross-survey collaborations. • Phase 1: Deﬁnition; ∼ During the ﬁrst phase, the list of requirements and outputs that has been outlined in this report willneed to be ﬂeshed out to ensure that all the requirements are met for the diﬀerent surveys. Deﬁnitionsand units have to be agreed upon (or at least translations between diﬀerent code outputs) so that thesimulations can be seamlessly shared between the surveys. This requires close collaboration betweenthe surveys and strong engagement from the working groups that will use the simulations for various asks (e.g., pipeline validation, systematics studies). Each survey would need to appoint a researcherwho has easy access to the working group requirements. The outcome from Phase 1 would be acomprehensive report that details the outputs from the simulation, analysis tools (e.g., halo ﬁndingapproaches, ray trace code implementations), and how these are connected to the diﬀerent surveytasks. • Phase 2: Tool development, validation, and cross-comparison; ∼

12 months

During the second phase, all tools identiﬁed in Phase 1 have to be implemented and validated.Cross-code comparisons would be extremely useful. In addition, conversion schemes and readersfor the diﬀerent codes would be developed to enable sharing of the diﬀerent data products in astraightforward way. Mao et al. [2018] demonstrate how this can be achieved across diﬀerent syn-thetic catalogs – the same approach, a reader that takes any input and converts it into a commonexchange format, would be applied to enable sharing of data products between the surveys. Giventhat the tools need to be developed to run at scale and some of them to run on-the-ﬂy within thesimulation codes, the second phase will require considerable eﬀort. We emphasize, however, thatmost cosmology codes already have at least a subset of the tools available. • Phase 3: Implementation; ∼ During the third phase, new simulations would be carried out that can easily be shared betweenthe surveys, given the preparations during the ﬁrst two phases. The computing resources neededfor this phase are considerable and would likely need to be obtained via competitive processes, suchas INCITE in the US or PRACE in Europe. The eﬀort required to run the simulations and enablethe sharing of the associated data products strongly depends on the number of simulations to becarried out.

The simulations and tasks mentioned below in Phases 1 and 2 fall under the Agencies’ pre-existing researchand development models for numerical projects. In addition, the Agencies’ existing grant and awardsolicitations are suﬃcient to support the eﬀorts highlighted. However, we recommend that the Agenciesemphasize such proposals in grant programs including, but not limited to, NSF-AST, NASA-TCAN,NASA-ATP, and DOE and NSF Career awards. We also encourage the Agencies to fund multiple proposalsin these solicitations to diversify the code development, subgrid modeling, and comparison eﬀorts. Theinitial funding selection for such eﬀorts is critical to begin as soon as possible to have new subgrid modelstested and implemented. These hydrodynamic simulations are essential for the systematic mitigation andcross-correlation measurements for Rubin and Euclid, thus need to be completed by the time of ﬁrst lightfor these surveys. A second round of funding will be necessary to further develop subgrid models forWFIRST and to update them with the new observations and tests provided by Rubin and Euclid. • Phase 1: Calibration of subgrid models; ∼

12 months

As stressed in the report, a major challenge for hydrodynamic simulations in the cosmological contextis the calibration of subgrid models. The work required in this area involves major R&D eﬀorts thatare, at this point, not necessarily tailored to the speciﬁc surveys but still rather generic due to thelarge uncertainties in the modeling. Nevertheless, lessons learned should be shared between diﬀerentgroups and a concerted eﬀort that enables easy sharing of results would be extremely beneﬁcial. Inaddition, a comprehensive list of validation data sets, tests, and criteria relevant for the three surveysshould be compiled. This list would be shared between the surveys and be used as a benchmarkfor the subgrid model implementations. In addition, it would have to be ensured that the majorR&D eﬀorts are suﬃciently funded to continue their eﬀorts on developing and improving the currentsubgrid models in diﬀerent codes. • Phase 2: Initial Model Implementation; ∼

36 months

During the second phase, a range of simulations would be carried out that would be shared between he surveys. An important aspect here is that diﬀerent approaches and subgrid model implemen-tations would be automatically compared – a major advantage of a cross-survey eﬀort given theresources these simulations take. The simulations runs would be bracketing the remaining uncer-tainties in the subgrid models. This conservative approach would capture the systematics associatedwith baryons on the various cosmological estimators pertinent for the surveys as discussed later inthe report. The coordination between the surveys and the comparison eﬀort would constitute in amulti-year program. An optimistic estimate would be 3 years of eﬀort to actually make suﬃcientprogress with this task. General support for this eﬀort will enable diﬀerent groups to scale up theircodes to take full advantage of the planned exascale machines that are expected to arrive in ∼ • Phase 3: Large Hydrodynamic Simulations; ∼ During the third phase, some groups might be in the position to carry out one or several largehydrodynamical simulations in cosmological volumes with suﬃcient mass resolution to undertakedetailed studies of, e.g., intrinsic alignment eﬀects and cluster physics. Such simulations wouldrequire major computational resources, which could become available in the U.S. in ∼ Linking measurements of upcoming surveys to physical model parameters requires very demanding for-ward simulations, which evolve the universe from early times to the present day. Extracting precisioncosmological information from surveys depends upon extending existing modeling capabilities further intothe small scale nonlinear regime as well as rigorous marginalization over currently unknown physics. Inpractical terms, it means that no single simulation can be suﬃcient for inferring new cosmological insightsfrom observations, but that large simulation campaigns producing ensemble runs, while varying cosmo-logical and other parameters, are needed. While no simulation in the ensemble will be at the level ofthe extreme-scale simulations discussed in the previous section, they are still computationally costly andrequire signiﬁcant allocations on modern supercomputers. Data produced by those ensemble runs canbe many petabytes in size, matching or even surpassing the data volume produced by the extreme-scalenumerical simulations. In addition, the analysis of the suites of cosmological simulations is complex if theaim is to directly compare or apply them to the analysis of the observational data.Two concrete examples of cosmological probes that will depend in the future crucially on accuratepredictions in the nonlinear regime are weak lensing and cluster cosmology. To exploit the potential of thenext generation of weak lensing surveys, producing accurate predictions of the matter power spectrum iscritical. The signal-to-noise ratio of the cosmic shear signal is highest on angular scales of 5-10 arcminutes,which corresponds to physical scales of ∼ ∼

1% level. This requirement goes beyond the ± ∼

1% on scales out to k ∼ − for gravity-only simulations using Gaussian process modeling and sampling a ﬁve dimensional arameter space at only 36 points. It is important to distinguish between gravity-only simulations, whichare used to make the forecasts for cosmological surveys, and hydrodynamical simulations, which are ableto describe modiﬁcations to the gravity-only matter power spectrum arising due to baryonic physics. Al-though baryons represent ∼ ∼

1% on small scales.The primary summary statistic of cluster cosmology is the redshift-dependent mass function, i.e.,number density of clusters as a function of mass and redshift. Recent work by McClintock et al. [2018]emulates the dark matter halo mass function in a 7-dimensional parameter space (Ω m , Ω b , h , n s , σ , w and N eff ) sampling only 40 points in a 4 σ range around the current best guess “concordance” cosmology.McClintock et al. [2018] ﬁnd that their emulation of the dark matter halo mass function is sub-percentaccurate and already suﬃcient to serve the needs of the ﬁrst Rubin data release. Going forward, beyondyear 1 of Rubin, this halo mass function emulator will need to be rebuilt with more accurate dark matteronly simulations – and likely more evaluated points in parameter space – but it certainly appears that ourability to quantitatively describe the number density of halos as a function of mass and redshift is unlikelyto be a bottleneck in future data analysis.However, accurate cluster cosmology also critically depends on the knowledge of the mass-observablerelation and its scatter, which must be extracted from simulations for a wide range of cosmologies. Inan actual survey, clusters are binned by an observable that is correlated to cluster mass, for exampleredMaPPer richness, X-ray luminosity or temperature, or Sunyaev-Zeldovich signal. Presently, this clustermass calibration error dominates the (theoretical) error budget, and is likely to be the main roadblockfor cluster cosmology in future, possibly requiring the use of hydrodynamical simulations to ultimatelyresolve.Beyond obtaining accurate predictions for a range of cosmological probes, measuring constraints oncosmological parameters relies on sampling schemes, such as Markov Chain Monte Carlo (MCMC) in or-der to explore likelihoods in parameter space. Future surveys require tens of cosmological and nuisanceparameters, and one needs to sample millions of diﬀerent points in parameter space to reach convergence.Running a full ab initio cosmological simulation at each point in parameter space is thus not a practicalsolution, even if highly (physically) approximate methods would suﬃce. Investigations of advanced alter-natives is a very active ﬁeld of research, and includes algorithms for optimal sampling of parameter spaceand the interpolation of the target summary statistics given some sparse sample of evaluated points inthe parameter space. This approach (colloquially called “emulation”) was ﬁrst introduced for the matterdensity power spectrum in Heitmann et al. [2006], and was more recently followed by the work on thehalo mass function (Heitmann et al. [2016], McClintock et al. [2018]), galaxy clustering and galaxy-galaxylensing by Wibking et al. [2017], and galaxy power spectrum and correlation function (Kwan et al. [2015],Zhai et al. [2018]).The examples listed above – accurate predictions across cosmologies for a range of cosmological probes,investigation of diﬀerent baryonic feedback models – and also the need for covariance estimates, all showcasethe need for generating ensembles of simulations. As for the extreme-scale simulations, results from sucheﬀorts can and should be easily shared between the diﬀerent surveys. In particular, no survey speciﬁcmodeling is required when building, e.g., emulators and therefore results are easily usable by a range ofsurveys. There are several challenges connected to generating large simulation ensembles. Some of these challengesare the similar to those for the extreme-scale simulations, but additional challenges arise due to thecomplexity of handling and organizing a large number of simulations. As for the extreme-scale simulations, ecuring computational resources, allocations as well as storage, to enable the runs themselves is diﬃcult.However, the advantage is that each individual simulation is relatively small, so many more supercomputingfacilities can be engaged to carry out such simulations. At the same time, if one wants to take full advantageof a range of computing resources, there are major challenges for the simulator related to running andmonitoring the simulations across multiple facilities.The major challenges for carrying out large ensemble runs are: • Securing computational resources (allocations, storage) to enable the runs themselves. • Developing analysis tools to eﬃciently extract a range of measurements from the simulations toenable the construction of emulators. • Building workﬂows that enable management for running and analyzing very large numbers of sim-ulations (potentially across multiple facilities with varying architectures and requirements).

Producing ensembles of simulations that span cosmological and nuisance parameters is essential to beable to fully exploit the information available from future cosmological surveys. At this point, only afew such emulation projects have been carried out, mostly focusing on statistics that are easily extractedfrom N-body simulations, such as the matter density power spectrum and more recently galaxy-relatedstatistics. In future, those emulators closer to direct observable statistics will become crucial, includinggalaxy (photometric)-shear, galaxy-galaxy (photometric) correlation, shear-CMB cross-correlation, shear-CMB lensing cross-correlation, galaxy-CMB cross-correlation, and others. The main diﬃculty in goingfrom matter to galaxy statistics is the increased number of parameters, although nuisance parametersdescribing galaxy-halo relations need not be sampled with expensive ab initio simulations, but can beincluded in post-processing. The key to a large simulation campaign succesfully addressing the needsof multiple surveys is therefore separating parameters into computationally “expensive” and “cheap”.Cosmological parameters belong to the ﬁrst group; changing any of them requires running a new simulationstarting from linear-theory initial conditions. Cheap parameters are, on the other hand, straightforwardto vary directly on outputs, which can be done in post-processing.The demand on the numerical codes is less severe here than with regard to the extreme-scale simula-tions; as ensembles consist of medium- to low-resolution simulations, code scalability is not as much of anissue, nor is I/O eﬃciency as each ﬁle is moderate in size. However, we emphasize that future supercom-puting architectures (beyond 2020) are anticipated to be more complex, thus current “workhorse” codeslike Gadget-2 will not suﬃce unless properly modiﬁed.

We conclude the section by presenting common actions in support of large simulation campaigns usefulfor surveys considered here. • Phase 1: Designing the common ensemble of simulations; ∼

12 months

The goal of this phase is to obtain a comprehensive understanding of the needs of all science workinggroups in major surveys, and to design a minimal common grid of simulations. This necessitatesinteractive collaboration involving scientists with strong expertise in diﬀerent probes which rely onsimulations for producing theoretical backdrop against which the observations are interpreted. Firstchallenge is to understand minimal simulation requirements needed to emulate diﬀerent statisticsat the desired level of precision. This commonly requires lot of domain knowledge, as the only wayto access this information on parameter sensitivity without running full simulation ensemble itselfis via approximate models. In some cases, like the halo mass function, an analytic ﬁtting functioncan be used as the approximate model. But in other cases it relies on extrapolation of the accuracyof emulators build via coarser simulations. his phase requires approximately one calendar year with researchers whose expertise spans therange of cosmological probes covered in this section. This phase will also involve running some“cheaper” ensembles of simulations, but it would not be very computationally expensive, roughlymillions of CPU hours. • Phase 2: Ensemble production; ∼

24 months

The goal of this phase is to produce the simulation ensemble, and it can start only after the suc-cessful completion of Phase 1. The work involves proposing computational resources, managing thesimulations on HPC platforms, assessing the emulation accuracy, adding new simulation points asmandated by the target accuracy. This phase requires approximately two calendar years. UnlikePhase 1, most of the work will be very computational in nature. It is important that the scientistsinvolved have some level of expertise in the ﬁeld in order to tackle issues related to the ﬁnal accuracy.The computing resources required for this phase are high and would likely have to be secured viacompetitive allocation process such as ALCC, INCITE or PRACE. • Phase 3: Updates and support with designs involving non-cosmological parameters; ∼

24 months

The goal of this phase is to provide support for the individual surveys that would use the commonensemble of simulations to add nuisance/post-processing parameters, like those needed to populatedark matter halos with simulated galaxies. We stress that the labor involved in this phase is not onlydata curation, but is iterative interaction with the relevant science working groups in the surveys.As diﬀerent statistics may need to have higher accuracy at certain points of the N-dimensionalparameter grid, new point evaluations (i.e. full simulations) would be needed. Once those areproduced they could be propagated to other surveys and working groups as this would represent anoverall increase in the modeling accuracy of the emulator. This phase would continue for roughly 2years.

There are a wide variety of methods for producing synthetic sky maps and there are many parallel eﬀortscurrently underway (using the same base simulation in many cases). We report on the approaches bythe diﬀerent surveys for generating synthetic sky maps, including the modeling of diﬀerent galaxy types,generation of shear maps, validation approaches, etc. Common modeling and validation challenges havebeen identiﬁed and possible joint solutions and pipelines will be outlined.Methods for generating synthetic skies for cosmological surveys can be loosely stratiﬁed according tomodeling choices driven by the tradeoﬀ between complexity and computational eﬃciency. We begin bybrieﬂy summarizing the broad categories into which contemporary methods fall, listed in descending orderof the computational expense to generate a single synthetic sky:1.

Hydrodynamical simulations of cosmological volumes [for a recent review article, see Somervilleand Dav´e, 2015, and references therein] directly track the evolution of gravity-only particles suchas dark matter, simultaneously with the physics of baryons, including ﬁne-grained “sub-grid” pre-scriptions for processes such as radiative cooling, star-formation and associated feedback, black holeactivity, etc.2.

Semi-analytic models (SAMs) [for a recent review article, see Somerville and Dav´e, 2015, andreferences therein] are grafted into gravity-only N-body simulations. As a prerequisite to generatinga synthetic sky, all SAMs require a signiﬁcant post-processing phase of such N-body simulations, inwhich dark matter halos are identiﬁed at each output simulation timestep; halos across timestepsare subsequently linked together into a “merger tree” that stores the evolutionary history of eachidentiﬁed halo. The SAM approach is to parameterize baryon-speciﬁc processes as functions of thehalos and their evolution; on a halo-by-halo basis, SAMs seek to directly model how baryons wouldhave been evolving had they been included in the N-body simulation. . Empirical models are also grafted into N-body simulations. All empirical models require theidentiﬁcation of dark matter halos, though the level of detail of the post-processing halo-identiﬁcationphase varies considerably from method to method. These models are statistical in nature, as theyare formulated in terms of stochastic mappings between ensembles of halos and ensembles of galaxies[for a recent review article, see Wechsler and Tinker, 2018, and references therein].4.

Approximate N-body methods employ various analytical techniques to circumvent the needfor a full simulation. Some methods approximately solve for the evolution of the density ﬁeld,and then identify halos in a post-processing phase; other methods approximately solve for the halodistribution more directly, without appeal to a halo-ﬁnder. All approaches require supplementationfrom simpliﬁed empirical models for the galaxy-halo connection to compute cosmological observablesof galaxies from the approximated halo distribution.Current and planned large-scale structure surveys most commonly employ empirical modeling andSAMs in the generation of synthetic skies supporting the survey. While hydrodynamical simulationsare used extensively to study the impact of a variety of systematic eﬀects that are relevant to large-scalestructure cosmology, these simulations are seldom used to directly produce mock catalogs for collaboration-wide analysis. Catalogs based on approximate N-body methods are typically used in applications wherea single tracer population is distributed across a large cosmological volume; thus synthetic galaxies inmocks generated with these methods have essentially no attributes (beyond being brighter than a singlecolor-magnitude threshold). However, many scientiﬁc analyses involve making a range of cuts in multiplewavebands, which requires mock galaxy catalogs to have more complexity than has yet been achieved viaapproximate N-body methodology.

In this section we enumerate the properties typically required for simulated (imaging) surveys and assessthe ability of each modeling approach to provide these properties.

The broadband ﬂux of a galaxy is one of the most important quantities required by any simulated skyprogram. Current and planned imaging surveys are composed of ﬁve or more ﬁlters, and the distributionof observed galaxies in this multidimensional space exhibits a rich spectrum of correlations across redshiftand environment. Cosmological surveys have diverse needs for mock catalogs with accurate conditionalone-point functions between most or all bands of the survey. Many cosmological analyses additionallyrequire or beneﬁt from mock catalogs with high-ﬁdelity two-point functions, including correlations withcolor, brightness, and redshift. This is especially challenging because of the sensitivity of two-point clus-tering to environmental correlations, and because in all present-day methods these correlations are notparameterized directly, but emergent.Generating large-volume, synthetic galaxy catalogs that meet these speciﬁcations is a diﬃcult challengefor all approaches to the problem. Traditional empirical models only produce mock galaxies with stellarmass or absolute restframe magnitude in a single band. By itself, such a model is insuﬃcient to generatethe required properties, and so empirical methods tend to be used primarily as “baseline” or “tuning”mocks, on top of which additional modeling is carried out. The one- and two-point ﬁdelity of mocksproduced in this fashion is the highest of any available alternative, but with restricted applicability to theparticular bands used in the tuning. Variations on this multi-step empirical approach are widely used togenerate present-day catalogs; scientists within each survey typically develop one or more of such methodsthemselves using methods that commonly remain proprietary.SAMs use stellar population synthesis (SPS) models to produce a stellar continuum SED for eachgalaxy; from this, absolute magnitudes in multiple bands are found by integration of the SED under the ppropriate ﬁlter. This modeling formulation naturally lends itself to a broad range of surveys, since inprinciple mock observations with any set of ﬁlters can be made on galaxies in a SAM-generated mock. Inpractice, it is computationally expensive to train SAMs, and diﬃcult to ensure high-ﬁdelity reproduction ofthe observed one- and two-point functions. The computational expense of SAMs is especially challengingbecause, in practice, the validation criteria of large imaging surveys are continually evolving, so thatexpensive one-time-only model calibrations quickly become obsolete.In hydrodynamical simulations, just as in SAMs, the star-formation and assembly history of eachgalaxy is used together with an SPS model to produce an SED, together with the ﬂux observed throughany desired broadband ﬁlter. Hydrodynamic simulations oﬀer the highest level of complexity in the ﬂuxesof synthetic galaxies, and considerable progress has been made in realistic forward-modeling of galaxycolors using hydro simulations. However, as described in Section 2.4.2 above, the computational expenseof this approach makes it diﬃcult to attain the necessary level of accuracy in the multi-Gpc volumesrequired by present and planned surveys.The observable luminosities and colors will be aﬀected by internal dust extinction in each galaxy.Modeling the eﬀects of dust ranges in complexity and realism from simple dust screen models, throughidealized geometry models in SAMs [Lacey et al., 2016], to full ray-tracing calculations applied to somehydrodynamical models [e.g., Jonsson, 2006]. Full ray-tracing is likely to be computationally too expensivefor simulation of large surveys, and in any case, may not give accurate results if the input galaxies arepoorly resolved. Screen and idealized geometry models are clearly oversimpliﬁed, but may at least giveplausible scaling of dust extinction with galaxy properties (e.g. metallicity, surface density), and can becalibrated to match observational constraints. For spectroscopic surveys it may be necessary to construct complete spectral energy distributions foreach galaxy or, at the least, to model key emission lines which will be used to select galaxy samples.As described in Section 4.1.1 both SAM and hydrodynamical simulations ubiquitously produce stellarcontinuum SEDs as a necessary step to the production of broadband luminosities, although the resolutionachieved by the underlying SPS model may be insuﬃcient to meet the resolution requirements of somesurveys. Furthermore, as noted above, calculation of the full SED may be signiﬁcantly more expensivethan computing a small number of broadband luminosities in some cases.Incorporation of emission lines into the spectrum has typically been achieved by modeling HII regions(e.g. using Cloudy [Ferland et al., 2017], which can also produce HII region continuua if those are required),with the physical conditions (e.g. metallicity, ionizing spectrum) taken from the underlying model (e.g.SAM or hydrodynamical simulation). Applications of this approach have been successful in reproducingluminosity functions and redshift distributions of H α -emitting galaxies [Orsi et al., 2008, 2010, 2014,Merson et al., 2018], although achieving plausible line ratios has proven to be more challenging [Mersonet al., 2018]. All of the caveats regarding the eﬀects of dust extinction on stellar continuum light alsoapply to emission lines, but are likely to be even more important as the emission lines arise preferentiallyfrom dense, dusty regions of galaxies.Incorporation of the AGN component into spectra is much less developed. Both SAM and hydrody-namical models usually predict the masses and accretion rates of the central supermassive black hole ineach model galaxy. These can be coupled with empirical or theoretical models of accretion disk spectra tocompute the AGN contribution to the spectrum [Fanidakis et al., 2011, 2012]. Morphology, by which we mean both classical morphological features (e.g. spiral vs. elliptical), but alsosize and shape, is crucial for construction of realistic simulated images, and for assessing the viabilityof weak lensing science. The greatest demands on morphology come from weak lensing working groups,which typically require mock galaxies to have both size and internal structure such as ellipticity and surface ensity proﬁles, including reasonably realistic correlations with broadband ﬂux, morphological type, andredshift.In semi-analytic modeling, the typical approach is to model morphology as a two-component disk/spheroid,based upon the formation and merging history of each galaxy. Sizes are usually determined from the an-gular momentum of halos for disks [Fall and Efstathiou, 1980, Mo et al., 1998, but see Jiang et al. 2018who argue that sizes are uncorrelated with halo spin and determined instead by halo concentration], andenergy conservation arguments for spheroids [e.g. Cole et al., 2000]. Within the class of SAMs there is abroad range of complexity in size modeling (e.g. inclusion of self-gravity of baryons, adiabatic contraction,energy dissipation during mergers). No extant model provides information about the ellipticity of thespheroid component. For disks, the normal vector of the disk plane is usually unspeciﬁed [Cole et al.,2000], or related to the angular momentum vector of the host halo [Stevens et al., 2016].Empirical models of galaxy morphology are at a less mature stage relative to models of SED-derivedproperties. There have been a comparatively small number of such models [e.g. Ross and Brunner, 2009,Skibba et al., 2009, Desmond and Wechsler, 2017]; no published empirical model has attempted joint predictions for morphology together with SED-derived properties such as broadband color.Modeling the intrinsic orientation of galaxy morphology is worthy of special mention in the context ofweak lensing science. Although intrinsic galaxy alignments are one of the leading systematics in lensing-based cosmological inference (see Section 5.2), models of this eﬀect in synthetic catalog generation areimmature. This situation is partly due to limited availability of clean observational measurements ofintrinsic galaxy alignments, but is primarily driven by the demands on modeling complexity created bythe need for realistic covariance between orientation, morphology and SED-derived properties.The most detailed forward modeling of galaxy morphology in the published literature has been carriedout using hydrodynamical simulations, which are coming to play a central role in studying intrinsic align-ment systematics. However, in many respects the status of morphology modeling mirrors the situationreviewed in the previous section on SED-derived properties. While hydro simulations achieve greater com-plexity than models based on gravity-only simulations, their computational expense has thus far limitedtheir direct use in synthetic catalog generation. While information on the spatial distribution and clustering of galaxies is provided by the underlying darkmatter distribution (provided by the base simulation), and is not a property of the galaxies themselves, itwarrants mention here. Any measurement of clustering will involve some observational selection (e.g. onabsolute magnitude), and so correlations between galaxy properties and their spatial distribution must becorrectly produced by models. Such correlations could easily be lost if galaxy models do not capture thedetails of assembly bias from the base simulation, or if the pre-processing of the base simulation does notcapture suﬃcient detail (e.g. if it misses populations of subhalos, or fails to construct suﬃciently accuratemerger trees; Benson et al. 2012, Srisawat et al. 2013, Avila et al. 2014, Wang et al. 2016).Furthermore, clustering predictions will depend to some degree on choices made in the treatment of sub-resolution eﬀects by any given model. For example, Knebe et al. [2018] explore the eﬀects of the choice ofhow to model “orphan” galaxies (galaxies whose host subhalo is no longer detected in the base simulation,possibly for purely numerical/resolution reasons) aﬀects predictions for two-point correlation functions,showing that these choices can have a signiﬁcant eﬀect on the amplitude of the two-point correlationfunction (and, therefore, on determinations of galaxy bias) due to the dependence of the frequency oforphan galaxy occurrence on the mass of the host halo. Validation of models in this respect must considermeasures of clustering conditioned on a variety of observational selections.

While techniques for survey simulations have advanced signiﬁcantly over the past decade, there remainseveral challenges which must be overcome before any of these methods can meet the scientiﬁc requirements f forthcoming surveys.Primary among these challenges is that of calibration and validation. For SAMs (and hydrodynamicalsimulations), calibration is crucial to ensure that the models accurately match the target data. Forempirical Monte Carlo methods, validation is key to demonstrate that the methodology is reliably robust.In both cases, these requirements are strongly limited by the computational challenge. For SAMs, thischallenge may be tractable [Henriques et al., 2009, Lu et al., 2011, 2012, Bower et al., 2010] using MCMCtechniques, depending on the diversity of calibration datasets and the accuracy to which they must bematched, but will likely have computational expense of comparable order to that used to carry out thebase simulations. Furthermore, while MCMC is eﬃcient at searching the model parameter space, there isno guarantee that a viable model (one which matches the target data to within the required tolerance)exists within that parameter space. Such approaches also require careful consideration of the errors (bothsystematic and random) of each target dataset [Benson, 2014], including covariances—something whichdoes not exist for the majority of datasets. The feasibility of calibration may also be limited by the validityof input physics modules. For example, it remains unclear whether extant SPS libraries produce colors tothe required accuracy [Conroy and Gunn, 2010]. For hydrodynamical simulations, precision calibration islikely impossible on timescales of interest.Diﬀerences in validation criteria between projects may pose a challenge for synthetic sky models. Allcurrent models are imperfect, and are typically able to match only a subset of observational constraintssimultaneously. If diﬀerent projects have diﬀerent validation requirements this may necessitate the con-struction of models tuned separately to each project—possibly invalidating any potential eﬃciency thatcould be obtained by utilizing the same model for multiple projects. Coordination on validation criteriabetween projects—with the goal of ﬁnding mutually-compatible criteria—should therefore be a priority.Additionally, observational constraints themselves are often inconsistent with other, similar constraints(e.g. two measurements of the galaxy stellar mass function which are formally diﬀerent given their errors).Methods to allow for covariances between datasets, and systematic uncertainties in those data (as well asin the models themselves) have been explored [Bernal and Peacock, 2018], but need further developmentto be applicable to the wide range of constraints and validation criteria that are expected.The evolving nature of a survey’s calibration requirements plays a critical and largely overlooked rolein this challenge. A number of factors contribute to the evolution of these requirements as a surveyprogresses: additional scientists join the collaboration and bring new expertise that informs the criteria;contemporaneous surveys release new data or measurements; alternative analyses that complement theinitially planned pipelines warrant new calibrations, and commonly require entirely new features of themodel to be introduced. Our assessment is that this evolving nature is rather fundamental to the operatingmode of all the large collaborations relevant to this report, and that this is unlikely to change for theindeﬁnite future. The reason this aspect of the workﬂow is central to any discussion of the computationalchallenges involved in generating simulation-based synthetic skies is simple: the evolving nature of asurvey’s calibration requirements precludes the possibility of a one-time-only “hero” calibration. Thissharply contrasts with the challenges associated with running the N-body simulations.While the base simulations described in Section 2 are likely to dominate the computational cost ofsynthetic sky map production, the production and validation of galaxy populations will be a non-negligibleand signiﬁcant computational cost itself. The exact cost will depend on the strictness of the validationcriteria for each speciﬁc survey, and on precisely which quantities are required (e.g. calculation of fullSEDs is computationally much more demanding than producing just one or two broad band luminosities).As noted above, calibration and validation of models will almost certainly require performing multiple(likely (cid:29)

1) runs of each model. Because of this, while calibration and validation can often be performedon a subset of the complete simulation volume, the total computational cost for calibration and validationis still expected to be of the same order as processing of the full simulation. Given current computationalresources, calibration to the levels required is possible (though costly) for SAMs and empirical models,but impractical for hydro simulations.Given these considerations, successful production of synthetic sky maps across surveys will require a igniﬁcant investment of both computational and human resources. Due to the broad sweep of expertiseacross galaxy formation physics that is required to build a suﬃciently complex and accurate model, anddue to the evolving nature of the calibration requirements, the associated labor will need to be carried outin close collaboration with each of the surveys’ analysis working groups. Synthetic sky map models are already able to meet the goal of being applied to the current generation ofbase gravity-only simulations, although in some cases this requires considerable computational resources.As the gravity-only simulations increase in volume and resolution, the sizes of the required synthetic skymaps increase, and the demands for modeling of additional quantities increase (particularly for multi-survey modeling), the computational demand of synthetic sky map production will increase signiﬁcantly.While these demands will be met in part by the next generation supercomputers, signiﬁcant investmentof eﬀort in code optimization, and development of statistical techniques to reduce computational demandwill be crucial.Another signiﬁcant challenge to be met is to produce synthetic sky maps which meet the requirementsof science working groups—in terms of the diversity of galaxy properties which are modeled, the accuracyto which those properties match reality, and the extent to which key physical correlations between prop-erties are captured by the model. Substantial eﬀorts are needed to deﬁne validation criteria for models(particularly if they are to be used for multiple surveys where those criteria may be very diﬀerent), todevelop improved or extended modeling techniques where needed, and to develop more eﬃcient methodsto calibrate models. This labor cannot be eﬀectively conducted by an individual or an isolated researchgroup, but instead requires close collaboration with the survey(s) whose analysis working groups haveneeds for the synthetic catalogs.The funding structure of large cosmological surveys provides insuﬃcient professional incentive to carryout this work. Currently, individual groups within a survey compete with each other to provide thesynthetic mock that is singled out as the “ﬂagship” or “standard” catalog of the collaboration; as generatingmocks is a fairly specialized scientiﬁc activity, it is common for the graduate students and postdoctoralresearchers involved to struggle to advance to the next career stage within the ﬁeld. This competition-based funding model has thus far resulted in closed-source software packages with only modest applicabilitybeyond the speciﬁc survey for which each package was tailored.Our assessment is that meeting the cross-survey goals outlined here requires a sustained eﬀort todevelop a scalable modeling platform with natural extensibility to multi-wavelength cosmological data.This platform would need to be developed in close contact with each survey’s scientiﬁc working groups,and the code base would need to be open-source and adaptable to suit the needs of the specialized analyseswithin each survey. We consider it unlikely that any such framework will emerge in the absence of a newchannel of stable, long-term funding dedicated to supporting the eﬀort.

We conclude this section by scoping the actions required by the challenge of generating synthetic galaxycatalogs that would be useful across surveys such as Rubin, WFIRST, and Euclid. • Phase 1: Comprehensive Assessment; ∼

12 months

Conduct a comprehensive assessment of the needs of all major surveys for whom the mock data isintended. This necessitates working closely with the relevant analysis working groups of each surveyto build and achieve consensus on quantitative validation criteria that will be used to evaluate themock. Special care must be taken to ensure that the validation data are self-consistent; each criterionshould be associated with one or more speciﬁc science aims of the surveys.This phase could be accomplished in roughly one calendar year by scientists whose expertise spansthe range of topics covered in this section. Phase 2: Model Development; ∼ Develop models and scalable software tools to generate galaxy catalogs with properties that arecurrently not available in mocks built for a particular survey. The end result of this phase isthe formulation of a comprehensive model with suﬃcient complexity to meet each of the surveys’needs, and a scalable implementation that can eﬃciently leverage the architectures of leadership-classcomputing facilities.The labor for Phase 2 could in principle commence ∼ ∼ • Phase 3: Model Calibration; ∼ Having built the form of the model and established quantitative optimization criteria, Phase 3will result in the delivery of a synthetic catalog that simultaneously meets the needs of all surveysparticipating in Phase 1. We stress that the labor involved in Phase 3 is not merely managing alarge computation, but will in fact be iterative with both of the previous phases: as the calibrationeﬀort proceeds, the scientists will discover new features required of the model, and the validationcriteria will undoubtedly evolve as new data become available in the time spanned by this eﬀort.Phase 3 necessarily follows Phase 2, and could be conducted over ∼ This section reports on systematic eﬀects that can be investigated by the surveys via the use of large-scalesimulations. It is important to be clear about the scope of this section: We are not exploring the usesimulations for systematics modeling in a general sense, in particular we are not looking at observationalsystematics such as photo-z and shear calibration, extinction, sky brightness. We stress that these eﬀectsalso require the use of simulations, however they are more survey speciﬁc than synergistic and hencebeyond the scope of this report. We do explore the use of simulations for modeling systematic eﬀects thatare not survey speciﬁc and where a joint simulation campaign would beneﬁt each of WFIRST, Euclid,Rubin.The systematics considered here include intrinsic alignments, baryonic eﬀects, galaxy bias, non-linearevolution of structure formation, and projection eﬀects. Most individual systematics tend not to cross-correlate, which is a strong incentive to investigate cross-correlations. This is particularly important giventhe fact that none of the astrophysical systematics are ﬁrst-principle calculations (with the exceptionof perturbative galaxy bias expansions) but rather phenomenological descriptions that are based on ob-servations and analytical approximations that are implemented through subgrid physics models or viasemi-analytic models, the latter of which are added to the gravity only simulations in post-processing(also see Sect. 4). In this context it is vital to ensure an information exchange between ongoing observa-tional campaigns (i.e. the Dark Energy Survey, Kilo Degree Survey, Hyper Suprime Cam Survey, BaryonOscillation Spectroscopic Survey, and the Dark Energy Spectroscopic Instrument) that enhance our under-standing of astrophysical models and the simulation campaigns of WFIRST, Rubin, and Euclid that needto implement the improved understanding derived from those surveys into increasingly reﬁned simulations.This is an iterative process that requires close interaction of observers, theorists, and simulators, and itrequires an equally close interaction of the research frontier and large infrastructure eﬀorts. .1 Accounting for Baryonic Eﬀects As optical and near-IR imaging surveys push the measurements of galaxy clustering and weak lensinginto the non-linear regime, it is important to understand eﬀects at smaller scales. In particular for weaklensing, the signal is mostly concentrated in smaller scales and thus accounting for baryonic eﬀects onthe matter power spectrum becomes critically important to provide an unbiased cosmological parameterinference [e.g. Semboloni et al., 2013, Zentner et al., 2013, Eiﬂer et al., 2015]. This is also true forcluster science, where the need to characterize galaxy clusters with baryonic physics is becoming criticalif one wants to use clusters to provide unbiased cosmological constraints (see, e.g., Bocquet et al. [2016]for a study of the impacts of baryons on the halo mass function). There are currently multiple eﬀorts tounderstand and simulate detailed baryonic physics within sizable ( ≈ h − Mpc on the side) cosmologicalvolumes. A list of hydrodynamic simulations and their properties is given in Section 2.3, Table 1. Amajor focus of these studies in the cosmological context is trying to understand at which length scales(in Fourier space and real space analyses) baryonic physics become so important that predictions fromgravity-only simulations cannot be used anymore for cosmological analyses. For a very recent comparison ofdiﬀerent hydrodynamical simulations, including the EAGLE, Illustris, and IllustrisTNG100 and TNG300,see Springel et al. [2018], for their impact on weak lensing with future surveys a recent study can be foundin Huang et al. [2018].At this point, more studies are needed to enable robust predictions for the matter density powerspectrum on small scales and the eﬀects of baryons on cluster mass measurements. Initiating a jointprogram across the surveys to tackle this question would enable detailed comparisons and studies of the(very diﬀerent) subgrid physics models that are employed in these simulation eﬀorts and how they aﬀectthe cosmological observables.

Cosmic shear is typically measured through two-point correlations of observed galaxy ellipticities. In theweak lensing regime, the observed ellipticity of a galaxy is the sum of its intrinsic ellipticity, (cid:15) I , andgravitational shear, γ : (cid:15) obs ≈ (cid:15) I + γ . If the intrinsic shapes of galaxies are not random, but spatiallycorrelated, these intrinsic alignment correlations can contaminate the gravitational shear signal and leadto biased measurements if not properly removed or modeled. Since early work establishing the potentialeﬀects [Heavens et al., 2000, Catelan et al., 2001, Crittenden et al., 2001], intrinsic alignments (IA) havebeen examined through observations [e.g., Hirata et al., 2007, Joachimi et al., 2011, Blazek et al., 2012,Singh et al., 2015], analytic modeling, and simulations [e.g., Schneider et al., 2012b, Tenneti et al., 2015,2014] - see Troxel and Ishak [2015] and Joachimi et al. [2015], and references therein for recent reviews. Afully predictive model of IA would include the complex processes involved in the formation and evolution ofgalaxies and their dark matter halos, as well as how these processes couple to the large-scale environment.In the absence of such knowledge, analytic modeling of IA on large scales relates observed galaxy shapesto the gravitational tidal ﬁeld and typically considers either tidal (linear) alignments, or tidal torquingmodels.The shapes of elliptical, pressure supported galaxies are often assumed to align with the surroundingdark matter halos, which are themselves aligned with the stretching axis of the large-scale tidal ﬁeld[Catelan et al., 2001, Hirata and Seljak, 2004]. This tidal alignment model leads to shape alignments thatscale linearly with ﬂuctuations in the tidal ﬁeld, and it is thus sometimes referred to as “linear alignment,”although nonlinear contributions may still be included [Bridle and King, 2007, Blazek et al., 2011, 2015].For spiral galaxies, where angular momentum is thought to be the primary factor in determining galaxyorientation, IA modeling is typically based on tidal torquing theory, leading to a quadratic dependence ontidal ﬁeld ﬂuctuations [Catelan et al., 2001, Lee and Pen, 2008]. However, on suﬃciently large scales, acontribution that is linear in the tidal ﬁeld may dominate. Due to this qualitative diﬀerence in assumedalignment mechanisms, source galaxies are often split by color into “red” and “blue” samples, as a proxy forelliptical and spiral types. Indeed, blue samples consistently exhibit weaker IA on large scales, supporting − k [Mpc − h]0 . . . . . . . P ( k ) b a r y / P ( k ) D M O z = 0 EagleMB2Horizon - AGNIllustrisAGNDBLIMFV1618NOSN NOSN NOZCOOLNOZCOOLREFWDENSWML1V848WML4DMO

Figure 1: The power spectrum ratio of diﬀerent hydrodynamical simulations with respect to theircounterpart dark matter only (DMO) simulations at z = 0. The thick lines represent the cases forthe EAGLE, MassiveBlack-II (MB2), Illustris, and Horizon-AGN simulations, while the thin linesindicate the 9 diﬀerent baryonic scenarios in the OWLS simulation suite. The dashed vertical linedivides the power spectrum ratios into regions where data points come from direct measurements(k ≤

30 h/Mpc) or from extrapolation with a quadratic spline ﬁt (k ≥

30 h/Mpc). Figure takenfrom [Huang et al., 2018]. 26 he theory that tidal alignment eﬀects are less prominent in spirals [Faltenbacher et al., 2009, Hirata et al.,2007, Mandelbaum et al., 2011]. On smaller scales, IA modeling must include a one-halo component todescribe how central and satellite galaxies align with each other and with respect to the distribution ofdark matter [Schneider and Bridle, 2010]. Krause et al. [2016] have conducted an exhaustive analysis of theimpact of IA on Rubin weak lensing analyses (see Fig. 2 for some representative results), varying luminosityfunctions, IA models, mitigation schemes, and contamination fractions of blue and red galaxies. Numericalsimulations, especially those including hydrodynamical physics, have recently become powerful tools forconstructing these models [Schneider et al., 2012b, Joachimi et al., 2013, Tenneti et al., 2015, 2014, Chisariet al., 2017]. It will be critical for the future to reﬁne these simulations with the latest observations andto forecast the impact on Rubin, WFIRST, Euclid analyses, and to further reﬁne this iterative approachto improve IA modeling. In this context it is of particular interest to study the correlations between IAuncertainties and galaxy-halo and baryonic modeling uncertainties and to develop a joint description ofthese intertwined astrophysical phenomena. − − − − − . . . . . . . − . − . − . − . − . . . . − − − − − − LSST no sysLSST HF G impactLSST HF G blue+red impactLSST HF D impactLSST HF G blue+red marg Ω m σ h w w a w a w h σ P r ob Krause, Eiﬂer & Blazek 2015 NLA (GAMA) +blueNLA (GAMA) +blue margNo IANLA (GAMA) red onlyNLA (DEEP2) red only

Figure 2: The impact of IA on WL constraints (68 per cent conﬁdence region) from Rubin assumingthe nonlinear alignment (NLA) scenario. We consider diﬀerent luminosity functions, i.e. GAMA(red/dashed) and DEEP2 (green/long-dashed) and for the GAMA luminosity function we alsoconsider the case for which blue galaxies have a mild NLA IA contribution (blue/dot-dashed).The Rubin statistical errors are shown in black/solid. Orange/dot-long-dashed contours showresults when using the most extreme of these cases, i.e. the data vector corresponding to the bluecontours, as input and including a standard IA mitigation scheme in the analysis. The marginalizedlikelihood is obtained by integrating over a 11-dimensional nuisance parameter space (see text fordetails) Figure taken from Krause et al. [2016].

The nonlinear regime of structure formation holds a wealth of cosmological information. For Rubin thishas been demonstrated in Krause and Eiﬂer [2017] Fig. 4 (reproduced in Fig. 3). The ﬁgure shows theinformation content as a function of the minimum scales included in an analysis. For the black, red,and blue contours a standard linear galaxy bias model is assumed, whereas for the green contours, whichinclude information from scales down to 0.1 Mpc/h, the analysis assumes a 6 parameter Halo Occupation OSMOLIKE

Figure 4.

Left: varying the minimum scale included in galaxy clustering and galaxy–galaxy lensing measurements. We show the baseline 3 × R min =

10 Mpc h − (black/solid) , and corresponding constraints when using R min =

20 Mpc h − (red/dashed) , R min =

50 Mpc h − (blue/dot–dashed) , R min = h − (green/long-dashed) instead. For the latter we switch from linear galaxy bias modelling to our HOD implementation. Right:information gain when using HOD instead of linear galaxy bias for 3 × (black solid versus dashed contours) in comparison to corresponding informationgain when including cluster number counts and cluster weak lensing in the data vector (violet/dot–dashed versus long-dashed) . for clustering and galaxy–galaxy lensing. Perturbative models forgalaxy biasing in the quasi-linear regime is an active area of research(e.g. McDonald & Roy 2009; Angulo et al. 2015; Senatore 2015),and the model for galaxy clustering and galaxy–galaxy lensing inequation (9) needs to be updated for analyses of galaxy cluster-ing measurements from future surveys. However, in the context ofthis forecast study, we are primarily interested in cosmological in-formation content as a function of scale. Forecasts based on theeffective linear biasing model should be interpreted as the potentialconstraining power assuming that sufﬁciently accurate bias modelswill be developed by the time of the data analysis.First, we characterize the loss in cosmological information frommore conservative R min =

20 Mpc h − and R min =

50 Mpc h − .Secondly, we consider a very optimistic scenario, in which weassume that galaxy biases down to scales of R min = h − and over the redshift range 0.2 < N g galaxies. Following Zheng et al. (2005), we split the HOD intocentral and satellite terms, which we model as (Zehavi et al. 2011) ⟨ N c ( M ) ⟩ = ! + erf " log M − log M min σ ln M , ⟨ N s ( M ) ⟩ = " ( M − M ) " M − M M ′ α sat . (22)The central occupation is a softened step function with transitionmass M min , which the characteristic mass or a halo to host a centralgalaxy, and softening σ ln M . M ′ is the characteristic mass scale fora halo to have a satellite galaxy; the satellite distribution is a powerlaw with slope α sat in high mass haloes and it is cut off at a low massscale M . For luminosity threshold samples, the satellite occupationis typically modulated by the central galaxy occupation as a halohas to contain a central galaxy to have satellite galaxies. For a colour selected sample however, only a fraction f c meets the sampleselection criteria, and we write the total galaxy occupation as ⟨ N g ( M ) ⟩ = ⟨ N c ( M ) ⟩ [ f c + ⟨ N s ( M ) ⟩ ] . (23)Based on this HOD, we calculate the galaxy–galaxy lensing andclustering power spectra as P gm ( k, z ) = b HOD ( z ) P lin ( k, z ) + % d M d n d M Mu m ( k, M ) ⟨ N c ( M ) ⟩⟨ N s ( M ) ⟩ u s ( k, M )¯ ρ % d M d n d M & N g ( M ) ’ P gg ( k, z ) = ( b HOD ( z )) P lin ( k, z ) + % d M d n d M ⟨ { N c ( M )[ f c + N s ( M )˜ u s ( k, M ))] } ⟩ (% d M d n d M & N g ( M ) ’) (24)with ˜ u s ( k, M ) the Fourier transform of the satellite galaxy densityproﬁle, which we assume to follow the matter density proﬁle, andwhere for notational convenience we deﬁne ⟨ [ N c ( M )] ⟩ ≡ × R min =

10 Mpc h − , black/solid ) to even larger cut-offs such as R min =

20 Mpc h − , (red/dashed) and R min =

50 Mpc h − , (blue/dot–dashed) . Thisis in sharp contrast to the substantial information gain whenemploying COSMOLIKE ’s HOD module to include smaller scales( R min = h − , green/long-dashed ) in the analysis. The sameinformation gain however is less signiﬁcant when adding clusternumber counts and cluster weak lensing to the 3 × Covariance matrices pose a major obstacle in multiprobe cosmo-logical analyses. If they are obtained through (re)sampling methodsusing the data itself, they are an estimated quantity (similar to theMNRAS D o w n l oaded f r o m h tt p s :// a c ade m i c . oup . c o m / m n r a s / a r t i c l e - ab s t r a c t/ / / / b y Q ueen M a r y U n i v e r s i t y o f London u s e r on O c t obe r Figure 3: This ﬁgure shows the gain in information on dark energy parameters w a and w p (wherethe latter corresponds to the commonly known w parameter but computed at a pivot redshift p ) as a function of varying the minimum scales included in galaxy clustering and galaxy-galaxylensing measurements. We show results for an Rubin joint clustering and weak lensing analysis (so-called 3x2pt), which assumes R min = 10 Mpc / h (black/solid), and corresponding constraints whenusing R min = 20 Mpc / h (red/dashed), R min = 50 Mpc / h (blue/dot-dashed), R min = 0 . / h(green/long-dashed) instead. For the latter we switch from linear galaxy bias modeling to a 6parameter Halo Occupation Density (HOD) implementation. Figure taken from Krause and Eiﬂer[2017]. Density model [see Krause and Eiﬂer, 2017, for details]. This analysis is performed in a 50+ dimensionalparameter space and we note that such a signiﬁcant gain in information given, the high-dimensionalityof the parameter space, is extremely rare. Rubin, WFIRST, Euclid have great potential to exploit thisinformation if the scientists provide accurate predictions well into the nonlinear regime. This task isdiﬃcult – not only due to baryonic physics that alter prediction on small scales but even to generate high-accuracy, gravity-only results across cosmologies is a diﬃcult task. To this end, the nonlinear evolution ofdark matter on large scales can be treated in diﬀerent ways. One is using perturbation theory, which hasbeen the default method when interpreting galaxy clustering in redshift surveys. It allows a somewhatmore controlled understanding on semi-nonlinear scales. Another method employs phenomenological ﬁtsto N-body simulations based on the halo-model, like Haloﬁt (for the most recent incarnation see Takahashiet al. 2012), or emulators of the actual N-body power spectrum measurements [Lawrence et al., 2017]. Athird approach is full forward modeling, where simulations are rapidly produced (using fast approximatecodes) and comparing the outputs with observational datasets directly see, e.g., [Agrawal et al., 2017]. Afourth approach includes using machine learning directly to predict cosmological parameters from the largescale structure to very small scales [Ravanbakhsh et al., 2017]. All of these methods need to be reﬁnedto reach the accuracy required for upcoming surveys. A joint eﬀort to investigate the validity of theseapproaches, the most eﬃcient implementation, and spatial reach at a given accuracy would be extremelyvaluable across the three surveys targeted in this report. .4 TACS Findings for Systematic Eﬀects Astrophysical systematics are common across all surveys and developing the required systematics miti-gation strategies to optimize the science return of Rubin, WFIRST, and Euclid, requires an integratedeﬀort that includes simulations, observations, and analytical descriptions. Observations from precursorsurveys such as DES, KiDS and HSC (in combination with CMB and spectroscopic surveys like BOSS andlater DESI) provide information on e.g., galaxy bias, velocity bias, baryonic scenarios, and intrinsic galaxyalignments. Some of these systematics can be modeled through analytical expressions, which are thenincorporated into numerical simulations in two ways: 1) via a post-processing step of N-body simulationsor 2) through ﬁne-tuning the sub-grid physics in hydrodynamical simulations. The increased precision ofthese simulations will in return enable an improved interpretation of Rubin, WFIRST, and Euclid data.This will be an iterative process, necessary nevertheless to avoid being dominated by astrophysical sys-tematics in future surveys. It is important in this iterative process not to double-count information, i.e.to develop a thorough procedure such that the data used to improve the simulations is not also the datathen analyzed with said improved simulations. Ensuring that the information used in the systematicssimulations remains independent of the Rubin, WFIRST, Euclid data is critical.

TACS ﬁnds that teams should be selected through competitively selected grants that include experts fromobservations, simulations, and analytical modeling. These teams should be cross-institutional and cross-survey; they should include experts from precursor surveys (e.g., BOSS, DES, eBOSS, HSC, KiDS, SDSS,VIPERS, etc) and from external data sets (e.g., CMB, X-ray, SZ), experts on numerical simulations andanalytical modeling. In this context it is important to note that astrophysical systematics are correlatedwith another and with cosmological observables and that developing strategies for each of the systematicsindependently will have very limited success. • Phase 1: Joint-probe, joint-survey assessment and forecasting; ∼

12 months

Systematics modeling and mitigation experts together with numerical simulation experts from Ru-bin, WFIRST, Euclid, should share information on modeling strategies, existing/planned simula-tions, anticipated access to external data sets. The impact of the systematics (all of them together)for a joint-probe and joint survey analysis needs to be assessed properly (cosmological forecasts). • Phase 2: Calibration, Validation, and Veriﬁcation; ∼

12 months

Continuation of the forecasting eﬀort but informed by early results from the simulations. Thesimulations should be calibrated with target observations and should be made as realistic as possible.At this point in time it should be clear what the relevant parameters in the simulations (e.g., subgridphysics) are that have the largest impact in changing the observables. This phase should also includedeveloping a strategy to ﬁnetune the simulations via observables that are weakly dependent or ifpossible fully independent of cosmology. • Phase 3: Systematics mitigation implementation; ∼

12 months

Implementation of the systematics mitigation strategies into cosmological modeling frameworks.Simulations at this point should span a large range in realistic cosmological and systematics modelsand they should not violate any cosmology independent observables within reasonable error bars.Test of mitigation strategies using precursor data and using independent simulations.

The analysis of cosmological data and simulations relies on using the most sophisticated statistical methodsavailable today. The input to many of these methods are large numbers of dark matter-only simulations,as discussed in Section 3. Due to the high cost of these simulations, it is very important to study statistical ethods that help to reduce the number of required simulations. Examples include the development ofemulators (predictions tools) from a limited set of high-quality simulations spanning a range of cosmologicalparameters or new modeling techniques for covariance estimates to reduce the number of realizationsneeded. We have identiﬁed common statistical challenges that rely on expensive simulations and discusspossible alternative methods that should be evaluated further. The creation of each virtual universe — for a given set of cosmological and marginalization parameters,as well as the particular random realization of the initial density ﬂuctuations — requires an extremelycomputationally expensive simulation on High Performance Computing resources. In order to make cos-mological inverse problems practically solvable, constructing a computationally cheap surrogate model oran emulator is imperative. Current approaches to emulators require the use of a summary-statistic whichis to be emulated, and are using simulations of the same ﬁdelity for each “design” point in N-dimensionalparameter space.To meet future survey requirements, we expect next the generation of emulators to exhibit progressin the following ways: (1) to have an iterative instead of a ﬁxed design; (2) to be multi-ﬁdelity capable,meaning to combine simulations done at diﬀerent ﬁdelities; and (3) use multi-level emulation via separatingdesign into “expensive” (e.g. cosmology parameters) and “cheap” parameters, like those appearing in post-processing runs, responsible for predicting diﬀerent luminosities or galaxy types from the density ﬁeld.

Methods to obtain covariances can be broadly structured into 3 diﬀerent categories: 1) analytic covariances,2) covariances estimated from numerical simulations, and 3) covariances estimated from the data directly.These methods have diﬀerent advantages and disadvantages; precursor surveys of Rubin, WFIRST, Euclidhave mostly been focusing on analytic covariances [Krause et al., 2017, Abbott et al., 2018, Hildebrandtet al., 2017, Hikage et al., 2018] and only rarely on simulation based covariances. Covariances directlyestimated from the data have not been used recently due to known biases when estimating the varianceof a large survey size from smaller subsets (e.g., Friedrich et al., 2016). All analyses that used analyticcovariances have had some validation scheme that involved numerical simulations.Analytic covariance matrices have 3 main advantages: 1) computational feasibility for large datavectors, especially in multi-probe analyses [see e.g. Krause and Eiﬂer, 2017, for a 7+million entry jointcovariance of weak lensing, galaxy-galaxy lensing, galaxy clustering, cluster number counts, and clusterweak lensing], 2) simple inversion procedures, and 3) ﬂexibility in terms of the scales, redshifts, galaxysamples that are considered. Whereas the second-order and Supersample variance terms can be computedsuﬃciently precisely (and much faster) using analytic covariances, the question remains whether higher-order moments of the density ﬁeld are suﬃciently precisely captured using analytic descriptions, primarilyvia the halo-model. In the case of weak lensing Barreira et al. [2018] have recently shown that the Gaussianand Supersample covariance terms are dominant such that the higher-order (connected tri-spectrum) termscan be neglected, without biasing Rubin and Euclid likelihood constraints signiﬁcantly. This result needsto be explored in the context of clustering, galaxy clusters and other probes, but for weak lensing it hasbecome clear that analytic covariances are a viable solution.When moving to simulation or data based covariance matrices, the scientiﬁc topic of choosing the bestestimator is important. Extensive research has been conducted on covariances obtained via the standardsample variance estimator and on the impact of imperfect estimated covariances on the cosmologicalparameter constraints. At the heart of the problem is the simple fact that the Gaussian likelihood, whichis commonly assumed in cosmological analyses, requires an inverse covariance. Unfortunately, the inverseof an estimated covariance is not the estimated inverse covariance and even minute residual noise in thecovariance estimator can severely bias the inverse. Hartlap et al. [2007] described a way to correct for his when assuming that the covariance estimate follows a Wishart distribution [see also Kaufman, 1967,Anderson, 2003]. The noise properties of this corrected precision matrix estimator and its impact on theconstraints derived on cosmological parameters was e.g. investigated by Taylor et al. [2013], Dodelson andSchneider [2013], and Taylor and Joachimi [2014], where the authors pointed out the enormous numberof realizations required (of order 10 or even 10 ) to achieve an inverse covariance with an acceptableprecision.Recently new Hybrid estimators (combining analytic and simulations and data) have emerged [Friedrichand Eiﬂer, 2018] and linear and nonlinear shrinkage estimators are being explored [Joachimi, 2017, Popeand Szapudi, 2008] which have substantially reduced these estimates and further reductions are possiblevia data compression.The functional form of the likelihood being a multivariate Gaussian has been questioned in the liter-ature, mostly in the context of weak lensing [Hartlap et al., 2009, Wilking and Schneider, 2013, Sellentinet al., 2018], but the same argument holds for galaxy clustering, galaxy-galaxy lensing and other large-scale structure probes. The core argument is that summary statistics derived from a non-Gaussian ﬁeldhave no ﬁrst principle reasons to follow a multivariate Gaussian likelihood. In the context of the CMB, acorresponding approximation for temperature and polarization second order statistics has been shown tonot bias results, however the CMB ﬁeld is substantially closer to a Gaussian compared to the late Universelarge-scale structure observables.Alternative approaches such as estimating the likelihood from simulations directly, or utilizing likeli-hood free analysis techniques such as Approximate Bayesian Computation are still in their early phaseof exploration and require targeted research funding to mature fully as alternatives. The necessity ofabandoning the multivariate Gaussian likelihood function as an assumption needs to be established ﬁrst.Currently the literature does not conclusively state whether this approximation fails at the level of pre-cision for Rubin, WFIRST, and Euclid. For this exploration we recommend a staged process of analyticexploration, inexpensive simulations, e.g. FLASK [Xavier et al., 2016], and subsequently ∼ highprecision simulations (the necessary number will be more precisely determined during the ﬁrst two steps). Developing meaningful discrepancy metrics is a core element of interpreting cosmological data. The mostprominent questions are: Is model A preferred over model B (LCDM vs wCDM in the most simple case)? Isdataset A in tension with dataset B (Euclid vs WFIRST vs Rubin)? Before combining datasets, scientistsmust assess whether the data to be combined are in tension with one another in the context of a givencosmological and systematics model.Discrepancy metrics are also important for a joint simulation eﬀort of Rubin, WFIRST, and Euclidnamely in determining whether the simulations are suﬃciently precise given the constraining power of thesurveys individually and then jointly. This is not a trivial task since, in principle, such an assessmentrequires an even more precise simulation of the survey(s) in the ﬁrst place. Even in the presence of sucha ﬁducial high-precision simulation (e.g., see the Euclid Flagship simulation), the questions arise: whatprecision do the emulator simulations need, what precision do the covariance/likelihood simulations need,and what precision do the systematics simulations need?The most common ways to quantify discrepancies are through either biases in the w0-wa parameterspace that arise when piping an imperfect simulation through a survey simulation pipeline or through theincrease in the error bars, usually also in the w0-wa parameter space, when accounting for an imperfectsimulation by adding nuisance parameters to the survey simulation or by adding the uncertainty quadrat-ically to the covariance. But even here the analysis choices can decide on the outcome of the discrepancyquantiﬁcation and can determine whether a simulation is deemed suﬃciently precise or not. Commonanalysis choices include: 1) how the covariances are computed, 2) what probes are included in the analy-sis, 3) what scales, redshifts, and galaxy samples are selected, 4) what priors are assumed from externaldata, 5) what systematics are included in the survey simulation, how are they parameterized and what re their priors, and most importantly 6) what physics is included in the parameter space (e.g., neutrinos,curvature, dark matter models).In order to asses whether a simulation (campaign) is suﬃciently accurate for the individual surveysRubin, WFIRST, and Euclid and additionally for their joint analysis, it is critical to unify the analysischoices for the survey simulations across the Rubin, Euclid, and WFIRST communities. This will allowthe surveys to have a meaningful framework to assess whether a given simulation (campaign) is suitablefor their needs. It is furthermore important to not simply quantify the precision of the simulations fortime-dependent dark energy, aka the w0-wa plane, but also for more general dark energy models andmodiﬁed gravity scenarios (e.g., alpha parameterization).Most importantly, the assessment of whether a simulation (campaign) is suﬃciently precise shouldhappen early, i.e. during the planning phase, of said eﬀort. Statistical methods are a critical element of a coordinated simulation eﬀort across surveys. Correspondingresearch is indispensable in order to eﬃciently use the existing computational resources (examples areemulators and covariance estimators) and in order to ensure that simulations generated within one of thesurveys are meaningful for another survey (discrepancy metrics).

The pilot studies suggested below are best implemented through competitive research grants from theDOE, NSF, and NASA, or small “Tiger Teams” that combine expertise in statistical methods and numericalsimulations across the surveys. First and foremost the surveys should share expertise and code on the topicsbelow and develop a coordinated testing scheme of the code implementations. The simulation resourcesrequired to implement some of the solutions on emulators, covariances/likelihoods, and discrepancy metricsshould be shared and the solutions should be tested on these shared resources in Phases 2 and 3. Forexample, covariance estimators using simulations that are developed by each survey should be testedagainst one another in simulated likelihood analyses. • Emulators:

Emulation of the computationally expensive aspects of a survey analysis is an indis-pensable concept in survey cosmology. Research on improved emulators, especially in the contextof new Machine Learning concepts, that interface statistical expertise with expertise of numericalsimulators should be a priority in competitively selected grants. • Covariances/Likelihoods:

1. The current state-of-the-art for covariance generation, given the latest results in analyticalcomputation, hybrid estimators, and non-linear shrinkage estimators requires of order 10 simulated survey realizations (previously 10 or even 10 ).2. Even 10 simulation realizations is a pessimistic scenario. With continued investment, therequired number of simulations could plausibly decrease even further (possibly to of order 10 )through a combination of data compression ideas and through combinations of the aforemen-tioned estimators.3. The tolerable error in the precision matrix is also dependent on the systematics budget andthe overall dimensionality of the likelihood analysis. In high-dimensional parameter spaces,errors in the precision matrix translate into sub-dominant uncertainties.4. Covariance matrices are only required when assuming that the likelihood of the consideredsummary statistic is a multivariate Gaussian. This assumption breaks down, strictly speaking,in the context of estimated covariances (from simulations and/or the data directly), where ithas been shown that the likelihood follows a t-distribution. Even in the case of an analytic ovariance matrix, the commonly considered summary statistics (two-point statistics) are notstrictly speaking distributed as multivariate Gaussians. Initial results diﬀer on the importanceof this eﬀect, i.e. the non-Gaussianity of the likelihood at least when considering two-pointstatistics.5. Alternative inference techniques, e.g. Approximate Bayesian Computation or Bayesian Hier-archical Modeling do not require assumptions on modeling a likelihood function and/or covari-ances. The community is exploring these avenues increasingly and although major obstaclesremain, it should be on the survey community’s radar.6. Findings:–

The impact of uncertainties in precision matrices should be examined by a joint task forceof experts across the surveys. This includes implementing and testing the new estimatorsin a realistic survey speciﬁc context. Data compression should be included in this eﬀort.The goal of this eﬀort should be to determine the required simulation eﬀort for covariancesacross all surveys. – Non-Gaussian functional forms of the likelihood and alternative inference techniques arelargely unchartered territory in terms of simulation needs for Rubin, Euclid, and WFIRST.Active research on these topics through competitively selected grants should be prioritized. • Discrepancy metrics:

Discrepancy metrics are critical to assess whether the quality of simulationsis acceptable in the context of individual Rubin, Euclid, and WFIRST analyses and their joint eﬀort.TACS ﬁnds that a cross-survey collaboration eﬀort should be created to develop these metrics inthe context of realistic analysis choices for the individual surveys.

Rubin, WFIRST, and Euclid are all looking at the same sky in a similar time-frame and they all havesimilar requirements for cosmological simulations. At the simplest level, it is a poor use of resources forthe three surveys to produce largely redundant simulation suites individually. In addition, there are only alimited number of people in the world with the expertise to produce extreme-scale cosmological simulationsand synthetic sky catalogs and also only a limited number of supercomputing facilities with the resourcesavailable to produce extreme-scale simulations or large suites of cosmological simulations. Given theselimitations it is challenging for the surveys to realize their cosmological simulation needs individually. Inpractice, it is the same simulators being approached by the diﬀerent surveys with slightly varying requestsfor cosmological simulations and their respective data products. A common infrastructure for sharingcosmological simulations will reduce the overall number of simulations that need to be produced, reducingthe pressure on both the supercomputing facilities and the simulators. It will also precipitate coordinationand agreements over who is producing what simulation products and how those products will be utilizedand acknowledged within each survey. If the infrastructure also includes a common approach for curatingthe data and some facilities for analysis, the ability for users to interact with the simulation data directly(rather than through the simulator) will be greatly increased.In order to realize any of the common approaches outlined in this report, and to ensure the scientiﬁcsuccess of Rubin, WFIRST, and Euclid, it is clear that a common infrastructure needs to be available.This includes hardware (e.g. storage space, data servers, fast connection and transfer links), as well asa common approach for data curation to make data products easily accessible to the community. It alsoincludes expert support personnel (both for the simulations and the data hosting) who are actively engagedin developing and maintaining the infrastructure, in addition to supporting the users. .2 Examples of Existing Infrastructures There are many solutions to hosting and sharing big datasets. The simplest solution is a basic repositorythat stores and hosts the simulations and associated data products for download by a user. More sophis-ticated solutions involve utilizing a common approach for data curation and also providing some on-sitecomputing resources to undertake increasingly sophisticated analyses on the data.This section provides two examples of existing data sharing and analysis infrastructures that have beenused for cosmological simulations. These examples are intended to give some insight into the diﬀerentsolutions available for storing and hosting cosmological simulation data and do not necessarily representthe best solutions for a common Rubin, WFIRST, and Euclid infrastructure. A more detailed and thoroughinvestigation is required to ﬂesh out infrastructure solutions that are optimized for Rubin, WFIRST, andEuclid.

The big data system at PIC was used to generate the Euclid Flagship mock galaxy catalog and to allowcollaborative access to this dataset. This full-sky mock, which extends to redshift 2.3, is a dataset of 10sof terabytes. It is made available within the Euclid collaboration via the COSMOHUB web portal whichallows users to make plots and extract subsets of the data without any prior SQL knowledge. The dataprocessing at PIC is based on the Apache Hadoop ﬁle system (HDFS), where data is distributed on localdisks of the processing nodes of a compute cluster. This gives very high data rates as long as the I/Oprocessing is always performed on the node that actually contains the relevant data on its local hard disk.By using Apache SPARC, which is a Python based implementation of the map-reduce data processingapproach, a very high degree of parallel I/O is sustained across these local hard disks. Generation of thefull-sky mock galaxy catalog from a catalog of 40 billion dark matter halos can be achieved in under 24hours. The software pipeline to produce this mock is called SciPIC, which is written in python using theApache SPARC framework. One key component of this pipeline is an implementation of the ‘treecorr’[Jarvis et al., 2004] algorithm for estimating galaxy pair correlation functions. This allows the clusteringof galaxies in the mock to be calibrated against observational data as a function of luminosity and color.While the system is currently able to handle the galaxy and dark matter halo catalogs and the intensityof queries currently coming from within the Euclid consortium, it does not seem suited to handling theraw simulation data. For example, the task of producing the input dark matter halo catalog is done as aseparate step at the University of Zurich using a pipeline of specialized parallel codes that can deal with250 TB of raw particle data. Rewriting halo ﬁnders (and other analysis tasks) using Apache SPARC wouldin principle allow raw data to be handled by a larger system of this type. However, the current splittingof tasks at the dark matter halo catalog level seems to be very eﬃcient, but incurs a lack of transparencyand redundant data exchange within the overall process of mock generation. In 2010 the NSF awarded a Major Research Instrumentation (MRI) grant to PI Alex Szalay at JohnsHopkins University (JHU) for a project to develop a multi-petabyte generic data analysis environment.This system is called the Data-Scope and it was designed to enable analysis of pertabyte-scale datasets .The system has a 5 petabyte storage capacity and a sequential I/O bandwidth of 500 gigabytes/second.Each individual project is provided with its own node, enabling the data to be stored in a way that isoptimized for that project. One example is storing the data in a SQL server, which is a database solutionthat enables sequential I/O. The data-scope system can reach 600 teraﬂops with its GPUs, which is a key COSMOHUB https://cosmohub.pic.es/ The Data-Scope http://idies.jhu.edu/resources/datascope/ Sequential I/O means that the data must be accessed in order, from the start of the ﬁle to the end, whilerandom I/O allows reading or writing any part of the ﬁle at any time. omponent of this enabling technology that requires new software to be written to undertake the moretraditional CPU analyses. For example, undertaking a standard correlation function calculation on galaxypairs can become prohibitive on a CPU when the number of galaxy pairs becomes very large. A massive400 trillion galaxy pair correlation function calculation was undertaken on Sloan Digital Sky Survey datathat was hosted on a system with very similar facilities to the Data-Scope [Tian et al., 2011]. The authorsreported that the calculation was hundreds of times faster than the same calculation on traditional CPUs.The Data-Scope is an example of how very large datasets, like cosmological simulations, can be curatedfor intensive analyses and the type of hardware, software, and expertise required to undertake these eﬀorts.However, the system is primarily focused on big data analysis and does not address curating the data forlong-term hosting and wide-spread community access. The Data-Scope is still operational and acceptsproposals from the community to undertake computationally intensive projects on large datasets. There are a number of challenges to developing a common infrastructure. There needs to be a plan forwhere the simulations are being run with some guarantees that those resources will be available for theseeﬀorts. Once the simulations have been completed, an initial analysis may be completed at a diﬀerentfacility, so rapid transfer capabilities of very large datasets need to be in place. Decisions need to bemade about what data products are being stored and hosted and how those products are being curated toenable widespread use (i.e. does the data need to be stored sequentially or in a format that enables rapidingestion by a database?). There are a range of solutions, from simply storing and hosting the ﬂat ﬁles fordirect download by scientists to analyze on a system that they identify themselves, to more sophisticateddatabase solutions that include access to increasingly powerful analysis hardware at the data center.

Every section of this report required either the generation or utilization of cosmological simulations toensure the scientiﬁc success of Rubin, WFIRST, and Euclid. With limited resources and expertise availablefor each of the surveys, coordination between the surveys on which cosmological simulations to produceand a common infrastructure to share the data will clearly contribute to the scientiﬁc success of eachof the surveys. This approach will also save money in the long-term by reducing the overall number ofrequired simulations and facilitating a common data curation approach that will increase user eﬃciency inaccessing and utilizing the simulations. In order to facilitate eﬀective sharing and utilization of the dataproducts, a central, common, data sharing infrastructure is required. In the absence of such coordinationand infrastructure, the onus returns to the individual surveys to produce, analyze, store, and host all oftheir required simulations, resulting in a much higher demand on already limited computational resourcesand similar simulations being produced up to three times.The work to ﬂesh out the range of solutions for an Rubin, WFIRST, Euclid cosmological simulationdata sharing infrastructure requires additional eﬀort. This eﬀort includes scoping and costing the hardwarerequirements, coordinating with the scientists to identify which data products should be stored and thebest methods for curating the data, exploring the methods for accessing the data and options for interfacingwith the data, scoping a range of support levels that a host data center could provide and costing thoseoptions, and providing detailed proposals that show what capabilities and scientiﬁc return can be expectedwith speciﬁc levels of investment. TACS ﬁnds that a study should be undertaken in collaboration with datacenters to investigate and test solutions for a long-term archival infrastructure for simulated cosmologicaldata products. • Phase 1: Scoping of Requirements and Architecture; ∼ – Conduct an assessment of possible shared infrastructure and data curation solutions and identify he best choice.– Develop requirements for an infrastructure with increasing capabilities (cost points), clearly iden-tifying the increased capabilities at each point and highlighting the optimal choice for the surveys,Agencies, and broader community.– Present a detailed proposal for a test-bed infrastructure that includes requirements for hardware,data curation, and personnel. Outline the tests that will be undertaken and the metrics that will beused to determine the overall success of the test-bed infrastructure. Provide a rough roadmap formoving from the test-bed to a fully realized infrastructure. • Phase 2: Building and Exercising a Test-Bed Infrastructure; ∼ – Acquire and install required new hardware (including computing, storage, and interconnect capa-bilities).– Implement and test the chosen data curation solution.– Test the data sharing solution.– If implemented, test the on-site analysis capabilities.– Present an analysis of the success of the test-bed infrastructure and provide a detailed proposalfor a fully realized infrastructure. • Phase 3: Realizing the Full Data Sharing Infrastructure; ∼ During Phase 3, a fully realized archival infrastructure would be deployed. The details on thehardware and personnel required to implement this phase will depend critically on what is learnedduring Phases 1 and 2 of this program.

Acknowledgements

AK was supported by JPL, which is run under contract by the California Institute of Technology forNASA. AK was also supported in part by NASA ROSES grant 12-EUCLID12-0004 and NASA grant15-WFIRST15-0008. Argonne National Laboratory’s work by KH and AH was supported under the U.S.Department of Energy contract DE-AC02-76SF00515.

References

T. M. C. Abbott, F. B. Abdalla, A. Alarcon, S. Allam, J. Annis, S. Avila, K. Aylor, M. Banerji, N. Banik,E. J. Baxter, K. Bechtol, M. R. Becker, B. A. Benson, G. M. Bernstein, E. Bertin, F. Bianchini,J. Blazek, L. Bleem, L. E. Bleem, S. L. Bridle, D. Brooks, E. Buckley-Geer, D. L. Burke, J. E. Carlstrom,A. Carnero Rosell, M. Carrasco Kind, J. Carretero, F. J. Castander, R. Cawthon, C. Chang, C. L.Chang, H. Cho, A. Choi, R. Chown, T. M. Crawford, A. T. Crites, M. Crocce, C. E. Cunha, C. B.D’Andrea, L. N. da Costa, C. Davis, T. de Haan, J. DeRose, S. Desai, J. De Vicente, H. T. Diehl, J. P.Dietrich, M. A. Dobbs, S. Dodelson, P. Doel, A. Drlica-Wagner, T. F. Eiﬂer, J. Elvin-Poole, W. B.Everett, B. Flaugher, P. Fosalba, O. Friedrich, J. Frieman, J. Garc´ıa-Bellido, M. Gatti, E. Gaztanaga,E. M. George, D. W. Gerdes, T. Giannantonio, D. Gruen, R. A. Gruendl, J. Gschwend, G. Gutierrez,N. W. Halverson, N. L. Harrington, W. G. Hartley, G. P. Holder, D. L. Hollowood, W. L. Holzapfel,K. Honscheid, Z. Hou, B. Hoyle, J. D. Hrubes, D. Huterer, B. Jain, D. J. James, M. Jarvis, T. Jeltema,M. W. G. Johnson, M. D. Johnson, S. Kent, D. Kirk, L. Knox, N. Kokron, E. Krause, K. Kuehn,O. Lahav, A. T. Lee, E. M. Leitch, T. S. Li, M. Lima, H. Lin, D. Luong-Van, N. MacCrann, M. A. G.Maia, A. Manzotti, D. P. Marrone, J. L. Marshall, P. Martini, J. J. McMahon, F. Menanteau, S. S.Meyer, R. Miquel, L. M. Mocanu, J. J. Mohr, J. Muir, T. Natoli, A. Nicola, B. Nord, Y. Omori,S. Padin, S. Pandey, A. A. Plazas, A. Porredon, J. Prat, C. Pryke, M. M. Rau, C. L. Reichardt,R. P. Rollins, A. K. Romer, A. Roodman, A. J. Ross, E. Rozo, J. E. Ruhl, E. S. Rykoﬀ, S. Samuroﬀ,C. S´anchez, E. Sanchez, J. T. Sayre, V. Scarpine, K. K. Schaﬀer, L. F. Secco, S. Serrano, I. Sevilla-Noarbe, E. Sheldon, E. Shirokoﬀ, G. Simard, M. Smith, M. Soares-Santos, F. Sobreira, Z. Staniszewski, . A. Stark, K. T. Story, E. Suchyta, M. E. C. Swanson, G. Tarle, D. Thomas, M. A. Troxel, D. L.Tucker, K. Vanderlinde, J. D. Vieira, P. Vielzeuf, V. Vikram, A. R. Walker, R. H. Wechsler, J. Weller,R. Williamson, W. L. K. Wu, B. Yanny, O. Zahn, Y. Zhang, and J. Zuntz. Dark Energy Survey Year 1Results: Joint Analysis of Galaxy Clustering, Galaxy Lensing, and CMB Lensing Two-point Functions. ArXiv e-prints , October 2018.Aniket Agrawal, Ryu Makiya, Chi-Ting Chiang, Donghui Jeong, Shun Saito, and Eiichiro Komatsu. Gen-erating log-normal mock catalog of galaxies in redshift space.

Journal of Cosmology and AstroparticlePhysics , 2017(10):003, 2017. URL http://stacks.iop.org/1475-7516/2017/i=10/a=003 .T. W. Anderson.

An Introduction to Multivariate Statistical Analysis , page 623/624. Wiley-Interscience,2003.S. Avila, A. Knebe, F. R. Pearce, A. Schneider, C. Srisawat, P. A. Thomas, P. Behroozi, P. J. Elahi,J. Han, Y.-Y. Mao, J. Onions, V. Rodriguez-Gomez, and D. Tweed. SUSSING MERGER TREES: theinﬂuence of the halo ﬁnder.

MNRAS , 441:3488–3501, July 2014. doi: 10.1093/mnras/stu799.Santiago Avila, Alexander Knebe, Frazer R. Pearce, Aurel Schneider, Chaichalit Srisawat, Peter A.Thomas, Peter Behroozi, Pascal J. Elahi, Jiaxin Han, Yao-Yuan Mao, Julian Onions, VicenteRodriguez-Gomez, and Dylan Tweed. SUSSING MERGER TREES: the inﬂuence of the halo ﬁnder.

Monthly Notices of the Royal Astronomical Society , 441:3488–3501, July 2014. ISSN 0035-8711. doi:10.1093/mnras/stu799. URL http://adsabs.harvard.edu/abs/2014MNRAS.441.3488A .A. Barreira, E. Krause, and F. Schmidt. Accurate cosmic shear errors: do we need ensembles of simulations?

ArXiv e-prints , July 2018.N. Battaglia, S. Ferraro, E. Schaan, and D. N. Spergel. Future constraints on halo thermodynamics fromcombined Sunyaev-Zel’dovich measurements.

JCAP , 11:040, November 2017. doi: 10.1088/1475-7516/2017/11/040.P. Behroozi, A. Knebe, F. R. Pearce, P. Elahi, J. Han, H. Lux, Y.-Y. Mao, S. I. Muldrew, D. Potter, andC. Srisawat. Major mergers going Notts: challenges for modern halo ﬁnders.

MNRAS , 454:3020–3029,December 2015. doi: 10.1093/mnras/stv2046.A. J. Benson. The mass function of unprocessed dark matter haloes and merger tree branching rates.

MNRAS , 467:3454–3466, May 2017. doi: 10.1093/mnras/stx343.Andrew J. Benson. Building a predictive model of galaxy formation - I. Phenomenological modelconstrained to the z = 0 stellar mass function.

Monthly Notices of the Royal Astronomical Soci-ety , 444:2599–2636, November 2014. ISSN 0035-8711. doi: 10.1093/mnras/stu1630. URL http://adsabs.harvard.edu/abs/2014MNRAS.444.2599B .Andrew J. Benson, Stefano Borgani, Gabriella De Lucia, Michael Boylan-Kolchin, and Pierluigi Monaco.Convergence of galaxy properties with merger tree temporal resolution.

Monthly Notices of the RoyalAstronomical Society , 419:3590–3603, February 2012. ISSN 0035-8711. doi: 10.1111/j.1365-2966.2011.20002.x. URL http://adsabs.harvard.edu/abs/2012MNRAS.419.3590B .J. L. Bernal and J. A. Peacock. Conservative cosmology: combining data with allowance for unknownsystematics.

ArXiv e-prints , March 2018.J. Blazek, M. McQuinn, and U. Seljak. Testing the tidal alignment model of galaxy intrinsic alignment.

JCAP , 5:010, May 2011. doi: 10.1088/1475-7516/2011/05/010.J. Blazek, R. Mandelbaum, U. Seljak, and R. Nakajima. Separating intrinsic alignment and galaxy-galaxylensing.

JCAP , 5:041, May 2012. doi: 10.1088/1475-7516/2012/05/041. . Blazek, Z. Vlah, and U. Seljak. Tidal alignment of galaxies. JCAP , 8:015, August 2015. doi: 10.1088/1475-7516/2015/08/015.S. Bocquet, A. Saro, K. Dolag, and J. J. Mohr. Halo mass function: baryon impact, ﬁtting formulae, andimplications for cluster cosmology.

Monthly Notices of the Royal Astronomical Society , 456:2361–2373,March 2016. doi: 10.1093/mnras/stv2657.R. G. Bower, I. Vernon, M. Goldstein, A. J. Benson, C. G. Lacey, C. M. Baugh, S. Cole, and C. S.Frenk. The parameter space of galaxy formation.

Monthly Notices of the Royal Astronomical Society ,407:2017–2045, October 2010. ISSN 0035-8711. doi: 10.1111/j.1365-2966.2010.16991.x. URL http://adsabs.harvard.edu/abs/2010MNRAS.407.2017B .S. Bridle and L. King. Dark energy constraints from cosmic shear power spectra: impact of intrinsicalignments on photometric redshift requirements.

New Journal of Physics , 9:444, December 2007. doi:10.1088/1367-2630/9/12/444.P. Catelan, M. Kamionkowski, and R. D. Blandford. Intrinsic and extrinsic galaxy alignment.

MNRAS ,320:L7–L13, January 2001. doi: 10.1046/j.1365-8711.2001.04105.x.N. E. Chisari, N. Koukouﬁlippas, A. Jindal, S. Peirani, R. S. Beckmann, S. Codis, J. Devriendt, L. Miller,Y. Dubois, C. Laigle, A. Slyz, and C. Pichon. Galaxy-halo alignments in the Horizon-AGN cosmolog-ical hydrodynamical simulation.

Monthly Notices of the Royal Astronomical Society , 472:1163–1181,November 2017. doi: 10.1093/mnras/stx1998.Shaun Cole, Cedric G. Lacey, Carlton M. Baugh, and Carlos S. Frenk. Hierarchical galaxy formation.

MNRAS , 319:168–204, November 2000. URL http://adsabs.harvard.edu/abs/2000MNRAS.319..168C .Charlie Conroy and James E. Gunn. The Propagation of Uncertainties in Stellar Population SynthesisModeling. III. Model Calibration, Comparison, and Evaluation.

The Astrophysical Journal , 712:833–857,April 2010. URL http://adsabs.harvard.edu/abs/2010ApJ...712..833C .R. G. Crittenden, P. Natarajan, U.-L. Pen, and T. Theuns. Spin-induced Galaxy Alignments and TheirImplications for Weak-Lensing Measurements.

ApJ , 559:552–571, October 2001. doi: 10.1086/322370.R. Dav´e, R. Thompson, and P. F. Hopkins. MUFASA: galaxy formation simulations with meshless hydro-dynamics.

MNRAS , 462:3265–3284, November 2016. doi: 10.1093/mnras/stw1862.H. Desmond and R. H. Wechsler. The Faber-Jackson relation and Fundamental Plane from halo abundancematching.

MNRAS , 465:820–833, February 2017. doi: 10.1093/mnras/stw2804.S. Dodelson and M. D. Schneider. The eﬀect of covariance estimator error on cosmological parameterconstraints.

Phys. Rev. D , 88(6):063537, September 2013. doi: 10.1103/PhysRevD.88.063537.K. Dolag, E. Komatsu, and R. Sunyaev. SZ eﬀects in the Magneticum Pathﬁnder simulation: comparisonwith the Planck, SPT, and ACT results.

MNRAS , 463:1797–1811, December 2016. doi: 10.1093/mnras/stw2035.T. Eiﬂer, E. Krause, S. Dodelson, A. R. Zentner, A. P. Hearin, and N. Y. Gnedin. Accounting for baryoniceﬀects in cosmic shear tomography: determining a minimal set of nuisance parameters using PCA.

MNRAS , 454:2451–2471, December 2015. doi: 10.1093/mnras/stv2000.S. M. Fall and G. Efstathiou. Formation and rotation of disc galaxies with haloes.

MNRAS , 193:189–206,October 1980. doi: 10.1093/mnras/193.2.189. . Faltenbacher, C. Li, S. D. M. White, Y.-P. Jing, Shu-DeMao, and J. Wang. Alignment betweengalaxies and large-scale structure. Research in Astronomy and Astrophysics , 9:41–58, January 2009.doi: 10.1088/1674-4527/9/1/004.N. Fanidakis, C. M. Baugh, A. J. Benson, R. G. Bower, S. Cole, C. Done, and C. S. Frenk. Granduniﬁcation of AGN activity in the ΛCDM cosmology.

MNRAS , 410:53–74, January 2011. doi: 10.1111/j.1365-2966.2010.17427.x.N. Fanidakis, C. M. Baugh, A. J. Benson, R. G. Bower, S. Cole, C. Done, C. S. Frenk, R. C. Hickox,C. Lacey, and C. Del P. Lagos. The evolution of active galactic nuclei across cosmic time: what isdownsizing?

MNRAS , 419:2797–2820, February 2012. doi: 10.1111/j.1365-2966.2011.19931.x.Y. Feng, T. Di-Matteo, R. A. Croft, S. Bird, N. Battaglia, and S. Wilkins. The BlueTides simulation: ﬁrstgalaxies and reionization.

MNRAS , 455:2778–2791, January 2016. doi: 10.1093/mnras/stv2484.G. J. Ferland, M. Chatzikos, F. Guzm´an, M. L. Lykins, P. A. M. van Hoof, R. J. R. Williams, N. P. Abel,N. R. Badnell, F. P. Keenan, R. L. Porter, and P. C. Stancil. The 2017 Release Cloudy.

RMxAA , 53:385–438, October 2017.O. Friedrich and T. Eiﬂer. Precision matrix expansion - eﬃcient use of numerical simulations in estimatingerrors on cosmological parameters.

Monthly Notices of the Royal Astronomical Society , 473:4150–4163,January 2018. doi: 10.1093/mnras/stx2566.O. Friedrich, S. Seitz, T. F. Eiﬂer, and D. Gruen. Performance of internal covariance estimators for cosmicshear correlation functions.

Monthly Notices of the Royal Astronomical Society , 456:2662–2680, March2016. doi: 10.1093/mnras/stv2833.S. Habib, A. Pope, H. Finkel, N. Frontiere, K. Heitmann, D. Daniel, P. Fasel, V. Morozov, G. Zagaris,T. Peterka, V. Vishwanath, Z. Luki´c, S. Sehrish, and W.-k. Liao. HACC: Simulating sky surveys onstate-of-the-art supercomputing architectures.

New Astronomy , 42:49–65, January 2016. doi: 10.1016/j.newast.2015.06.003.J. Hartlap, P. Simon, and P. Schneider. Why your model parameter conﬁdences might be too optimistic.Unbiased estimation of the inverse covariance matrix.

Astronomy and Astrophysics , 464:399–404, March2007. doi: 10.1051/0004-6361:20066170.J. Hartlap, T. Schrabback, P. Simon, and P. Schneider. The non-Gaussianity of the cosmic shear likelihoodor how odd is the Chandra Deep Field South?

Astronomy and Astrophysics , 504:689–703, September2009. doi: 10.1051/0004-6361/200911697.A. Heavens, A. Refregier, and C. Heymans. Intrinsic correlation of galaxy shapes: implications for weaklensing measurements.

MNRAS , 319:649–656, December 2000. doi: 10.1046/j.1365-8711.2000.03907.x.K. Heitmann, D. Higdon, C. Nakhleh, and S. Habib. Cosmic Calibration.

Astrophys. J. Lett. , 646:L1–L4,July 2006. doi: 10.1086/506448.K. Heitmann, D. Higdon, M. White, S. Habib, B. J. Williams, E. Lawrence, and C. Wagner. The CoyoteUniverse. II. Cosmological Models and Precision Emulation of the Nonlinear Matter Power Spectrum.

Astrophys. J. , 705:156–174, November 2009. doi: 10.1088/0004-637X/705/1/156.K. Heitmann, D. Bingham, E. Lawrence, S. Bergner, S. Habib, D. Higdon, A. Pope, R. Biswas, H. Finkel,N. Frontiere, and S. Bhattacharya. The Mira-Titan Universe: Precision Predictions for Dark EnergySurveys.

Astrophys. J. , 820:108, April 2016. doi: 10.3847/0004-637X/820/2/108. runo M. B. Henriques, Peter A. Thomas, Seb Oliver, and Isaac Roseboom. Monte Carlo Markov Chain pa-rameter estimation in semi-analytic models of galaxy formation. Monthly Notices of the Royal Astronom-ical Society , 396:535–547, June 2009. URL http://adsabs.harvard.edu/abs/2009MNRAS.396..535H .C. Hikage, M. Oguri, T. Hamana, S. More, R. Mandelbaum, M. Takada, F. K¨ohlinger, H. Miyatake, A. J.Nishizawa, H. Aihara, R. Armstrong, J. Bosch, J. Coupon, A. Ducout, B.-C. Hsieh, Y. Komiyama,F. Lanusse, A. Leauthaud, E. Medezinski, S. Mineo, S. Miyazaki, R. Murata, H. Murayama, M. Shi-rasaki, C. Sif´on, M. Simet, J. Speagle, D. N. Spergel, M. A. Strauss, N. Sugiyama, M. Tanaka, and S.-Y.Wang. Cosmology from cosmic shear power spectra with Subaru Hyper Suprime-Cam ﬁrst-year data.

ArXiv e-prints , September 2018.H. Hildebrandt, M. Viola, C. Heymans, S. Joudaki, K. Kuijken, C. Blake, T. Erben, B. Joachimi, D. Klaes,L. Miller, C. B. Morrison, R. Nakajima, G. Verdoes Kleijn, A. Amon, A. Choi, G. Covone, J. T. A.de Jong, A. Dvornik, I. Fenech Conti, A. Grado, J. Harnois-D´eraps, R. Herbonnet, H. Hoekstra,F. K¨ohlinger, J. McFarland, A. Mead, J. Merten, N. Napolitano, J. A. Peacock, M. Radovich, P. Schnei-der, P. Simon, E. A. Valentijn, J. L. van den Busch, E. van Uitert, and L. Van Waerbeke. KiDS-450:cosmological parameter constraints from tomographic weak gravitational lensing.

Monthly Notices ofthe Royal Astronomical Society , 465:1454–1498, February 2017. doi: 10.1093/mnras/stw2805.C. M. Hirata and U. Seljak. Intrinsic alignment-lensing interference as a contaminant of cosmic shear.

PhysRevD , 70(6):063526–+, September 2004. doi: 10.1103/PhysRevD.70.063526.C. M. Hirata, R. Mandelbaum, M. Ishak, U. Seljak, R. Nichol, K. A. Pimbblet, N. P. Ross, andD. Wake. Intrinsic galaxy alignments from the 2SLAQ and SDSS surveys: luminosity and redshiftscalings and implications for weak lensing surveys.

MNRAS , 381:1197–1218, November 2007. doi:10.1111/j.1365-2966.2007.12312.x.P. F. Hopkins. A new class of accurate, mesh-free hydrodynamic simulation methods.

MNRAS , 450:53–110, June 2015. doi: 10.1093/mnras/stv195.P. F. Hopkins, D. Kereˇs, J. O˜norbe, C.-A. Faucher-Gigu`ere, E. Quataert, N. Murray, and J. S. Bul-lock. Galaxies on FIRE (Feedback In Realistic Environments): stellar feedback explains cosmologicallyineﬃcient star formation.

MNRAS , 445:581–603, November 2014. doi: 10.1093/mnras/stu1738.H.-J. Huang, T. Eiﬂer, R. Mandelbaum, and S. Dodelson. Modeling baryonic physics in future weak lensingsurveys.

ArXiv e-prints , September 2018.M. Jarvis, G. Bernstein, and B. Jain. The skewness of the aperture mass statistic.

MNRAS , 352:338–352,July 2004. doi: 10.1111/j.1365-2966.2004.07926.x.F. Jiang, A. Dekel, O. Kneller, S. Lapiner, D. Ceverino, J. R. Primack, S. M. Faber, A. V. Macci`o,A. Dutton, S. Genel, and R. S. Somerville. Is the dark-matter halo spin a predictor of galaxy spin andsize?

ArXiv e-prints , April 2018.B. Joachimi. Non-linear shrinkage estimation of large-scale structure covariance.

Monthly Notices of theRoyal Astronomical Society , 466:L83–L87, March 2017. doi: 10.1093/mnrasl/slw240.B. Joachimi, R. Mandelbaum, F. B. Abdalla, and S. L. Bridle. Constraints on intrinsic alignment con-tamination of weak lensing surveys using the MegaZ-LRG sample.

AAP , 527:A26, March 2011. doi:10.1051/0004-6361/201015621.B. Joachimi, E. Semboloni, S. Hilbert, P. E. Bett, J. Hartlap, H. Hoekstra, and P. Schneider. Intrinsicgalaxy shapes and alignments - II. Modelling the intrinsic alignment contamination of weak lensingsurveys.

MNRAS , 436:819–838, November 2013. doi: 10.1093/mnras/stt1618. . Joachimi, M. Cacciato, T. D. Kitching, A. Leonard, R. Mandelbaum, B. M. Sch¨afer, C. Sif´on, H. Hoek-stra, A. Kiessling, D. Kirk, and A. Rassat. Galaxy Alignments: An Overview. SSRv , 193:1–65, November2015. doi: 10.1007/s11214-015-0177-4.P. Jonsson. SUNRISE: polychromatic dust radiative transfer in arbitrary geometries.

MNRAS , 372:2–20,October 2006. doi: 10.1111/j.1365-2966.2006.10884.x.G. M. Kaufman. Some bayesian moment formulae.

Report No. 6710, Center for Operations Research andEconometrics, Catholic University of Louvain, Heverlee, Belgium , 1967.S. Kaviraj, C. Laigle, T. Kimm, J. E. G. Devriendt, Y. Dubois, C. Pichon, A. Slyz, E. Chisari, andS. Peirani. The Horizon-AGN simulation: evolution of galaxy properties over cosmic time.

MNRAS ,467:4739–4752, June 2017. doi: 10.1093/mnras/stx126.N. Khandai, T. Di Matteo, R. Croft, S. Wilkins, Y. Feng, E. Tucker, C. DeGraf, and M.-S. Liu. TheMassiveBlack-II simulation: the evolution of haloes and galaxies to z = 0.

MNRAS , 450:1349–1374,June 2015. doi: 10.1093/mnras/stv627.A. Kiessling, M. Cacciato, B. Joachimi, D. Kirk, T. D. Kitching, A. Leonard, R. Mandelbaum, B. M.Sch¨afer, C. Sif´on, M. L. Brown, and A. Rassat. Galaxy Alignments: Theory, Modelling & Simulations.

SSR , 193:67–136, November 2015. doi: 10.1007/s11214-015-0203-6.A. Knebe, S. R. Knollmann, S. I. Muldrew, F. R. Pearce, M. A. Aragon-Calvo, Y. Ascasibar, P. S. Behroozi,D. Ceverino, S. Colombi, J. Diemand, K. Dolag, B. L. Falck, P. Fasel, J. Gardner, S. Gottl¨ober, C.-H. Hsu, F. Iannuzzi, A. Klypin, Z. Luki´c, M. Maciejewski, C. McBride, M. C. Neyrinck, S. Planelles,D. Potter, V. Quilis, Y. Rasera, J. I. Read, P. M. Ricker, F. Roy, V. Springel, J. Stadel, G. Stinson,P. M. Sutter, V. Turchaninov, D. Tweed, G. Yepes, and M. Zemp. Haloes gone MAD: The Halo-FinderComparison Project.

MNRAS , 415:2293–2318, August 2011. doi: 10.1111/j.1365-2966.2011.18858.x.A. Knebe, D. Stoppacher, F. Prada, C. Behrens, A. Benson, S. A. Cora, D. J. Croton, N. D. Padilla,A. N. Ruiz, M. Sinha, A. R. H. Stevens, C. A. Vega-Mart´ınez, P. Behroozi, V. Gonzalez-Perez,S. Gottl¨ober, A. A. Klypin, G. Yepes, H. Enke, N. I. Libeskind, K. Riebe, and M. Steinmetz.MULTIDARK-GALAXIES: data release and ﬁrst results.

MNRAS , 474:5206–5231, March 2018. doi:10.1093/mnras/stx2662.E. Komatsu, K. M. Smith, J. Dunkley, C. L. Bennett, B. Gold, G. Hinshaw, N. Jarosik, D. Larson, M. R.Nolta, L. Page, D. N. Spergel, M. Halpern, R. S. Hill, A. Kogut, M. Limon, S. S. Meyer, N. Odegard,G. S. Tucker, J. L. Weiland, E. Wollack, and E. L. Wright. Seven-year Wilkinson Microwave AnisotropyProbe (WMAP) Observations: Cosmological Interpretation.

Astrophys. J. , 192:18, February 2011. doi:10.1088/0067-0049/192/2/18.E. Krause and T. Eiﬂer. cosmolike - cosmological likelihood analyses for photometric galaxy surveys.

Monthly Notices of the Royal Astronomical Society , 470:2100–2112, September 2017. doi: 10.1093/mnras/stx1261.E. Krause, T. Eiﬂer, and J. Blazek. The impact of intrinsic alignment on current and future cosmicshear surveys.

Monthly Notices of the Royal Astronomical Society , 456:207–222, February 2016. doi:10.1093/mnras/stv2615.E. Krause, T. F. Eiﬂer, J. Zuntz, O. Friedrich, M. A. Troxel, S. Dodelson, J. Blazek, L. F. Secco, N. Mac-Crann, E. Baxter, C. Chang, N. Chen, M. Crocce, J. DeRose, A. Ferte, N. Kokron, F. Lacasa, V. Mi-randa, Y. Omori, A. Porredon, R. Rosenfeld, S. Samuroﬀ, M. Wang, R. H. Wechsler, T. M. C. Abbott,F. B. Abdalla, S. Allam, J. Annis, K. Bechtol, A. Benoit-Levy, G. M. Bernstein, D. Brooks, D. L. Burke,D. Capozzi, M. Carrasco Kind, J. Carretero, C. B. D’Andrea, L. N. da Costa, C. Davis, D. L. DePoy, . Desai, H. T. Diehl, J. P. Dietrich, A. E. Evrard, B. Flaugher, P. Fosalba, J. Frieman, J. Garcia-Bellido,E. Gaztanaga, T. Giannantonio, D. Gruen, R. A. Gruendl, J. Gschwend, G. Gutierrez, K. Honscheid,D. J. James, T. Jeltema, K. Kuehn, S. Kuhlmann, O. Lahav, M. Lima, M. A. G. Maia, M. March, J. L.Marshall, P. Martini, F. Menanteau, R. Miquel, R. C. Nichol, A. A. Plazas, A. K. Romer, E. S. Rykoﬀ,E. Sanchez, V. Scarpine, R. Schindler, M. Schubnell, I. Sevilla-Noarbe, M. Smith, M. Soares-Santos,F. Sobreira, E. Suchyta, M. E. C. Swanson, G. Tarle, D. L. Tucker, V. Vikram, A. R. Walker, andJ. Weller. Dark Energy Survey Year 1 Results: Multi-Probe Methodology and Simulated LikelihoodAnalyses. ArXiv e-prints , June 2017.J. Kwan, K. Heitmann, S. Habib, N. Padmanabhan, E. Lawrence, H. Finkel, N. Frontiere, and A. Pope.Cosmic Emulation: Fast Predictions for the Galaxy Power Spectrum.

Astrophys. J. , 810:35, September2015. doi: 10.1088/0004-637X/810/1/35.C. G. Lacey, C. M. Baugh, C. S. Frenk, A. J. Benson, R. G. Bower, S. Cole, V. Gonzalez-Perez, J. C. Helly,C. D. P. Lagos, and P. D. Mitchell. A uniﬁed multiwavelength model of galaxy formation.

MNRAS ,462:3854–3911, November 2016. doi: 10.1093/mnras/stw1888.E. Lawrence, K. Heitmann, J. Kwan, A. Upadhye, D. Bingham, S. Habib, D. Higdon, A. Pope, H. Finkel,and N. Frontiere. The Mira-Titan Universe. II. Matter Power Spectrum Emulation.

The AstrophysicalJourna , 847:50, September 2017. doi: 10.3847/1538-4357/aa86a9.J. Lee and U.-L. Pen. The Nonlinear Evolution of Galaxy Intrinsic Alignments.

ApJ , 681:798–805, July2008. doi: 10.1086/588646.J. Lee, S. K. Yi, P. J. Elahi, P. A. Thomas, F. R. Pearce, P. Behroozi, J. Han, J. Helly, I. Jung, A. Knebe,Y.-Y. Mao, J. Onions, V. Rodriguez-Gomez, A. Schneider, C. Srisawat, and D. Tweed. Sussing mergertrees: the impact of halo merger trees on galaxy properties in a semi-analytic model.

MNRAS , 445:4197–4210, December 2014. doi: 10.1093/mnras/stu2039.Yu Lu, H. J. Mo, Martin D. Weinberg, and Neal Katz. A Bayesian approach to the semi-analytic modelof galaxy formation: methodology.

Monthly Notices of the Royal Astronomical Society , 416:1949–1964,September 2011. URL http://adsabs.harvard.edu/abs/2011MNRAS.416.1949L .Yu Lu, H. J. Mo, Neal Katz, and Martin D. Weinberg. Bayesian inference of galaxy formation from theK-band luminosity function of galaxies: tensions between theory and observation.

Monthly Notices ofthe Royal Astronomical Society , 421:1779–1796, April 2012. URL http://adsabs.harvard.edu/abs/2012MNRAS.421.1779L .R. Mandelbaum, C. Blake, S. Bridle, F. B. Abdalla, S. Brough, M. Colless, W. Couch, S. Croom, T. Davis,M. J. Drinkwater, K. Forster, K. Glazebrook, et al. The WiggleZ Dark Energy Survey: direct constraintson blue galaxy intrinsic alignments at intermediate redshifts.

MNRAS , 410:844–859, January 2011. doi:10.1111/j.1365-2966.2010.17485.x.Y.-Y. Mao, E. Kovacs, K. Heitmann, T. D. Uram, A. J. Benson, D. Campbell, S. A. Cora, J. DeRose,T. Di Matteo, S. Habib, A. P. Hearin, J. Bryce Kalmbach, K. S. Krughoﬀ, F. Lanusse, Z. Luki´c,R. Mandelbaum, J. A. Newman, N. Padilla, E. Paillas, A. Pope, P. M. Ricker, A. N. Ruiz, A. Tenneti,C. A. Vega-Mart´ınez, R. H. Wechsler, R. Zhou, Y. Zu, and LSST Dark Energy Science Collaboration.DESCQA: An Automated Validation Framework for Synthetic Sky Catalogs.

Astrophys. J. Supp. , 234:36, February 2018. doi: 10.3847/1538-4365/aaa6c3.I. G. McCarthy, J. Schaye, S. Bird, and A. M. C. Le Brun. The BAHAMAS project: calibrated hydro-dynamical simulations for large-scale structure cosmology.

MNRAS , 465:2936–2965, March 2017. doi:10.1093/mnras/stw2792. . McClintock, E. Rozo, M. R. Becker, J. DeRose, Y.-Y. Mao, S. McLaughlin, J. L. Tinker, R. H. Wechsler,and Z. Zhai. The Aemulus Project II: Emulating the Halo Mass Function. ArXiv e-prints , April 2018.A. Merson, Y. Wang, A. Benson, A. Faisst, D. Masters, A. Kiessling, and J. Rhodes. Predicting H α emission-line galaxy counts for future galaxy redshift surveys. MNRAS , 474:177–196, February 2018.doi: 10.1093/mnras/stx2649.A. I. Merson, C. M. Baugh, J. C. Helly, V. Gonzalez-Perez, S. Cole, R. Bielby, P. Norberg, C. S. Frenk,A. J. Benson, R. G. Bower, C. G. Lacey, and C. d. P. Lagos. Lightcone mock catalogues from semi-analytic models of galaxy formation - I. Construction and application to the BzK colour selection.

MNRAS , 429:556–578, February 2013. doi: 10.1093/mnras/sts355.H. J. Mo, Shude Mao, and Simon D. M. White. The formation of galactic discs.

Monthly Notices ofthe Royal Astronomical Society , 295:319–336, April 1998. doi: DOI:10.1046/j.1365-8711.1998.01227.x;eprintid:arXiv:astro-ph/9707093. URL http://adsabs.harvard.edu/abs/1998MNRAS.295..319M .J. Onions, A. Knebe, F. R. Pearce, S. I. Muldrew, H. Lux, S. R. Knollmann, Y. Ascasibar, P. Behroozi,P. Elahi, J. Han, M. Maciejewski, M. E. Merch´an, M. Neyrinck, A. N. Ruiz, M. A. Sgr´o, V. Springel,and D. Tweed. Subhaloes going Notts: the subhalo-ﬁnder comparison project.

MNRAS , 423:1200–1214,June 2012. doi: 10.1111/j.1365-2966.2012.20947.x.J. Onions, Y. Ascasibar, P. Behroozi, J. Casado, P. Elahi, J. Han, A. Knebe, H. Lux, M. E. Merch´an, S. I.Muldrew, M. Neyrinck, L. Old, F. R. Pearce, D. Potter, A. N. Ruiz, M. A. Sgr´o, D. Tweed, and T. Yue.Subhaloes gone Notts: spin across subhaloes and ﬁnders.

MNRAS , 429:2739–2747, March 2013. doi:10.1093/mnras/sts549.A. Orsi, C. G. Lacey, C. M. Baugh, and L. Infante. The clustering of Ly α emitters in a ΛCDM Universe. MNRAS , 391:1589–1604, December 2008. doi: 10.1111/j.1365-2966.2008.14010.x.A. Orsi, C. M. Baugh, C. G. Lacey, A. Cimatti, Y. Wang, and G. Zamorani. Probing dark energy withfuture redshift surveys: a comparison of emission line and broad-band selection in the near-infrared.

MNRAS , 405:1006–1024, June 2010. doi: 10.1111/j.1365-2966.2010.16585.x.´A. Orsi, N. Padilla, B. Groves, S. Cora, T. Tecce, I. Gargiulo, and A. Ruiz. The nebular emission ofstar-forming galaxies in a hierarchical universe.

MNRAS , 443:799–814, September 2014. doi: 10.1093/mnras/stu1203.A. C. Pope and I. Szapudi. Shrinkage estimation of the power spectrum covariance matrix.

Monthly Noticesof the Royal Astronomical Society , 389:766–774, September 2008. doi: 10.1111/j.1365-2966.2008.13561.x.D. Potter, J. Stadel, and R. Teyssier. PKDGRAV3: beyond trillion particle cosmological simulationsfor the next era of galaxy surveys.

Computational Astrophysics and Cosmology , 4:2, May 2017. doi:10.1186/s40668-017-0021-1.A. Pujol, E. Gazta˜naga, C. Giocoli, A. Knebe, F. R. Pearce, R. A. Skibba, Y. Ascasibar, P. Behroozi,P. Elahi, J. Han, H. Lux, S. I. Muldrew, M. Neyrinck, J. Onions, D. Potter, and D. Tweed. Subhaloesgone Notts: the clustering properties of subhaloes.

MNRAS , 438:3205–3221, March 2014. doi: 10.1093/mnras/stt2446.S. Ravanbakhsh, J. Oliva, S. Fromenteau, L. C. Price, S. Ho, J. Schneider, and B. Poczos. EstimatingCosmological Parameters from the Dark Matter Distribution.

ArXiv e-prints , November 2017.A. J. Ross and R. J. Brunner. Halo-model analysis of the clustering of photometrically selected galaxiesfrom SDSS.

MNRAS , 399:878–887, October 2009. doi: 10.1111/j.1365-2966.2009.15318.x. . Schaye, C. Dalla Vecchia, C. M. Booth, R. P. C. Wiersma, T. Theuns, M. R. Haas, S. Bertone, A. R.Duﬀy, I. G. McCarthy, and F. van de Voort. The physics driving the cosmic star formation history. MNRAS , 402:1536–1560, March 2010. doi: 10.1111/j.1365-2966.2009.16029.x.J. Schaye, R. A. Crain, R. G. Bower, M. Furlong, M. Schaller, T. Theuns, C. Dalla Vecchia, C. S. Frenk,I. G. McCarthy, J. C. Helly, A. Jenkins, Y. M. Rosas-Guevara, S. D. M. White, M. Baes, C. M.Booth, P. Camps, J. F. Navarro, Y. Qu, A. Rahmati, T. Sawala, P. A. Thomas, and J. Trayford. TheEAGLE project: simulating the evolution and assembly of galaxies and their environments.

MNRAS ,446:521–554, January 2015. doi: 10.1093/mnras/stu2058.M. D. Schneider and S. Bridle. A halo model for intrinsic alignments of galaxy ellipticities.

MNRAS , 402:2127–2139, March 2010. doi: 10.1111/j.1365-2966.2009.15956.x.M. D. Schneider, C. S. Frenk, and S. Cole. The shapes and alignments of dark matter halos.

JCAP , 5:030, May 2012a. doi: 10.1088/1475-7516/2012/05/030.M. D. Schneider, C. S. Frenk, and S. Cole. The shapes and alignments of dark matter halos.

JCAP , 5:030, May 2012b. doi: 10.1088/1475-7516/2012/05/030.E. Sellentin, C. Heymans, and J. Harnois-D´eraps. The skewed weak lensing likelihood: why biases arise,despite data and theory being sound.

Monthly Notices of the Royal Astronomical Society , 477:4879–4895,July 2018. doi: 10.1093/mnras/sty988.E. Semboloni, H. Hoekstra, and J. Schaye. Eﬀect of baryonic feedback on two- and three-point shearstatistics: prospects for detection and improved modelling.

MNRAS , 434:148–162, September 2013. doi:10.1093/mnras/stt1013.S. Singh, R. Mandelbaum, and S. More. Intrinsic alignments of SDSS-III BOSS LOWZ sample galaxies.

MNRAS , 450:2195–2216, June 2015. doi: 10.1093/mnras/stv778.R. A. Skibba, S. P. Bamford, R. C. Nichol, C. J. Lintott, D. Andreescu, E. M. Edmondson, P. Murray,M. J. Raddick, K. Schawinski, A. Slosar, A. S. Szalay, D. Thomas, and J. Vandenberg. Galaxy Zoo:disentangling the environmental dependence of morphology and colour.

MNRAS , 399:966–982, October2009. doi: 10.1111/j.1365-2966.2009.15334.x.S. W. Skillman, M. S. Warren, M. J. Turk, R. H. Wechsler, D. E. Holz, and P. M. Sutter. Dark SkySimulations: Early Data Release.

ArXiv e-prints , July 2014.R. E. Smith, J. A. Peacock, A. Jenkins, S. D. M. White, C. S. Frenk, F. R. Pearce, P. A. Thomas,G. Efstathiou, and H. M. P. Couchman. Stable clustering, the halo model and non-linear cosmologicalpower spectra.

MNRAS , 341:1311–1332, June 2003. doi: 10.1046/j.1365-8711.2003.06503.x.Rachel S. Somerville and Romeel Dav´e. Physical Models of Galaxy Formation in a Cosmological Frame-work.

Annual Review of Astronomy and Astrophysics , 53:51–113, August 2015. ISSN 0066-4146. doi:10.1146/annurev-astro-082812-140951.A. Spacek, E. Scannapieco, S. Cohen, B. Joshi, and P. Mauskopf. Constraining AGN Feedback in Mas-sive Ellipticals with South Pole Telescope Measurements of the Thermal Sunyaev-Zel’dovich Eﬀect.

Astrophys. J. , 819:128, March 2016. doi: 10.3847/0004-637X/819/2/128.V. Springel. The cosmological simulation code GADGET-2.

MNRAS , 364:1105–1134, December 2005.doi: 10.1111/j.1365-2966.2005.09655.x.V. Springel. E pur si muove: Galilean-invariant cosmological hydrodynamical simulations on a movingmesh.

MNRAS , 401:791–851, January 2010. doi: 10.1111/j.1365-2966.2009.15715.x. . Springel, S. D. M. White, A. Jenkins, C. S. Frenk, N. Yoshida, L. Gao, J. Navarro, R. Thacker,D. Croton, J. Helly, J. A. Peacock, S. Cole, P. Thomas, H. Couchman, A. Evrard, J. Colberg, andF. Pearce. Simulations of the formation, evolution and clustering of galaxies and quasars. Nature , 435:629–636, June 2005. doi: 10.1038/nature03597.V. Springel, R. Pakmor, A. Pillepich, R. Weinberger, D. Nelson, L. Hernquist, M. Vogelsberger, S. Genel,P. Torrey, F. Marinacci, and J. Naiman. First results from the IllustrisTNG simulations: matter andgalaxy clustering.

Monthly Notices of the Royal Astronomical Society , 475:676–698, March 2018. doi:10.1093/mnras/stx3304.C. Srisawat, A. Knebe, F. R. Pearce, A. Schneider, P. A. Thomas, P. Behroozi, K. Dolag, P. J. Elahi,J. Han, J. Helly, Y. Jing, I. Jung, J. Lee, Y.-Y. Mao, J. Onions, V. Rodriguez-Gomez, D. Tweed,and S. K. Yi. Sussing Merger Trees: The Merger Trees Comparison Project.

MNRAS , 436:150–162,November 2013. doi: 10.1093/mnras/stt1545.Chaichalit Srisawat, Alexander Knebe, Frazer R. Pearce, Aurel Schneider, Peter A. Thomas, PeterBehroozi, Klaus Dolag, Pascal J. Elahi, Jiaxin Han, John Helly, Yipeng Jing, Intae Jung, JaehyunLee, Yao-Yuan Mao, Julian Onions, Vicente Rodriguez-Gomez, Dylan Tweed, and Sukyoung K. Yi.Sussing Merger Trees: The Merger Trees Comparison Project.

Monthly Notices of the Royal Astronom-ical Society , 436:150–162, November 2013. ISSN 0035-8711.Adam R. H. Stevens, Darren J. Croton, and Simon J. Mutch. Building disc structure and galaxy propertiesthrough angular momentum: the DARK SAGE semi-analytic model.

Monthly Notices of the RoyalAstronomical Society , 461:859–876, September 2016. ISSN 0035-8711. doi: 10.1093/mnras/stw1332.URL http://adsabs.harvard.edu/abs/2016MNRAS.461..859S .R. Takahashi, M. Sato, T. Nishimichi, A. Taruya, and M. Oguri. Revising the Haloﬁt Model for theNonlinear Matter Power Spectrum.

Astrophys. J. , 761:152, December 2012. doi: 10.1088/0004-637X/761/2/152.A. Taylor and B. Joachimi. Estimating cosmological parameter covariance.

Monthly Notices of the RoyalAstronomical Society , 442:2728–2738, August 2014. doi: 10.1093/mnras/stu996.A. Taylor, B. Joachimi, and T. Kitching. Putting the precision in precision cosmology: How accurateshould your data covariance matrix be?

Monthly Notices of the Royal Astronomical Society , 432:1928–1946, July 2013. doi: 10.1093/mnras/stt270.A. Tenneti, R. Mandelbaum, T. Di Matteo, Y. Feng, and N. Khandai. Galaxy shapes and intrinsicalignments in the MassiveBlack-II simulation.

MNRAS , 441:470–485, June 2014. doi: 10.1093/mnras/stu586.A. Tenneti, S. Singh, R. Mandelbaum, T. di Matteo, Y. Feng, and N. Khandai. Intrinsic alignments ofgalaxies in the MassiveBlack-II simulation: analysis of two-point statistics.

MNRAS , 448:3522–3544,April 2015. doi: 10.1093/mnras/stv272.R. Teyssier. Cosmological hydrodynamics with adaptive mesh reﬁnement. A new high resolution code calledRAMSES.

Astronomy and Astrophysics , 385:337–364, April 2002. doi: 10.1051/0004-6361:20011817.H. J. Tian, M. C. Neyrinck, T. Budav´ari, and A. S. Szalay. Redshift-space Enhancement of Line-of-sight Baryon Acoustic Oscillations in the Sloan Digital Sky Survey Main-galaxy Sample.

ApJ , 728:34,February 2011. doi: 10.1088/0004-637X/728/1/34. ichele Trenti, Britton D. Smith, Eric J. Hallman, Samuel W. Skillman, and J. Michael Shull. How Welldo Cosmological Simulations Reproduce Individual Halo Properties? The Astrophysical Journal , 711:1198–1207, March 2010. ISSN 0004-637X. doi: 10.1088/0004-637X/711/2/1198. URL http://adsabs.harvard.edu/abs/2010ApJ...711.1198T .M. A. Troxel and M. Ishak. The intrinsic alignment of galaxies and its impact on weak gravitational lensingin an era of precision cosmology.

PhR , 558:1–59, February 2015. doi: 10.1016/j.physrep.2014.11.001.M. Vogelsberger, S. Genel, D. Sijacki, P. Torrey, V. Springel, and L. Hernquist. A model for cosmologicalsimulations of galaxy formation physics.

MNRAS , 436:3031–3067, December 2013. doi: 10.1093/mnras/stt1789.M. Vogelsberger, S. Genel, V. Springel, P. Torrey, D. Sijacki, D. Xu, G. Snyder, S. Bird, D. Nelson, andL. Hernquist. Properties of galaxies reproduced by a hydrodynamic simulation.

Nature , 509:177–182,May 2014. doi: 10.1038/nature13316.Y. Wang, F. R. Pearce, A. Knebe, A. Schneider, C. Srisawat, D. Tweed, I. Jung, J. Han, J. Helly,J. Onions, P. J. Elahi, P. A. Thomas, P. Behroozi, S. K. Yi, V. Rodriguez-Gomez, Y.-Y. Mao, Y. Jing,and W. Lin. Sussing merger trees: stability and convergence.

MNRAS , 459:1554–1568, June 2016. doi:10.1093/mnras/stw726.Yang Wang, Frazer R. Pearce, Alexander Knebe, Aurel Schneider, Chaichalit Srisawat, Dylan Tweed,Intae Jung, Jiaxin Han, John Helly, Julian Onions, Pascal J. Elahi, Peter A. Thomas, Peter Behroozi,Sukyoung K. Yi, Vicente Rodriguez-Gomez, Yao-Yuan Mao, Yipeng Jing, and Weipeng Lin. Sussingmerger trees: stability and convergence.

Monthly Notices of the Royal Astronomical Society , 459:1554–1568, June 2016. ISSN 0035-8711. doi: 10.1093/mnras/stw726. URL http://adsabs.harvard.edu/abs/2016MNRAS.459.1554W .M. S. Warren. 2HOT: An Improved Parallel Hashed Oct-Tree N-Body Algorithm for Cosmological Simu-lation.

ArXiv e-prints , October 2013.R. H. Wechsler and J. L. Tinker. The Connection between Galaxies and their Dark Matter Halos.

ArXiv:1804.03097 , April 2018.B. D. Wibking, A. N. Salcedo, D. H. Weinberg, L. H. Garrison, D. Ferrer, J. Tinker, D. Eisenstein,M. Metchnik, and P. Pinto. Emulating galaxy clustering and galaxy-galaxy lensing into the deeplynonlinear regime: methodology, information, and forecasts.

ArXiv e-prints , September 2017.P. Wilking and P. Schneider. A quasi-Gaussian approximation for the probability distribution of correlationfunctions.

Astronomy and Astrophysics , 556:A70, August 2013. doi: 10.1051/0004-6361/201321718.H. S. Xavier, F. B. Abdalla, and B. Joachimi. Improving lognormal models for cosmological ﬁelds.

MonthlyNotices of the Royal Astronomical Society , 459:3693–3710, July 2016. doi: 10.1093/mnras/stw874.A. R. Zentner, E. Semboloni, S. Dodelson, T. Eiﬂer, E. Krause, and A. P. Hearin. Accounting for baryonsin cosmological constraints from cosmic shear.

PhysRevD , 87(4):043509, February 2013. doi: 10.1103/PhysRevD.87.043509.Z. Zhai, J. L. Tinker, M. R. Becker, J. DeRose, Y.-Y. Mao, T. McClintock, S. McLaughlin, E. Rozo,and R. H. Wechsler. The Aemulus Project III: Emulation of the Galaxy Correlation Function.

ArXive-prints , April 2018., April 2018.