Multi-wavelength mock galaxy catalogs of the low-redshift Universe
MMNRAS , 1–17 (0000) Preprint 11 January 2021 Compiled using MNRAS L A TEX style file v3.0
Multi-wavelength mock galaxy catalogs of the low-redshiftUniverse
Aseem Paranjape (cid:63) , Tirthankar Roy Choudhury † & Ravi K. Sheth , ‡ Inter-University Centre for Astronomy & Astrophysics, Ganeshkhind, Post Bag 4, Pune 411007, India National Centre for Radio Astrophysics, TIFR, Post Bag 3, Ganeshkhind, Pune 411007, India Center for Particle Cosmology, University of Pennsylvania, 209 S. 33rd St., Philadelphia, PA 19104, USA The Abdus Salam International Center for Theoretical Physics, Strada Costiera, 11, Trieste 34151, Italy draft
ABSTRACT
We present a new suite of mock galaxy catalogs mimicking the low-redshift Universe,based on an updated halo occupation distribution (HOD) model and a scaling relationbetween optical properties and the neutral hydrogen ( Hi ) content of galaxies. Ouralgorithm is constrained by observations of the luminosity function and luminosity-and colour-dependent clustering of SDSS galaxies, as well as the Hi mass function and Hi -dependent clustering of massive Hi -selected galaxies in the ALFALFA survey. Mockcentral and satellite galaxies with realistic values of r -band luminosity, g − r and u − r colour, stellar mass and Hi mass are populated in an N -body simulation, inheritinga number of properties of the density and tidal environment of their host halos. Thehost halo of each central galaxy is also ‘baryonified’ with realistic spatial distributionsof stars as well as hot and cold gas, along with the corresponding rotation curve. Ourdefault HOD assumes that galaxy properties are a function of group halo mass alone,and can optionally include effects such as galactic conformity and colour-dependentgalaxy assembly bias. The mocks predict the relation between the stellar mass and Hi mass of massive Hi galaxies, as well as the 2-point cross-correlation function ofspatially co-located optical and Hi -selected samples. These mocks form the basis ofa number of applications we will pursue in other work, such as novel null tests forgalaxy assembly bias, predictions for the Hi velocity width function and testing theuniversality of the radial acceleration relation. Key words: galaxies: formation - cosmology: theory, dark matter, large-scale structureof Universe - methods: numerical
Contemporary studies of galaxy evolution and cosmologymust explore a multitude of physical and statistical propertiesof the observed large-scale structure of the Universe. Thecomputational and technical challenges involved in any suchanalysis, whether observational or theoretical, mean thatmock galaxy catalogs are now a staple tool of cosmologicalanalyses.Current and upcoming large-volume surveys of the Uni-verse, aiming to extract cosmological information using ob-servables including the redshift space clustering of galaxiesat large and small scales, weak lensing, the abundances ofclusters and voids, etc., increasingly rely on the use of mockgalaxy catalogs for a variety of applications. Apart fromcalibrating expected measurement covariances and end-to- (cid:63)
E-mail: [email protected] † E-mail: [email protected] ‡ E-mail: [email protected] end pipeline testing (Mao et al. 2018), such catalogs alsoserve as excellent test-beds for exploring ideas related to thegalaxy-dark matter connection, the nature of dark matter orthe predictions of alternative gravity theories, and the effectsof dark energy. As such, it is critical to develop and calibratemock-making algorithms, constrained by existing observa-tions, that can accurately account for the multi-scale, multi-probe connection between baryonic matter in and aroundgalaxies and the dark cosmic web in which these galaxiesreside. This is the primary motivation behind the presentwork.Computationally speaking, the most efficient algorithmsare those which model the baryon-dark matter connection us-ing empirical, statistical tools that are motivated by the HaloModel (see Cooray & Sheth 2002, for a review). These includethe halo occupation distribution (HOD; Zehavi et al. 2011),conditional luminosity function (CLF; Yang et al. 2018, andreferences therein) or subhalo abundance matching (SHAM;Behroozi et al. 2019, and references therein) prescriptions © a r X i v : . [ a s t r o - ph . GA ] J a n Paranjape et al. that are constrained by observed galaxy abundances andclustering. At the other end of the spectrum, lie full-fledgedcosmological hydrodynamical simulations of galaxy forma-tion (e.g., Vogelsberger et al. 2014; Dubois et al. 2014; Schayeet al. 2015; Springel et al. 2018), arguably the most realisticand most expensive tool in computational cosmology. Semi-analytical models (SAMs), which evolve simplified physicaldescriptions of galaxy formation and evolution within thecosmic web of gravity-only simulations, lie somewhere be-tween full simulations and empirical models, in terms of bothcomputational complexity as well as fidelity to observationalconstraints (for a recent review, see Somerville & Dav´e 2015).The present work focuses on HOD models.Mock-making algorithms based on the HOD or SHAMframeworks have been frequently used in the literature inconjunction with large-volume surveys at low and interme-diate redshifts (Manera et al. 2013; de la Torre & Peacock2013; de la Torre et al. 2013; Kitaura et al. 2016; Mao et al.2018; Alam et al. 2020; Zhao et al. 2020; Sugiyama et al.2020). These algorithms typically segregate into those de-scribing stellar populations and overall star formation ac-tivity (constrained by a multitude of galaxy surveys at low-and high-redshift spanning wavelengths from the infraredto optical to ultraviolet), and others focused on the distri-bution of gas, primarily in the form of neutral hydrogen( Hi , constrained by radio wavelength observations, typicallyat low-redshift). We broadly refer to the former categoryusing the label ‘optical’ and the latter as ‘ Hi ’ algorithms.All such algorithms typically rely on dark halos identified ingravity-only cosmological simulations, along with statisticalprescriptions to paint galaxies into these halos.Algorithms for assigning single band optical luminosities(or stellar masses) to mock galaxies using HOD, CLF orSHAM prescriptions have existed for about two decades(Cooray & Sheth 2002; Vale & Ostriker 2004; Reddick et al.2013), with relatively recent extensions to include colours(or star formation rates) (Skibba & Sheth 2009; Hearin &Watson 2013; Contreras et al. 2020). While the simplestoccupation models single out halo mass as the primary driverof observed correlations between galaxy properties and theirenvironments (e.g., Abbas & Sheth 2007; Zu & Mandelbaum2015; Paranjape et al. 2018b; Alam et al. 2019), recent workhas argued for the importance of modelling beyond-masseffects such as assembly bias and galactic conformity (Zentneret al. 2014), leading to tune-able prescriptions for theseeffects (Masaki et al. 2013; Paranjape et al. 2015; Hearinet al. 2016a; Yuan et al. 2018; Xu et al. 2020; Contreraset al. 2020). State-of-the-art implementations such as the UniverseMachine prescription of Behroozi et al. (2019)employ SHAM on entire merger trees in high-resolutiongravity-only simulations, calibrated to reproduce stellar massfunctions and star formation rates over a wide range ofredshifts.On the Hi side, mock catalogs have been created using acombination of galaxy formation SAMs and a prescription fordistributing the Hi in disks (see, e.g., Obreschkow et al. 2009),which have been useful for planning upcoming surveys withtelescopes such as the Square Kilometre Array (SKA). If oneis only interested in the very large-scale correlations of the Hi distribution (e.g., for intensity mapping experiments), orin interpreting Hi detections based on stacking experimentsat high redshift, the algorithms for creating the catalogs areconsiderably simpler, assigning an Hi mass directly to darkhalos using a physically motivated prescription (Bagla et al. 2010; Guha Sarkar et al. 2012; Villaescusa-Navarro et al.2014; Castorina & Villaescusa-Navarro 2017; Padmanabhanet al. 2017).In this paper, we aim to combine low-redshift ( z (cid:46) . Hi -selected galaxies,to generate mock galaxies that are simultaneously assignedmulti-band optical information as well as Hi masses. To thisend, we consolidate recent work in these areas and intro-duce mock galaxy catalogs constructed using updated halooccupation models and calibration of multi-wavelength low-redshift galaxy optical and Hi properties. Each mock galaxyin our catalogs is assigned values of the r -band absolute mag-nitude M r (detailed definition below), colour indices g − r and u − r , Hi mass m Hi and stellar mass m ∗ , along witha range of environmental properties derived from the darkmatter environment of the galaxy’s host halo. Relying onthese properties, the mocks reproduce the luminosity func-tion and the luminosity and colour dependence of projected2-point clustering of SDSS galaxies, the Hi mass functionand m Hi -dependent clustering of massive ALFALFA galaxies,and predict the cross-correlations between these galaxies.The mocks also have tunable implementations of galacticconformity and colour-dependent galaxy assembly bias.Additionally, the host halo of each central galaxy inour mocks is ‘baryonified’, i.e., assigned a realistic spatialdistribution of stars and gas and, consequently, a realisticrotation curve for the galactic disk. This is a novel featureof our mocks which, as we discuss later, potentially allowsus to explore a number of interesting questions that havenot been adequately addressed in the theoretical literature.Among others, these include modelling the observed 21cmline profiles of Hi -selected galaxies, along with the associatedvelocity width distribution, and the nature and universalityof the radial acceleration relation in the baryons+cold darkmatter (CDM) paradigm.The rest of the paper is organised as follows. We de-scribe the ingredients of our mock algorithm in section 2,followed by a detailed description of the algorithm itself insection 3. In section 4, we demonstrate the performance ofour mocks in reproducing a number of 1-point and 2-pointstatistical observables. In section 5, we discuss observableswhose behaviour is predicted by our mocks, along with possi-ble extensions of our technique that are interesting for futureanalyses. We conclude in section 6 with a brief discussion ofpotential applications of our mocks.Throughout, we consider a flat ΛCDM cosmology withparameters { Ω m , Ω b , h, n s , σ } given by { } compatible with the 7-year results of the Wilkin-son Microwave Anisotropy Probe experiment (WMAP7, Ko-matsu et al. 2011), with a linear theory transfer functiongenerated by the code camb (Lewis et al. 2000). Our con-vention will be to quote halo masses ( m ) in h − M (cid:12) andgalaxy stellar masses ( m ∗ ) and Hi masses ( m Hi ) in h − M (cid:12) units. The notation m for halo mass will refer to m , themass enclosed in the radius R where the enclosed densityfalls to 200 times the background density. Similarly m vir willrefer to m , the mass enclosed in the radius R wherethe enclosed density falls to 200 times the critical density. http://camb.info MNRAS , 1–17 (0000) olourful mocks We start by describing the ingredients used in constructingour mocks. These include the gravity-only simulations thatwe populate with galaxies, the observed galaxy sample whoseoptical properties form the basis of the halo occupationdistribution (HOD) we use to assign luminosities, coloursand stellar masses to the mock galaxies, and the scalingrelation between optical properties and neutral hydrogen( Hi ) using which we assign Hi masses. The N -body simulations we rely on are listed in Table 1 ofParanjape & Alam (2020), of which we focus on the WMAP7configurations. Specifically, we have 2, 10 and 3 realisationseach of the L150 N1024, L300 N1024 and L600 N1024 boxes,respectively, corresponding to particle masses m p = 2 . × , . × , . × h − M (cid:12) , respectively. The notationL150 N1024, for example, indicates a cubic, periodic box oflength L box = 150 h − Mpc containing 1024 particles. Thesimulations were performed using the tree-PM code gadget-2 (Springel 2005) with a PM grid of a factor 2 finer than theinitial particle count along each axis, and a comoving forcesoftening length of 1 /
30 of the mean interparticle spacing.Initial conditions were generated using 2 nd order Lagrangianperturbation theory (Scoccimarro 1998) with the code music (Hahn & Abel 2011). Halos were identified using the code rockstar (Behroozi et al. 2013a) which performs a Friends-of-Friends (FoF) algorithm in 6-dimensional phase space. Wediscard all sub-halos and further only consider objects whose‘virial’ energy ratio η = 2 T / | U | satisfies 0 . ≤ η ≤ . We rely on optical properties of galaxies in the local Uni-verse as provided by Data Release 7 (DR7, Abazajian et al.2009) of the Sloan Digital Sky Survey (SDSS, York et al.2000). From the SDSS DR7 Catalog Archive Server (CAS), we obtained galaxy properties including Galactic extinction-corrected apparent magnitudes (luptitudes) in the u , g and r bands for all galaxies with spectroscopic redshifts in therange 0 . ≤ z ≤ . r -bandapparent magnitude threshold m r ≤ .
7. Both Petrosianand Model magnitudes were obtained from the database.Absolute magnitudes M . u , M . g , M . r were estimated byK-correcting to rest frame bands at z = 0 . K-correct (Blanton & Roweis 2007) (we used a modified version of thePython wrapper provided by N. Raseliarison ) and evolutioncorrecting as described by Blanton et al. (2003). We did notcorrect for dust extinction in the host, which makes edge-onspirals appear redder. This makes our colour-dependent anal-ysis consistent with that of Zehavi et al. (2011) who reportedmeasurements of colour-dependent clustering using similarly https://bitbucket.org/gfcstanford/rockstar http://hpc.iucaa.in https://github.com/nirinA/kcorrect_python uncorrected colours. Correcting for inclination can, in princi-ple, affect inferences regarding the physics of quenching insatellites, as well as the physics governing the Hi content ofoptically red galaxies, which we will explore in future work.(Consistency with Zehavi et al. 2011 is also why we do notwork with the improved SDSS photometry discussed in Meertet al. 2015.) Flux measurement errors were accounted forwhen using K-correct , but not explicitly in the Gaussianmixture fitting below.This analysis yielded values of M r ≡ M . r − ( h ),and similarly M g and M u , for each galaxy. The latter wereconverted to the colour indices g − r = M g − M r and u − r = M u − M r , which are therefore rest frame colours, K-correctedand evolution corrected to z = 0 .
1. Below, we use Petrosianabsolute magnitudes M r and Model colours g − r and u − r . Here we discuss the complete optical HOD we use for assign-ing luminosities and colours to mock galaxies. This HODis constrained by measurements of luminosity- and colour-dependent clustering in SDSS DR7 and uses a Gaussianmixture description of bi-variate colour distributions, as de-scribed below.
We use the standard 5-parameter mass-only HOD calibratedby Paul et al. (2019) for the WMAP7 cosmology, which de-scribes SDSS luminosity-dependent clustering measurementsfrom Zehavi et al. (2011). The HOD was calibrated usingsimulation-based tables of 2-point halo correlation functionsand halo profiles from the z = 0 outputs of the simulationsdescribed above, following the technique of Zheng & Guo(2016). Satellites were assumed to be distributed according tothe spherically averaged dark matter distribution in parenthalos, without assuming the (Navarro et al. 1996, NFW)profile (although the latter is an excellent approximationover the spatial and mass dynamic range of interest). For the2-halo terms, hard-sphere halo exclusion was implemented,but all other correlations were directly measured from thesimulations in narrow mass bins, so that non-linear, scale-dependent halo bias was accurately modelled. See Paul et al.(2019) for further discussion. Paul et al. (2019) also modelled colour-dependent cluster-ing from Zehavi et al. (2011) by treating the ‘red fraction’ f r | s ( M r ) of satellites of luminosity M r as a free parameter.Here f r | s ( M r ) is the fraction of satellites in a luminosity bin(labelled M r ) whose g − r colours satisfy g − r > ( g − r ) cut ( M r ) ≡ . − . M r , (1)(equation 13 of Zehavi et al. 2011). Paul et al. (2019) prop-agated this cut into the halo model formalism to observa-tionally constrain f r | s ( M r ) in the four wide luminosity binsfor which clustering measurements and jack-knife covariancematrices were available (see their Table 1).The final product provided by Paul et al. (2019) com-prises simple fitting functions for the M r -dependence of thefive parameters defining the thresholded HOD and the pa-rameter f r | s in the range − < M r ≤ −
19 (see their Figures
MNRAS000
MNRAS000 , 1–17 (0000)
Paranjape et al.
Figure 1.
Bivariate distribution of Model ( g − r, u − r ) coloursin bins of Petrosian absolute magnitude M r (coloured histograms)measured in SDSS, compared with the best fit Gaussian mixturein each bin (contours).
12 and 13 and Table 3; they denote f r | s as p r | s ). We use thesebelow to assign luminosities and colours to mock central andsatellite galaxies. In the following, we require the joint distribution p ( u − r, g − r | M r ) of g − r and u − r at fixed M r in SDSS, whichwe model as a 2-component bivariate Gaussian mixture. Inprinciple, one can logically extend this analysis to multiplebands using a higher-dimensional Gaussian mixture. We usemeasured values of g − r and u − r in narrow bins of M r without accounting for measurement errors. For each bin of M r , we construct a volume-limited subsample by choosinggalaxies in the redshift range 0 . ≤ z < z max ( M r, max ),where M r, max is the faint edge of the bin and z max ( M r, max )is the redshift at which a galaxy of this absolute magnitudewould fall below the flux limit m r = 17 . sklearn.mixture (Pedregosa et al.2011) to implement an iterative Expectation-Maximisationalgorithm (Dempster et al. 1977) with 12 initialisations. Werepeat the exercise in each luminosity bin for 150 bootstrapresamplings of the respective subsample and use the averagevalues of the 11 parameters defining the Gaussian mixture https://scikit-learn.org/ as the ‘best fit’, with the corresponding standard deviationsacross the bootstrap samples as errors.Figure 1 compares the measured bivariate distri-butions in a few bins of M r (coloured histograms)with the best fit Gaussian mixture (contours). Fig-ure 2 summarises the 11 parameters of the mixturefor all the luminosity bins (points with errors). Theparameters are: (i) the probability p (red | M r ) that thegalaxy belongs to the ‘red’ mode of the mixture, (ii)the mean vectors ( (cid:104) g − r | M r (cid:105) red , (cid:104) u − r | M r (cid:105) red ) and( (cid:104) g − r | M r (cid:105) blue , (cid:104) u − r | M r (cid:105) blue ) of the red and blue modes,respectively, (iii) the diagonal elements of the covariancematrices ( σ ( g − r | M r ) , σ ( u − r | M r )) and ( σ ( g − r | M r ) , σ ( u − r | M r )) of the red and blue modes, respec-tively and (iv) the correlation coefficients between g − r and u − r , for the red and blue modes respectively. The smoothcurves in Figure 2 show piece-wise continuous polynomialfits to each parameter. These are used similarly to the HODfitting functions provided by Paul et al. (2019), to assigncolours to mock galaxies (see below).For assigning colours to centrals and satellites separately,we require the probability p (red | sat , M r ) that the g − r and u − r colours of a satellite of luminosity M r are drawn fromthe ‘red mode’ of the bivariate double-Gaussian distributiondescribed above. This can be easily obtained by combiningthe constraints on f r | s described in section 2.3.2 with (a subsetof) the parameters of the Gaussian mixture p ( g − r, u − r | M r ),and is given by p (red | sat , M r ) = 2 f r | s − I (blue) ( M r ) I (red) ( M r ) − I (blue) ( M r ) (2)where (suppressing the M r -dependence of the arguments forbrevity) I (red / blue) ( M r ) = erfc (cid:32) ( g − r ) cut − (cid:104) g − r (cid:105) red / blue √ σ red / blue ( g − r ) (cid:33) , (3)and where ( g − r ) cut ( M r ) was given in equation (1). Stellar masses are calculated using a mass-to-light ratiocalibrated to SDSS DR7 measurements similarly to Paranjapeet al. (2015, see their sections 2.3 and 4.2). The measurementsare shown as the coloured histogram in Figure 3. We fit amean relation to these measurements in bins of x ≡ ( g − r ),of the form (cid:104) M/L (cid:105) ( x ) = a + b erf (( x − c ) d ) + e tanh(( x − f ) /g ) , (4)finding best fitting values a = 1 . b = 0 .
735 ; c = 0 . ,d = 3 .
38 ; e = 0 .
187 ; f = 0 . g = 0 . . (5)We also fit the scatter around the mean in the same bins, ofthe form σ ( M/L ) ( x ) = (cid:26) k + k ( x − x ) ; x < x k + k ( x − x ) + k ( x − x ) ; x ≥ x , (6)finding best-fitting values given by x = 1 .
019 ; k = 0 . k = − . ,k = 1 .
16 ; k = − . . (7) MNRAS , 1–17 (0000) olourful mocks Figure 2.
Summary of Gaussian mixture fits for p ( g − r, u − r | M r ) (points with errors). Smooth curves show piece-wise continuouspolynomial fits to each mixture parameter, which we use in assigning colours to mock galaxies. Our calibration is shown in Figure 3 and is better behavedthan that of Paranjape et al. (2015) for very blue objects; wehave checked, however, that both calibrations lead to nearlyidentical results for the final mocks. For each mock galaxywith colour g − r , we calculate a Gaussian random numberfor the mass-to-light ratio with mean and standard deviationcalculated using equations (4) and (6), respectively, setting x = ( g − r ). This is then combined with the correspondingvalue of M r of the galaxy to assign a stellar mass m ∗ . Paul et al. (2018, hereafter, PCP18) calibrated a lognormalscaling relation (with constant scatter in log-mass) betweenthe neutral hydrogen mass m Hi of a galaxy and its opti-cal properties M r and g − r , using the optical HOD fromGuo et al. (2015) and clustering measurements from Guoet al. (2017) of Hi -selected galaxies in the ALFALFA sur-vey (Giovanelli et al. 2005). The scaling relation was sep-arately calibrated for central and satellite galaxies using ahalo model, along with an overall parameter f Hi which givesthe fraction of optically selected galaxies that contain Hi .Their default model also assumed that satellites with m Hi > . × (0 . /h ) M (cid:12) do not exist. The model was con-strained using measurements of number counts and projectedclustering w p ( r p ) of ALFALFA Hi -selected galaxies fromGuo et al. (2017) for the thresholds log ( m Hi /h − M (cid:12) ) ≥ . (0 . . (0 . w p ( r p )for galaxies with log ( m Hi /h − M (cid:12) ) ≥ . (0 . m > m sat , max , Figure 3.
Calibration of mass-to-light ratio for the SDSS DR7sample, using the code
K-correct . Histogram shows the measureddistribution of Model g − r colours against Petrosian mass-to-light ratio in the r -band. Yellow points with errors show binnedmeasurements of the same, and the yellow solid curve shows ourbest fit relation (4) to the points. Dashed red curve shows thefit reported by Wang & White (2012) for reference. See text fordiscussion. while keeping the scaling relations for centrals and satel-lites intact otherwise, leads to acceptable descriptions of themeasurements for all the thresholds mentioned above, aswell as of the Hi mass function. A simple χ minimisationexercise, varying m sat , max and predicting w p ( r p ) for galaxieswith log ( m Hi /h − M (cid:12) ) ≥ . (0 . MNRAS000
K-correct . Histogram shows the measureddistribution of Model g − r colours against Petrosian mass-to-light ratio in the r -band. Yellow points with errors show binnedmeasurements of the same, and the yellow solid curve shows ourbest fit relation (4) to the points. Dashed red curve shows thefit reported by Wang & White (2012) for reference. See text fordiscussion. while keeping the scaling relations for centrals and satel-lites intact otherwise, leads to acceptable descriptions of themeasurements for all the thresholds mentioned above, aswell as of the Hi mass function. A simple χ minimisationexercise, varying m sat , max and predicting w p ( r p ) for galaxieswith log ( m Hi /h − M (cid:12) ) ≥ . (0 . MNRAS000 , 1–17 (0000)
Paranjape et al. covariance matrix kindly provided by Hong Guo, leads to m sat , max (cid:39) . h − M (cid:12) .Being tied to the optical HOD, this Hi scaling relationsuffers from a natural incompleteness in producing Hi masses,determined by the optical completeness limit on M r for thegalaxy population. For a sample limited by M r < − Hi mass function is complete only above m Hi (cid:38) . h − M (cid:12) . As such, all our analysis of Hi -selectedgalaxies is restricted to the massive end. We now describe our main algorithm for generating mockgalaxies in a gravity-only simulation, along with a ‘baryoni-fication’ scheme for assigning spatial distributions of starsand gas to each galaxy.
Our basic algorithm to assign galaxy luminosities and coloursto halos in N -body simulations is essentially the same as thatdescribed by Skibba & Sheth (2009), with a few technicalimprovements. We also include a number of additional galaxyproperties. The algorithm can be summarised as follows: • Central occupation and luminosity:
A thresholdluminosity L min (or absolute magnitude M r, max ) is dy-namically determined by requiring that the HOD of cen-tral galaxies f cen ( > L min | m ) be sampled down to a value1 . × − for all halo masses m ≥ m part , where m part isthe particle mass of the simulation. For example, for theL300 N1024 configuration, this gives us M r, max (cid:39) − m are then occupied by a central galaxywith probability f cen ( > L min | m ). Central galaxy luminosi-ties are sampled from the conditional luminosity function f cen ( > L | m ) /f cen ( > L min | m ). • Satellite occupation and luminosity:
The numberof satellites in an occupied halo is drawn from a Poissondistribution with mean ¯ N sat ( > L min | m ), and the luminositiesof these satellites are sampled from the conditional luminosityfunction ¯ N sat ( > L | m ) / ¯ N sat ( > L min | m ). We do not enforcethat satellites be less luminous than their host central. Fora luminosity-complete sample of galaxies with M r ≤ − .
5% of groups containing atleast one satellite, the brightest satellite is brighter than thecentral. • Galaxy positions:
The central of a halo is placed atthe halo center-of-mass. Satellite positions are distributed asan NFW profile around the central, truncated at R . Haloconcentrations c are drawn from a Lognormal distributionwith median and scatter at fixed halo mass as calibratedby Diemer & Kravtsov (2015) using very high-resolutionsimulations, which avoids contamination due to numericalfitting errors in relatively low-resolution boxes. The fittingfunctions from Diemer & Kravtsov (2015, who used the m definition of halo mass) are converted to the m definitionappropriate for the HOD fits using the prescription of Hu &Kravtsov (2003).To preserve correlations between halo concentration andlarge-scale environment, the Lognormal concentrations ofhalos in narrow mass bins are rank ordered by the actualestimated halo concentrations in each bin (the assumption being that this ranking would be approximately preservedeven in relatively coarsely sampled halos, although see Ra-makrishnan et al. 2020). The scale radius r s = R /c inferred for each halo from this exercise is stored for lateruse (see section 3.2). • Galaxy velocities:
The central of a halo is assigned thebulk velocity of the halo. The satellites are assigned randomvelocities drawn from a 3-dimensional isotropic Gaussiandistribution with mean equal to the central velocity and 1-dvelocity dispersion appropriate for the NFW profile at thelocation of the satellite.Additionally, we implement the following modifications: • Galaxy colours:
To assign g − r and u − r colours togalaxies, we extend the algorithm proposed by Skibba &Sheth (2009). As in their case, the first step is to determinewhether a galaxy is ‘red’ or ‘blue’, which is done separatelyfor satellites and centrals, using the probability p (red | sat , M r )for satellites (see sections 2.3.3) and the corresponding prob-ability for centrals which can be derived using p (red | sat , M r ), p (red | M r ) and the HOD (this also requires an integral overthe halo mass function, for which we use the fitting functionfrom Tinker et al. 2008). In the next step, we assign g − r and u − r colours by sampling the appropriate (bivariate) modeof the double Gaussian p ( g − r, u − r | M r ) calibrated in sec-tion 2.3.3. Operationally, we first sample the univariate mode p red / blue ( g − r | M r ) (obtained by marginalising the respectivebivariate distribution over u − r ) and then sample the corre-sponding conditional distribution p red / blue ( u − r | g − r, M r ),for each red/blue galaxy. Optionally, we also include galacticconformity by correlating g − r with halo concentration (oran unspecified Gaussian-distributed halo property) at fixedhalo mass, using the tunable prescription of Paranjape et al.(2015). • Galaxy stellar masses:
As discussed by Paranjapeet al. (2015) and described in detail in section 2.3.4, wecalculate stellar masses m ∗ using a ( g − r )-dependent mass-to-light ratio calibrated for SDSS DR7 galaxies. • Galaxy neutral hydrogen masses:
We assign Hi masses to a uniformly sampled fraction f Hi of galaxies, with alognormal distribution at fixed M r and g − r , using the modelfrom PCP18. This model, which we refer to as the ‘mini-mal PCP18’ model below, additionally discards Hi -satelliteswhich have log ( m Hi /h − M (cid:12) ) > . (0 . Hi -satellites in parent halos having m ≥ m sat , max . This is done after the uniform downsampling for Hi assign-ment described above. For our default model, which we call‘PCP18 mod-sat’, we set m sat , max = 10 . h − M (cid:12) (see be-low for a comparison between the models). As discussed byPCP18, for optically selected galaxies with M r ≤ −
18, thisoptical- Hi scaling relation places the majority of Hi mass infaint blue galaxies (see their Figure 9). • Neutral hydrogen disks:
For each galaxy containing Hi , we assign a comoving disk scale length h Hi for an assumedthin disk with Hi surface density Σ Hi ( r ⊥ ) ∝ e − r ⊥ /h Hi (here r ⊥ is the radial distance in the disk plane), using the scalingrelation h Hi = 7 . h − kpc (cid:0) m Hi / h − M (cid:12) (cid:1) . , (8) All such ‘discarded’ satellites continue to have their assignedoptical properties, only their Hi mass is set to zero.MNRAS , 1–17 (0000) olourful mocks with a scatter of 0 .
06 dex, consistent with the measurementsreported by Wang et al. (2016). These are useful in mod-elling ‘baryonified’ rotation curves, as we describe later. • Galaxy environment:
We assign to each galaxy thevalue of its host-centric dark matter overdensity δ and tidalanisotropy α , each Gaussian smoothed at an adaptive scale4 R , host / √ h − Mpc, as well as thebias b of the host (see Paranjape et al. 2018a and Paranjape& Alam 2020 for how these variables are calculated in the N -body simulation). The galaxy properties described above, namely, luminosity,colour, stellar mass and Hi mass, are all assigned by our al-gorithm by treating each galaxy as a point object. Combinedwith the information on the host halo mass and concentra-tion, however, these properties can also be used to model the spatial distribution of stars and of hot and cold gas in thegalaxy and its halo. This in turn can be used to construct arotation curve for the galaxy, which has several interestingapplications as we discuss later.Since the circular velocity v rot ( r ) at a halo-centric dis-tance r depends on the total mass m tot ( < r ) enclosed inthis radius, we must model the spatial distribution of all matter components inside the host halo. This is particularlyrelevant for the inner parts of the halo which are typicallybaryon-dominated. We restrict this analysis to central galax-ies and will return in future work to satellite galaxies, whichrequire additional modelling of processes such as tidal andram pressure stripping, strangulation, etc. (see, e.g., van denBosch et al. 2008; Behroozi et al. 2019) that are beyond thescope of the present work.We follow the prescription of Schneider & Teyssier (2015,henceforth, ST15) to ‘baryonify’ each host halo. This method,and extensions thereof, have been shown to successfully ac-count for baryonic effects in the matter power spectrum(Chisari et al. 2018; Schneider et al. 2019; Aric`o et al. 2020b)and bispectrum (Aric`o et al. 2020a) at relatively small scalesover a range of redshifts. We have modifed the ST15 pre-scription to include the Hi disk and have simplified it bytruncating all profiles at the halo radius (see, e.g., Aric`oet al. 2020b). Following the general practice for this method,we use the mass m vir ≡ m and radius R vir ≡ R forall baryonification scaling relations below. We are primarilyinterested here in low redshifts and length scales (cid:46) R vir . Webriefly summarise the method next. • Before baryonification, each halo starts with its totalmatter as a single component distributed according to anNFW profile consistent with the gravity-only simulation inwhich the mock catalog is being generated, using the mass m vir and concentration c vir = R vir /r s (see section 3.1 fordetails of determining the scale radius r s for each halo). • We divide the total mass of each baryonified halo into 5components: bound gas (‘bgas’), stars in the central galaxy Wang et al. (2016) provide a scaling relation for the quantity D Hi defined as the diameter of the contour corresponding to asurface density of 1 M (cid:12) pc − , which we relate to h Hi using theprovided scaling relation itself (which implies a constant averagesurface density, independent of m Hi ) along with an integral overthe exponential disk profile Σ Hi ( r ⊥ ). (‘cgal’), neutral hydrogen in its disk (‘ Hi ’), gas expelleddue to feedback (‘egas’) and the dark matter which quasi-adiabatically relaxes in the presence of the baryons (‘rdm’). • Each baryonic component, denoted by index α ∈{ bgas , cgal , Hi , egas } , is assigned a mass fraction f α subjectto the constraint f bary ≡ (cid:80) α f α = Ω b / Ω m due to conserva-tion of baryonic mass. In practice, we set f cgal = m ∗ /m vir , f Hi = m Hi /m vir and f bgas using f bgas = (Ω b / Ω m ) × (cid:104) M c /m vir ) β (cid:105) − , (9)with M c = 1 . × h − M (cid:12) and β = 0 . f egas is then set by the baryonic mass conservationconstraint. • Each baryonic component is given its own mass pro-file ρ α ( r ). The choices below for ρ bgas , ρ cgal and ρ egas areidentical to those in ST15. We briefly describe these belowand refer the reader to ST15 for more details and originalreferences.– The bound gas profile ρ bgas has the form ρ bgas ∝ [ln(1 + r/r s ) / ( r/r s )] / (Γ − , set assuming hydrostatic equi-librium and a polytropic equation of state in the inner haloand matched to the original NFW profile in the outer halo.Here r s and Γ are the scale radius of the NFW profile andthe polytropic index of the gas, respectively. The matchingis performed at a radius r match = √ r s and fixes the valueof Γ. For hosts of centrals with M r ≤ −
19, we find typicalvalues of Γ (cid:39) .
19 with a dispersion of (cid:39) . ρ cgal ∝ r − e − r / R with half-light radius R hl = 0 . R vir (Kravtsov 2013). In principle, this can be extended toinclude a scatter and/or accommodate a dependence onhalo angular momentum as predicted by disk formationmodels (Mo et al. 1998, see the discussion in Kravtsov2013); we ignore this here for simplicity. Strictly speaking,we should treat the stellar profile as a combination of acentral bulge and a 2-dimensional disk (with a relativecontribution that correlates with galaxy colour), ratherthan the purely spherically symmetric form assumed here.In this work, we will follow the previous literature on thesubject and assume the form given above, leaving a moreself-consistent description of the stellar disk to future work.In this sense, the stellar profile we model is better thoughtof as a pure bulge.– The ejected gas profile is taken to be ρ egas ( r ) ∝ e − r / r with r ej = 0 . √ η ej R vir , setting η ej = 0 . r ej (cid:39) . R vir . As discussed by ST15, the modellingof ρ egas ( r ) in the halo outskirts is rather uncertain and ob-servationally ill-constrained. However, at the scales of ourinterest ( r (cid:46) R vir ), ρ egas ( r ) ≈ constant and its contribu-tion to the total mass is therefore completely determinedby baryonic mass conservation inside the halo. Our resultsare therefore expected to be very robust to any minorvariations in the shape of ρ egas ( r ) at scales (cid:38) R vir .– The sphericalised profile of Hi is obtained by integrat-ing a thin exponential disk of surface density Σ Hi ( r ⊥ ) ∝ e − r ⊥ /h Hi (with r ⊥ being the radial distance in the diskplane) to get ρ Hi ( r ) ∝ r − e − r/h Hi . The disk scale length We assume that the stellar and Hi disks are decoupled and donot model time-dependent warps, etc. in the Hi disk.MNRAS000
19 with a dispersion of (cid:39) . ρ cgal ∝ r − e − r / R with half-light radius R hl = 0 . R vir (Kravtsov 2013). In principle, this can be extended toinclude a scatter and/or accommodate a dependence onhalo angular momentum as predicted by disk formationmodels (Mo et al. 1998, see the discussion in Kravtsov2013); we ignore this here for simplicity. Strictly speaking,we should treat the stellar profile as a combination of acentral bulge and a 2-dimensional disk (with a relativecontribution that correlates with galaxy colour), ratherthan the purely spherically symmetric form assumed here.In this work, we will follow the previous literature on thesubject and assume the form given above, leaving a moreself-consistent description of the stellar disk to future work.In this sense, the stellar profile we model is better thoughtof as a pure bulge.– The ejected gas profile is taken to be ρ egas ( r ) ∝ e − r / r with r ej = 0 . √ η ej R vir , setting η ej = 0 . r ej (cid:39) . R vir . As discussed by ST15, the modellingof ρ egas ( r ) in the halo outskirts is rather uncertain and ob-servationally ill-constrained. However, at the scales of ourinterest ( r (cid:46) R vir ), ρ egas ( r ) ≈ constant and its contribu-tion to the total mass is therefore completely determinedby baryonic mass conservation inside the halo. Our resultsare therefore expected to be very robust to any minorvariations in the shape of ρ egas ( r ) at scales (cid:38) R vir .– The sphericalised profile of Hi is obtained by integrat-ing a thin exponential disk of surface density Σ Hi ( r ⊥ ) ∝ e − r ⊥ /h Hi (with r ⊥ being the radial distance in the diskplane) to get ρ Hi ( r ) ∝ r − e − r/h Hi . The disk scale length We assume that the stellar and Hi disks are decoupled and donot model time-dependent warps, etc. in the Hi disk.MNRAS000 , 1–17 (0000) Paranjape et al.
Figure 4.
Baryonification of halos from gravity-only simulations. We show two examples of hypothetical halos with masses m vir =10 h − M (cid:12) (left panels) and m vir = 10 . h − M (cid:12) (right panels) , with halo concentrations and baryonic mass fractions as indicatedin respective labels. The baryonic fractions sum up to Ω b / Ω m (cid:39) . Upper panels:
Density profiles normalised by the halo density ρ ( < R vir ) ≡ ρ crit . The original NFW profile in each case is shown by the thin dotted black curve. Thicker curves with different coloursand line styles show the 5 individual components as indicated, with the thick dark red curve showing the final total profile. Note especiallythat the ejected gas profile (dotted magenta) is nearly a constant in each case, and that the relaxed dark matter profile (thick dashedblack) is substantially different from the original NFW in its shape due to quasi-adiabatic contraction and expansion. Lower panels:
Rotation velocity profile v rot ( r ) normalised by the virial velocity V vir = (cid:112) Gm vir /R vir . Note that the contribution of the Hi disk to therotation curve is treated separately from that of the spherical components, as described in the text. h Hi is assigned as described in section 3.1. Note:
Thissphericalised contribution only affects the calculation ofthe relaxed dark matter component. The contribution ofthe Hi disk to the rotation curve itself is treated separatelyas described below.Each profile function ρ α ( r ) is normalised so as to enclose the total mass m vir inside R vir . The total baryonic profile is then f bary ρ bary ( r ) = (cid:80) α f α ρ α ( r ), with an enclosed baryonicmass m bary ( < r ) = 4 π (cid:82) r d r (cid:48) r (cid:48) f bary ρ bary ( r (cid:48) ). • The dark matter component is assumed to quasi-adiabatically respond to the presence of baryonic mass andrelax to a new shape while approximately conserving angularmomentum. The details of the iterative procedure used tocalculate the resulting relaxed dark matter profile ρ rdm ( r )(normalised similarly to the baryonic profile functions) arein Appendix A, which is based on section 2.3 of ST15. Themass fraction f rdm is set simply by mass conservation to be f rdm = 1 − Ω b / Ω m . The dark matter mass enclosed in radius r is then m rdm ( < r ) = 4 π (cid:82) r d r (cid:48) r (cid:48) f rdm ρ rdm ( r (cid:48) ). • The thin exponential Hi disk leads to a mid-plane circu-lar velocity contribution v Hi ( r ) satisfying (see section 2.6 ofBinney & Tremaine 1987) v Hi ( r ) = 2 f Hi V ( h Hi /R vir ) y [ I ( y ) K ( y ) − I ( y ) K ( y )] , (10)where y ≡ r/ (2 h Hi ), V vir = (cid:112) Gm vir /R vir is the virial velocity and I n ( y ) and K n ( y ) are modified Bessel functions of thefirst and second kind, respectively. • The total mass of dark matter and all baryonic compo-nents except the Hi disk enclosed in radius r is m tot − Hi ( Sanity check on HOD. (Left panel:) Mocks versus input functions. For each threshold on M r , we separately show the contributionof central and satellite galaxies in the mock (histograms) and in the analytical HOD (smooth curves). (Right panel:) Thresholdedluminosity function in the mock (separately showing the contribution of central, satellite and all galaxies) versus SDSS data from Zehaviet al. (2011). Figure 6. Sanity check on clustering. Projected 2-point correlationfunction (2pcf) w p ( r p ) for red/blue/all galaxies in luminosity binsin the mock (solid lines with error bands) compared with data fromZehavi et al. (2011, points with errors). Mock measurements used6 realisations of the L300 N1024 box for the three fainter bins and3 realisations of the L600 N1024 box for the brightest bin. Linesshow the mean and error bands reflect the respective standarddeviations over all available realisations. Similarly to the data,mock galaxies were classified as red and blue based on their g − r values in comparison to equation (1). The 2pcf for mock galaxieswas calculated using equation (12) with π max = 60 h − Mpc tomatch the Zehavi et al. (2011) measurements. We now report the results of generating mock catalogs usingthe algorithm of section 3 on the simulations described insection 2.1. As a sanity check, the left panel of Figure 5 compares theinput fitting functions for the HOD from Paul et al. (2019)with the output of the mock algorithm applied to a singleL300 N1024 box. The right panel of the Figure compares thethresholded luminosity function averaged over 6 realisationsof the L300 N1024 box with the measurements from Zehaviet al. (2011) that were used as constraints by Paul et al.(2019).Figure 6 similarly compares the projected clustering ofred/blue/all galaxies in mock catalogs with the measurementsfrom Zehavi et al. (2011). Mock galaxies were classified asred and blue based on their g − r values in comparison toequation (1) to ensure a fair comparison with the data. Theprojected 2-point correlation function (2pcf) w p ( r p ) for mockgalaxies was calculated by integrating the real space 2pcf ξ ( r ) using w p ( r p ) = 2 (cid:90) √ r + π r p d r r ξ ( r ) (cid:112) r − r , (12)where we set π max = 60 h − Mpc to match the Zehavi et al.(2011) measurements. We see generally good agreement inall cases, although the clustering of red galaxies tends to belower in the mock than in the data. This is very likely due tothe limited volume of our 300 h − Mpc boxes, which do notinclude the effects of faint (predominantly red) satellites invery massive halos.Figure 7 shows the joint distributions of M r , g − r and u − r in one mock using the L300 N1024 box. We clearly see thewell-known colour-magnitude and colour-colour bimodality,another sanity check on the mock algorithm. MNRAS000 As a sanity check, the left panel of Figure 5 compares theinput fitting functions for the HOD from Paul et al. (2019)with the output of the mock algorithm applied to a singleL300 N1024 box. The right panel of the Figure compares thethresholded luminosity function averaged over 6 realisationsof the L300 N1024 box with the measurements from Zehaviet al. (2011) that were used as constraints by Paul et al.(2019).Figure 6 similarly compares the projected clustering ofred/blue/all galaxies in mock catalogs with the measurementsfrom Zehavi et al. (2011). Mock galaxies were classified asred and blue based on their g − r values in comparison toequation (1) to ensure a fair comparison with the data. Theprojected 2-point correlation function (2pcf) w p ( r p ) for mockgalaxies was calculated by integrating the real space 2pcf ξ ( r ) using w p ( r p ) = 2 (cid:90) √ r + π r p d r r ξ ( r ) (cid:112) r − r , (12)where we set π max = 60 h − Mpc to match the Zehavi et al.(2011) measurements. We see generally good agreement inall cases, although the clustering of red galaxies tends to belower in the mock than in the data. This is very likely due tothe limited volume of our 300 h − Mpc boxes, which do notinclude the effects of faint (predominantly red) satellites invery massive halos.Figure 7 shows the joint distributions of M r , g − r and u − r in one mock using the L300 N1024 box. We clearly see thewell-known colour-magnitude and colour-colour bimodality,another sanity check on the mock algorithm. MNRAS000 , 1–17 (0000) Paranjape et al. Figure 7. Colour-magnitude and colour-colour bimodality in the mock. Histograms show distributions of M r against g − r (left panel) and u − r (middle panel) , and g − r against u − r (right panel) in one mock using the L300 N1024 box. Figure 8. Differential luminosity functions (left panel) and stellar mass functions (right panel) of red/blue/all central/satellite/all galaxiesaveraged over 6 mocks using the L300 N1024 configuration. Colour segregation in the left panel was based on equation (1) applied to g − r and M r for each galaxy, while for the right panel we used ( g − r ) cut = 0 . 76 + 0 . ( m ∗ /h − M (cid:12) ) − 10] (Paranjape et al. 2015).For comparison, the solid green curves in the left (right) panel show the corresponding SDSS fits from Blanton et al. (2003) (Peng et al.2012). In the case of the stellar mass function, these fits are also separately available for centrals and satellites and are shown as thedashed and dotted green curves, respectively Turning to somewhat more detailed tests, Figure 8 com-pares the differential luminosity and stellar mass functionsaveraged over 6 realisations of the L300 N1024 box withfitting functions to SDSS measurements from the literature.In each case, for the mock measurements we show resultsseparately for red/blue/all central/satellite/all galaxies. Thetotal luminosity function is in reasonable agreement with theSchechter function fit from Blanton et al. (2003), which is notvery surprising since the thresholded luminosity function wasused as a constraint in the HOD calibration. The total stellarmass function of the mocks also agrees reasonably well withthe corresponding fit from Peng et al. (2012). In this case,we also have individual fits for centrals and satellites, whichwere produced by Peng et al. (2012) using the SDSS groupcatalog of Yang et al. (2007), which similarly agree well withthe stellar mass functions of mock centrals and satellites,respectively. Considering the substantial amount of system- atic uncertainty involved in extracting stellar mass functionsfrom data, as well as inherent systematics in the galaxy clas-sification algorithm used to produce the SDSS group catalog,we conclude that the mocks are in good agreement with thedata here as well. Figure 9 compares the Hi mass function and projected 2pcfin the ‘minimal PCP18’ model (see section 3.1) with measure-ments in the ALFALFA survey. The left panel shows the dif-ferential Hi mass function of red/blue/all central/satellite/allgalaxies averaged over 6 realisations of the L300 N1024box, compared with the fit to Hi -selected galaxies in theALFALFA survey by Martin et al. (2010). We see reason-able agreement above the completeness limit of the cat-alog (set by the luminosity completeness threshold; see MNRAS , 1–17 (0000) olourful mocks Figure 9. (Left panel:) Differential Hi mass function of red/blue/all central/satellite/all galaxies in averaged over 6 ‘minimal PCP18’model mocks using the L300 N1024 configuration. Colour segregation was based on equation (1) applied to g − r and M r for each galaxy.For comparison, the solid green curves show the corresponding fit from Martin et al. (2010) to Hi -selected galaxies in the ALFALFAsurvey. (Right panel:) Projected 2pcf in the ‘minimal PCP18’ model (averaged over the same mocks as in the left panel) for three m Hi thresholds, compared with corresponding ALFALFA measurements from Guo et al. (2017). The labels indicate the values of χ whencomparing each mock result with the corresponding 12 data points from the ALFALFA measurements using the covariance matriceskindly provided by Hong Guo. Figure 10. Same as figure 9, showing results for the ‘PCP18 mod-sat’ model described in section 3.1, which discards Hi -selected satellitesin parent halos with m ≥ m sat , max . The value of m sat , max was set to 10 . h − M (cid:12) by minimising the χ between the mock 2pcf resultsand corresponding ALFALFA measurements and covariance for the lowest mass threshold shown. We adopt this as our default model forassigning Hi mass to mock galaxies. section 2.4 and the discussion in Paul et al. 2018). The right panel shows that the 2pcf of mock galaxies withlog ( m Hi /h − M (cid:12) ) > . (0 . m Hi thresholds, whichperform well. The 2pcf for Hi -selected mock galaxies wascalculated using equation (12) with π max = 20 h − Mpc tomatch the Guo et al. (2017) measurements.As mentioned earlier, this slight disagreement is likely due to our use of an updated and improved optical HOD.We therefore explore the modification of the PCP18 modeldescribed in section 3.1 and discard Hi -selected satellitesin parent halos with m ≥ m sat , max . To set the value ofthe threshold, we attempted to minimise the χ betweenmocks and data for the threshold log ( m Hi /h − M (cid:12) ) ≥ . (0 . χ has a very broadminimum in the vicinity of m sat , max = 10 . h − M (cid:12) , whichwe use as our default value. The resulting Hi mass function MNRAS000 Same as figure 9, showing results for the ‘PCP18 mod-sat’ model described in section 3.1, which discards Hi -selected satellitesin parent halos with m ≥ m sat , max . The value of m sat , max was set to 10 . h − M (cid:12) by minimising the χ between the mock 2pcf resultsand corresponding ALFALFA measurements and covariance for the lowest mass threshold shown. We adopt this as our default model forassigning Hi mass to mock galaxies. section 2.4 and the discussion in Paul et al. 2018). The right panel shows that the 2pcf of mock galaxies withlog ( m Hi /h − M (cid:12) ) > . (0 . m Hi thresholds, whichperform well. The 2pcf for Hi -selected mock galaxies wascalculated using equation (12) with π max = 20 h − Mpc tomatch the Guo et al. (2017) measurements.As mentioned earlier, this slight disagreement is likely due to our use of an updated and improved optical HOD.We therefore explore the modification of the PCP18 modeldescribed in section 3.1 and discard Hi -selected satellitesin parent halos with m ≥ m sat , max . To set the value ofthe threshold, we attempted to minimise the χ betweenmocks and data for the threshold log ( m Hi /h − M (cid:12) ) ≥ . (0 . χ has a very broadminimum in the vicinity of m sat , max = 10 . h − M (cid:12) , whichwe use as our default value. The resulting Hi mass function MNRAS000 , 1–17 (0000) Paranjape et al. and projected 2pcf are shown in Figure 10; we see a mild im-provement in the 2pcf of the lowest threshold, and also someimprovement in the higher thresholds. Since this halo thresh-olded model, which we refer to as ‘PCP18 mod-sat’, allowsfor the existence of massive satellites while still agreeing withobservations, we choose to adopt it as our default model. Forcomparison, the number densities in units of ( h − Mpc) − in this model for each thresholded sample (in order of in-creasing threshold Hi mass) in the mocks are, respectively, { . ± . , . ± . , . ± . } × − (witherrors estimated using the scatter across 6 realisations), whilethe corresponding values from Table 1 of Guo et al. (2017)are { . , . , . } × − (no errors are provided on thesevalues).We have also explored several ‘beyond halo mass’ mod-ifications of the PCP18 model by changing the criterionused for discarding Hi -selected satellites. In particular, weconsidered thresholds on (i) halo mass and concentrationjointly (i.e., only allowing satellites in low mass and highconcentration halos), (ii) halo concentration alone (only highconcentration halos allowed), (iii) large-scale linear halo bias(low bias halos allowed) and (iv) discarding all ‘red mode’satellites. These are generally inspired by the results of Guoet al. (2017), who found that abundance matching prefer-entially younger (sub)halos with Hi -selected galaxies led togood descriptions of ALFALFA clustering. Of these, a jointthreshold on halo mass and concentration performs the best,but leads to minimum χ values nearly identical to those forthe ‘PCP18 mod-sat’ model above, at the cost of one addi-tional parameter. We therefore conclude that the ALFALFAdata for massive Hi galaxies do not require ‘beyond halomass’ effects in modelling the mass function and projected2pcf, provided that Hi mass is assigned through an opticalscaling relation. Having demonstrated that our mock catalogs reproduce thebasic 1-point and 2-point observables associated with galaxysamples selected by optical or Hi properties, in this sectionwe discuss certain predictions of our mocks, along with a fewpossible extensions. Figure 11 shows the rotation curves of 150 randomly chosenmock central galaxies containing Hi disks, with the curves ineach panel being coloured by one of m ∗ , M r , m Hi or g − r . Asnoted earlier, these are generally flat but show considerablediversity, in qualitative agreement with observed rotationcurves (Persic et al. 1996; McGaugh et al. 2001). We defera more quantitative comparison with observations to futurework.Our mocks also predict the relations between group halomass and stellar mass ( m ∗ − m ) as well as Hi mass ( m Hi − m ).Figure 12 shows the m ∗ − m relation (top panel) and the m Hi − m relation (bottom panel) for all central galaxies with M r ≤− 19 in one mock using the L300 N1024 configuration. The top panel is essentially the same as Figure A4 of Paranjapeet al. (2015), except that we have used an updated HOD.Similarly, the bottom panel can be compared with Figure B1of PCP18, who showed the median m Hi − m relation using their analytical halo model. Both sets of results are consistentwith these earlier works, showing a steep m ∗ − m relation buta much shallower m Hi − m relation. The latter feature alsoemphasizes the need for caution when painting Hi directlyinto halos: the weak correlation between Hi mass and halomass can amplify systematic errors in any calibration.For comparison, the purple curves in the top panel showthe median m ∗ − m relations calibrated using SHAM byBehroozi et al. (2013b, solid) and Kravtsov et al. (2018,dash-dotted), converted in each case to the m mass def-inition appropriate for this work. We see that, for stellarmasses above the completeness limit of our mocks (horizon-tal dotted line), the mock result is closer to the Behrooziet al. (2013b) relation for m (cid:46) . h − M (cid:12) and lies betweenthe two SHAM calibrations at larger halo masses. A similarcomparison with the literature for the m Hi − m relation iscomplicated by the fact that different authors have useddifferent conventions for defining this relation (e.g., Padman-abhan et al. 2017; Guo et al. 2017). We have checked that ourresults are qualitatively similar to these calibrations, leavinga more detailed analysis to future work. A primary strength of our mock algorithm is its ability topaint realistic optical and Hi properties in the same galaxies .This means that we can go beyond previous studies andpredict or forecast expectations for the joint distribution of,say, stellar and Hi mass in low-redshift galaxies, along withthe corresponding spatial correlations.Figure 13 shows the m Hi − m ∗ relation (left panel) andthe spatial cross-correlation function between galaxy samplesselected by luminosity and Hi mass (right panel) in our mocks.These are genuine predictions of our algorithm; comparingthese with corresponding measurements forms a test of thevarious underlying assumptions. Indeed, as already noted byPCP18, we see in the left panel that the predicted m Hi − m ∗ relation is in reasonable agreement with the results of Maddoxet al. (2015, purple points with errors), although the mediantrend in the mocks is slightly lower than in the data.To date, the only measurements of cross-correlationssimilar to those in the right panel are by Papastergis et al.(2013), who studied the projected cross-2pcf between Hi -selected galaxies in ALFALFA and colour-selected galaxiesin SDSS (see their Figures 17 and 18). As pointed out byGuo et al. (2017), however, the weights used by Papastergiset al. (2013) in their 2pcf measurements did not accuratelyaccount for sample variance effects, which are substantial inthe small volume ( z (cid:46) . 05) probed by the ALFALFA survey.E.g., Guo et al. (2017) reported a significant m Hi -dependenceof clustering, which was not detected by Papastergis et al.(2013).It will therefore be very interesting to confront our mockcatalogs with more robust cross-2pcf measurements. Forexample, the choices controlling Hi -satellites in our model(namely, the value of the threshold halo mass m sat , max ) af-fect the shape of the cross-correlation at small separationsbetween bright optical galaxies and Hi -selected galaxies andcan therefore be tested by such observations. MNRAS , 1–17 (0000) olourful mocks Figure 11. Rotation curves of 150 central galaxies containing Hi disks, chosen at random from one mock using the L300 N1024configuration. Each curve is coloured by the galaxy’s stellar mass m ∗ (top left) , luminosity M r (bottom left) , Hi mass m Hi (top right) andcolour g − r (bottom right) . Since each rotation curve is only generated for r ≤ R vir , the curves truncate at the dotted line which shows v ( r ) = ( V vir /R vir ) r in each panel. Figure 12. Correlation with halo mass. Histograms show the m ∗ − m (top panel) and m Hi − m (bottom panel) relation for allcentral galaxies with M r ≤ − 19 in one mock using the L300 N1024configuration. Solid yellow lines in each panel show the medianrelation in bins of halo mass, while dashed yellow lines show thecorresponding 16 th and 84 th percentiles. For calculating thesecurves, in the top panel , we ignore halos which do not contain acentral galaxy and in the bottom panel , we further ignore haloswhose central does not contain any Hi mass. The solid purplecurve in the top panel shows the SHAM calibration from Behrooziet al. (2013b), while the dash-dotted purple curve shows thesame relation with parameters taken from Kravtsov et al. (2018),converted to the m mass definition in each case. The horizontaldotted lines in each panel indicate the approximate completenessthresholds for m ∗ and m Hi in the mock. Although the mocks we have presented here provide fairlyrealistic descriptions of the distribution of optical and Hi properties of local Universe galaxies, they contain severalingredients which can be potentially improved upon or ex-tended. We list some of these here. We noted in section 3.1 that our assignment of halo concen-trations c preserves spatial correlations of c at fixedhalo mass (also called assembly bias) under the assumptionthat these correlations are accurately tracked even by poorlyresolved halos. Recently, Ramakrishnan et al. (2020) haveshown that this assumption fails at worse than ∼ 20% for ha-los resolved with (cid:46) 150 particles. Instead, they showed thatthe spatial correlations of the local tidal anisotropy α are ac-curately preserved even for halos with as few as 30 particles.Using their technique of sampling a conditional distribution p ( c | m, α ), therefore, will be a promising extension of ouralgorithm that would endow mocks built on low-resolution N -body simulations with accurate representations of haloassembly bias.Using the fitting functions provided by Ramakrishnanet al. (2020), the same technique can be trivially extended,over a similar dynamic range in halo mass, to halo propertiessuch as spin and asphericity, with the secondary bias of thesevariables also accurately accounted for. Halo spin, for exam-ple, can then be used to inform the assignment of the stellarspatial distribution (section 3.2), while halo asphericity couldconceivably be used to model aspherical spatial distributionsof satellites.Another interesting extension, also easy to include inour mocks, would be an environment-dependent modulationof the HOD itself, as discussed by Xu et al. (2020). Combinedwith the conditional sampling of halo properties, this wouldlead to full flexibility in modelling galaxy assembly bias. MNRAS000 150 particles. Instead, they showed thatthe spatial correlations of the local tidal anisotropy α are ac-curately preserved even for halos with as few as 30 particles.Using their technique of sampling a conditional distribution p ( c | m, α ), therefore, will be a promising extension of ouralgorithm that would endow mocks built on low-resolution N -body simulations with accurate representations of haloassembly bias.Using the fitting functions provided by Ramakrishnanet al. (2020), the same technique can be trivially extended,over a similar dynamic range in halo mass, to halo propertiessuch as spin and asphericity, with the secondary bias of thesevariables also accurately accounted for. Halo spin, for exam-ple, can then be used to inform the assignment of the stellarspatial distribution (section 3.2), while halo asphericity couldconceivably be used to model aspherical spatial distributionsof satellites.Another interesting extension, also easy to include inour mocks, would be an environment-dependent modulationof the HOD itself, as discussed by Xu et al. (2020). Combinedwith the conditional sampling of halo properties, this wouldlead to full flexibility in modelling galaxy assembly bias. MNRAS000 , 1–17 (0000) Paranjape et al. Figure 13. Predictions of optical- Hi correlations not used in constraining the galaxy-dark matter connection in our mocks. (Left panel:) Joint distribution (coloured histogram) of m Hi and m ∗ for galaxies containing Hi in one mock using the L300 N1024 box. Solid yellow lineindicates the median m Hi in bins of m ∗ , while dashed yellow lines indicate the 16 th and 84 th percentiles. Vertical and horizontal dottedlines indicate the completeness limits of the mock in m ∗ and m Hi , respectively (see Figures 8 and 10). For comparison, the purple symbolswith error bars show the relation calibrated by Maddox et al. (2015, see their Table 1) using a cross-matched sample of Hi -selectedgalaxies from the ALFALFA and SDSS surveys. (Right panel:) Hi mass threshold (indicated in the label) using3 realisations of the L300 N1024 configuration. The samples lie in the same volume and therefore overlap in membership, but are notexplicitly cross-matched during their selection. We also noted above that our treatment of the stellar spatialprofile is, strictly speaking, inconsistent because it assumes aspherically symmetric distribution of stars rather than, say,an axially symmetric disk. This has not been a concern in theliterature on baryonic effects in the matter power spectrum(see section 3.2 for references), presumably because of theangular averaging inherent in calculating the latter. Sinceour primary intention is to produce a rotation curve foreach galaxy, however, the difference between a sphericalbulge and an axial disk can potentially be a large effect (see,e.g., Figure 2-17 of Binney & Tremaine 1987). We intend toexplore this further in a forthcoming work, by simultaneouslymodelling a stellar disk and bulge using realistic bulge-to-disk mass ratios (e.g. Bernardi et al. 2014), along with theircorrelations with galaxy colours and the presence of an Hi disk. In this work, we used the expression (9) for the bound gasfraction f bgas , with parameters adopted from ST15, for mod-elling all central galaxies with M r ≤ − 19. Strictly speaking,this relation holds for the central galaxies of halos with m vir (cid:38) h − M (cid:12) , since it is calibrated using X-ray obser-vations of galaxy clusters. Our choice therefore correspondsto an extrapolation of this relation into an unobserved regimeof halo mass.Since the resulting value of f bgas for each central istypically substantially smaller than its f egas (which is set bybaryonic mass conservation in this work), we do not expectthis extrapolation to lead to any significant systematic errorfor any of the statistics explored in this paper. One possibilityfor improving this calibration would be to focus on modelling f egas first, using observations of the circum-galactic medium,and then fix f bgas using baryonic mass conservation. All of our results have been restricted to the local Universe,a consequence of using clustering constraints from the low-redshift ( z (cid:46) . 1) surveys SDSS and ALFALFA. There area number of reasons to motivate extending our results toredshifts 0 . (cid:46) z (cid:46) z (cid:38) . z = 0 that a simple extension ofour algorithm, ignoring the effects of halo and galaxy merg-ers, might be sufficient to robustly predict (central) galaxyproperties. This would involve a ‘backwards’ evolution ofmulti-band galaxy luminosities and colours using tools suchas K-correct , along with some assumptions regarding theevolution of the optical HOD and the optical- Hi scalingrelation we use, possibly informed by simplified galaxy evolu-tion models (e.g., Lilly et al. 2013). This approach would be MNRAS , 1–17 (0000) olourful mocks complementary to SHAM prescriptions such as UniverseMa-chine (Behroozi et al. 2019) which rely on high-resolutionmerger trees that our approach, in principle, can do without.If successful, such an exercise would allow us to addressa range of issues. For optically selected samples, this includesthe expected distribution of galaxies observed by surveys suchas DES and Euclid, the interplay between scale-dependentgalaxy bias and redshift space distortions for these galaxies,etc. For neutral hydrogen studies, such mocks could informexpectations from stacking analyses, predictions for inten-sity mapping experiments, etc. We leave a more detailedassessment of the feasibility of this technique to future work. The ability to realistically reproduce, in a simulated universe,the properties and spatial distribution of galaxies observed inthe actual Universe, opens the door to addressing a numberof interesting astrophysical and cosmological questions. Wehave presented an updated algorithm that produces catalogsof mock galaxies in simulated halos at z ≈ 0, realisticallyendowed with a variety of properties including r -band lumi-nosities, g − r and u − r colours, stellar masses m ∗ , neutralhydrogen ( Hi ) masses m Hi , as well as (for central galaxies)the spatial distribution of gas and stars, leading to realisticrotation curves. Our mock galaxies additionally inherit anumber of environmental properties from their host darkmatter halos, including the halo-centric overdensity, tidalanisotropy and large-scale halo bias.Our algorithm, which relies on an HOD which assignsgalaxy properties based on halo mass m alone, can option-ally include effects such as galactic conformity and colour-dependent galaxy assembly bias, and is easily extendable toinclude effects such as environment-dependent modulations ofthe HOD. By construction, the basic mocks we presented herereproduce the luminosity function, colour-luminosity relationand the luminosity- and colour-dependent 2-point clusteringof optically selected SDSS galaxies with M r ≤ − 19, as wellas the Hi mass function and Hi -dependent 2-point clusteringof Hi -selected ALFALFA galaxies with m Hi (cid:38) . h − M (cid:12) .The mocks then reproduce the SDSS stellar mass functionand the SDSS-ALFALFA m ∗ − m Hi relation reasonably well(these were not used when constraining the parameters ofthe algorithm), while predicting the spatial 2-point cross-correlation function of low-redshift optical and Hi galaxies(which has not yet been robustly measured; Figure 13), andthe m ∗ − m and m Hi − m relations (Figure 12). The cal-ibrations we used lead to volume-completeness thresholdsof M r ≤ − m ∗ (cid:38) . h − M (cid:12) and m Hi (cid:38) . h − M (cid:12) ,thus representing the population of massive galaxies in thelow-redshift Universe. Our algorithm represents a consoli-dation of the results of Paranjape et al. (2015), Paul et al.(2018) and Paul et al. (2019).Our mocks are potentially useful for a number of appli-cations, some of which we list here. • In their study of environment-dependent clustering inSDSS, Paranjape et al. (2018b) noted some small ( ∼ • As we noted earlier, our algorithm successfully describes Hi -dependent clustering at the massive end in ALFALFAwithout the need of assembly bias, unlike earlier studies (e.g.Guo et al. 2017). Our mocks can therefore serve as usefulnull tests for galaxy assembly bias using interesting newcombinations of observables, such as the large-scale bias ofgalaxies split by Hi mass in bins of stellar mass. • Galactic conformity, a putative non-local connectionbetween the satellites and central galaxy of the same halo,continues to pose a puzzle for galaxy formation models (Wein-mann et al. 2006; Hearin et al. 2016b). Our mocks can beused to explore new tests of this phenomenon, e.g., the poten-tial dependence of galactic conformity on tidal environment(which is otherwise an excellent indicator of halo assemblybias, see Ramakrishnan et al. 2019). • Our mocks can predict the Hi mass function of (massive)galaxies selected by optical luminosity or colour. The corre-sponding measurements have only recently become available(Dutta et al. 2020; Dutta & Khandai 2021), and will bevery useful for testing our basic assumptions regarding theconnection between optical properties and Hi . • The rotation curves of our mock central galaxies, alongwith their Hi disks when present, can be used to model theobserved 21cm velocity profiles of Hi -selected galaxies. Thedistribution of the widths of these profiles has been measuredin the ALFALFA survey (Papastergis et al. 2011; Moormanet al. 2014) and constitutes an exciting and hitherto unex-plored new probe of the small-scale distribution of baryonicmatter. • The ‘radial acceleration relation’ (RAR) between theacceleration profiles due to dark matter and baryons in diskgalaxies (McGaugh et al. 2016; Lelli et al. 2017) has emergedas an intriguing new probe of gravitational theories at galac-tic length scales. The rotation curves and mass profiles ofbaryonic and dark matter in our mock central galaxies willallow us to explore the nature of the RAR for large samplesof galaxies in the CDM+baryons framework.We intend to pursue these applications in the near futureand will report on the results in forthcoming publications. ACKNOWLEDGMENTS We thank R. Srianand for his collaboration and for manyvaluable discussions and suggestions which improved the pre-sentation of this paper. AP also thanks Nishikanta Khandaiand Kandaswamy Subramanian for insightful discussions.The research of AP is supported by the Associateship Schemeof ICTP, Trieste and the Ramanujan Fellowship awardedby the Department of Science and Technology, Governmentof India. TRC acknowledges support of the Department ofAtomic Energy, Government of India, under project no. 12-R&D-TFR-5.02-0700 and the Associateship Scheme of ICTP.This work made extensive use of the open source computingpackages NumPy (Van Der Walt et al. 2011), SciPy (Vir- MNRAS000 We thank R. Srianand for his collaboration and for manyvaluable discussions and suggestions which improved the pre-sentation of this paper. AP also thanks Nishikanta Khandaiand Kandaswamy Subramanian for insightful discussions.The research of AP is supported by the Associateship Schemeof ICTP, Trieste and the Ramanujan Fellowship awardedby the Department of Science and Technology, Governmentof India. TRC acknowledges support of the Department ofAtomic Energy, Government of India, under project no. 12-R&D-TFR-5.02-0700 and the Associateship Scheme of ICTP.This work made extensive use of the open source computingpackages NumPy (Van Der Walt et al. 2011), SciPy (Vir- MNRAS000 , 1–17 (0000) Paranjape et al. tanen et al. 2020), Matplotlib (Hunter 2007), JupyterNotebook and the plotting software Veusz. We gratefullyacknowledge the use of high performance computing facilitiesat IUCAA, Pune. DATA AVAILABILITY The mock catalogs generated by our algorithm will be sharedupon reasonable request to the authors. REFERENCES Abadi M. G., Navarro J. F., Fardal M., Babul A., Steinmetz M.,2010, MNRAS, 407, 435Abazajian K. N., et al., 2009, ApJS, 182, 543Abbas U., Sheth R. K., 2007, MNRAS, 378, 641Aihara H., et al., 2018, PASJ, 70, S4Alam S., Zu Y., Peacock J. A., Mandelbaum R., 2019, MNRAS,483, 4501Alam S., et al., 2020, arXiv e-prints, p. arXiv:2007.09004Aric`o G., Angulo R. E., Hern´andez-Monteagudo C., Contreras S.,Zennaro M., 2020a, arXiv e-prints, p. arXiv:2009.14225Aric`o G., Angulo R. E., Hern´andez-Monteagudo C., Contreras S.,Zennaro M., Pellejero-Iba˜nez M., Rosas-Guevara Y., 2020b,MNRAS, 495, 4800Bagla J. S., Khandai N., Datta K. K., 2010, MNRAS, 407, 567Barnes J., White S. D. M., 1984, MNRAS, 211, 753Behroozi P. S., Wechsler R. H., Wu H.-Y., 2013a, ApJ, 762, 109Behroozi P. S., Wechsler R. H., Conroy C., 2013b, ApJ, 770, 57Behroozi P., Wechsler R. H., Hearin A. P., Conroy C., 2019,MNRAS, 488, 3143Ben´ıtez N., et al., 2015, in Highlights of Spanish AstrophysicsVIII. pp 148–153Bernardi M., Meert A., Vikram V., Huertas-Company M., Mei S.,Shankar F., Sheth R. K., 2014, MNRAS, 443, 874Bett P., Eke V., Frenk C. S., Jenkins A., Helly J., Navarro J.,2007, MNRAS, 376, 215Binney J., Tremaine S., 1987, Galactic dynamics. Princeton Uni-versity Press, Princeton, NJBlanton M. R., Roweis S., 2007, AJ, 133, 734Blanton M. R., et al., 2003, ApJ, 592, 819Blumenthal G. R., Faber S. M., Flores R., Primack J. R., 1986,ApJ, 301, 27Blyth S., et al., 2015, in Advancing Astrophysics with the SquareKilometre Array (AASKA14). p. 128 ( arXiv:1501.01295 )Castorina E., Villaescusa-Navarro F., 2017, MNRAS, 471, 1788Chisari N. E., et al., 2018, MNRAS, 480, 3962Contreras S., Angulo R., Zennaro M., 2020, arXiv e-prints, p.arXiv:2012.06596Cooray A., Sheth R., 2002, Phys. Rep., 372, 1DESI Collaboration et al., 2016, arXiv e-prints, p.arXiv:1611.00036Dempster A. P., Laird N. M., Rubin D. B., 1977, Journal of theRoyal Statistical Society. Series B (Methodological), 39, 1Diemer B., Kravtsov A. V., 2015, ApJ, 799, 108Driver S. P., et al., 2009, Astronomy and Geophysics, 50, 5.12Dubois Y., et al., 2014, MNRAS, 444, 1453Dutta S., Khandai N., 2021, MNRAS, 500, L37Dutta S., Khandai N., Dey B., 2020, MNRAS, 494, 2664Flaugher B., 2005, International Journal of Modern Physics A, 20,3121 https://matplotlib.org/ https://jupyter.org https://veusz.github.io/ http://hpc.iucaa.in Giovanelli R., et al., 2005, AJ, 130, 2598Gnedin O. Y., Kravtsov A. V., Klypin A. A., Nagai D., 2004, ApJ,616, 16Guha Sarkar T., Mitra S., Majumdar S., Choudhury T. R., 2012,MNRAS, 421, 3570Guo H., et al., 2015, MNRAS, 453, 4368Guo H., Li C., Zheng Z., Mo H. J., Jing Y. P., Zu Y., Lim S. H.,Xu H., 2017, ApJ, 846, 61Guzzo L., et al., 2014, A&A, 566, A108Hahn O., Abel T., 2011, MNRAS, 415, 2101Hearin A. P., Watson D. F., 2013, MNRAS, 435, 1313Hearin A. P., Zentner A. R., van den Bosch F. C., Campbell D.,Tollerud E., 2016a, MNRAS, 460, 2552Hearin A. P., Behroozi P. S., van den Bosch F. C., 2016b, MNRAS,461, 2135Hu W., Kravtsov A. V., 2003, ApJ, 584, 702Hunter J. D., 2007, Computing In Science & Engineering, 9, 90Kitaura F.-S., et al., 2016, MNRAS, 456, 4156Komatsu E., et al., 2011, ApJS, 192, 18Kravtsov A. V., 2013, ApJ, 764, L31Kravtsov A. V., Vikhlinin A. A., Meshcheryakov A. V., 2018,Astronomy Letters, 44, 8LSST Science Collaboration et al., 2009, arXiv e-prints, p.arXiv:0912.0201Laureijs R., et al., 2011, arXiv e-prints, p. arXiv:1110.3193Lelli F., McGaugh S. S., Schombert J. M., Pawlowski M. S., 2017,ApJ, 836, 152Lewis A., Challinor A., Lasenby A., 2000, ApJ, 538, 473Lilly S. J., Carollo C. M., Pipino A., Renzini A., Peng Y., 2013,ApJ, 772, 119Madau P., Dickinson M., 2014, ARA&A, 52, 415Maddox N., Hess K. M., Obreschkow D., Jarvis M. J., Blyth S. L.,2015, MNRAS, 447, 1610Manera M., et al., 2013, MNRAS, 428, 1036Mao Y.-Y., et al., 2018, ApJS, 234, 36Martin A. M., Papastergis E., Giovanelli R., Haynes M. P.,Springob C. M., Stierwalt S., 2010, ApJ, 723, 1359Masaki S., Lin Y.-T., Yoshida N., 2013, MNRAS, 436, 2286McGaugh S. S., Rubin V. C., de Blok W. J. G., 2001, AJ, 122,2381McGaugh S. S., Lelli F., Schombert J. M., 2016, Phys. Rev. Lett.,117, 201101Meert A., Vikram V., Bernardi M., 2015, MNRAS, 446, 3943Mo H. J., Mao S., White S. D. M., 1998, MNRAS, 295, 319Moorman C. M., Vogeley M. S., Hoyle F., Pan D. C., HaynesM. P., Giovanelli R., 2014, MNRAS, 444, 3559Navarro J. F., Frenk C. S., White S. D. M., 1996, ApJ, 462, 563Obreschkow D., Croton D., De Lucia G., Khochfar S., RawlingsS., 2009, ApJ, 698, 1467Padmanabhan H., Refregier A., Amara A., 2017, MNRAS, 469,2323Papastergis E., Martin A. M., Giovanelli R., Haynes M. P., 2011,ApJ, 739, 38Papastergis E., Giovanelli R., Haynes M. P., Rodr´ıguez-Puebla A.,Jones M. G., 2013, ApJ, 776, 43Paranjape A., Alam S., 2020, MNRAS, 495, 3233Paranjape A., Kovaˇc K., Hartley W. G., Pahwa I., 2015, MNRAS,454, 3030Paranjape A., Hahn O., Sheth R. K., 2018a, MNRAS, 476, 3631Paranjape A., Hahn O., Sheth R. K., 2018b, MNRAS, 476, 5442Paul N., Choudhury T. R., Paranjape A., 2018, MNRAS, 479,1627Paul N., Pahwa I., Paranjape A., 2019, MNRAS, 488, 1220Pedregosa F., et al., 2011, Journal of Machine Learning Research,12, 2825Peng Y.-j., Lilly S. J., Renzini A., Carollo M., 2012, ApJ, 757, 4Persic M., Salucci P., Stel F., 1996, MNRAS, 281, 27Ramakrishnan S., Paranjape A., Hahn O., Sheth R. K., 2019,MNRAS, 489, 2977 MNRAS , 1–17 (0000) olourful mocks Ramakrishnan S., Paranjape A., Sheth R. K., 2020, arXiv e-prints,p. arXiv:2012.10170Reddick R. M., Wechsler R. H., Tinker J. L., Behroozi P. S., 2013,ApJ, 771, 30Schaye J., et al., 2015, MNRAS, 446, 521Schneider A., Teyssier R., 2015, J. Cosmology Astropart. Phys.,2015, 049Schneider A., Teyssier R., Stadel J., Chisari N. E., Le Brun A.M. C., Amara A., Refregier A., 2019, J. Cosmology Astropart.Phys., 2019, 020Scoccimarro R., 1998, MNRAS, 299, 1097Skibba R. A., Sheth R. K., 2009, MNRAS, 392, 1080Somerville R. S., Dav´e R., 2015, ARA&A, 53, 51Springel V., 2005, MNRAS, 364, 1105Springel V., et al., 2018, MNRAS, 475, 676Sugiyama S., Takada M., Kobayashi Y., Miyatake H., ShirasakiM., Nishimichi T., Park Y., 2020, Phys. Rev. D, 102, 083520Teyssier R., Moore B., Martizzi D., Dubois Y., Mayer L., 2011,MNRAS, 414, 195Tinker J., Kravtsov A. V., Klypin A., Abazajian K., Warren M.,Yepes G., Gottl¨ober S., Holz D. E., 2008, ApJ, 688, 709Vale A., Ostriker J. P., 2004, MNRAS, 353, 189Van Der Walt S., Colbert S. C., Varoquaux G., 2011, preprint,( arXiv:1102.1523 )Villaescusa-Navarro F., Viel M., Datta K. K., Choudhury T. R.,2014, J. Cosmology Astropart. Phys., 2014, 050Virtanen P., et al., 2020, Nature Methods, 17, 261Vogelsberger M., et al., 2014, MNRAS, 444, 1518Wang W., White S. D. M., 2012, MNRAS, 424, 2574Wang J., Koribalski B. S., Serra P., van der Hulst T., Roychowd-hury S., Kamphuis P., Chengalur J. N., 2016, MNRAS, 460,2143Weinmann S. M., van den Bosch F. C., Yang X., Mo H. J., 2006,MNRAS, 366, 2Xu X., Zehavi I., Contreras S., 2020, arXiv e-prints, p.arXiv:2007.05545Yang X., Mo H. J., van den Bosch F. C., Pasquali A., Li C.,Barden M., 2007, ApJ, 671, 153Yang X., et al., 2018, ApJ, 860, 30York D. G., et al., 2000, AJ, 120, 1579Yuan S., Eisenstein D. J., Garrison L. H., 2018, MNRAS, 478,2019Zehavi I., et al., 2011, ApJ, 736, 59Zentner A. R., Hearin A. P., van den Bosch F. C., 2014, MNRAS,443, 3044Zhao C., et al., 2020, arXiv e-prints, p. arXiv:2007.08997Zheng Z., Guo H., 2016, MNRAS, 458, 4015Zu Y., Mandelbaum R., 2015, MNRAS, 454, 1161de la Torre S., Peacock J. A., 2013, MNRAS, 435, 743de la Torre S., et al., 2013, A&A, 557, A54van den Bosch F. C., Aquino D., Yang X., Mo H. J., Pasquali A.,McIntosh D. H., Weinmann S. M., Kang X., 2008, MNRAS,387, 79 APPENDIX A: QUASI-ADIABATICRELAXATION Here we describe the technique for calculating the responseof the dark matter profile to presence of baryonic matterthrough an approximate conservation of angular momentum(Barnes & White 1984; Blumenthal et al. 1986; Gnedin et al.2004; Abadi et al. 2010; Teyssier et al. 2011; ST15). The dis-cussion below follows section 2.3 of ST15 (see their equations2.15-2.17).The basic equation describing this quasi-adiabatic re-laxation gives the final radius r of a spherical dark matter element in terms of its initial radius r in , rr in = 1 + q rdm (cid:18) m nfw ( < r in ) m tot ( < r ) − (cid:19) . (A1)Here, m tot ( < r ) = m bary ( < r ) + m rdm ( < r ) is the totalmass contained inside the final radius r , with m bary ( < r ) =4 π (cid:82) r d r (cid:48) r (cid:48) f bary ρ bary ( r (cid:48) ) being the baryonic componentand m rdm ( < r ) being the final, relaxed dark matter massprofile which satisfies m rdm ( < r ) = f rdm m nfw ( < r in ) , (A2)and m nfw ( < r in ) = 4 π (cid:82) r in d r (cid:48) r (cid:48) ρ nfw ( r (cid:48) ) is the dark mattermass inside the initial radius as per the original, normalisedNFW profile. We remind the reader that all the densityprofiles are normalised so as to enclose the entire mass m vir inside r = R vir .The quantity q rdm is a parameter controlling the levelof angular momentum conservation. From equation (A1), wesee that q rdm = 1 corresponds to perfect conservation, since r m ( < r ) is an adiabatic invariant in this case. On the otherhand, q rdm = 0 corresponds to no baryonic backreaction. Inthis work, we follow ST15 and set q rdm = 0 . 68, which hasbeen found to accurately describe the cumulative effects ofbaryonic backreaction effects both in the inner and outerregions of simulated halos, accounting for the fact that theformation of the central galaxy is not instantaneous. Theeffect of varying q rdm on rotation curves and related statisticswill be the focus of a future study.Defining the ratio ξ ≡ r/r in , (A3)equation (A1) can be re-written as L ( ξ | r ) ≡ ξ − q rdm − q rdm f rdm (cid:20) m bary ( < r ) f rdm m nfw ( < r/ξ ) (cid:21) − = 0 , (A4)which is conducive to an iterative solution. We employ New-ton’s method using an analytical expression for the derivative L (cid:48) ( ξ | r ) = ∂ L ( ξ | r ) /∂ξ at fixed r and writing the estimate atthe n th iteration as ξ ( n ) = ξ ( n − − L ( ξ ( n − | r ) L (cid:48) ( ξ ( n − | r ) . (A5)For practically all baryonic configurations and values of r ∈ (10 − , × R vir , and for all values 0 ≤ q rdm ≤ 1, convergenceis achieved with a relative tolerance of 10 − in (cid:46) q rdm (cid:46) . 8, while the inner regionsof the halo do not converge for q rdm (cid:38) . ξ at any r then gives the mass ofrelaxed dark matter enclosed in radius r using equation (A2)setting r in = r/ξ on the right hand side. By construction, ξ = 1 at r = R vir , so that m rdm ( < R vir ) = f rdm m vir , as itshould be. The value of m rdm ( < r ) at any r is sufficient forcalculating the rotation curve v rot ( r ) using equation (11). Forthe differential profile shown in Figure 4, we must differentiateequation (A2) with respect to r . The (somewhat cumbersome)result can be written analytically entirely in terms of ξ and r ; we omit it for brevity. MNRAS000