An Improved and Physically-Motivated Scheme for Matching Galaxies with Dark Matter Halos
DD raft version M arch
1, 2021Typeset using L A TEX twocolumn style in AASTeX63
An Improved and Physically-Motivated Scheme for Matching Galaxies with Dark Matter Halos S tephanie T onnesen and J eremiah P. O striker
Flatiron Institute, CCA, 162 5th Avenue, New York, NY 10010 USA Princeton University Observatory, Ivy Lane, Princeton, NJ 08544 USA Columbia Astrophysics Laboratory, 538 W 120th St, New York, NY 10027 USA (Dated: January 2021)
ABSTRACTThe simplest scheme for predicting real galaxy properties after performing a dark matter simulation is torank order the real systems by stellar mass and the simulated systems by halo mass and then simply assumemonotonicity - that the more massive halos host the more massive galaxies. This has had some success, but westudy here if a better motivated and more accurate matching scheme is easily constructed by looking carefullyat how well one could predict the simulated IllustrisTNG galaxy sample from its dark matter computations.We find that using the dark matter rotation curve peak velocity, v max , for normal galaxies reduces the errorof the prediction by 30% (18% for central galaxies and 60% for satellite systems) - following expectationsfrom Faber-Jackson and the physics of monolithic collapse. For massive systems with halo mass > . M (cid:12) hierarchical merger driven formation is the better model and dark matter halo mass remains the best singlemetric. Using a new single variable that combines these e ff ects, φ = v max / v max , . + M peak / (10 . M (cid:12) ) allowsfurther improvement and reduces the error, as compared to ranking by dark matter mass at z = v max ranking. Two parameter fits – including environmental e ff ects produce only minimal further impact. INTRODUCTIONThere has been a great deal of progress in recent yearsin our understanding and simulations of galaxy formation.The initial conditions seem to be well specified by LCDMcosmological models and their variants, and hydrodynamiccodes now have the capacity and resolution to include manyof the physical processes relevant to galaxy formation andevolution. These include dark matter gravitational collapse,gas cooling, star formation and the mechanical and radiativefeedback from stars and central, massive black holes (BHs).There is no widely accepted mechanism for the formationof these massive BHs, but, assuming their formation as seedBHs, their evolution and e ff ects on the surrounding galaxiesare now reasonably well modeled. Recent summaries of thestatus of existing work on galaxy formation and evolution arepresented in Somerville and Davé (2015) and Naab and Os-triker (2017). For more detailed treatments one can consultthe EAGLE (Schaye et al. 2015; Crain et al. 2015), FIRE(Hopkins et al. 2018a,b), IllustrisTNG (Vogelsberger et al.2014; Genel et al. 2014; Weinberger et al. 2017; Pillepichet al. 2018), MUFASA (Davé et al. 2016), NIHAO (Wanget al. 2015; Blank et al. 2019) and other simulations. [email protected] (ST) But there are virtues to considering simpler treatments thatdo not need to rely on sub-grid modeling and which are eas-ily adaptable to analyzing large data sets. Of these, the sim-plest, perhaps, is the abundance matching scheme which isbased on the fact that, in all variants of the LCDM modeling,galaxies live in more massive dark matter (DM) halos, quasi-spherical lumps of dark matter, which grow via gravitationalinstabilities from very low amplitude ( 10 − ), gaussian per-turbations imparted at very early times in a roughly powerlaw distribution by unknown processes (thought to be relatedto inflation) ((e.g. Vale and Ostriker 2004, 2006, 2008).There is good agreement on how to compute the formationof these dark matter halos with several computational codesnow able to make moderately high-resolution cosmic scalevolumes containing accurate distributions of DM halos hav-ing well defined properties. Early analyses by Navarro et al.(1997) showed that these could be represented to a reason-able approximation by three numbers, a mass, a virial radiusand a core radius, with the ratio of the latter two numbersrepresented as the concentration. To zeroth order observedgalaxies can be represented by their stellar masses, or alter-natively by their stellar luminosities.Thus, the simplest possible scheme for populating a vol-ume in the universe with galaxies would be to populate it firstwith DM halos, and then make a rank ordered list of thesewith the most massive first. Next, one could take the same a r X i v : . [ a s t r o - ph . GA ] F e b volume from the real universe and rank order the observedgalaxies by mass (or luminosity), and then simply assumethat the more massive halos hosted more massive galaxies,putting in each halo the corresponding galaxy.This, almost ridiculously simple, scheme was pursued byVale & Ostriker in three papers (Vale and Ostriker 2004,2006, 2008; see also Kravtsov et al. 2004; Tasitsiomi et al.2004). Using this scheme (or a variation thereof), one cantake di ff erent variants of the LCDM model, compute the halodistribution, populate the halos with galaxies using the sim-ple abundance matching scheme and then compare to ob-servations. By construction, the luminosity functions mustcome out to be correct, but correlation functions (Conroyet al. 2006; Marín et al. 2008; Guo et al. 2010; Trujillo-Gomez et al. 2011), pair counts (Berrier et al. 2006), mag-nitude gap statistics (Hearin et al. 2013; Ostriker et al. 2019),and galaxy-galaxy lensing (Hearin and Watson 2013) can beusefully compared to observations.Two immediate questions arise. First, is there a better ze-roth order scheme than ranking by mass and matching. Sev-eral authors have considered this question. For example,while the DM in subhalos is quickly stripped once they areaccreted onto halos, stripping of the more centralized galaxystarts later (Nagai and Kravtsov 2005), and in fact their op-tical sizes are observed to grow with cosmic time (cf vanDokkum et al. 2010), presumably due to the accretion ofsmaller satellite systems. Thus, Conroy et al. (2006) im-proved subhalo abundance matching by matching galaxiesto halos at the time at which they are accreted onto a cen-tral halo. Even earlier, Kravtsov et al. (2004) proposed usingthe maximum circular velocity of (sub)halos, v max , which ismore stable than halo mass to stripping. Reddick et al. (2013)tested abundance matching models using several halo prop-erties, and found that only v peak (defined as the peak valueof v max over the history of the halo) or a combination of v max for central galaxies and v peak for satellite galaxies isable to reproduce observations of galaxy clustering. Indeed,Zentner et al. (2014) argues that abundance matching using v max matches several observed galaxy statistics (Conroy et al.2006; Hearin et al. 2013; Hearin and Watson 2013; Reddicket al. 2013) because halo mass alone does not determine thehalo velocity profile.Xu and Zheng (2018) confirmed that for the central galax-ies in the original Illustris simulations, M ∗ is more tightlycorrelated with v peak than with halo mass. They also findthat at fixed v peak , the correlation between M ∗ and other haloproperties is removed. In He (2020), the author uses subhaloabundance matching and finds that v peak correlates best withthe stellar mass at the epoch of v peak in both central and satel-lite galaxies in EAGLE, Illustris, and IllustrisTNG. Stellarmass stripping of satellite galaxies results in increased scat-ter in the z = ∗ to v peak relation. Chaves-Montero et al. (2016) find that v relax , defined as the maximum of the circularvelocity of a dark matter structure while it fulfils a relaxationcriterion, as evaluated along its entire history, correlates moststrongly with M ∗ .Second, does there exist any first order refinements of themass-matching scheme that could be implemented, whichwould be easy to apply and would significantly increase itsaccuracy. In fact, Lehmann et al. (2017) point out that whileranking by v max is similar to ranking by halo mass, at fixedhalo mass more concentrated halos have higher v max (Klypinet al. 2011). They therefore use a parameterization from Maoet al. (2015) that includes both halo mass and concentration: V α ≡ V vir ( V max V vir ) α (1)where v max is the maximal circular velocity of the halo and V vir ≡ ( GM vir R vir ) / (2)They find that an α ∼ ff erent environmental densities for SDSS ob-servations and a subhalo abundance matching model appliedto the Bolshoi-Planck simulation and find that the model pre-dictions agree well with observations.In this paper we will try to both physically motivate andimprove the simplest matching scheme. We will use Illus-trisTNG to determine whether complicating the most sim-ple form of subhalo abundance matching, ie matching stellarmass to a single halo property, reduces scatter in the assign-ment of galaxies to halos. We first verify the halo propertythat produces the least scatter in the relation, v max , consid-ering the central and satellite populations separately and se-lecting di ff erent mass ranges even in this initial step. Wethen provide a physical motivation for the parameters usedin the optimal matching scheme. Then, similarly to Martizziet al. (2020), we calculate how the scatter is reduced whenwe fold in a second halo feature. However, unlike these re-cent works, we test a wide range of possible parameters andpossible combinations thereof.In Section 2 we describe our sample of galaxies and ourmethodology for testing and evaluating procedures for usingmatching techniques to predict galaxy properties given darkmatter simulations. In Section 3 we try out the simplest oneparameter schemes and show that the physically motivatedfocus on peak velocity dispersion is best for normal galaxiesbut that total halo mass remains best for first brightest, mas-sive, central systems as again would be expected from phys-ical arguments. Section 4 broadens the treatment to includemultiple variables - including environment - and then in Sec-tions 5 & 6 we present an overall discussion of the resultsand our conclusions. METHODS2.1.
IllustrisTNG
The IllustrisTNG100 (public data release: Nelson et al.2019) is part of a suite of cosmological simulations runusing the AREPO moving mesh code (Springel 2010).TNG100 has a volume of 110.7 Mpc and a mass resolu-tion of 7 . × M (cid:12) and 1 . × M (cid:12) for dark matter andbaryons, respectively. The TNG suite implements upgradedsubgrid models compared to the Illustris simulation (Vogels-berger et al. 2014; Genel et al. 2014); specifically, a modifiedblack hole accretion and feedback model (Weinberger et al.2017), updated galactic winds (Pillepich et al. 2018). TNGalso includes magnetohydrodynamics (Pakmor et al. 2011).2.2. Galaxy Selection
We use galaxy populations from the IllustrisTNG 100 sim-ulation described above. We consider galaxies at the z = ≥ M (cid:12) / h in the dark-matter only run (DMO) that are matched in the full hydro-dynamical simulation with galaxies whose stellar mass isgreater than 10 M (cid:12) / h .In detail, we first selected all galaxies in the DMO simu-lation with a dark matter mass greater than 10 M (cid:12) / h . Wethen used the publicly available matching data to find the cor-responding galaxy in the full hydro simulation. We use allgalaxies identified with masses above 5 × M (cid:12) / h , whichclearly includes galaxies that are underresolved in the simu- lation. However, at our minimum dark matter halo mass, thelowest stellar mass of any galaxy in our sample is 7 × M (cid:12) / h . In order to only include well-resolved galaxies, thatare more likely to be in an observational sample, our finalanalysis only includes galaxies with stellar masses above 10 M (cid:12) / h . 2.3. Galaxy Environmental Measures
We use nearby galaxies to measure the local environment.We include all galaxies with dark matter masses above 10 M (cid:12) / h within 1 Mpc, 2 Mpc, 5 Mpc, 8 Mpc or 15 Mpc ofeach galaxy. In order to have a more physical measure ofthe local mass density, we summed the total mass of all thesegalaxies.We also were able to separate galaxies into satellitesor centrals using the GroupFirstSub identifier in the Illus-trisTNG DMO simulation. This allowed us to perform ourfits for the entire sample and for satellites and centrals sep-arately, and, as we shall see, the two categories are signifi-cantly di ff erent in their properties. This gives us three sam-ples: “all", “centrals" and “satellites". The fourth sample islabeled as “mix", which is the combined sample in whichsatellites and centrals are fit separately.2.4. Concentration
We use three measures of the concentration. First, fromBose et al. (2019) we use: c v ≡ V max H o R max (3)where v max is the maximum velocity of the simulated rotationcurve and R max is the radius at which V c is maximal. Boseet al. (2019) show that this is equivalent to the concentrationcalculated using all the particles in a halo and assuming anNFW profile (see also Moliné et al. 2017).Second, we use the ratio c h ≡ v max / V hal f mass , whereV hal f mass is the circular velocity at the half mass radius ofthe dark matter halo, calculated as (cid:112) GM hal f mass / R hal f mass .Finally, we use the ratio c R ≡ R max / R hal f mass .2.5. Percent Error
We define the error as:
Error ≡ (cid:80) N | log ( M true / M prediction ) | N (4)so that for small errors our definition is equivalent to 0.43times the average fractional error. RANK ORDERINGIn this section we discuss using rank ordering to match ha-los and galaxies. Specifically, we first present a straightfor-ward theoretical sca ff old for selecting the halo property bestsuited for rank ordering. Then, using IllustrisTNG we con-firm our derivation.3.1. A Simple Theoretical Basis for Selecting the OrderingHalo Property
We can begin with an assumption that stellar mass is re-lated to the baryonic mass scaled to the dark matter mass,corrected by the fraction of matter that cools and forms starsin the center of the halo: M ∗ ∝ M peak Ω b Ω d t f orm t cool , f orm (5)Here t f orm is the formation time of the halo and t cool , f orm isthe time required for the baryons to cool and condense intoa galaxy. This is simply putting in the form of an equationthe classical idea of “monolithic collapse" first proposed byEggen et al. (1962).We can relate the mean density of the galaxy to its massand radius: ρ max ≡ M max π r max (6)In which ρ max , M max , and r max are the density, mass, andradius at which the circular velocity reaches v max , where v max = GM max r max .We can relate t f orm to the halo density assuming standardgravitational collapse (Gunn & Gott 1972): G < ρ > ≡ t − f orm (7)We can also relate t cool , f orm to density using energy conser-vation and the standard cooling equations: kT max m ≡ GM max r max (8)The above equation defines T max . We subsequently candefine t cool , f orm as: Λ ( T max ) ρ max ≡ ρ max kT max t cool , f orm (9)Thus, t cool , f orm ∝ ρ − max f where f ≡ Λ ( T max ) / T max , and t f orm ∝ ρ − max . If we use these relations in Equation 5, we find that M ∗ ∝ M peak ρ max f ∝ ( M max r max ) f ∝ v max f (10)Because for Bremsstrahlung cooling f ∝ T max ∝ v max ,we complete the simplified derivation of the well-establishedFaber-Jackson relation (Faber and Jackson 1976): M ∗ ∝ v max (11)We highlight that this derivation is based on the assump-tion of spherical collapse of the halo and pure radiative cool-ing. We do not consider any complicating processes that we know a ff ect galaxies in the universe, such as mergers or feed-back from star formation or AGN. In fact, we might expectthis relation between M ∗ and v max to break down more oftenfor higher mass galaxies, as they have been found to havelater growth times where these assumptions clearly break-down (Behroozi et al. 2013). For first brightest systems, sit-ting in massive halos from which they can accrete satellites,one would expect M peak to be more relevant, and, as we havenoted, both observations (e.g. van Dokkum et al. 2010) andLCDM theory argue that hierarchical accretion is the domi-nant process for first brightest galaxies.Indeed, we can go a step farther and ask the basis forand the value of the transition mass above which “normal"growth of the stellar component from a cooling collapse be-comes di ffi cult. This was addressed in a paper by Rees andOstriker (1977)(eqn 20) in an elementary discussion of themaximum mass of cosmic gas that can cool and collapse ina dynamical time. They did not include the important ef-fects of dark matter in their treatment and obtained a massof [( (cid:126) cGm p ) ( e (cid:126) c ) ( m p m e ) ] m p ∼ M (cid:12) in baryons. Had the ef-fects of dark matter been included the value of the bary-onic, transition mass would have been reduced somewhat,but the corresponding dark matter mass would have approx-imated 10 . M (cid:12) . In fact, the mass function of galaxies inthe standard Press-Schechter parameterization declines ex-ponentially above a certain critical mass, the stellar mass be-ing roughly 10 M (cid:12) and the corresponding halo mass beingroughly 10 . M (cid:12) . Consequently, we have both an observa-tional and a physical basis for expecting that galaxies abovesome critical mass will grow primarily by accreting satellitesand cannot be formed easily by a monolithic collapse. Thus,while matching based on v max will be best for normal sys-tems, we can expect that, for first brightest galaxies in mas-sive clusters, M peak should be best metric.3.2. Rank Ordering in IllustrisTNG
Here, we test these theoretical predictions using the Illus-trisTNG simulation. As described in Sectiion 2.2, we use asample of galaxies with dark matter mass greater than 10 M (cid:12) / h in the DMO simulation and stellar masses above 10 M (cid:12) / h in the full hydrodynamical simulation.We rank-ordered our selected galaxies by total mass andthe stellar mass separately for each of our samples: "all"(11927), "satellites" (2337), and "centrals" (9590). We haveused three simple proxies for dark matter halo mass in ourranking schemes: the current dark matter mass, M DM , thepeak dark matter mass, M peak , and the current v max . Theseare shown in order from the top to bottom panels in Figure1. We show the total mass proxy and stellar mass for each ofour galaxies as orange “o". The blue lines show the predictedstellar mass using the rank-ordering method for each of oursamples.We see that the scatter decreases as we move from usingM DM to M peak , although the M ∗ ∝ M (3 / halo for higher massesholds for both variables. We also highlight that the rank-order line for satellite galaxies is much closer to that for cen-trals when we use M peak than when we use M DM . The scattercontinues to decrease when we use v max as our dark matterhalo mass proxy, particularly at lower masses (lower v max ).To guide the eye we have overplotted simple power-law rela-tions between M ∗ and v max .The results shown in Figure 1 are quantified using the per-cent error as described in Section 2.5 (eqn 4), with the re-sults shown in Table 1. We see that while using the currentM DM in the matching scheme is reasonably accurate for cen-tral galaxies, it is much less accurate for satellite systems.Therefore, we also consider the peak mass of the halo, M peak .Using M peak should correct for mass loss from satellite galax-ies due to tidal stripping. Because dark matter is distributedto a larger radius than the stars in a galaxy, it will be morestrongly stripped.Therefore, while we expect M peak to be very similar toM DM for central galaxies, it can vary by a considerableamount for satellites. Indeed, we see in Table 1 that the im-provement for centrals is very small when using M peak ratherthan M DM , but it is dramatic for satellite galaxies. We alsoconsider v max , as tidal stripping is found to have little e ff ecton this property, likely because the maximum rotational ve-locity is reached at relatively low radii. Using v max for ourvariable we find that the error for central galaxies has im-proved by more than 15%, although the correlation between v max and stellar mass for satellites is somewhat weaker thanthe correlation between M peak and stellar mass. However, be-cause most of our galaxies are centrals, v max remains the bestsingle variable for rank-ordering our galaxy sample.We stress that, because of the shape of the mass function,any relation between stellar mass and halo mass will be dom-inated by the lowest-mass galaxies. Therefore, we also con-sider separately only galaxies whose mass in the DMO sim-ulation is greater than 10 M (cid:12) / h in order to remove the bulkof low mass galaxies while still retaining a sample with ∼ DM is reasonably accurate for central galaxies,but much less so for satellite systems. Again, we find a largeimprovement in the ranking scheme for satellites galaxies us-ing M peak .However, unlike in the full sample, v max is the worst rank-ing variable for central galaxies with halo masses above 10 M (cid:12) / h . This agrees well with our theoretical argument that atlarge masses M peak will be the best ranking variable due tomerging.With this empirical support for the trends predicted in ourmodel, we also develop a straightforward variable, using the Figure 1.
The stellar mass of galaxies versus possible variablesto use for ranking. Blue lines show the predicted stellar mass us-ing the rank-ordering method for each of our samples.
Top panel:
Ranking using the current dark matter halo mass has the most scat-ter.
Middle panel:
Using M peak for ranking reduces scatter, and therank ordering predictions for satellite and central galaxies is muchcloser.
Bottom panel:
Ranking using v max reduces scatter dramat-ically, particularly for lower mass (lower v max ) halos, as quantifiedin Table 1. physical intuition from above, that v max will be the best rank-ing variable for low mass galaxies and M peak will be the bestranking variable for high mass galaxies (Section 3.1). Forthis variable we normalize both v max and M peak to their val-ues at a “pivot mass" of M peak = . M (cid:12) . We call thesevariables v norm ≡ v max / v max , . and m norm ≡ M peak / . .We then rank order our galaxies using the parameter basedon these normalized values: φ ≡ v norm + m norm (12)Using this parameter, low mass galaxies depend morestrongly on v max , while high mass galaxies depend on M peak .Both the exact value of the pivot mass and the powers of v norm and m norm were selected to minimize error while fleshing outour theoretical sca ff old.As shown in Table 1, using this parameter φ gives someimprovement on the fit to the central galaxies in our sample,and dramatically reduces the error for the satellite galaxies.Using this variable for the mix of all galaxies reduces the er-ror by a substantial 33% when compared with rank orderingby M DM . USING SECONDARY VARIABLES TO IMPROVERANK ORDERINGWe now attempt to minimize the scatter in the φ - M ∗ re-lation using other features of dark matter halos. These fea-tures are listed in Table 2. We have roughly grouped the haloproperties into those related to the halo mass (M DM , M peak , v disp ≡ dark matter velocity dispersion, and v max ), size (r max ≡ v max radius and r DM ≡ dark matter half mass radius), shape(concentration using the three methods described in Section2.4), formation time (the lookback time to M peak , to whenthe halo reaches 50% of its z = = Method of Correction
We first plot our feature as a function of our best singlevariable φ , and find the running median of the feature usinga window size of 50 galaxies. We have tested using otherwindow sizes (25 and 100 galaxies) and find similar results.The top panel of Figure 2 shows this plot using the environ-mental density M DM , r < Mpc variable. Clearly there is a trendof increasing environmental density as a function of φ , and itdi ff ers for satellites and centrals.Because we use the rolling median, we need to remove thefirst and last 25 values, so we are left with an "all" sample of11877, a "centrals" sample of 9540, and a "satellites" sampleof 2287 galaxies. Removing these galaxies has little impact Figure 2.
The top and bottom panels show the first and secondsteps used to include a secondary halo feature to reduce the scatterin rank ordering halos (Section 4.1). Here we use the M DM , r < Mpc environmental measure, written as M2Mpc.
Top:
First we plot thisvariable as a function of φ , our rank-ordering variable. Here weshow the total sample (“all") as well as the centrals and satellites("sats") separately. The points are color-coded as centrals and satel-lites. Bottom:
Using the scatter from a rolling median, we canfind that the ratio of the true stellar mass of the galaxy to the rank-ordered assigned mass has a dependence on M DM , r < Mpc . The pointsare not color-coded for satellites and centrals, as for the “all" fit weuse all of the galaxies in the sample. We can then correct our stellarmass using this dependence. Notice that the rolling median for thetotal sample is similar to that for the separated satellite and centralsamples. on the percent errors using the rank ordering method for eachsample (a change of less than 1%).We then plot M true / M rank as a function of ∆ log( feature ),which is the di ff erence between the log( feature ) for each darkmatter halo and the log( feature rollingmedian ) found at each φ . Inthe bottom panel of Figure 2 we show how M true / M rank is re-lated to the scatter in M DM , r < Mpc . This relation is fit using
Number of galaxies (M DM > M (cid:12) ) 11927 9590 2337 11927 11927Galaxy Sample All Centrals Satellites Mix % ImprovementRank Ordering using M DM peak v max φ ≡ v norm + m norm DM (at z = peak , v max (at z = φ ≡ v norm + m norm (eqn 12) for the dark matter mass for the di ff erent samples (a fit to all galaxies, only centrals, only satellites, and mix of allgalaxies fitting the centrals and satellites separately). We use a galaxy sample with dark matter mass in the DMO simulation greater than10 M (cid:12) / h that is matched to any galaxy in the hydrodynamical run with stellar mass greater than 10 M (cid:12) / h . The final column shows thepercent improvement of ranking by the selected variable compared to M DM (at z =
0) on the “mix" sample. We see that ranking using thesingle variable φ reduces the error 34% compared to matching by M DM (at z = DM > M (cid:12) ) 11927 9590 2337 11927 11927Galaxy Sample All Centrals Satellites Mix % Improvement φ + v disp φ + v max φ + M DM φ + M peak φ + r max φ + r DM φ + c v φ + c h φ + c r φ + t peak φ + t φ + t φ + M DM , r < Mpc φ + M DM , r < Mpc φ + M DM , r < Mpc φ + M DM , r < Mpc φ + M DM , r < Mpc φ + M DM , r < Mpc + t φ ranking method plus the listed corrections. We use the galaxy sample with dark mattermass greater than 10 M (cid:12) / h that is matched to any galaxy in the hydro run with stellar mass greater than 10 M (cid:12) / h . Note that M DM , r < XMpc isthe mass of all halos within that radius from the DMO simulation with M DM > M (cid:12) , including the mass of the halo from which themeasurement originates. All halo properties are measured using the DMO simulation. Here the final column shows the percent improvementon the “mix" sample of using the correction variable in addition to rank-ordering by φ in comparison to only rank-ordering by φ . first, second and third order polynomial fits. Finally, we cor-rect our prediction for the stellar mass using our chosen fit asbelow (the second order polynomial fit shown tends to givethe best results): log ( M ∗ , pred ) = log ( M ∗ , rank ) + α ∆ log ( f eature ) + β ∆ log ( f eature ) + γ (13)Finally, we calculate the percent error of the new predic-tion. This value for each feature and galaxy population isshown in Table 2. 4.2. Results
All of our quantitative results are shown in Table 2. Themost glaring result is that most corrections to do not result ina large improvement of the percent error from ranking using φ . Using random resampling of 70% of our data sets(“all",“central", and “satellites") 60 times, we find a distri-bution of errors with means matching the values listed for φ of the complete sample in Table 1, and standard deviations of0.001, 0.001, 0.002, and 0.0009 for “all", “centrals", “satel-lites" and “mix" samples, respectively. With this in mind wecan look more closely at the improvement when adding asecond feature to our matching scheme.In some more detail, it is not surprising that all of the halofeatures describing halo mass do not improve the fit to cen-tral galaxies at all. These are well-fit by our φ variable. How-ever, interestingly, the error is reduced for the satellite samplewhen we include a M DM correction. This may be because wehave largely ignored satellite galaxy evolution by choosing v max and M peak as the components of φ . Including M DM maystart to include the later evolution of these galaxies.We find universally small improvement when consideringour variables describing halo size (r max and r DM ) and shape(concentration).Interestingly, there is some improvement in the error whenfolding formation time into the stellar mass estimate. Forexample, t is the halo feature that results in the smallestpercent errors across all of our samples: “all" galaxies, “cen-trals", “satellites", and the “mix" sample.Finally, using environment to correct for the stellar massalso has a small impact on the overall error. Despite this, wenote that including the mass from galaxies within 2-5 Mpcseems to produce a slightly better correction than smaller orlarger environment windows.4.2.1. Correcting Using A Combination of Environment andFormation Time
Finally, we use our fits for each of our strongest individualcorrections, t and M DM , r < Mpc , to create a combined correc-tion on the rank-ordering technique. M ∗ , pred = log ( M ∗ , rank ) + ( α M ∆ log ( M DM , r < Mpc ) + β M ∆ log ( M DM , r < Mpc ) + ( α t ∆ log ( t ) + β t ∆ log ( t ) + γ (14)We use the curvefit module in scipy to perform a least-squares fit to the above equation, and find that we can reducethe error using both M DM , r < Mpc and t as shown in the finalline of Table 2. 4.3. Verifying our Results
Here we use two methods to verify our results on the im-provement using multiple halo features to determine stellarmass. 4.3.1.
Random Forest Regression
Now that we have gained insight into the level of improve-ment that can be gained by using more than one feature ofdark matter halos in the abundance matching technique, weturn to machine learning to provide an independent check ofour modeling and ranking scheme.Using Random Forest Regression (RFR) allows us to rankthe features according to their e ff ect on the model output, and has the additional benefit of expanding the space of availablemodels beyond polynomial fitting. For this work we use sci-kit learn (Pedregosa et al. 2011).First, we are able to reproduce the percent error on the en-tire sample using only our defined φ feature (0.111), and ina two-feature setting where we add a central / satellite galaxylabel (0.105). We check the rest of our ranking parametersfrom Table 1 and verify that φ produces the best ranking vari-able to match DM halos to galaxies. Also, we confirm thatour φ variable produces lower error values than the combina-tion of v max and M peak .We also use our four selected halo features that we foundproduced the best match between the DMO and hydro-dynamical simulations, φ , M DM , r < Mpc , t , and the cen-tral / satellite label. Using an optimized RF regressor, the ex-pected test set error is 0.098 with a standard deviation of0.0013, quite similar to the percent error we find ranking thesatellites and centrals separately using φ and applying ouranalytic correction using M DM , r < Mpc and t . This is reas-suring because it shows that our results are only very mildlydependent on the modeling assumptions.We can also include all the features and use a parameteroptimization technique to find the minimum possible errorof a Random Forest Regression. We find a minimum er-ror of 0.092 using eight randomly selected features, creating100 trees (n estimators ) with a maximum depth of 14 branches(max depth ). However, there are more than 30 combinationswithin one standard deviation (0.0015), including one usingonly 4 features. We can conclude that there may be manysimilarly relevant predictors in our feature list. This supportsour analytic reasoning that several of our halo features arereasonable proxies for halo mass, and we have already notedthat our other halo features can be separated into only a fewtypes of variables (halo size, concentration, formation time,and environment).Indeed, if we optimize the RFR including one feature ofeach type we can reach an error of 0.095 ( φ , M DM , r < Mpc , t ,r DM , and c v ). This is within two standard deviations of thefour halo parameters we use in our analytic model, and sodoes not indicate a dramatic improvement.Comparing our results to the errors found using the RFRmachine learning technique gives us assurance that our ana-lytic method for including extra halo features is reasonable,and that our conclusions are not strongly model-dependent.While continuing to add features can reduce the error on thematching scheme, we do not find other clear DM halo fea-tures that dramatically improve upon our analytic method.4.3.2. Cross-Validation
In order to obtain another view on whether increasing thenumber of halo features improves our estimate of stellar masswe can use cross-validation. This can be used to determine
All Centrals Satellites Mix φ φ + M DM , r < Mpc φ + t φ + M DM , r < Mpc + t how meaningful our derived improvements are when usedto predict the stellar mass of galaxies. Cross-validation isspecifically designed to trade o ff over- and under-fitting togive the highest prediction accuracy. For this, we randomlyselect 80% of our sample as our test set, on which we performthe fitting processes as described. We use the remaining 20%as our test set to determine if the percent error on the stel-lar mass prediction improves when including more features.Specifically, we select 80% of our total sample for the “all"fits, and then 80% of the central and satellite samples, in or-der to determine the “central", “satellite", and “mix" fits.We performed this cross-validation routine ten times usingten di ff erent random subsets of the data, and universally findimprovement in both the training and test sets when usingM DM , r < Mpc , t , or their combination. In Table 3, we list themedian percent error values for the ten sets of training andtest samples. We can conclude that we have not yet overfitusing these halo features, and our improvement in predictingstellar masses from halo masses is real, albeit small. DISCUSSIONWhat have we learned from this exercise? The zeroth orderconclusion is that a matching scheme based on the maximumvelocity in a dark matter halo is a good single predictor ofthe final stellar mass for normal galaxies, whether they arecentral galaxies or satellites. The typical error in the predic-tion (in the IllustrisTNG100 simulations) is 11.6 percent inlog(M ∗ ) compared to 19.8 using M DM , and the dependence ofstellar mass on v max is unsurprisingly log (M ∗ ) ∼ (3.8 ± v max ) (using bootstrap resampling with 70% of the dataset). This result is just what one would have expected fromthe simplest physical argument that estimates the amount ofgas that can be turned into stars in the standard Gunn andGott (1972) collapse of a dark matter halo.But, for high mass systems comparable to the first brightestgalaxies in clusters living in halos more massive than 10 . M (cid:12) , the accretion of satellite systems will significantly in-crease the stellar mass and the most relevant halo parameteris simply the peak dark matter mass, M peak . Using a sin-gle variable, φ (eqn 12), which incorporates both featuresreduces the error to 10.5% when satellites and centrals areranked seperately.These prescriptions should be easy to implement andcan replace the simplest, halo mass based initial matching schemes when estimating the expected galaxy stellar massesgiven a dark matter simulation.If one wants to go farther and improve the best zeroth orderscheme by first order corrections then we have found that aroughly 6% improvement is possible. Interestingly, environ-mental considerations that we considered did not lead to sig-nificant improvement even in satellite galaxies, and the bestsingle variable for improvement was t , the time at which ahalo reached 85% of its peak mass.However, an almost mindless combination of the two vari-ables ( v max , M peak ) worked best. We further found that asimple linear combination based on these two variables en-ables predictions to a typical accuracy of 10.5 percent errorin log(M ∗ ). TESTSAll of these results are based on simulated data and it isimportant to test them in the real world. We have been ableto think of two tests that might be applied to help determinewhether the proposed matching scheme provides a significantimprovement over the simplest matching scheme. First oneconstructs a standard LCDM, dark matter only simulationand, using a standard halo finding algorithm, makes a catalogof dark matter halos labeling each of them with the final darkmatter mass, M DM , the peak dark matter mass M peak over thehistory of the halo and the current halo maximum circularvelocity v max . Then, to test the classic matching scheme (ashas been done before – Conroy et al. 2006), one takes a rep-resentative volume and rank orders the halos by M DM , takescatalog values for a comparable volume (from, say, the SloanDigital Sky Survey) and rank orders the observed galaxies by(for example) g or r magnitudes and then identifies each DMhalo with the matched by ranking, real galaxy. This gives onean artificial catalog of galaxies each tagged with a position, avelocity and a g or r optical magnitude.Then one would “observe” this synthesized catalog andconstruct two spatial distribution functions, a galaxy-galaxyspatial correlation function (Conroy et al. 2006; Hearin et al.2013; Hearin and Watson 2013; Reddick et al. 2013) anda void distribution function (e.g. Walsh and Tinker 2019).These could then be compared to the known galaxy-galaxyspatial correlation functions and the known void distributionfunctions with both one parameter functions specified as afunction of magnitude. The magnitude distribution itself is0of course correct by construction. Then comparing – say –the autocorrelation length as a function of galaxy magnitudebetween the real and synthesized data sets allows one to de-termine the fractional error as a function of galaxy magni-tude.Then one would go back to the original DM halo catalogand, using (M peak , v max ), construct for each halo the value of φ = ( v max / v ) + (M peak / (M )) (equation 13), where ( v , M )are the values of ( v max , M peak ) for the average halo of mass10 . M (cid:12) . Now, with each halo tagged with its value of φ ,one can rank order the synthetic sample by φ and attach vi-sual magnitudes to each galaxy by the same method as wasdone using M DM . Now one has a new catalog to observe withrespect to spatial distribution metrics and can again find thefractional error in – for example – the spatial autocorrelationlength as a function of visual brightness and compute the er-ror by comparing to real observed data.This procedure would give us a quantitative estimate asto how well the matching scheme was working compared toboth reality and the previous simpler matching scheme whichhas had considerable success. And, unlike the exercises inthis paper, the tests would not be dependent on the accuracyof our current galaxy formation algorithms, which, whilewell tested, su ff er from the “confirmation bias” inevitablewhen uncertain modelling parameters are adjusted to fit ob-servations. We look forward to pursuing these independenttests in future work. CONCLUSIONSIn this paper we have examined schemes to populate a syn-thesized dark matter only set of cosmological simulationswith galaxies to see if we could devise a simple and accuratescheme. We took as our starting point a matching scheme(Vale and Ostriker 2004, 2006, 2008) which, while almostnaively simple has had some success. In that scheme, onerank orders DM halos by final mass and rank orders realgalaxies in a similar cosmic volume by luminosity and at-taches to the kth ranked halo the kth ranked galaxy. Table 1represents of one parameter e ff orts which we compared to thecomputed luminosities in the IllustrisTNG simulated galaxycatalog. Table 2 summarizes our results with two parame-ter fits where we used combinations of velocity dispersion,mass and environmental density. We did not find that addingan environmental variable produced a significant improve-ment over simpler schemes nor did we find that any of thetwo parameter fits that we investigated were statistically sig-nificantly superior to the one parameter fits. What we diddiscover was that a new single variable, φ (cf equation 12),which combines information from both mass and velocityvariables, provides a quite significant improvement over thebasic ranking scheme using final dark matter mass, the errorbeing reduced by about 33% percent. In our examination of the physical basis for the success of this new variable we ex-amined simple arguments starting with the over half centuryold paper by Rees and Ostriker (1977).There is a critical mass for galaxies – the mass above whichit cannot cool by normal radiative processes in roughly a freefall time. That mass corresponds roughly to 10 . M (cid:12) whichwe designate as M1. Below this mass there is a simple an-alytic argument that asks if a gaseous object can cool in itsown free fall time and (assuming bremsstrahlung cooling) wenoted that this condition is equivalent to the Faber-Jacksonrelation, ie that M ∗ ∼ v max . For masses above M1 growthonly occurs by accretion of satellites and that is proportionalto M. So, we designed a metric, φ , which is dominated by ve-locity for low mass objects and dominated by mass for highmass objects more massive than M1. This single variable,based on the physical motivation given above, seems to pro-vide a matching scheme superior to others which we havetested. We did try other combinations of (M peak , v max ) andfound none superior to the simple variable, φ , that we hadtested. So our bottom line is that the variable, φ (eq 12), isthe best single variable to use in predicting the stellar massof galaxies, given their halo properties.ACKNOWLEDGMENTSST would like to thank Claire Kopenhafer and TjitskeStarkenburg for their help and scripts in reading in and an-alyzing TNG outputs, Viviana Aquaviva for her MachineLearning class and comments on the draft, and Dan Foreman-Mackey for discussions and comments on cross-validation.ST gratefully acknowledges support from the Center forComputational Astrophysics at the Flatiron Institute, whichis supported by the Simons Foundation. The data used inthis work were hosted on facilities supported by the Scien-tific Computing Core at the Flatiron Institute, a division ofthe Simons Foundation.1REFERENCES Behroozi, P. S., Wechsler, R. H., and Conroy, C. (2013). On theLack of Evolution in Galaxy Star Formation E ffi ciency. ApJL ,762(2):L31.Berrier, J. C., Bullock, J. S., Barton, E. J., Guenther, H. D.,Zentner, A. R., and Wechsler, R. H. (2006). Close GalaxyCounts as a Probe of Hierarchical Structure Formation.
ApJ ,652(1):56–70.Blank, M., Macciò, A. V., Dutton, A. A., and Obreja, A. (2019).NIHAO - XXII. Introducing black hole formation, accretion, andfeedback into the NIHAO simulation suite.
MNRAS ,487(4):5476–5489.Bose, S., Eisenstein, D. J., Hernquist, L., Pillepich, A., Nelson, D.,Marinacci, F., Springel, V., and Vogelsberger, M. (2019).Revealing the galaxy-halo connection in IllustrisTNG.
MNRAS ,490(4):5693–5711.Chaves-Montero, J., Angulo, R. E., Schaye, J., Schaller, M., Crain,R. A., Furlong, M., and Theuns, T. (2016). Subhalo abundancematching and assembly bias in the EAGLE simulation.
MNRAS ,460(3):3100–3118.Conroy, C., Wechsler, R. H., and Kravtsov, A. V. (2006).Modeling Luminosity-dependent Galaxy Clustering throughCosmic Time.
ApJ , 647(1):201–214.Crain, R. A., Schaye, J., Bower, R. G., Furlong, M., Schaller, M.,Theuns, T., Dalla Vecchia, C., Frenk, C. S., McCarthy, I. G.,Helly, J. C., Jenkins, A., Rosas-Guevara, Y. M., White, S. D. M.,and Trayford, J. W. (2015). The EAGLE simulations of galaxyformation: calibration of subgrid physics and model variations.
MNRAS , 450(2):1937–1961.Davé, R., Thompson, R., and Hopkins, P. F. (2016). MUFASA:galaxy formation simulations with meshless hydrodynamics.
MNRAS , 462(3):3265–3284.Dragomir, R., Rodríguez-Puebla, A., Primack, J. R., and Lee, C. T.(2018). Does the galaxy-halo connection vary withenvironment?
MNRAS , 476(1):741–758.Eggen, O. J., Lynden-Bell, D., and Sandage, A. R. (1962).Evidence from the motions of old stars that the Galaxycollapsed.
ApJ , 136:748.Faber, S. M. and Jackson, R. E. (1976). Velocity dispersions andmass-to-light ratios for elliptical galaxies.
ApJ , 204:668–683.Genel, S., Vogelsberger, M., Springel, V., Sijacki, D., Nelson, D.,Snyder, G., Rodriguez-Gomez, V., Torrey, P., and Hernquist, L.(2014). Introducing the Illustris project: the evolution of galaxypopulations across cosmic time.
MNRAS , 445(1):175–200.Gunn, J. E. and Gott, J. Richard, I. (1972). On the Infall of MatterInto Clusters of Galaxies and Some E ff ects on Their Evolution. ApJ , 176:1.Guo, Q., White, S., Li, C., and Boylan-Kolchin, M. (2010). Howdo galaxies populate dark matter haloes?
MNRAS ,404(3):1111–1120. He, J.-h. (2020). Modelling the tightest relation between galaxyproperties and dark matter halo properties from hydrodynamicalsimulations of galaxy formation.
MNRAS , 493(3):4453–4462.Hearin, A. P. and Watson, D. F. (2013). The dark side of galaxycolour.
MNRAS , 435(2):1313–1324.Hearin, A. P., Zentner, A. R., Berlind, A. A., and Newman, J. A.(2013). SHAM beyond clustering: new tests of galaxy-haloabundance matching with galaxy groups.
MNRAS ,433(1):659–680.Hopkins, P. F., Wetzel, A., Kereš, D., Faucher-Giguère, C.-A.,Quataert, E., Boylan-Kolchin, M., Murray, N., Hayward, C. C.,and El-Badry, K. (2018a). How to model supernovae insimulations of star and galaxy formation.
MNRAS ,477(2):1578–1603.Hopkins, P. F., Wetzel, A., Kereš, D., Faucher-Giguère, C.-A.,Quataert, E., Boylan-Kolchin, M., Murray, N., Hayward, C. C.,Garrison-Kimmel, S., Hummels, C., Feldmann, R., Torrey, P.,Ma, X., Anglés-Alcázar, D., Su, K.-Y., Orr, M., Schmitz, D.,Escala, I., Sanderson, R., Grudi´c, M. Y., Hafen, Z., Kim, J.-H.,Fitts, A., Bullock, J. S., Wheeler, C., Chan, T. K., Elbert, O. D.,and Narayanan, D. (2018b). FIRE-2 simulations: physics versusnumerics in galaxy formation.
MNRAS , 480(1):800–863.Klypin, A. A., Trujillo-Gomez, S., and Primack, J. (2011). DarkMatter Halos in the Standard Cosmological Model: Results fromthe Bolshoi Simulation.
ApJ , 740(2):102.Kravtsov, A. V., Berlind, A. A., Wechsler, R. H., Klypin, A. A.,Gottlöber, S., Allgood, B. o., and Primack, J. R. (2004). TheDark Side of the Halo Occupation Distribution.
ApJ ,609(1):35–49.Lehmann, B. V., Mao, Y.-Y., Becker, M. R., Skillman, S. W., andWechsler, R. H. (2017). The Concentration Dependence of theGalaxy-Halo Connection: Modeling Assembly Bias withAbundance Matching.
ApJ , 834(1):37.Mao, Y.-Y., Williamson, M., and Wechsler, R. H. (2015). TheDependence of Subhalo Abundance on Halo Concentration.
ApJ , 810(1):21.Marín, F. A., Wechsler, R. H., Frieman, J. A., and Nichol, R. C.(2008). Modeling the Galaxy Three-Point Correlation Function.
ApJ , 672(2):849–860.Martizzi, D., Vogelsberger, M., Torrey, P., Pillepich, A., Hansen,S. H., Marinacci, F., and Hernquist, L. (2020). Baryons in theCosmic Web of IllustrisTNG - II. The connection amonggalaxies, haloes, their formation time, and their location in theCosmic Web.
MNRAS , 491(4):5747–5758.Matthee, J., Schaye, J., Crain, R. A., Schaller, M., Bower, R., andTheuns, T. (2017). The origin of scatter in the stellar mass-halomass relation of central galaxies in the EAGLE simulation.
MNRAS , 465(2):2381–2396. Moliné, Á., Sánchez-Conde, M. A., Palomares-Ruiz, S., and Prada,F. (2017). Characterization of subhalo structural properties andimplications for dark matter annihilation signals.
MNRAS ,466(4):4974–4990.Naab, T. and Ostriker, J. P. (2017). Theoretical Challenges inGalaxy Formation.
ARA & A , 55(1):59–109.Nagai, D. and Kravtsov, A. V. (2005). The Radial Distribution ofGalaxies in Λ Cold Dark Matter Clusters.
ApJ , 618(2):557–568.Navarro, J. F., Frenk, C. S., and White, S. D. M. (1997). AUniversal Density Profile from Hierarchical Clustering.
ApJ ,490(2):493–508.Nelson, D., Springel, V., Pillepich, A., Rodriguez-Gomez, V.,Torrey, P., Genel, S., Vogelsberger, M., Pakmor, R., Marinacci,F., Weinberger, R., Kelley, L., Lovell, M., Diemer, B., andHernquist, L. (2019). The IllustrisTNG simulations: public datarelease.
Computational Astrophysics and Cosmology , 6(1):2.Ostriker, J. P., Choi, E., Chow, A., and Guha, K. (2019). Mind theGap: Is the Too Big to Fail Problem Resolved?
ApJ , 885(1):97.Pakmor, R., Bauer, A., and Springel, V. (2011).Magnetohydrodynamics on an unstructured moving grid.
MNRAS , 418(2):1392–1401.Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion,B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg,V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M.,Perrot, M., and Édouard Duchesnay (2011). Scikit-learn:Machine learning in python.
Journal of Machine LearningResearch , 12(85):2825–2830.Pillepich, A., Springel, V., Nelson, D., Genel, S., Naiman, J.,Pakmor, R., Hernquist, L., Torrey, P., Vogelsberger, M.,Weinberger, R., and Marinacci, F. (2018). Simulating galaxyformation with the IllustrisTNG model.
MNRAS ,473(3):4077–4106.Reddick, R. M., Wechsler, R. H., Tinker, J. L., and Behroozi, P. S.(2013). The Connection between Galaxies and Dark MatterStructures in the Local Universe.
ApJ , 771(1):30.Rees, M. J. and Ostriker, J. P. (1977). Cooling, dynamics andfragmentation of massive gas clouds: clues to the masses andradii of galaxies and clusters.
MNRAS , 179:541–559.Schaye, J., Crain, R. A., Bower, R. G., Furlong, M., Schaller, M.,Theuns, T., Dalla Vecchia, C., Frenk, C. S., McCarthy, I. G.,Helly, J. C., Jenkins, A., Rosas-Guevara, Y. M., White, S. D. M.,Baes, M., Booth, C. M., Camps, P., Navarro, J. F., Qu, Y.,Rahmati, A., Sawala, T., Thomas, P. A., and Trayford, J. (2015).The EAGLE project: simulating the evolution and assembly ofgalaxies and their environments.
MNRAS , 446(1):521–554.Somerville, R. S. and Davé, R. (2015). Physical Models of GalaxyFormation in a Cosmological Framework.
ARA & A , 53:51–113. Springel, V. (2010). E pur si muove: Galilean-invariantcosmological hydrodynamical simulations on a moving mesh. MNRAS , 401(2):791–851.Tasitsiomi, A., Kravtsov, A. V., Wechsler, R. H., and Primack, J. R.(2004). Modeling Galaxy-Mass Correlations in DissipationlessSimulations.
ApJ , 614(2):533–546.Trujillo-Gomez, S., Klypin, A., Primack, J., and Romanowsky,A. J. (2011). Galaxies in Λ CDM with Halo AbundanceMatching: Luminosity-Velocity Relation, BaryonicMass-Velocity Relation, Velocity Function, and Clustering.
ApJ ,742(1):16.Vale, A. and Ostriker, J. P. (2004). Linking halo mass to galaxyluminosity.
MNRAS , 353(1):189–200.Vale, A. and Ostriker, J. P. (2006). The non-parametric model forlinking galaxy luminosity with halo / subhalo mass. MNRAS ,371(3):1173–1187.Vale, A. and Ostriker, J. P. (2008). A non-parametric model forlinking galaxy luminosity with halo / subhalo mass: are brightestcluster galaxies special? MNRAS , 383(1):355–368.van Dokkum, P. G., Whitaker, K. E., Brammer, G., Franx, M.,Kriek, M., Labbé, I., Marchesini, D., Quadri, R., Bezanson, R.,Illingworth, G. D., Muzzin, A., Rudnick, G., Tal, T., and Wake,D. (2010). The Growth of Massive Galaxies Since z = ApJ ,709(2):1018–1041.Vogelsberger, M., Genel, S., Springel, V., Torrey, P., Sijacki, D.,Xu, D., Snyder, G., Nelson, D., and Hernquist, L. (2014).Introducing the Illustris Project: simulating the coevolution ofdark and visible matter in the Universe.
MNRAS ,444(2):1518–1547.Walsh, K. and Tinker, J. (2019). Probing Galaxy assembly bias inBOSS galaxies using void probabilities.
MNRAS ,488(1):470–479.Wang, L., Dutton, A. A., Stinson, G. S., Macciò, A. V., Penzo, C.,Kang, X., Keller, B. W., and Wadsley, J. (2015). NIHAO project- I. Reproducing the ine ffi ciency of galaxy formation acrosscosmic time with a large sample of cosmologicalhydrodynamical simulations. MNRAS , 454(1):83–94.Weinberger, R., Springel, V., Hernquist, L., Pillepich, A.,Marinacci, F., Pakmor, R., Nelson, D., Genel, S., Vogelsberger,M., Naiman, J., and Torrey, P. (2017). Simulating galaxyformation with black hole driven thermal and kinetic feedback.
MNRAS , 465(3):3291–3308.Xu, X. and Zheng, Z. (2018). Dependence of halo bias andkinematics on assembly variables.
MNRAS , 479(2):1579–1594.Zentner, A. R., Hearin, A. P., and van den Bosch, F. C. (2014).Galaxy assembly bias: a significant source of systematic error inthe galaxy-halo relationship.