[PDF] Advances in Constraining Intrinsic Alignment Models with Hydrodynamic Simulations

Abstract

We use galaxies from the IllustrisTNG, MassiveBlack-II and Illustris hydrodynamic simulations to investigate the behaviour of large scale galaxy intrinsic alignments. Our analysis spans four redshift slices over the approximate range of contemporary lensing surveys z=0−1 . We construct comparable weighted samples from the three simulations, which we then analyse using an alignment model that includes both linear and quadratic alignment contributions. Our data vector includes galaxy-galaxy, galaxy-shape and shape-shape projected correlations, with the joint covariance matrix estimated analytically. In all of the simulations, we report non-zero IAs at the level of several σ . For a fixed lower mass threshold, we find a relatively strong redshift dependence in all three simulations, with the linear IA amplitude increasing by a factor of ∼2 between redshifts z=0 and z=1 . We report no significant evidence for non-zero values of the tidal torquing amplitude, A 2 , in TNG, above statistical uncertainties, although MassiveBlack-II favours a moderately negative A 2 ∼−2 . Examining the properties of the TATT model as a function of colour, luminosity and galaxy type (satellite or central), our findings are consistent with the most recent measurements on real data. We also outline a novel method for constraining the TATT model parameters directly from the pixelised tidal field, alongside a proof of concept exercise using TNG. This technique is shown to be promising, although the comparison with previous results obtained via other methods is non-trivial.

Full PDF

MMNRAS , 000–000 (0000) Preprint 24 September 2020 Compiled using MNRAS L A TEX style ﬁle v3.0

Advances in Constraining Intrinsic Alignment Models withHydrodynamic Simulations

S. Samuroff ⋆ , R. Mandelbaum and J. Blazek , McWilliams Center for Cosmology, Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213, USA Department of Physics, Northeastern University, Boston, MA, 02115, USA Laboratory of Astrophysics, ´Ecole Polytechnique F´ed´erale de Lausanne, CH-1290 Versoix, Switzerland

24 September 2020

ABSTRACT

We use galaxies from the I

LLUSTRIS

TNG, M

ASSIVE B LACK -II and I

LLUSTRIS -1 hydrody-namic simulations to investigate the behaviour of large scale galaxy intrinsic alignments. Ouranalysis spans four redshift slices over the approximate range of contemporary lensing surveys z = − . We construct comparable weighted samples from the three simulations, which wethen analyse using an alignment model that includes both linear and quadratic alignment con-tributions. Our data vector includes galaxy-galaxy, galaxy-shape and shape-shape projectedcorrelations, with the joint covariance matrix estimated analytically. In all of the simulations,we report non-zero IAs at the level of several σ . For a ﬁxed lower mass threshold, we ﬁnda relatively strong redshift dependence in all three simulations, with the linear IA amplitudeincreasing by a factor of ∼ between redshifts z = and z = . We report no signiﬁcantevidence for non-zero values of the tidal torquing amplitude, A , in TNG, above statisticaluncertainties, although M ASSIVE B LACK -II favours a moderately negative A ∼ − . Exam-ining the properties of the TATT model as a function of colour, luminosity and galaxy type(satellite or central), our ﬁndings are consistent with the most recent measurements on realdata. We also outline a novel method for constraining the TATT model parameters directlyfrom the pixelised tidal ﬁeld, alongside a proof of concept exercise using TNG. This tech-nique is shown to be promising, although the comparison with previous results obtained viaother methods is non-trivial. Key words: cosmology: theory gravitational lensing: weak large-scale structure of Universemethods: numerical

It is now well established that the weak lensing of distant galax-ies by foreground mass provides a relatively clear window onto thelarge scale structure of the Universe. This is true whether that fore-ground mass is in the form of discrete matter concentrations, astraced by galaxies (i.e. galaxy-galaxy lensing; Mandelbaum et al.2013; Leauthaud et al. 2017; Prat & S´anchez et al., 2018; Joudakiet al. 2018; Blake et al. 2020), massive dark matter halos (clus-ter lensing; Melchior et al. 2017; Dark Energy Survey Collabora-tion 2020), or the continuous large scale matter distribution (cos-mic shear; Heymans et al. 2013; Dark Energy Survey Collabora-tion 2016; Troxel et al. 2018; Hildebrandt et al. 2020; Chang et al.2019; Hamana et al. 2020; Asgari et al. 2020; see also the forth-coming DES Y3 analyses Amon et al. 2020 and Secco et al. 2020).Though the measurement method and the exact form of the the-ory predictions differ slightly in the three cases, they are all fun-damentally probes of the growth of structure at low redshift. Sim- ⋆ [email protected] ilarly cross correlations between galaxy lensing and other observ-ables can be powerful probes in their own right; recent examplesinclude galaxy lensing × CMB lensing (Schaan et al. 2017), voidscorrelated with CMB lensing (Vielzeuf et al. 2019) and galaxyweak lensing crossed with gamma ray emission (Ammazzalorso& Gruen et al., 2020), each of which provide probes of dark matterwith slightly different sensitivities. A measurement of cosmologi-cal weak lensing, however, is subject to a range of systematic ef-fects; that is, observational effects that mimic a cosmological lens-ing signal, and so bias cosmological inference if one neglects them.Depending on the systematic in question, the most effective miti-gation strategy may be quite different. In broad terms, however,the standard approach is to either (a) mitigate systematics wherepossible, either by applying a calibration to the data, or discardingthe data points most strongly affected or (b) marginalise over themwith a parametric model. Often a combination of the two is appro-priate, and the prior used in (b) is informed by additional data orsimulations, and detailed testing of the calibration step in (a).This work focuses on one particular source of systematicbias, which enters all of the weak lensing measurements described © 0000 The Authors a r X i v : . [ a s t r o - ph . C O ] S e p Samuroff et al above: galaxy intrinsic alignments (IAs). The fact that the projectedshapes of galaxies residing in the same local region of the cosmicweb are correlated has been known for many years now (Catelanet al. 2001; Heymans & Heavens 2003). For pairs of galaxies at thesame redshift, the physically localised intrinsic shape-shape cor-relations can persist even on relatively large angular scales. Fortu-nately, in practice this signal, commonly referred to as the II contri-bution, is typically weak; it is also absent, by construction, from ameasurement of galaxy-galaxy lensing, which reduces the sensitiv-ity to II further in the context of a multiprobe analysis. Often moredangerous are what are known as GI correlations, which arise dueto the fact that foreground mass causes both local gravitational in-teractions in foreground galaxies and lensing in background objects(Hirata & Seljak 2004).Unfortunately, many of the avenues available for understand-ing other lensing systematics are not feasible in the case of intrinsicalignments. For example image simulations, which have becomean invaluable tool for quantifying shear calibration errors (Zuntz& Sheldon et al., 2018; Mandelbaum et al. 2018; Kannawadi et al.2019; S´anchez et al. 2020) cannot be used for understanding IAsdue to their fundamentally astrophysical nature. For quite differentreasons, the various sophisticated methods that the lensing com-munity has developed for calibrating photometric redshift errorsin recent years (e.g. Choi et al. 2016; Gruen & Brimioulle 2017;Gatti & Vielzeuf et al., 2018; Prat & Baxter et al., 2019; Alarconet al. 2019, Myles & Alarcon et al., 2020; Giannini & Gatti et al.,2020) have limited potential for cross-use as IA mitigation tools.Although direct mitigation methods have been proposed in the lit-erature (Heymans & Heavens 2003; Joachimi & Schneider 2010),to date these have been limited in their applicability, in large partbecause they tend to rely on having good single-galaxy redshift in-formation. They also often focus on the (typically subdominant) IIcontribution (although the Joachimi & Schneider 2010 method herecan include both). The standard approach in cosmological lensingstudies is to model IAs using a (semi-) physically motivated para-metric model, and marginalise over its (typically − ) parameterswith wide ﬂat priors. Given this background, hydrodynamic simu-lations are one of a small number of possible routes to understand-ing intrinsic alignments in cosmological lensing surveys, either formodel building, or deriving informative priors for the existing mod-els. Although analytic models are relatively well motivated on verylarge physical scales, this is much less true on small to intermediatescales. Extending beyond this regime, then, either requires simula-tions or the addition of extra terms to the model, controlled by newparameters (Schneider & Bridle 2010; Blazek et al. 2015, 2019;Fortuna et al. 2020). Although not the focus of this paper, anotherroute is to use real galaxies to make a direct IA measurement (seee.g. Hirata et al. 2007; Mandelbaum et al. 2011; Joachimi et al.2011; Blazek et al. 2012; Singh et al. 2015; Johnston et al. 2019).This approach avoids questions about the realism of simulations. Itdoes, however, have its own challenges, not least the need for ac-curate per-galaxy redshift information, and the typically fairly re-stricted galaxy selections (often bright, red, low redshift samples)Although a substantial amount of literature exists on the sub-ject of IAs in hydrodynamic simulations, it is fair to say that thereis signiﬁcant variation in focus and methodology. For example, aseries of studies by a group working on the H ORIZON -AGN sim-ulation have looked in detail at the alignment of two- and three-dimensional subhalo shapes with their local large scale structureand the cosmic web (e.g. Dubois et al. 2014; Codis et al. 2015a;Soussana et al. 2020). Intriguingly, Codis et al. (2015a) found hintsthat blue galaxy IAs could survive in projection at a level detectable by future surveys. A number of papers based on M

ASSIVE B LACK -II (e.g. Tenneti et al. 2014, 2015; Bhowmick et al. 2020) andI

LLUSTRIS -1 (Hilbert et al. 2017) have explored similar themes.Minor discrepancies in the details of the IA signal, and its depen-dence on galaxy properties, have been uncovered; thus far, how-ever, the interpretation of these differences has been complicated byboth the relatively low signal-to-noise on large scales, and method-ological differences.This work is intended as a step towards a more complete un-derstanding of intrinsic alignments in hydrodynamic simulations,building on these earlier studies. We present a uniﬁed analysis ofsamples from various recent simulations, with measurement meth-ods and selection functions matched in order to make a meaning-ful quantitative comparison. Unlike many previous studies, we fo-cus on two-point intrinsic-galaxy and intrinsic-intrinsic alignmentstatistics w g + and w ++ , which are commonly used in observationalstudies; this is primarily because one can derive well deﬁned an-alytic predictions for them, which directly correspond to the IAmodelling used in cosmological lensing analyses. This is less trueof statistics like the 3D EE and ED correlations (e.g. Tenneti et al.2015; Chisari et al. 2015), and halo misalignment statistics (Ten-neti et al. 2014; Codis et al. 2015b), all of which have been used inmany simulation-based studies. In this work we perform a simul-taneous analysis of these w g + and w ++ , alongside the equivalentgalaxy-galaxy correlations, in order to fully exploit the large scaleIA information in these simulated data sets.The paper is structured as follows. In Section 2 we outlinethe properties of the three simulated datasets used in this work,I LLUSTRIS

TNG, M

ASSIVE B LACK -II and I

LLUSTRIS -1, and de-scribe the selection used to construct comparable object catalogues.Section 3 then sets out the pipeline taking us from public (stellarand dark matter) particle data and S UB F IND group tables to shapecatalogues, and eventually to two-point measurements. The theorycalculations, which we use to connect these measurements to IAmodels, are described in Section 4. In Section 5 we present the re-sults of our baseline likelihood analyses using the two-point align-ment data, and then in Section 6 we discuss a series of extensions.We ﬁt one of the more sophisticated alignment models in the lit-erature, and consider the dependence of its parameters on variousgalaxy properties. In addition to the two-point constraints, Section7 presents a novel method for extracting alignment information di-rectly from the simulated matter ﬁeld. We develop the basic prin-ciples, and present an example using I

LLUSTRIS

TNG. Finally, weconclude and brieﬂy discuss our results in the context of the ﬁeldin Section 8.

We consider three discrete cosmological simulation volumes in thisstudy. Of these, I

LLUSTRIS

TNG is chronologically the most re-cent, and so beneﬁts from the improvements derived from the anal-ysis of earlier simulation efforts. The simulation runs are evolvedaccording to Newtonian dynamics and assume similar but non-identical cosmologies, which are set out in Table 1, with particlesevolved from a set of initial conditions at high redshift. In each red-shift snapshot, groups are identiﬁed using the S UB F IND friends-of-friends (FoF) group ﬁnding algorithm (Springel et al. 2001).

MNRAS000

MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations Simulation Volume Cosmology Mean Gas Particle/ h − Mpc Mass / M ⊙ I LLUSTRIS

TNG A s = . × − Ω m = . b = . n s = . h = . ( σ = . ) MBII A s = . × − Ω m = . b = . n s = . h = . ( σ = . ) I LLUSTRIS -1 A s = . × − Ω m = . b = . n s = . h = . ( σ = . ) Table 1.

Properties of the simulation volumes used in this work. The par-ticle mass quoted in the right-most column is the mean of gas particles.Note that in M

ASSIVE B LACK -II all particles are equally weighted, whileI

LLUSTRIS

TNG they cover a range (see Nelson et al. 2019). σ is shownin parentheses as it is a derived parameter. M ASSIVE B LACK -II has been used in various previous studies, andis described in a number of existing publications; details aboutthe approximations and modelling can be found in Khandai et al.(2015) and Di Matteo et al. (2012). The simulation has a comovingvolume of ( h − Mpc ) , and was generated using P-GADGET,which is a version of GADGET3 (Springel 2005). Initial condi-tions were generated with a transfer function generated by CMB-FAST at z = . Star formation is modelled as a binary phaseprocess, triggered when a region of gas reaches some thresholddensity. Stellar particles are generated randomly from gas particleswith a probability determined by their star formation rate. Stel-lar winds are modelled using the parametrisation of Hernquist &Springel (2003). AGN feedback, which is particularly relevant inhigh mass galaxy populations, where IAs are also strong, is also in-cluded; details of the black hole growth and AGN feedback modelssee Khandai et al. (2015)’s Sec 2.3. I LLUSTRIS -1 is another hydrodynamic simulation whose data arenow public . The smallest of the three considered in this work, thebox has a total comoving volume of V = ( h − Mpc ) , whichwas evolved using the moving mesh grid code, A REPO (Wein-berger et al. 2020). The various physical processes approximatedin I

LLUSTRIS -1, in brief, include radiative cooling (both primor-dial and due to heavy elements) with self-shielding corrections;star-formation in dense regions of gas; stellar evolution with as-sociated metal enrichment; supernova feedback and quasar-mode,radio-mode, and radiative mode AGN feedback. The above pre-scriptions have ∼ tunable parameters, which were ﬁxed to val-ues obtained using a signiﬁcantly smaller volume, higher resolu- tion, set of simulations. Details of these models can be found inVogelsberger et al. (2014). I LLUSTRIS

TNG is the most recent hydrodynamic simulation in-cluded here. The particle and group data are described in the releasepapers (Springel et al. 2018; Nelson et al. 2019), and are avail-able for download . The I LLUSTRIS

TNG data are generated usingA

REPO . A Monte Carlo tracer particle scheme is used to to followthe Lagrangian evolution of baryonic matter. The hydrodynamicelement comprises prescriptions for a handful of different physi-cal processes, including emission line radiative cooling; stochasticstar formation; supernova feedback and AGN feedback. The latterhas two modes (referred to as “quasar” and “kinetic wind” modes),depending on the accretion rate. Details of these prescriptions canbe found in Pillepich et al. (2018). It is worth remarking that I L - LUSTRIS

TNG is tuned explicitly to match observations at z = using a number of statistics; speciﬁcally the galaxy stellar massfunction, the total gas mass content within the virial radius of mas-sive groups, the stellar mass-stellar size and the black hole - galaxymass relations, and the overall shape of the cosmic star formationrate density at high redshift. To obtain a galaxy sample from which we can draw useful conclu-sions for each of the simulated datasets, we impose additional qual-ity cuts. Although our measurements are not subject to the usualobservational biases (due, for example, to ﬁtting ellipticities in thepresence of pixel noise, or imperfect PSF modelling), they are af-fected by convergence bias (e.g. Chisari et al. 2015). That is, subha-los with an insufﬁcient number of particles to provide a meaningfulshape measurement alter the ensemble ellipticity distribution of thesample. To avoid such effects, we impose a selection based on thenumber of particles in a galaxy (dark matter and stellar). This trans-lates into a slightly different mass cut for each simulation due to therespective mass resolutions of the three datasets. We thus addition-ally impose a direct cut on stellar mass, such that the samples allhave the same lower bound on M ∗ . The ﬁnal selection is then: n DM > n ∗ > (1) M ∗ > . × h − M ⊙ . This leaves a total of ∼ , , , and , us-able galaxies in I LLUSTRIS -1, M

ASSIVE B LACK -II and I

LLUS - TRIS

TNG samples respectively. Note that the cut in Eq. (1) is im-posed on each snapshot independently, resulting in the per-redshiftnumbers shown in Table 2.In Figure 1 we show the ellipticity distribution and stellar massfunction for each of the samples. “Ellipticity” in this context is de-ﬁned as the magnitude of the spin-2 complex ellipticity deﬁned inSection 3.1. As discussed there, the exact value for a given galaxyis dependent on the details of the measurement method (i.e. the rel-ative weighting of stellar matter at different radii). Given that the MNRAS , 000–000 (0000)

Samuroff et al measurement pipeline is applied consistently to the different sim-ulations, however, Figure 1 does allow a meaningful comparison.The striking discrepancy in the upper panel has been noted else-where (see, for example, Tenneti et al. 2016’s Figure 2); galaxies inI

LLUSTRIS -1 are signiﬁcantly rounder than both comparable sim-ulations and real data. The differences in the mass function (lowerpanel) mean that, even with a common lower bound, the mean stel-lar mass of the samples differs slightly (see the right-hand columnin Table 2). At given redshift, the mean masses are ordered (de-scending) I

LLUSTRIS

TNG, I

LLUSTRIS -1, M

ASSIVE B LACK -II.I

LLUSTRIS

TNG is unusual amongst hydrodynamic simula-tions, in the sense that it has realistic galaxy magnitudes, integratedover a number of different pass bands. We include the SDSS griz band magnitudes in our processed catalogues, and will use themin the following sections. Brieﬂy, these are evaluated by summingthe luminosity of star particles in a particular subhalo, and the ap-propriate ﬁlter band-pass is applied. More detail on this calculationcan be found in Nelson et al. (2018)’s Sec. 3. The distribution ofapparent r − band magnitudes in three I LLUSTRIS

TNG snapshotsis shown in Figure 2. For reference, the observed magnitude dis-tribution from the DES Y1 M

ETACALIBRATION catalogue is alsoincluded (dashed purple). It is worth remembering here that, un-like the simulated data, DES is a ﬂux-limited imaging survey, withgalaxies distributed across a range of redshifts (ensemble medianredshift z ∼ . ; Zuntz et al. 2018), and so direct comparisonis not useful; they are shown here to illustrate that the simulatedgalaxy samples here not representative of those in a typical lensingsurvey, but are a brighter subset. Key to halo model-based descriptions of galaxy alignments is theability to split galaxies cleanly into satellites and centrals (see For-tuna et al. 2020 for a recent example). Galaxies residing at thecentres of their halo tend to be older and more massive than thesatellites in the same halo; in the halo model picture, the clusteringand shape properties of these two sets of galaxies is fundamentallydifferent. For this reason it is, then, interesting to explore the be-haviour of satellites and centrals separately. For this work we sim-ply designate the most massive galaxy in each FoF group as thecentral . Although noisy, this deﬁnition is less prone to misclas-siﬁcation than one based on geometry, particularly in high massgroups in which the region around the bottom of the potential wellis relatively crowded. We show the distribution of galaxy-halo sep-arations for centrals and satellites at z = in Figure 3. Althoughnot shown here, a similar pattern is seen in the higher redshift snap-shots. That the mass based classiﬁer is a strong indicator of galaxyposition in the halo offers some reassurance that the central ﬂaggingis, in fact, literally selecting central galaxies. This is a relatively oldproblem, and various previous studies have explored different waysto ﬂag central galaxies (see, for example, Rykoff et al. 2016). There is much evidence in the literature to indicate that IAs arestrongly dependent on galaxy colour (Joachimi et al. 2011; Hey-mans et al. 2013; Singh et al. 2015; Samuroff et al. 2019; John- In the nomenclature of the TNG data release, , the centralin each group is identiﬁed using the “GroupFirstSub” ﬂag. . . . . . . . Ellipticity e p ( e ) Stellar Mass log M ∗ / h − M (cid:12) − − − − M a ss F un c t i o n Φ ( M ∗ ) / h M p c − d e x − MBIIIllustris-1TNG

Figure 1.

Upper:

Normalised distributions of projected ellipticity for the z = samples used in this work, with the cuts described in Section 2.4.Shown are I LLUSTRIS

TNG (purple), M

ASSIVE B LACK -II (dark blue) andI

LLUSTRIS -1 (green).

Lower: the stellar mass functions for the same sam-ples. ston et al. 2019). Clearly photometric colour is a proxy for ahost of other physical properties, which ultimately determine howstrongly the galaxy sample is aligned, and one could equivalentlyuse other properties such as morphology and bulge/disc ratio. Al-though crude, a binary type split is often useful, given that mixedgalaxy samples commonly exhibit a clear bimodality in colour (orcolour-magnitude) space (e.g. Baldry et al. 2004; Valentini et al.2018), and that this maps roughly onto differences in IA properties.That said, the IA signal in the simulations (or indeed any galaxysample) is a complex function of many correlated quantities (e.g.colour, morphology, dynamical properties). Although it is useful tostudy IAs in subpopulations deﬁned using proxies, it is worth pro-ceeding with care, and bearing in mind that the full picture is morecomplicated.Whereas quantities like stellar mass and subhalo shapes arerelatively simple to obtain from hydrodynamic simulations, map-ping them onto observable quantities like ﬂuxes and colours isnon-trivial. This has historically been a challenging problem, andthere are documented deﬁciencies in the galaxy photometry forM

ASSIVE B LACK -II and I

LLUSTRIS -1; the equivalent quantitiesfor I

LLUSTRIS

TNG are, however, thought to be fairly realistic (see

MNRAS000

MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations Simulation Redshift Number of galaxies n c / h Mpc − Red Fraction Satellite Fraction Mean Stellar Mass / M ⊙ I LLUSTRIS

TNG 0.00 171,684 0.020 0.34 0.33 . I LLUSTRIS

TNG 0.30 168,399 0.020 0.22 0.32 . I LLUSTRIS

TNG 0.62 159,925 0.019 0.18 0.30 . I LLUSTRIS

TNG 1.00 145,394 0.017 0.12 0.27 . M ASSIVE B LACK -II 0.00 33,578 0.033 N/A 0.45 . M ASSIVE B LACK -II 0.30 34,646 0.035 N/A 0.46 . M ASSIVE B LACK -II 0.62 35,523 0.036 N/A 0.48 . M ASSIVE B LACK -II 1.00 35,482 0.036 N/A 0.49 . I LLUSTRIS -1 0.00 18,489 0.044 N/A 0.32 . I LLUSTRIS -1 0.30 17,203 0.041 N/A 0.31 . I LLUSTRIS -1 0.62 15,181 0.036 N/A 0.29 . I LLUSTRIS -1 1.00 12,881 0.031 N/A 0.27 . Table 2.

Physical properties of the galaxy samples considered in this work. The object selection is as set out in Section 2.4, and is applied independentlyat each redshift. Here n c (fourth column) is the comoving galaxy number density of the sample. The methods used to separate red/blue and satellite/centralgalaxies are described in Sections 6.2.4 and 6.2.1 respectively.

18 20 22 24 r − band Magnitude . . . . . . p ( r ) DES Y1 (metacal) z = 0 . z = 0 . z = 1 . Figure 2.

Normalised distributions of r − band apparent magnitude for ourI LLUSTRIS

TNG sample. We convert the absolute magnitude in the cata-logues to apparent magnitude at each snapshot, assuming the correct cos-mology of the simulation. A detailed description of how the simulated ab-solute magnitudes are computed can be found in Nelson et al. (2018). Apower-law approximation for the SED is used to compute the k − correctionsfor the apparent magnitudes; although this is not rigorously correct, it issufﬁcient for our purposes, given that the k − corrections are comfortablysubdominant to the distance modulus, and that we are only attempting aqualitative comparison here. For reference, the unﬁlled curve shows theequivalent distribution for the ﬁducial Dark Energy Survey Year 1 shapecatalogue, after quality cuts (c.f. Zuntz et al. 2018’s Fig. 3). e.g. Nelson et al. 2018). In brief, in I LLUSTRIS

TNG a stellar syn-thesis model is used to predict the stellar population of each particlein a subhalo as a function of metallicity and age. This process in-cludes basic models for dust emission and nebular line emission.The predicted stellar spectrum is multiplied by the SDSS opti-cal/near IR ugriz band-passes (airmass 1.3), producing a magni-tude in each ﬁlter. The per-particle magnitudes are then summedover the ensemble bound to the subhalo. This process is explainedin more detail in Nelson et al. 2018’s Section 3 (see their “Model(A)”). Oﬀset from halo centre / h − kpc0 . . . . . . . . . p ( R ) CentralsSatellites

Figure 3.

Normalised distributions of galaxy offsets from the centre of massof their host halos, in the I

LLUSTRIS

TNG simulation at z = . The cen-tral ﬂag used is deﬁned in Section 2.4.2. Note that the distribution labelled“satellites” is boosted by a factor of 15 for visibility. We inspect the colour-magnitude diagrams and make a lineardivision in g − i colour space ( g − i ) = m gi × r + c gi , (2)to roughly mimic the green valley division. The colour mag-nitude diagram evolves with redshift, and so we carry outthis process independently in each snapshot, giving m gi =( . , . , . , . ) , c gi = ( . , . , . , . ) . Thered fraction resulting from this split at each redshift is shown inTable 2. The numbers here are roughly consistent with those seenin real data, and change with redshift in an intuitively correct way(i.e., the low redshift Universe has a larger abundence of massivered elliptical galaxies compared with z = ). The r − i colour mag-nitude diagram for our split sample is shown in Figure 4. Given thatthe split is imposed in g − i , it is somewhat reassuring that we seeclearly deﬁned well separated samples in this space. MNRAS , 000–000 (0000)

Samuroff et al

Figure 4.

Colour magnitude diagrams for our I

LLUSTRIS

TNG sample atfour redshifts (labelled upper left). The two sets of contours show the dis-tributions of the red and blue samples, as outlined in the text. Note that thesplit is imposed in g − i versus r space, which is why the division is notsharp. The ﬁne points show a randomly downsampled selection of galaxiesfrom each population. Note that unlike in Figure 2, the magnitudes usedhere (including for estimating colours) are absolute, not apparent, ones. In a three dimensional cosmological volume, the most natural wayto quantify a galaxy’s shape is via its intertia tensor. Analogous toprojected ellipticities, which are constructed from the moments ofa galaxy light proﬁle, the most general form for the inertia tensoris: I ij = W N p ∑ k = w k x i,k x j,k , (3)where the indices ij indicate one of the three spatial coordinateaxes i, j ∈ ( x, y, z ) , and the sum runs over the number of parti-cles within the subhalo. For our purposes, this means star particles,but one could equivalently estimate the shape of the dark mattersubhalo using the same equation. The prefactor w k is the weightallocated to particle k , and W is the sum of the weights; in the caseof M ASSIVE B LACK -II, all of the star particles have the same mass,and so the weights are ﬂat. In I

LLUSTRIS

TNG andI

LLUSTRIS -1,this is not the case, and each particle is weighted by its mass. Analternative, known as the reduced inertia tensor (see Chisari et al.2015, Tenneti et al. 2016), weights particles by their inverse squaredistance from the subhalo centroid. This process is known to biasthe measured ellipticities low, necessitating a further iterative cor-rection procedure. Although we mention this here for context, sinceit has been used a handful of times in the literature, it is not usedin this work. Note also that we have reason to think the IA signalis, in reality, dependent on the radial weighting of the shape mea-surements, an effect that has been observed in real data (Singh &Mandelbaum 2016).By performing an eigenvalue decomposition on I , one canobtain three dimensional axis vectors and lengths, which in turncan be projected into the 2D second moments Q xx , Q yy , Q xy . Therecipe is set out by Piras et al. (2018) (see their Eq 13-15), and we refer the reader to that paper for the mathematical detail. Al-though for technical reasons our pipeline goes via three dimen-sional shapes, it is also worth noting that one could also simplymeasure the projected two dimensional moments of a subhalo di-rectly. With the projected moments, one can then construct the spin-2 ellipticity of a galaxy as ( e , e ) = ( Q xx − Q yy , Q xy ) Q + Q + √∣ Q ∣ . (4)It is worth bearing in mind that there are in fact two common el-lipticity deﬁnitions used for weak lensing. The one deﬁned aboveis equivalent to an ellipticity magnitude, written in terms of (pro-jected) axis ratios, e = ( a − b )/( a + b ) ; for detailed discussion ofboth this and the alternative ellipticity deﬁnition, and their respec-tive advantages, see Melchior & Viola (2012). Note that this is aCartesian projection along one axis of the simulation box, not alightcone projection with conversion to angular coordinates. Thepositive and negative e direction, then, is deﬁned by the x, y co-ordinate directions of the square simulation volume. Although thismeasurement does not correspond directly to what one could do inreality, the difference is not thought to be signiﬁcant, given the sta-tistical size and other limitations of the samples considered in thiswork. All correlation functions used in this paper are computed usingthe public

HALOTOOLS package (Hearin et al. 2017). The moststraightforward (and highest signal-to-noise) two-point measure-ment one could make is that of galaxy clustering in three dimen-sions. We adopt a common estimator of the form (Landy & Szalay1993): ξ ijgg ( r p , Π ) = D i D j − D i R j − D j R i + R i R j R i R j , (5)where DD , RR and DR are weighted counts of galaxy-galaxy,random-random and galaxy-random pairs, binned in perpendicularand line-of-sight separation, r p and Π . The indices i, j denote a pairof catalogues (either galaxy positions, or random points), which arecorrelated together. In both cases above, R represents the positionsof a set of random points drawn from a ﬂat distribution within thesimulation volume.The cross correlation of galaxy positions and intrinsic ellip-ticities, ξ g + ( r p , Π ) , can similarly be estimated, as a function of r p and Π . We use a modiﬁed Landy-Szalay estimator of the form: ξ ijg + ( r p , Π ) = S i + D j − S i + R j R i R j (6)(see Mandelbaum et al. 2011). One can similarly measure theshape-shape correlation: ξ ij ++ ( r p , Π ) = S i + S j + R i R j . (7) https://github.com/duncandc/halotools_ia https://halotools.readthedocs.io;v0.7 MNRAS000

A Constraints from Hydrodynamic Simulations The terms in the numerator represent shape correlations and aredeﬁned as S i + D j ≡ ∑ α ≠ β w α w β e + ( β ∣ α ) , (8) S i + S j + ≡ ∑ α ≠ β w α w β e + ( α ∣ β ) e + ( β ∣ α ) , (9)where the indices α, β run over galaxies and e + ( β ∣ α ) is the tan-gential ellipticity of galaxy β , rotated into the coordinate systemdeﬁned by the separation vector with galaxy α . For the ﬁducialcatalogue, I LLUSTRIS

TNG, the weights are equal and normalisedto the number of galaxies. In order to make a direct comparisonof the different samples, galaxies in the M

ASSIVE B LACK -II andI

LLUSTRIS -1 catalogues are assigned weights, such that the hosthalo mass distributions of the three match. For detail about theweighting scheme, which we refer to as halo-mass reweighting, anddiscussion about the impact on our results, we refer the reader toAppendix C. IAs are known to be dependent on cosmology and thehost halo mass distribution, and this process should remove differ-ences due to discrepancies in these factors. It is also true, however,that other properties such as the details of the galaxy-halo connec-tion and the properties of the galaxies themselves also potentiallyhave an impact. Such differences represent a systematic uncertainty(since we cannot say with certainty which of the simulations, if any,represents reality, nor straightforwardly homogenise them), and soany resulting differences should be treated as such.In lensing studies it is also common to assign galaxies some-thing approximating inverse variance (shape noise + measurementuncertainty) weights (see, for example, Zuntz et al. 2018). Sincethis weighting tends to upweight bright, high S / N galaxies, itseems likely it would also boost the IA signal. That said, in practicelensing weights tend to be shape noise dominated, and so relativelyuniform across the sample, meaning the magnitude of this effect isexpected to be small.From these three dimensional measurements, obtaining thetwo dimensional projected correlations is a case of integratingalong the line of sight. One has, w ab ( r p ) = ∫ Π max − Π max dΠ ξ ab ( r p , Π ) , (10)Here the lower indices ab denote a type of two-point correlation, a, b ∈ ( g, +) . Π max is an integration limit, which is set by thesimulation volume. For this study we adopt a value equal to athird of the box size, For this study we adopt a value equal to athird of the box size, or Π max = h − Mpc for I

LLUSTRIS

TNG, Π max = h − Mpc for M

ASSIVE B LACK -II, and Π max = h − Mpc for I

LLUSTRIS -1. In practice, for our purposes we wish tomaximise Π max ; although it is true that very long baselines willeventually harm the signal-to-noise by including uncorrelated pairs,on scales of a few tens of Mpc we are well within the regime whereextending Π max helps to access additional large scale signal modes(see Joachimi et al. 2011’s App. A2 for further discussion). In addition to the two-point measurements described above, we alsoimplement a new method to derive IA constraints at the ﬁeld level.We refer the reader to Section 7 for details, but the method involves deriving constraints on IA parameters via a comparison of the (pix-elized) three dimensional tidal ﬁeld and the intrinsic galaxy shapeﬁeld (see also Hilbert et al. 2017, who also use the tidal ﬁeld di-rectly to measure IAs, albeit via two-point functions). To this end,we need an estimate of that tidal tensor as a function of position;we obtain this from the gridded particle data as follows.Starting with the table of particle positions at ﬁxed redshift,we divide the simulation box into 3D cubic pixels. The pixel size L is an unconstrained analysis variable, and affects the physi-cal interpretation of the eventual results. We choose to performour measurements using three different scales, pixels across( L ∼ . h − Mpc ), pixels across ( L ∼ . h − Mpc ) and pixels ( L ∼ . h − Mpc ). Within each pixel p in the grid, centeredat position x p , we measure the overdensity of matter and stars, δ ( x p ) = N p ⟨ N p ⟩ p − . (11)That is, the total number of dark matter particles in pixel p , dividedby the mean occupation across all pixels. In the case of dark matter,all particles in I LLUSTRIS

TNG are weighted equally, and the val-ues in the equation above are raw number counts, rather than sumsof masses. Using the Fourier space version of the Poisson equation,one can show that the traceless tidal tensor can be obtained fromthe overdensity ﬁeld as: s ij ( k ) = ( k i k j k − δ ij ) δ ( k ) , (12)where k = k + k + k . For more details about the mathematics seeCatelan & Porciani (2001); Alonso et al. (2016). The two indiceshere ij denote a single element of the × tensor matrix withinpixel p .We also obtain a noisy estimate for the intrinsic shear in pixel p , γ Iij ( x p ) , by averaging the trace-free inertia tensors of galaxieswithin it. That is, γ Iij ( x p ) = ⟨ I ij,k − δ ij Tr [ I k ]⟩ k , (13)where the subscript k denotes a particular galaxy from pixel p , andthe angle brackets ⟨⟩ k indicate averaging over those galaxies. Weestimate the per-element variance of the × matrix γ I directlyby computing the RMS over all galaxies; that is, we assume thatshape noise dominates, such that the covariance matrix is diago-nal, and can be written as C − ij,p = δ ij σ − µ , or the inverse squareshape variance for component µ ∈ ( , ) . Note that this is a globalquantity, computed across pixels and applied to each of them. Weconﬁrm that the covariance scales with pixel size roughly as onemight expect from geometric arguments as σ SN µ ∝ L − / . A 1Dslice of the three ﬁelds described here, as measured in the z = I LLUSTRIS

TNG snapshot, can be seen in Figure 6. Shown are (leftto right): dark matter overdensity, the upper diagonal element of thedark matter tidal tensor and the smoothed galaxy shape ﬁeld. It isapparent from Figure 6 that there is an obvious qualitative corre-spondance between the raw matter ﬁeld and the tidal tensor (com-pare the left-most and middle panels). The sampling of galaxies ismuch sparser, which is evidenced by the amount of white space inthe right-most panel. Depending on the pixel scale, the fraction ofunoccupied pixels is between and . Although striking inthis Figure, and worth noting, the impact of this sampling is ex-plicitly incorporated into our IA modelling, as described in Section4.

MNRAS , 000–000 (0000)

Samuroff et al − r p × w gg z = 0 . z = 0 . z = 0 . z = 1 . . . . . . r p × w g + TNGMBIIIllustris − r p [ h − Mpc] − . . . . . r p × w ++ r p [ h − Mpc] 10 r p [ h − Mpc] 10 r p [ h − Mpc]

Figure 5.

The ﬁducial data vectors used in this work. Shown from top are galaxy-galaxy, galaxy-shape and shape-shape two-point correlations, at four discreteredshifts (left to right, as indicated). The point styles indicate measurements made on M

ASSIVE B LACK -II (dark blue stars), I

LLUSTRIS -1 (pink open circles)and I

LLUSTRIS

TNG (purple ﬁlled circles). The solid lines are the theory predictions, evaluated at the best ﬁtting point in the TATT parameter space for eachdata set. Scales within the shaded regions ( r p < h − Mpc ) are excluded from the ﬁts, using both the TATT and NLA models. y/h − Mpc04080120160200 z / h − M p c δ m y/h − Mpc s y/h − Mpc γ I Figure 6.

The z = dark matter overdensity ﬁeld and associated quantities from I LLUSTRIS

TNG. Here we show (left to right) the matter overdensity δ m , the , component of the × dark matter tidal tensor and the same component of the × galaxy shape tensor γ I . The pixel resolution is 128/side, resulting in aphysical pixel scale of . h − Mpc, which is at the ﬁner end of the range of pixel scales presented in this work. While the full simulation box is clearly threedimensional, for illustrative purposes we choose here to show a 2D slice through the centre. In the right-hand panel, white pixels indicate those containing nogalaxies that pass cuts into our ﬁnal I

LLUSTRIS

TNG shape catalogue. MNRAS000

TNG shape catalogue. MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations Figure 7.

The correlation matrix for our ﬁducial I

LLUSTRIS

TNG two-pointmeasurements, as estimated using jackknife resampling (upper left trian-gle), and an analytic Gaussian approximation (lower right). Note that thetwo covariance matrices are symmetric about the diagonal; the triangle con-ﬁguration is shown here for illustrative purposes only.

In order to derive robust parameter constraints from our measure-ments we need a representative, numerically stable, estimate forthe covariance matrix of those measurements. The full data vectorconsists of three two-point measurements for each of four snap-shots; this gives us N = × × N rp data points for each simu-lated galaxy sample (96, 144 and 192 in the case of I LLUSTRIS -1,M

ASSIVE B LACK -II, and I

LLUSTRIS

TNG respectively). Our ﬁdu-cial covariance estimate is calculated analytically, a detail of thisanalysis that differs from many previous studies, most of whichhave opted for an internal covariance estimator such as jackkniferesampling. The analytic approach has a number of advantages, notleast the ability to extend to large scales where jackknife estimatesbreak down. We show a comparison of our ﬁducial correlation ma-trix, calculated using the method described below, and a jackknifeestimate in Figure 7.Although in principle the covariance has higher order contri-butions resulting from mode mixing (e.g. Krause et al. 2016), giventhe limited statistical power of the simulations, and the fact that shotand shape noise tend to dominate on the scales we ﬁt, the dominantGaussian contribution is considered sufﬁcient for our purposes. Inthe Gaussian approximation, a given element is the sum of a noiseterm and a cosmic variance contribution:

Cov [ w z s αβ ( r p ,j ) , w z s δγ ( r p ,k )] = C SN ,z s z s ,kjαβδγ + C CV ,z s z s ,kjαβδγ , (14)where the Greek indices denote correlation types i.e. α, β, δ, γ ∈( g, +) ; z s identiﬁes a particular redshift slice and j, k are comovingscale bins. The cross correlations between snapshots is potentiallycomplicated, given that the galaxy properties are strongly (but notfully) correlated. However, since we will not attempt a fully simul-taneous analysis across redshifts, but rather restrict our inference toone snapshot at a time, we will neglect these additional covarianceterms. One can write each element as: Cov [ w z s αβ ( r p ,i ) , w z s δγ ( r p ,j )] = δ ij π A p r p ,i ∆ r p ∫ k d k Θ αβ ( kr p ,i ) Θ δγ ( kr p ,i )[ ˜ P αδ ( k, z s ) ˜ P βγ ( k, z s ) + ˜ P αγ ( k, z s ) ˜ P βδ ( k, z s )] , (15)with the kernels Θ µν ( x ) = ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩ J ( x ) µν = g + J ( x ) µν = ggJ ( x ) + J ( x ) µν = ++ , (16)where J ν is a Bessel function of the ﬁrst kind of order ν . In IA mea-surements on real data A p is a function of redshift, and accounts forthe survey mask; in our case it is simply the cross sectional area ofthe simulation box in h − Mpc − . One should also note that thepower spectra here are subject to a noise contribution, ˜ P αβ ( k, z s ) = P αβ ( k, z s ) + N z s αβ , (17)where N z s αβ = / n z s c for αβ = gg , N αβ = ( σ z s e ) / n z s c for αβ = ++ ,and N αβ = for αβ = g + . The denominator n c is the comovingvolume number density of the sample at z = z s in h Mpc − , and σ z s e is the projected ellipticity dispersion.As is apparent from the above, the analytic covariance matrixis sensitive to some extent on the input parameter values (cosmol-ogy, galaxy bias, and IAs). As stated before, cosmological parame-ters are ﬁxed to those appropriate for the simulation in question, asper Table 1. For the other (IA and bias) parameters, we generate theﬁducial matrix for each sample using an iterative procedure similarto that of Krause & Eiﬂer et al., (2017). That is, we repeatedly ﬁt thedata to obtain IA and galaxy bias parameter constrains, update thecovariance matrix and ﬁt again. Our convergence criteria are that(a) the marginalised 1D parameter posteriors are not systematicallydifferent between iterations, and (b) the χ and evidence values arestable to within a few percent. In all samples, the covariance matrixconverges within − iterations.We also test our ﬁducial analytic covariance matrix against aversion computed using jackknife resampling. In brief, the jack-knife method involves dividing the data into N spatial subregions,and repeating the measurement N times, each time removing oneof them. The validity of this approach relies on various (poten-tially strong) assumptions; not least it assumes the subregions arestatistically independent (see Anderson 2003; Hartlap et al. 2007for discussion), and that the scales of interest are much smallerthan scale of the subregions. These factors, combined with therelatively small number of subregions allowed by even I LLUS - TRIS

TNG (the largest of the simulations considered here), are theprimary reason we consider jackknife as an approximate test of,and not a viable alternative to, our analytic predictions. In the ﬁdu-cial case (I

LLUSTRIS

TNG), we divide the three dimensional boxinto N jk = = cubic subvolumes. A visual comparison of thecorrelation matrices can be found in Figure 7. We also compare theroot diagonals of the two covariance matrices (see Figure A2). Al-though there is approximate agreement between the two, the jack-knife method tends to underestimate the variance on virtually allscales in the three correlations. On the relevant scales for our ﬁts( > h − Mpc ), the differences are at the level of up to ∼ − in w g + . MNRAS , 000–000 (0000) Samuroff et al

Our analysis pipeline is built within C

OSMO

SIS (Zuntz et al.2015). The new modules introduced in this paper has been vali-dated against older free-standing code. Although we will not dis-cuss this process in detail here, a longer discussion can be foundin Appendix A. Sampling is performed using M ULTI N EST (Ferozet al. 2019), and in the subset of chains where the Bayesian evi-dence is needed, we also run using

POLYCHORD (Handley et al.2015), with more stringent accuracy settings . In all cases, we ﬁxthe cosmology to the input for the relevant simulation, with the pa-rameters given in Table 1 and zero neutrino mass. The matter powerspectrum is generated using CAMB with nonlinear modiﬁcationsfrom HALOFIT (Takahashi et al. 2012). A simulation of ﬁnite boxsize (i.e. any simulation) has an effective k limit, at which the powerspectra are truncated (see Power & Knebe 2006 and Bagla et al.2009 for discussion and quantiﬁcation), an effect that primarilyimpacts large physical scales, but potentially has ramiﬁcations atsmaller separations too. In order to avoid biasing our results, weexplicitly include this truncation in our modelling. Given the boxsizes, the actual effective small − k cutoff is at k min = π / L , or ∼ . h − Mpc , ∼ . h − Mpc and ∼ . h − Mpc in the cases ofI

LLUSTRIS

TNG, M

ASSIVE B LACK -II and I

LLUSTRIS -1 respec-tively. We assess the impact of this detail by repeating our ﬁts withﬁxed k min = . h − Mpc for the three simulations. The resultingbiases, arising from ignoring the small- k cut off, is potentially quitesigniﬁcant ( ∼ ) in both the galaxy bias and IA parameters.Our ﬁducial analysis includes physical scales in the range < r p < L / h − Mpc , where L is the length of the simulationbox. Unlike in real survey data, an upper cut is necessary to avoidedge effects due to the ﬁnite simulation size. The lower cut fol-lows several other studies (Joachimi et al. 2011; Singh et al. 2015;Johnston et al. 2019), and is intended to be conservative in remov-ing data affected by nonlinear bias. We explicitly test this choice inSection 6.1. We consider two different IA scenarios in our ﬁts, discussed inmore detail below. While it is useful to think of these as entirelyseparate models, and indeed we will refer to them as such, it isworth bearing in mind that they are nested. That is, the more com-plex model reverts to the simpler one when a subset of its param-eters are zero. For reference, the free parameters in each of thesemodels and the associated priors in each case are shown in Table 3.

One common predictive IA model is the Nonlinear Alignment(NLA) model; in essence, it is an empirically motivated modiﬁ-cation (see Bridle & King 2007, Hirata et al. 2007) to a physicallymotivated (at least partially, in certain regimes) prescription knownas the Linear Alignment (LA) model (Catelan et al. 2001; Hirata &Seljak 2004; Hirata & Seljak 2010; Blazek et al. 2011). Under theassumption of linear alignments, one can write the intrinsic shape https://bitbucket.org/joezuntz/cosmosis , v1.6; masterbranch live points = , tolerance = . , num repeats = Model Parameter PriorNLA A U [− , ] b g U [ . , ] TATT A U [− , ] A U [− , ] b TA U [− , ] b g U [ . , ] Table 3.

Free parameters for the IA model ﬁts implemented in this work.All ﬁts are performed on a single snapshot, with the same priors appliedirrespective of redshift. Note that the linear galaxy bias b g is not an IAparameter (i.e. it does not enter either the GI or II power spectra), but it isincluded in the modelling and so is shown here. The choice of priors hereis designed to be conservative, and well clear of the posterior edges. Wediscuss the possible impact of this choice, and demonstrate robustness toprojection effects in Section 5. of a galaxy in terms of the background gravitational potential at thetime of galaxy formation as: ( e I + , e I × ) = − ¯ C πG ( ∂ ∂x − ∂ ∂y , ∂ ∂x∂y ) φ ( χ ∗ ) , (18)where ¯ C is a normalisation constant, typically ﬁxed at a value of × − M − ⊙ h − Mpc (Brown et al. 2002). Following Hirata &Seljak (2004), the GI and II power spectra have the form: P GI ( k ) = − ¯ C ¯ ρ ( z ) D ( z ) a ( z ) P lin δ ( k ) (19)and P II ( k ) = ( ¯ C ¯ ρ ( z ) D ( z ) ) a ( z ) P lin δ ( k ) . (20)Here ¯ ρ is the (spatially averaged) mean matter density of the Uni-verse and D is the linear growth function. The model also predictshigher order contributions, as well as non-zero B modes arisingfrom galaxy clustering, though these are typically neglected in im-plementations of the NLA model (Hirata & Seljak 2004, Blazeket al. 2015; see the next section for further discussion). We fol-low many previous analyses in ﬁxing ¯ C to Brown et al. (2002)’svalue, and parameterising deviations in strength of alignment fromthis baseline with a free amplitude, such that P GI → A P GI and P II → A P II .The feature that deﬁnes the NLA is the substitution of the lin-ear power spectrum in Eq. (19) and (20) for the nonlinear version.The rationale for this change is as an attempt to capture the non-linear tidal ﬁeld, and indeed it does appears to improve the perfor-mance on small to intermediate scales (see, for example Bridle &King 2007; Blazek et al. 2015; Singh et al. 2015), even if it is notnecessarily internally consistent. Our second IA model, referred to as the Tidal Alignment + TidalTorque (TATT) model, was ﬁrst proposed by Blazek et al. (2019)and has been employed a number of times in the context of cosmicshear analyses in the recent past (see Troxel et al. 2018; Samuroffet al. 2019). We will provide a brief overview of the theory, andrefer the reader to those papers a more detailed description.

MNRAS000

MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations In this framework, a galaxy’s intrinsic shape is written as anexpansion in the trace-free tidal ﬁeld tensor s ij : γ Iij = C s ij † Tidal Alignment + C δ ( δ × s ij )·„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ Density Weighting + C [ ∑ k = s ik s kj − δ ij s ]·„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„‚„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„¶ Tidal Torquing + . . . , (21)with both sides of the equation evaluated at a position x , which maybe either a Lagrangian or an Eulerian position. The two amplitudes C and C describe the magnitude of alignment due to tidal align-ment and tidal torquing respectively. It is worth bearing in mind,however, that these terms can absorb IAs due to other mechanismswhen ﬁt to real data; for example, an effective non-zero C canin principle arise in a pure tidal torquing IA scenario, when com-bined with nonlinear growth (Hui & Zhang 2002). The term, withthe coefﬁcient C δ , is a so-called density weighting contribution,and arises from the fact that one can only measure galaxy shapes ina position where there is actually a galaxy (see e.g. Hirata & Seljak2004 and Blazek et al. 2015 for further discussion). Also note thatthe product of the matter overdensity and tidal ﬁelds δs ij implicitlyassumes a smoothing scale, a detail we will return to in Section 7.The real-space dark matter tidal tensor is a × matrix, deﬁned inrelation to the overdensity ﬁeld in Equation (12). If the tidal ten-sor is computed using the nonlinear matter ﬁeld, then the leadingterm in Eq. (21) is equivalent to the NLA prediction. If the TATTmodel parameters are varied together, however, they can enter thedata in potentially degenerate ways, meaning that the A part of thefull TATT space will not necessarily match the NLA ﬁt to the samedata, if A ≠ is preferred. One then has: C = − A ¯ C Ω m ρ crit D ( z ) , (22) C = A ¯ C Ω m ρ crit D ( z ) . (23)and C δ = − A δ ¯ C Ω m ρ crit D ( z ) , (24)The constant ¯ C is the same as the one discussed in the previoussection. The IA power spectra (GI and II) are derived from pertur-bation theory and are given by integrals over the matter power spec-trum; for details see Sections A-C of Blazek et al. (2019). Our ver-sion of the TATT model is identical to that of Troxel et al. (2018),Blazek et al. (2019) and Samuroff et al. (2019). It makes use ofthe FAST-PT code (McEwen et al. 2016; Fang et al. 2017), and isimplemented within C OSMO

SIS.Following Blazek et al. (2015), we do not vary A δ directly,but rather assume the density weighting term is related to the tidalalignment amplitude via a coefﬁcient (i.e. C δ = b TA C ). The orig-inal motivation for this parameterization was that IA correlationsscaling with δs ij were generated by the density weighting of theIA ﬁeld, which can only be observed where galaxies are located The intrinsic shape here is deﬁned in an analogous way to the projectedellipticity; it is the trace-free component of the moment matrix in three di-mensions, or equivalently, the eigenvector matrix of the 3D inertia tensor.As noted in Blazek et al. (2019), it is not a uniquely deﬁned quantity, anddepends on the radial weighting of the measurement algorithm. (see Blazek et al. 2015 for a more detailed discussion). As with theother terms, C δ can be thought of more generally as describingany alignment physics with large-scale correlations that depend on δs ij , and so does not necessarily correspond directly to the galaxybias constrained by w gg , as per the simple density weighting pic-ture. Indeed, in a linear and “local Lagrangian” picture of IA for-mation, in which intrinsic galaxy shapes are a linear function of thelocal tidal ﬁeld initially present where the halo (and galaxy) form,a C δ ∼ C term will be generated by the advection of galaxiesbetween the Lagrangian and Eulerian frames Schmitz et al. (2018).Given the potential for other physical effects to be captured by thesame term, it is safest to allow it to vary as a free parameter overa similar range to the other IA parameters (see Table 3). Previousstudies have chosen to ﬁx it to unity (Troxel et al. 2018, Samuroffet al. 2019, Blazek et al. 2019), based on physical arguments. Inthese cases, however, the density weighting term has been subdom-inant, allowing only very broad constraints on b TA ; Samuroff et al.(2019), show that the decision to ﬁx it was not a signiﬁcant sourceof uncertainty in the context of DES Y1 × pt cosmology. This islikely to be less true for our direct IA measurements.Finally, we note that the TATT model predicts a non-zero IA-induced B-mode term, which enters the II power spectrum, andis sensitive to C δ and C (see Blazek et al. 2019, eq. 37-39).These contributions are included in our modelling of w ++ . Again,we demonstrated in Samuroff et al. (2019) (Appendix C) that thischoice has negligible impact on parameter constraints in the con-text of a DES Y1 × pt analysis. This is not trivially true for thetype of measurement considered in this work, and so we includethe extra B-mode terms when ﬁtting the TATT model here. Given an IA power spectrum from either of the models described,one can predict the projected correlation functions at ﬁxed redshiftvia Hankel transforms. Under the Limber approximation one has: w z s g + ( r p ) = − b z s g ∫ d kk π J ( kr p ) P GI ( k, z = z s ) , (25)with the z s indicating a particular redshift (snapshot), and J beinga second order Bessel function of the ﬁrst kind. We assume lineargalaxy bias, b g ≡ δ g / δ , which is marginalised with a wide prior(Table 3). The range b g = [ . , ] is intended to be conservative,and the bias is always well constrained within these bounds. Animportant thing to note here, however, is that in a high dimensionalparameter space typical of cosmological analyses such wide priorscan cause shifts in the 2D constraints via projection effects (see e.g.Joachimi et al. 2020, Secco et al. 2020 for discussion); in our rela-tively simple setup we do not expect this to be an issue. We verifythis in our ﬁducial I LLUSTRIS

TNG TATT analysis by reducing the b g prior width to [ . , ] , and conﬁrm it does not alter our results.A similar exercise, halving the volume of the prior on the less wellconstrained b TA again has no signiﬁcant impact.In real data one would also need to evaluate an integral over aredshift kernel, deﬁned by the sample’s redshift distribution (Man-delbaum et al. 2011’s Appendix A); in our case this reduces to eval-uating P GI ( k ) at a particular redshift z s . The other two-point cor-relations follow by analogy as: w z s gg ( r p ) = b z s g b z s g ∫ d kk π J ( kr p ) P δ ( k, z = z s ) , (26)and MNRAS , 000–000 (0000) Samuroff et al w z s ++ ( r p ) = ∫ d kk π [ J ( kr p ) + J ( kr p )] P II ( k, z = z s ) . (27)In the case where we are including an IA induced B-mode contri-bution, the above becomes a sum of two integrals (see e.g. Blazeket al. 2015, equation 2.8): w z s ++ ( r p ) = w z s , EE ++ ( r p ) + w z s , BB ++ ( r p )= ∫ d kk π [ J ( kr p ) + J ( kr p )] P EEII ( k, z = z s ) +∫ d kk π [ J ( kr p ) − J ( kr p )] P BBII ( k, z = z s ) . (28)The TATT E and B mode power spectra here are given by Blazeket al. (2019)’s equations (38) and (39). In the NLA case P BBII = ,and Eq. (28) reduces to Eq. (27).Although we do not compute the 3D correlations ξ ab ( r p , Π ) ,we do factor in the fact that the line of sight integral in the mea-surement has a ﬁnite limit Π max (e.g. Eq. (10)). The effect of thisis to suppress the signal slightly, as correlated pairs are cut off.We can test the magnitude of this by comparing our observablesat a ﬁducial point in parameter space with an external modellingcode, which explicitly includes Π max . Since the impact is foundto be independent of r p on large scales, to the level of ∼ . ,we incorporate it into our modelling as a single multiplicative fac-tor µ , which we compute for each correlation function, at eachredshift (i.e. 12 numbers per simulation). In the case of I LLUS - TRIS

TNG, Π max is relatively large ( h − Mpc ), and so the signaldamping is only − (which is comfortably subdominant to un-certainty). For M ASSIVE B LACK -II and I

LLUSTRIS -1 ( Π max = and Π max = h − Mpc respectively), however, µ is somewhatlarger, which shifts the IA parameters upwards slightly. Althoughour qualitative conclusions are robust even without this correction,omitting it is seen to bias the A and A towards low values by ∼ − . As discussed, our baseline methodology is to ﬁt the joint data vec-tor of w gg , w g + and w ++ simultaneously for a given simulation andat a given redshift. In this section we present the results of theselikelihood analyses. This approach is analogous to cosmologicalinference using × data, with the signiﬁcant difference that ourparameter space is several times smaller (and does not include cos-mological parameters). It carries a number of advantages, not leastbeneﬁtting from some level of complementarity in the degeneraciesof the different data vector elements.We perform our IA model ﬁts to each of the four redshiftsnapshots independently, a choice primarily driven by the covari-ance matrix; unlike in real data, where each galaxy can be assigned(albeit not necessarily correctly) to a single tomographic bin, herewe effectively have one realisation of the galaxy ﬁeld, which isevolved with redshift. The galaxy population, the shape noise andthe cosmic variance are, then, potentially heavily correlated be-tween redshifts, which makes a fully simultaneous analysis com-plicated. Modelling such correlations is non trivial, and not consid-ered a valuable exercise within the scope of this paper.Despite their potential, concerns persist around the accuracyof hydrodynamic simulations as an effective model for intrinsic alignments; systematic uncertainties arise largely from the under-lying physics models, and are evidenced by longstanding disagree-ments between different simulations. Discussion of such differ-ences in the literature have focused on the impact of baryons onthe matter power spectrum (see van Daalen et al. 2011, Chisari et al.2018, Huang et al. 2019); discrepancies in the magnitude (and sign)of alignments have been noted (Chisari et al. 2015; Codis et al.2015b; Tenneti et al. 2016; Chisari et al. 2016), but these have per-haps received less attention due to the fact that, unlike the baryoniceffects, IA measurements in these simulations do not feed directlyinto cosmological analyses (although they could do, potentially, infuture). To properly diagnose this systematic uncertainty it is use-ful to compare the results from multiple simulations using a uni-ﬁed analysis framework, and appropriately weighted samples, aswe seek to do in this section. As discussed above, the reweightingis designed only to match the halo mass distributions, and not to ﬁxother differences in, for example, the galaxy formation properties;we consider these more complex differences as sources of system-atic uncertainty. Indeed, it is interesting to try to disentangle themfrom discrepancies due to differences in the analysis details (e.g.the galaxy selection method) of previous studies. The posteriors from NLA model ﬁts to the various simulations at z = are shown in the upper panel of Figure 8. As described inSection 3.2, the M ASSIVE B LACK -II and I

LLUSTRIS -1 samplesare reweighted, such that the halo mass distributions match (seealso the discussion in Appendix C, where we demonstrate the im-portance of this reweighting). This process is designed to allowmeaningful comparison between simulations by ensuring that dif-ferences in halo mass distribution are not driving the offset in theIA-bias parameter space. As noted above, the halo mass weightingis not guaranteed to eliminate all differences due to sample compo-sition arising from how the galaxy halo connection is implementedin the simulations For clarity, we do not show the three other snap-shots at z > , but note that very similar qualitative trends are seenout to z = .Noticeably, the galaxy bias (horizontal axis, upper panel)agrees well between the different simulations; given the relativelytight relation between halo mass and large scale bias (modulo cos-mological parameter-dependence), this is perhaps unsurprising. Al-though not shown for TATT, the marginalised posterior on bias isclose to independent of the choice IA parameterisation, primarilybecause w gg dominates the constraint. The relative agreement be-tween the detected NLA signal in the different simulations hereis interesting, in the context of existing literature. It has been ob-served anecdotally (Tenneti et al. 2016, Chisari et al. 2016) thatM ASSIVE B LACK -II tends to prefer a slightly stronger IA ampli-tude than I

LLUSTRIS -1. This conclusion is supported at some levelhere; in the NLA case M

ASSIVE B LACK -II favours slightly larger A values than either I LLUSTRIS

TNG, although the difference isless than σ at any given redshift. The difference is more pro-nounced in the TATT scenario (lower panel Figure 8 and also Fig-ure 9 below), although still only at the level of − σ . It is alsoworth remarking that this is the ﬁrst time a robust comparison hasbeen attempted with a homogenised sample, using w g + and w ++ simultaneously, and with an analytic covariance matrix that is nu-merically stable on large scales.The joint posteriors on the TATT model amplitudes A , A are shown in the lower panel of Figure 8 (see also Appendix B forthe full TATT posteriors from the three simulations). These am- MNRAS000

A Constraints from Hydrodynamic Simulations . . . . . . . b g . . . . . . A NLA

IllustrisTNGMassiveBlack-IIIllustris − − − A − − A TATT

IllustrisTNGMassiveBlack-IIIllustris

Figure 8. σ and σ conﬁdence contours from our NLA (top) and TATT(bottom) model ﬁts to various hydrodynamic simulations at z = . Shownare I LLUSTRIS

TNG (purple, solid), I

LLUSTRIS -1 (green, dotted) andM

ASSIVE B LACK -II (blue, dashed). Note that the M

ASSIVE B LACK -II andI

LLUSTRIS -1 samples here are weighted, such that the distributions of hosthalo masses match between the simulations, in order to allow meaning-ful comparison with I

LLUSTRIS

TNG (see Section 3.2). The three horizon-tal lines in the NLA panel show the A values of the respective posteriorpeaks; these best-ﬁtting values are A TNG1 = . , A MBII1 = . and A Ill1 = . . The three simulations are consistent in the NLA space to ∼ σ ,although some mild disagreement is seen in the case of the more complexmodel. plitudes can be thought of as controlling the strength of differentIA contributions, which are linear and quadratic in the tidal ﬁeldrespectively. Note that the TATT ﬁts also include additional pa-rameters ( b TA and linear galaxy bias b g ), which are marginalisedin this 2D representation (see Sec 4.1.2 and Table 3). In this lim-ited parameter space we do not believe our marginalised resultsto be signiﬁcantly affected by prior volume effects (e.g. the dis-cussion in Joachimi et al. 2020). We conﬁrm that rerunning theTATT chains with a reduced b TA prior U [ , ] does not qualita-tively change the TATT posteriors. In the case of M ASSIVE B LACK -II and I

LLUSTRIS -1, the constraint is degraded relative to I

LLUS - TRIS

TNG, to the extent that quite different TATT IA scenarios areallowed within σ . That I LLUSTRIS -1 offers little-to-no constrainton the extended model is unsurprising; indeed we are ﬁtting a smallhandful of relatively noisy points in the > h − Mpc range, which provide no real information on the shape of the correlation func-tion. Unlike in the NLA case, we now see some level of disagree-ment between the different simulations; that is, whereas I

LLUS - TRIS

TNG favours a region of parameter space that resembles NLA(i.e. A ∼ ), M ASSIVE B LACK -II prefers A < at σ . Whilethis could be a sign of a real alignment signal, generated by thephysics models of M ASSIVE B LACK -II, it is worth being cautioushere; the TATT model will respond to any structure in the data,regardless of physical origin, and M

ASSIVE B LACK -II has knownlimitations . Inspecting the data vector (Figure 5) more closely, itseems that the A < is driven by the gradual rise in power be-tween − h − Mpc . This feature is seen in both w g + and w ++ , andit does indeed seem to be relatively well ﬁt by the quadratic align-ment contribution. It is also notable that there is no correspondingfeature at around the same scale in w gg , which is somewhat reas-suring that this is a real signal, and not an artifact of the simulations.In all cases we note that the data favour low values of b TA ,albeit with relatively large uncertainties. The region of parameterspace where one could reasonably interpret the TATT tidal align-ment bias as a pure physical galaxy bias are disfavoured at ∼ σ ,with b TNGTA = . ± . , b MBIITA = . ± . . Interestingly, in theupper redshift bins M ASSIVE B LACK -II prefers a weakly negative b TA (Table B1), the physical interpretation of which is not imme-diately clear. Given the sample selection, and the limitations of thesimulations, it is not obvious that the low b TA values transfer toreal lensing data, but it is interesting, in the sense that the data are(mostly) showing a preference for the simpler IA scenario.From the I LLUSTRIS

TNG ﬁts, the ﬁnal posterior mean TATTparameter values at z = are: A TNG1 = . ± . , A TNG2 = . ± . , b TNGTA = . ± . . (29)The A constraint here is consistent with the equivalent NLA am-plitude from the two-parameter ﬁts ( A = . ± . ), a conclusionthat largely holds across the three simulations. That is, switching toTATT leads to a degradation in the uncertainty on A (by roughly for I LLUSTRIS

TNG at z = ), but no signiﬁcant shift in thefavoured value. Remarkably, although M ASSIVE B LACK -II favoursnegative A = − . ± . at the level of ∼ − σ , I LLUSTRIS -1 and I

LLUSTRIS

TNG are consistent with zero across the redshiftrange. The small A values differ slightly from recent studies onDES data (Troxel et al. 2018, Samuroff et al. 2019), which report apreference for A < (although our I LLUSTRIS

TNG constraint isstill at most ∼ σ from the DES Y1 mixed sample; Samuroff et al.2019 Figure 12). Note however that in such analyses on photomet-ric data like the studies cited above, where the two-point functionsare measured in broad redshift bins as a function of angular scale,a signiﬁcant amount of mode-mixing can occur. That is, one can-not cleanly separate physical scales. In addition to this, it is worthbearing in mind that no analysis on real data can ever be perfect;despite various robustness tests and validation carried out for DESY1, we cannot altogether rule out the leakage of other modellingerrors (e.g. in the photometric redshift distributions) into the IA In particular, there is a lack of realistic spiral type galaxies, and a rela-tive over-abundance of diffuse elliptical objects compared with data. Dueto relatively weak AGN feedback, M

ASSIVE B LACK -II produces an over-predicts the number of massive galaxies at low redshift (Khandai et al.2015). Another manifestation of this is seen in the ipact of baryons on thenonlinear matter power spectrum, which is signiﬁcantly different from thatin any other hydrodynamic simulation (Huang et al. 2019’s Figure 1).MNRAS , 000–000 (0000) Samuroff et al . . . . . . Redshift − − A i TNG ( A )MBII ( A )Illustris ( A ) TNG ( A )MBII ( A )Illustris ( A ) TNG ( A , NLA)MBII ( A , NLA)Illustris ( A , NLA)

Figure 9.

The redshift evolution of the two TATT model amplitudes in ourthree simulated datasets. The ﬁts were performed on each redshift slice in-dependently. The simpler NLA ﬁts are also shown for reference (stars). Theresults from M

ASSIVE B LACK -II (dark blue), I

LLUSTRIS -1 (green) and I L - LUSTRIS

TNG (purple) are shown. Note that the galaxy samples for a givensimulation at different redshifts strongly overlap, and so the errors are po-tentially highly correlated, to an extent not reﬂected in the σ error bars inthis ﬁgure. constraints. For these, amongst other, reasons that it is not trivialto extrapolate from our results to comment on the detectability ofhigher order IA contributions in real data.The lack of a clear detection of higher order alignment termsis not altogether surprising, given the relatively conservative scalecuts implemented here ( r p > h − Mpc ; see also Section 6.1).Given the difference in physical scaling, naturally the alignmentof galaxies on very large scales should resemble the tidal align-ment scenario ( A > , A = ). Although we do not have a strongﬁrst-principles prediction of the scales on which the quadratic termsshould become signiﬁcant, we can make a rough estimate. Basedon theory predictions, in scenarios that are consistent with previ-ous observations (Samuroff et al. 2019), the regime where the tidaltorquing terms are not totally subdominant to tidal alignment issomewhere on the scale of a few h − Mpc (see Blazek et al. 2015,Blazek et al. 2019). This places our ﬁts in the marginal regime,where it is possible, but not certain that we might detect a non-NLA-like alignment signal.

To illustrate the redshift evolution of the various IA parameters, weshow the marginalised best-ﬁts and σ uncertainties in Figure 9.As before, we show all three simulations in purple/blue/green. Itis worth keeping in mind here that there is signiﬁcant overlap be-tween samples at different redshifts, meaning the shape noise is po-tentially quite strongly correlated. The interpretation of the varioustrends shown in this ﬁgure, then, are not trivial. That said, the basicpatterns noted above are seen to hold across the redshift range. Thatis, with the partial, weak, exception of M ASSIVE B LACK -II, the A values obtained in the NLA and the TATT analyses are consistentwith each other for a given simulation (compare the stars with thetriangles in Figure 9). The TA alignment amplitude rises more orless monotonically in I LLUSTRIS

TNG and M

ASSIVE B LACK -II, and the two simulations agree well in the NLA case at all redshifts.With the extra freedom of the TATT model, however, we see somelevel of divergence, with M

ASSIVE B LACK -II favouring a higher A by a factor of ∼ . − . (although the upwards trend withredshift persists). This seems to ﬁt with an underlying assumptionof the linear alignment model: that IAs are frozen into a popula-tion of galaxies at early times (see Kiessling et al. 2015, Schmitzet al. 2018 and Kirk et al. 2012, particularly their App. A and refer-ences therein). As the underlying large scale structure evolves andhalos grow, the subhalo mass distribution shifts upwards. In ourcase, then, the ﬁxed stellar mass cut is more stringent, and removesa larger fraction of weakly aligned objects at high redshift than atlow redshift. The net effect of this is an increase in the measuredIA signal with increasing z . Though physically interesting, we re-iterate that the ﬂat lower mass cut at each redshift is not represen-tative of the selection function in a real lensing catalogue. In realdata with realistic ﬂux- and shape-based quality cuts, the changesin composition with redshift will have a signiﬁcant bearing on howthe effective IA amplitude evolves. A step in this direction (albeitstill not capturing the full complexity of a redshift dependent se-lection in real data) would be to use the simulation merger tree topropagate through a mass cut at given redshift. Bhowmick et al.(2020) attempt such an exercise for M ASSIVE B LACK -II, with re-sults that are qualitatively consistent with the present study. Al-though in the consistently traced sample (SAMPLE-TREE in theirterminology) increasing halo-satellite misalignment tends to washout alignments at high z , the impact of the changing populationopposes, and largely outweighs this trend.In contrast, A is more or less constant with redshift inall simulations (the downward triangles in Figure 9). NotablyM ASSIVE B LACK -II’s preference for A < is not seen to persistacross snapshots, although the interpretation of this is non-trivial.Particularly in the higher redshift slices, the M ASSIVE B LACK -II posteriors exhibit signiﬁcant bimodality, which appears to arisefrom a degeneracy between A and b TA . Although positive andnegative A result in quite different w g + predictions, all other pa-rameters held ﬁxed, the combination b TA ∼ , A ∼ − . and b TA ∼ − , A ∼ . both produce theory curves that ﬁt the z = data adequately on scales r p > (see Figure B2). The theory pre-dictions differ somewhat on smaller scales, suggesting that push-ing below our ﬁducial scale cut could potentially help to break thisdegeneracy. This distorts the 1D point representation in Figure 9,shifting the mean towards zero, and also broadening the σ stan-dard deviation signiﬁcantly. Beyond simple posterior constraints, one can also gauge the abil-ity of the data to support the extended modelling in a quantitativeway. A number of goodness of ﬁt metrics exist in the literature,and we consider a subset of those here. Since I

LLUSTRIS -1 is rel-atively unconstraining, and is known to have ﬂaws (in the sensethat it over-predicts the strength of baryonic feedback, which isknown to interact with IAs; Soussana et al. 2020), we comparethe results using I

LLUSTRIS

TNG and M

ASSIVE B LACK -II only.The simplest metric is the raw shift in χ when switching be-tween models (see, for example, Krause et al. 2016); in the I L - LUSTRIS

TNG case, that is ∆ χ = − . , marginally favouringthe extended model, with similar values obtained at higher red-shift. A somewhat stronger preference is seen in M ASSIVE B LACK -II, which gives ∆ χ = − . . One slightly more sophisti-cated indicator of model ﬁt is the Bayesian Information Criterion MNRAS000

A Constraints from Hydrodynamic Simulations (BIC; Arevalo et al. 2017), which effectively balances reducing thetheory-data residuals against the extra complexity of the model. ForI LLUSTRIS

TNG, ∆BIC = . , which translates into a “positive”preference for NLA. That is, by this indicator, the data do not war-rent the additional parameters. In contrast, the M ASSIVE B LACK -II data, which we recall showed a preference for non-zero A ,gives ∆BIC = − . , this time in favour of the TATT model. Con-sidering ﬁnally the Bayes factor (Marshall et al. 2006), we see asimilar picture: B = Z TATT / Z NLA = . for I LLUSTRIS

TNG,which indicates that the data favour the simpler model (or rather,the extra TATT parameters do not provide a sufﬁciently better ﬁtto outweigh the added model complexity). Again, in the case ofM

ASSIVE B LACK -II, the results are slightly clearer, with B = . ,which (just) falls into the category of “substantial” evidence on theJeffreys Scale. In summary, these numerical exercises bear out thequalitative picture we saw earlier; while I LLUSTRIS

TNG, on therelatively large scales considered, shows no evidence that the NLAmodel is insufﬁcient, M

ASSIVE B LACK -II does show hints.A different, but related, question one could ask is: given ourresults, and assuming a particular underlying model, to what ex-tent can we say that there is disagreement between the simulations?Do the hints at non-zero TATT parameters in M

ASSIVE B LACK -II point to systematic tension between the underlying physicalalignment models, or are they in fact consistent with realisa-tions of the same model? We reiterate here that the samples areweighted, such that differences in the underlying halo mass distri-bution should not be responsible for any differences between thesimulations. Again, there are a number of metrics available, suitedto different scenarios with different caveats (see Campos & Lemoset al., 2020 for discussion), and we will not attempt a comprehen-sive comparison. For our purposes, we adopt a slightly differentform of the Bayes ratio (see Eq. V.3, Dark Energy Survey Collabo-ration 2018), R = p ( D TNG , D MBII ∣ M IA ) p ( D TNG ∣ M IA ) p ( D MBII ∣ M IA ) . (30)The numerator here is the Bayesian evidence obtained from jointlyanalysing the two-point data from the two simulations. The lowerterms are those from the separate analyses of I LLUSTRIS

TNG andM

ASSIVE B LACK -II in isolation. Note that in the joint analysis, weassume the two data sets are independent, with no cross covariance.In the TATT case, we ﬁnd R < . , which constitutes strong evi-dence for tension on the Jeffreys Scale. Again, this is implied bythe differences we saw in the marginalised credibility contours, butit is interesting that it is borne out by the numerical metric. In this section we discuss a series of modiﬁcations to our base-line analysis, with the aim of exploring the basic results above inmore depth. This includes a series of analyses with less stringentcuts, probing scales down to h − Mpc . We also examine the de-pendence of the signal on various physical properties, includingcolour, type (central or satellite) and luminosity.

As we have seen in Section 5, our ﬁts to the large scale I

LLUS - TRIS

TNG correlation functions are consistent with the NLA sce-nario (i.e. pure tidal alignment). While there is a detectable IA sig- nal, the parameters controlling deviations from NLA are consis-tent with zero. At least in principle, however, there exists a regimewhere the higher-order corrections are signiﬁcant (and thus neces-sary to model the data adequately), but one halo contributions arestill subdominant (see Blazek et al. 2019’s Fig. 1). It is this that mo-tivates us to extend our ﬁts below the ﬁducial cut off at h − Mpc .The ﬁducial cut follows Joachimi et al. (2011) and, as dis-cussed there, is conservative by design, intended to be well clearof the scales on which nonlinear bias enters the data. The precisescales on which the linear approximation breaks down is, however,somewhat dependent on the galaxy selection, as well as the sta-tistical precision of the measurement. One beneﬁt of using sim-ulated data, however, is that we have access to the dark matterﬁeld directly; it is, then, possible to check where exactly nonlin-ear galaxy bias begins to manifest in our particular measurements.A longer discussion can be found in Appendix D, but in brief weestimate the effective scale-dependent bias as a function of r p asthe ratio b g = ( w gg / w δδ ) . Based on this exercise, within I LLUS - TRIS

TNG’s statistical uncertainties, we see that the linear bias as-sumption holds well down to ∼ h − Mpc . Motivated by this ﬁnd-ing, we repeat our ﬁducial analysis, sequentially relaxing the lowerscale cut down to r p > h − Mpc . The results can be found in Fig-ure 10 (see also Table 4).As we can see in Figure 10, all the way down to h − Mpc ,the higher-order TATT model parameters favoured by the I

LLUS - TRIS

TNG data are consistent with zero. This includes the densityweighting term b TA (not shown), as well as the quadratic ampli-tude A . Although there appears to be information on the smallerscales, evidenced by the reduction in the size of the posteriors andthe slight change in the degeneracy direction, there is no clear signof deviations from NLA. The added constraining power is partic-ularly clear in the case of the A amplitude, although we also seea modest tightening of the uncertainties on A and b TA about theircentral values. It seems reasonable to draw from this that althoughwe have physical reason to think that the additional TATT contri-butions exist in the Universe, they are small enough on the scaleswe use to be undetectable, given the statistical precision of I LLUS - TRIS

TNG. The higher order terms scale rapidly with r p , and soit is quite possible that they dominate in a similar regime to non-linear galaxy bias. This is also consistent with the conclusions onemight draw from naively looking at the data vectors in Figure 5;the purple points are reasonably ﬁtted by the purple lines (the bestﬁtting NLA model), even down to scales ∼ h − Mpc . This is trueof both w g + and w ++ and, while deviations do exist, they are atslightly smaller scales. Fitting IAs on even smaller scales, wherenonlinear bias becomes non-negligible, is possible, given that per-turbation theory predicts higher order bias contributions in muchthe same way as the higher order IA contributions in TATT. It is,however, complicated by the presence of nonlinear bias - nonlin-ear IA cross terms, which we cannot safely assume are negligible.Although we do not attempt such an analysis here, implementinga consistent perturbative model, including the cross terms, is thefocus of ongoing work.We perform a similar exercise with M ASSIVE B LACK -II, ﬁt-ting the z = correlation functions down to h − Mpc . Again,the constraints tighten signiﬁcantly; now, however, the contoursshift in the negative A , positive A direction ( A = . ± . , A = − . ± . , b TA = − . ± . ). MNRAS , 000–000 (0000) Samuroff et al

Cut / h − Mpc

Model N pts A A b TA b g r p > NLA 14 . ± .

17 0 . . . ± . r p > TATT 14 . ± .

49 0 . ± .

65 0 . ± .

86 1 . ± . r p > TATT 17 . ± .

49 0 . ± .

47 0 . ± .

86 1 . ± . r p > TATT 20 . ± .

44 0 . ± .

36 0 . ± .

68 1 . ± . r p > TATT 23 . ± . − . ± .

28 0 . ± .

38 0 . ± . Table 4.

Quality metrics for TATT model ﬁts to I

LLUSTRIS

TNG at z = . The second column, labelled N pts indicates the total number of points included inthe joint ﬁt to w gg , w g + and w ++ , after scale cuts. r p > h − Mpc r p > h − Mpc r p > h − Mpc − A . . . . . A − b T A − A − b TA Figure 10.

TATT parameter constraints from our z = I LLUS - TRIS

TNG sample with a selection of lower scale cuts (as labelled). Thethree analyses favour approximately the same A , with slightly varying pre-cision. Even in the case of the least stringent cuts, the results are consistentwith A = . In this section we impose a series of catalogue level splits, withthe aim of understanding how our results depend on galaxy prop-erties. For two main reasons, we only consider the ﬁducial I

LLUS - TRIS

TNG catalogues in this section. First, the larger volume allowssome leeway, such that sub-divisions can be made without degrad-ing the constraining power beyond the point of usefulness. Second,and more importantly, only in I

LLUSTRIS

TNG do we have sufﬁ-ciently realistic galaxy photometry (see Section 2.4.3). Althoughsome of the properties considered here are correlated, we seek todisentangle the impact of each insofar as we can. For each of thecases discussed below, the new data vectors are recomputed usingthe same pipeline as before. For each subsample, we also repeatthe iterative covariance matrix calculation discussed in Section 3.4with the appropriate galaxy densities and ellipticity dispersions.

The ﬁrst split we examine is in colour-magnitude space. The abilityto perform a colour cut, and retain a signiﬁcant number of red andblue objects, is a marked difference between this work and previousdirect IA measurements on real data, which have focused on bright − − A i Red Galaxies .

00 0 .

25 0 .

50 0 .

75 1 . Redshift z − − A i Blue Galaxies

NLATATT ( A )TATT ( A ) Figure 11.

Best ﬁtting IA model parameters as a function of redshift forour colour split I

LLUSTRIS

TNG samples. Note that two TATT model am-plitudes are ﬁt simultaneously for each of the two samples. The pink pointsin the upper panel are measurements of the NLA model amplitude in redgalaxies from the literature. Speciﬁcally, we show SDSS Main Sample( z = . ; Johnston et al. 2019), BOSS LOWZ ( z = . ; Singh et al.2015), GAMA red sample ( z = . and z = . ; Johnston et al. 2019)and MegaZ ( z = . ; Joachimi et al. 2011). Similarly, the light blue pointsin the lower panel represent published blue galaxy constraints: SDSS MainSample (at z = . ; Johnston et al. 2019), GAMA Z2B ( z = . ; John-ston et al. 2019), and WiggleZ ( z = . ; Mandelbaum et al. 2011). red samples at low redshift. We recompute the correlation functionsand covariance matrices for the red and blue subsamples describedin Section 2.4.3. As in all of our large scale ﬁts, the full unsplitcatalogue is used for the density part of the correlations. This givesus an analogous two new data vectors, D red = ( w RR ++ , w Rg + , w gg ) and D blue = ( w BB ++ , w Bg + , w gg ) , with the superscripts R and B denoting the red and the blue samples. Note that the density tracersample is not split, and so w gg here is the same in the two datavectors (and the same as that analysed in Section 5). We ﬁt both IAmodels using each data vector, with the results shown in Figure 11.For the sake of clarity and to aid comparison, rather show the fullparameter contours, we have condensed the IA amplitude parame-ters into 1D posterior means and 68% error bounds. While this isuseful for illustrative purposes, it can be reductive in cases wherethe posterior is non Gaussian, as we will discuss below. MNRAS000

A Constraints from Hydrodynamic Simulations As before, the single NLA amplitude approximately agreeswith the A amplitude from the TATT ﬁts in almost all cases; theexception to this is the high z red sample, which favours a combi-nation with nonzero TT contribution and a correspondingly lowerTA amplitude, although the signiﬁcance of A ≠ is still only ∼ − σ . Although the details of the redshift distribution and thesample selection make direct comparison non-trivial, it is interest-ing to note that this disagrees mildly with the ﬁndings of Samuroffet al. (2019), which are based on ﬁts to real cosmological lens-ing measurements from DES Y1, where positive values of A in ared source sample were disfavoured at the level of ∼ σ (see theirFigure 16). We also plot a number of previous direct IA measure-ments in Figure 11 (the pastel coloured points in both panels), fromBOSS LOWZ (Singh et al. 2015), KiDS, GAMA and SDSS (John-ston et al. 2019), MegaZ (Joachimi et al. 2011) and WiggleZ (Man-delbaum et al. 2011). Although red galaxy measurements are morenumerous, there are a handful of comparable studies on blue galax-ies. As one can see from Figure 11, our ﬁts on I LLUSTRIS

TNG arelargely consistent with the measurements on data. The only slightdeviation from this is WiggleZ, which is lower than our results atequivalent redshift (albeit only by ∼ σ ). It is, however, worth bear-ing in mind that WiggleZ is atypical in terms of sample, comprisinga bright starburst population, rather than a simple colour-selectedblue sample.A notable, and perhaps worrying, feature of Figure 11 isthe relatively strong IA signature in blue galaxies. The amplitudeof w Bg + , while signiﬁcantly lower w Rg + , is persistently non-zeroat z > . . To aid in understanding this observation, we repeatthe two-point measurements and NLA ﬁts on the upper redshiftsnapshot, with an additional mass cut, considering only galaxiesin the lower , M ∗ < . × h − M ⊙ (mean stellar mass M ∗ = . × h − M ⊙ ). Even here, we see non-zero alignmentsat several σ , A = . ± . . Although lower than both the blueand unsplit samples at z = , it is still a relatively strong signal.Remarkably, we ﬁnd that the high redshift blue IA feature persistsunder further mass splitting, down to M ∗ < . × h − M ⊙ ; at thispoint, there are only ∼ blue galaxies in the shape sample, suchthat although the measurement is consistent with null, the errorbarsstill encompass signiﬁcant non-zero values. Although interesting, itis not clear whether this is a function of the relatively stringent con-vergence cut ( M ∗ > . × h − M ⊙ ) , and if so how far down inmass the alignment signal continues. It is also not obvious whetherthis transfers to a signiﬁcant IA lensing contaminant +in a morerealistic setup; implementing a redshift-dependent selection func-tion, typical of real cosmic shear is a topic we will explore in futurework.Given that the role of a galaxy within its halo is a signiﬁ-cant factor in determining its alignment, we also repeat the high z blue measurements with an additional satellite/central split. Theresults here are less ambiguous: the residual blue galaxy signalis generated almost entirely by central galaxies. That is w g + , asmeasured using satellite galaxy shapes is consistent with zero onscales r p > h − Mpc . This result seems to support, at least in ourcase (which is simpliﬁed relative to real data in a number of ways),the ﬁndings of Johnston et al. (2019), which suggest colour aloneis an imperfect determinant of IA properties. Singh et al. (2015)also note similar, although consider only LRGs (that is, their re-sults were a statement on the relative homogeneity of IAs in redsequence galaxies of given luminosity, rather than on the efﬁcacyof colour based splits). In the absence of blue high z alignmentmeasurements in real data, it is difﬁcult to say whether this is a fault in the simulations, generating an artiﬁcially strong IA signalin blue centrals, or a real feature of the Universe. Although stellar mass is not, in general, an observable quantity it isan important one; this is true both in that IAs (and other galaxyproperties) signiﬁcantly depend on it, and that it is a proxy foractual observables. Indeed, this link is key to approaches such asHalo Occupation Distribution (HOD) modelling. We compute eachgalaxy’s stellar mass as an unweighted sum over the stellar particlesassigned to its subhalo. Unlike with colour and centrals/satellitesthere is no natural dividing line for this split, and so we choose todivide galaxies into equal number mass bins. For the moment wewill consider a simple upper/lower mass division, but will considera more complex binning in what follows. Again, the split is appliedto the shape sample only, leaving the density tracer intact (and so w gg unchanged).Though the satellite fraction is not systematically changed bythe division in any of the snapshots, we do see a shift in the abun-dence of red galaxies. That is, the red fraction of the high mass sam-ple is boosted relative to the full sample, from ∼ to ∼ at z = and from ∼ to at z = . This qualitative trend, thatthe red fraction increases with mass, and declines with redshift, isconsistent with the patterns seen in real data (see e.g. Prescott et al.2011). In the case of the NLA constraints we have a relatively sim-ple picture from the mass-split reanalysis; at a given redshift, highmass galaxies are both more biased, and more strongly aligned, asillustrated in Figure 12; the direction of the shift in bias-IA am-plitude parameter space when going from the high to low mass isroughly the same, irrespective of redshift. The redshift trend canperhaps be understood as follows: if we are to believe the basicLA model premise, then intrinsic alignments are imprinted at earlytimes, and persist into the low redshift universe. In this picture, atleast, high mass red galaxies at z = are strongly aligned, and sowe can extrapolate from this that the objects that become brightred high mass galaxies are also strongly aligned. In other words, asredshift increases, even if the mean subhalo mass declines (whichit does, in Table 2) the more massive, redder section of the galaxypopulation will be strongly aligned.What does stay ﬁxed, however, is the lower mass thresholdwe impose on our catalogues. As the whole mass distribution shiftsdownwards, then, we are preferentially cutting more of the lowerpart of the mass distribution, and so discarding a larger fractionof weakly aligned objects. The gradual evolution in IA and biasparameters covers the range from virtually unaligned low massgalaxies at z = to A ∼ . in the high mass high redshift bin.It is worth bearing in mind that, although physically interesting,this pattern does not trivially carry over into real data, because insuch cases other observational effects become relevant. The factthat we typically use ﬂux-limited galaxy samples for lensing mea-surements, for example, means that the mean stellar mass tends to increase with redshift, not decline as in our case. Fully separatingout these effects, of sample composition and evolution of intrinsicalignments, would require a more careful exploration using mergertrees of the sort presented by Bhowmick et al. (2020).We also rerun our TATT analysis on the mass-split data vec-tors, giving the marginalised parameter constraints shown in Figure12. For clarity, we show only z = here, but ﬁnd similar patternsin all snapshots. As before, A gradually increases over the range z = − , and the σ contour encompasses A = in all cases. Thatsaid, there is a relatively strong anti-correlation between the two IA MNRAS , 000–000 (0000) Samuroff et al − A − − A Low Mass, z = 0 . z = 0 . Figure 12.

TATT model posterior constraints from I

LLUSTRIS

TNG, un-der a binary high/low mass split. The sample is divided about the medianstellar mass, M ∗ = . × h − M ⊙ , and the two subsamples are ﬁt inde-pendently. − . − . . . . h L i /L ) A LOWZ (Singh et al 2015)MegaZ LRG + SDSS (Joachimi et al 2011)GAMA + SDSS (Johnston et al 2019)TNG Red GalaxiesTNG Blue Galaxies

Figure 13.

Luminosity dependence of the measured NLA intrinsic align-ment amplitude. The red/blue diamonds show the two colour subsamplesof I

LLUSTRIS

TNG at z = . , and the dotted lines of the same colourshow power law ﬁts to these data. For reference, we also show comparablemeasurements from MegaZ + SDSS LRG + L4 + L3 (Joachimi et al. 2011),BOSS LOWZ (Singh et al. 2015) and KiDS × GAMA+SDSS (Johnston et al.2019) in purple, pink and green. The shaded bands represent their ﬁts andthe corresponding uncertainties. amplitudes, such that a range of scenarios with A > , combinedwith slightly reduced A , are also equally favoured. In addition to the binary mass cut above, we also consider di-rectly the luminosity dependence of the measured IA signal, deﬁn-ing four equal-number bins in r − band luminosity. To separateactual luminosity dependence and changes in the red fractionbetween bins, we impose the red/blue colour split described inSection 6.2.1. The signal-to-noise in the upper red bin is par- − . − . . . h L i /L ) − A i Red Galaxies ( A )Red Galaxies ( A )Blue Galaxies ( A )Blue Galaxies ( A ) Figure 14.

Luminosity dependence of the measured TATT model intrin-sic alignment parameters. As in Figure 13 the shaded bands show ﬁts toMegaZ, LOWZ and SDSS+GAMA (points now omitted). The diamondsshow the tidal alignment amplitude A , while the stars show the tidaltorquing contribution A . We show the best ﬁtting power laws, parame-terised A i ( L ) = A i,z ( L / L ) β i , for A (dotted) and A (dashed). Thenumerical values of the power law slopes are quoted in Section 6.2.3. ticularly high, which motivates a further equal number subdivi-sion, slightly extending our luminosity coverage. We then havethe luminosity bins L r, red / L = [( . − . ) , ( . − . ) , ( . − . ) , ( . − . )] and L r, blue / L =[( . − . ) , ( . − . ) , ( . − . ) , ( . − . )] , which roughly, but not exactly, correspond to mass bins.In each one we recompute the correlation functions and the covari-ance matrix, then ﬁt using both IA models. As before we imposethe split only on the shape sample, which is correlated with the fulldensity sample. The results, as a function of r − band luminosity, areshown for NLA and TATT respectively in Figures 13 and 14. Forthe purposes of comparison with the literature, we consider the sec-ond snapshot, z = . only here. This choice does not signiﬁcantlychange the conclusions of this section.Figure 13 shows the the NLA case, with open points indi-cating previous constraints on the IA luminosity relation usingdata (Joachimi et al. 2011, Johnston et al. 2019; see also For-tuna et al. 2020’s Figure 5 and Singh et al. 2015’s Figure 10).All of these represent direct IA measurements at low redshift, us-ing relatively bright red samples. The MegaZ + SDSS LRG +L4 + L3 ﬁt (Joachimi et al. 2011) in particular has been widelyused in the literature to extrapolate the luminosity dependence tofainter samples (see e.g. Krause et al. 2016). In addition, we showour new results from I LLUSTRIS

TNG, both red and blue sam-ples. The blue simulated subsample lies towards the fainter endof this plot, going fainter than any of the data measurements. Thered, on the other hand, covers a wider luminosity range, spanningboth KiDS × GAMA+SDSS (Johnston et al. 2019) and the MegaZ+ SDSS LRG + L4 + L3 ﬁt (Joachimi et al. 2011) points. It isworth remarking here that unlike Figure 11, the points each repre-sent a different set of galaxies. Whereas there the different snap-shots strongly overlap, and so are subject to highly correlated er-rors, the noise realisations should now be independent, making ﬁt-

MNRAS000

MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations ting a trend relatively simple. The luminosity dependence is param-eterised as A ( L, z ) = A z ( LL ) β , (31)where L is a pivot luminosity, corresponding to an absolute mag-nitude M r = − . The amplitude A z and power law index β areleft as free parameters in our ﬁts. Doing a simple least-squaresﬁt to Equation (31), we obtain β , red = . ± . , A z, red = . ± . for the I LLUSTRIS

TNG red sample. Notably, this issomewhat shallower than both MegaZ + SDSS LRG + L4 + L3( β , MegaZ = . + . − . ; shaded purple in Figure 13) and LOWZ( β , LOWZ = . ± . ; pink shaded); it is slightly steeper than,but consistent to ∼ σ with, the KiDS × GAMA+SDSS red sample( β , GAMA = . + . − . ; green shaded). For the most part, this ﬁtswith the broken power law picture painted by the existing datasets(i.e. a relatively steep slope at high L , turning into a much ﬂatterfunction below L / L ∼ . ). Our uppermost L bin, however, in-dicates something slightly different; the measured IA amplitude inthis bin is both relatively well constrained, and below the extrap-olated MegaZ power law prediction by several σ . Taken togetherwith the third point A ( L / L = . ) = . ± . , which is slightlyabove the data, this seems to hint at some level of disagreement be-tween simulations and data.One caveat here is that the x axis positions are point esti-mates from luminosity distributions, which have ﬁnite width. In thecase of the blue sample, the distributions are relatively compact andGaussian; in the case of the higher L red sample luminosity bin, thisis not the case, and the p ( L ) distribution is broad, with a trailingupper tail, reaching a maximum luminosity of log ( L / L ) = . .Using the modal luminosity as our point estimate, the rightmostpoint in Figure 13 shifts slightly to the left, thereby reducing theapparent tension with the earlier results.Another complicating factor here is the evolving satellite frac-tion in both our, and the published, samples. In our case, the I L - LUSTRIS

TNG red sample satellite fraction changes signiﬁcantlyfrom f s = . in the lowest luminosity bin ( log ⟨ L ⟩/ L = − . )to f s = . in the brightest bin ( log ⟨ L ⟩/ L = . ). In contrast,the satellite fraction of the I LLUSTRIS

TNG blue sample is quitestable at f s ∼ . across the L range. We explore the impact of thisdirectly by repeating the measurements using red centrals only; asone might expect from the numbers above, the amplitude in thelower bins shifts upwards slightly (to A = . ± . ), to a valuewhich is consistent with the central only GAMA measurement (theyellow point in Fortuna et al. 2020’s Fig. 14). The upper luminositybins are almost completely unchanged (since the IA signal in thosebins is heavily dominated by centrals anyway). That is, the cen-tral/satellite trend does not seem to be sufﬁcient on its own to ex-plain the discrepancy between our results on I LLUSTRIS

TNG andthe steeper slope seen in LOWZ and MegaZ.The trend in blue galaxies, β , blue = . ± . , A z, blue = . ± . . is also interesting, particularly given the lack of existingblue sample measurements. Our results are consistent with no lu-minosity evolution in blue galaxies, at least at the faint end of theluminosity function. More troubling, perhaps, is the persistence ofa relatively strong blue IA signal. This is in accordance with ourprevious ﬁndings, but it is particularly striking here that even in thefaintest blue galaxies at z = . , there is a non-negligible IA signal, A = − . Although the topic clearly warrants caution, and furtherinvestigation in real data, if it bears out this could have signiﬁcantconsequences for future cosmic shear anlayses. An important point to bear in mind here is the choice of pivotluminosity. By convention, and to facilitate comparison with previ-ous results on real data, we choose a pivot L corresponding to an r − band absolute magnitude of M r = − (see e.g. Joachimi et al.2011; Singh et al. 2015; Johnston et al. 2019). This is appropriatefor those studies, and for our red galaxy sample, in that L is moreor less in the centre of the luminosity range. In the case of the blueI LLUSTRIS

TNG sample, however, the bulk of the sample is below L / L = . Although this is a valid analysis choice, and indeed use-ful for comparison with the literature, it does mean that A z and β are likely non-trivially correlated in this case.Finally, we repeat this exercise using the TATT model, againat z = . , with the results shown in Figure 14. Again, we showthe best ﬁts to MegaZ, LOWZ and KiDS × GAMA+SDSS, but nowfor clarity we omit the corresponding data points. Although weﬁt power law slopes as before, the constraints are degraded rela-tive to the NLA case. Fitting to the red sample, we ﬁnd β , red = . ± . , β , red = . ± . . As before, the red galaxy TAamplitude A increases from faint to bright galaxies ( β , red > ),as one would naively expect. In all but the brightest two bins in thered galaxy sample, the TT amplitude A remains consistent withzero to ∼ σ . Although we report weak positive β here, the ﬁtsare extremely noisly, such that very different scenarios are allowedwithin the uncertainties. At the current precision, then, there is lit-tle hope of distinguishing between power law and non power lawforms of luminosity evolution, at least for A . In the blue sample,we ﬁnd β , blue = . ± . , β , blue = . ± . , consistent withno coherent variation with L across the range. As well as luminosity and colour, the IA signal is known to de-pend on galaxy type, and so we next consider a satellite/centralspit. We divide the I

LLUSTRIS

TNG shape catalogues into cen-trals and satellites using the method described in Section 2.4.2,before remeasuring the two-point functions and repeat the co-variance matrix calculation. This gives us two new data vectors D c = ( w cc ++ , w cg + , w gg ) , D s = ( w ss ++ , w sg + , w gg ) . In all cases theshape part of the correlation is either central or satellite, and thedensity part uses the unsplit catalogue. Note that we repeat this ex-ercise with split density samples, and conﬁrm that we return consis-tent (albeit slightly degraded) IA constraints when ﬁtting on largescales.The NLA analysis on these new satellite/central disaggregateddata vectors are shown in Figure 15. As expected, the galaxy biasis consistent between the two, and constrained primarily by w gg ,which is the same in the two data vectors. By construction the largescale ﬁts to these data are each sensitive to a particular combina-tion of two halo IA power spectra. Speciﬁcally D c is sensitive to( P h,s GI , P h,ss II ), and D s probes ( P h,c GI , P h,cc II ). Again, we assumethat on two halo scales, the satellite/central composition of the den-sity tracer is not relevant. Notably, the amplitude of large scale cen-tral alignments in Figure 15 is stronger than that of satellites by afactor of ∼ , at the level of a few σ . The subject of satellite align-ments has been discussed quite extensively in the literature, andthe overall picture ﬁts with our results here. A number of theoret-ical studies point to satellite IAs being dominated by tidal torqueinduced radial alignments within their halos (Knebe et al. 2008;Faltenbacher et al. 2007; Pereira et al. 2008), which scale rapidlywith separation, and tend to wash out on very large scales. Thereis also now evidence from various observations on both cluster andgalaxy scales supporting the same picture Sif´on et al. 2015; Singh MNRAS , 000–000 (0000) Samuroff et al . . . . . . . . b g − . . . . . . . A z = 0 . satellitescentrals Figure 15.

NLA model constraints on our I

LLUSTRIS

TNG sample, splitinto satellites and centrals. In both cases the split is imposed on the shapesample; the density tracer sample used in w g + and w gg is the full z = I LLUSTRIS

TNG catalogue. As shown, the large intrinsic alignment ofsatellites is weaker than that of centrals by a factor of ∼ . − . − . − . . . . . . . . A − − − A z = 0 . satellitescentrals Figure 16.

TATT model constraints from I

LLUSTRIS

TNG at z = , afterdecomposing the catalogue into central and satellite galaxies. The metricused to deﬁne the two classes is outlined in Section 6.2.4. As in Figure 15,the density sample used here is the full, unsplit catalogue, and the satel-lite/central split is imposed only on the shape sample. et al. 2015; Huang et al. 2018). This, again, is consistent with John-ston et al. (2019) and Fortuna et al. (2020), who suggest satelliteshapes are effectively random on sufﬁciently large scales. Centrals,on the other hand, tend to align with the host halo, and so tracethe large scale correlations in the background large scale structure(Catelan et al. 2001; Kiessling et al. 2015). Although not shownin Figure 15, it is also worth noting that the central galaxies showa clear monotonic increase in IA amplitude with redshift, a trendwhich is not replicated in satellite galaxies.We next repeat our analysis of the 4 split data vectors, but nowusing the three parameter TATT model instead of NLA. Figure 16shows the marginalised parameter constraints at four redshifts. Asin the simpler ﬁts above, the central IA signal is stronger than thatin satellite galaxies by a factor of a few, although the constraintsare degraded to the extent that it is difﬁcult to draw meaningfulpredictions from this. Again, there is no clear evidence of non-zero favoured values of either the tidal torquing amplitude A , or thedensity term A δ = b TA A , in either the satellite or the central pop-ulation.We also present the correlation functions of centrals and satel-lites in Figure 17. This is a worthwhile exercise for a variety ofreasons, not least that there is information on the small scale IAsignal missed in the large scale ﬁts. While we cannot, at the presenttime, ﬁt the IA signal on scales <∼ h − Mpc , the qualitative com-parison can be instructive. Given that there is some evidence thatthey behave differently, we consider blue and red satellites/centralsseparately here. We also drop the full sample density tracer, andinstead use one of the four (red/blue, satellite/central) subsamples.The motivation here is that, while on large scales, the density traceris effectively just that: a probe of the large scale matter distributionmultiplied by a linear galaxy bias, on scales approaching the onehalo regime this no longer holds. Figure 17 shows these new datavectors. As shown we measure both w g + and w ++ , and recomputethe covariance matrices with the appropriate densities. For refer-ence, the dark red crosses also show the equivalent satellite/centralred galaxy w g + corrlations from KiDS × GAMA here (c.f. John-ston et al. 2019 Fig. 7, red points/band). On large scales at least,our I

LLUSTRIS

TNG red sample is consistent with their measure-ments. There are a few interesting features here to note, however.Firstly, we see a relatively strong red galaxy 1h contribution onscales < h − Mpc . Although the general trends match the realdata, with ss and to a lesser extent cs exhibiting strong scale de-pendent IAs in this regime, the magnitude is somewhat higher inour sample. This is particularly interesting, given that our samplecharacteristics are similar ( ⟨ L ⟩/ L = . and . for our red andblue samples respectively, compared with their ∼ . and . ).As discussed brieﬂy in Section 6.2.1, we observe a persistent non-zero IA signal in blue galaxies on large scales; here we can seeit is dominated by the cc correlation, with a smaller contributionfrom sc . Also notable is that, contrary to what has often been as-sumed, the large scale satellite correlations do not appear to van-ish on large scales. Focusing on the right hand panels, the purpleand pink points are consistently positive and non zero. While smallcompared with the red central terms, and consistent with the darkred points from GAMA, there appears to be a detectable signal atthe precision allowed by I LLUSTRIS

TNG.Now considering the lower panel, we see shape-shape corre-lations involving satellites do indeed appear to be zero on largescales, irrespective of colour. Indeed, the large scale w ++ is drivenprimarily by the cc component, with all other subsets of the dataapparently consistent with null signal at > h − Mpc . As before,we see no signiﬁcant 1h cc term, down to ∼ . h − Mpc (a resultwhich should be true by construction, since each halo contains onlyone central galaxy).

In earlier sections, we set out an analysis based on measuring andmodelling the two-point functions of intrinsic galaxy shapes. Thisis the most common method for deriving information about intrin-sic alignments from data, be it simulated or real (see e.g. Hirataet al. 2007; Singh et al. 2015; Chisari et al. 2015). This sectionoutlines an alternative approach, which exploits the fact that cos-mological simulations allow direct access to the underlying matterﬁeld. The basic idea is that, with a suitable choice of smoothingscale, one can measure the components of Eq. (21) directly andperform a linear ﬁt to obtain constraints on the various amplitudes.

MNRAS000

MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations r p w g + / h − M p c cc Red galaxiesBlue galaxiesNo colour split cs − r p /h − Mpc02468 r p w g + / h − M p c sc r p /h − Mpc ss − . . . . . r p w ++ / h − M p c cc Red galaxiesBlue galaxiesNo colour split r p /h − Mpc ss − r p /h − Mpc0 . . . r p w ++ / h − M p c sc Figure 17.

Upper : Projected galaxy-shape correlation functions w g + , for various subsamples of the z = I LLUSTRIS

TNG sample. Shown clockwise from thetop right are the central autocorrelation; the satellite shape - central position correlation; the satellite autocorrelation; and the central shape - satellite positioncross correlation. In addition to the satellite/central split, galaxies are split by colour (shown by the different colour points). For reference, the shaded regionshows scales excluded in the ﬁducial analysis. The red crosses are analogous split measurements on real KiDS+GAMA data (Johnston et al. 2019; see theirFig. 7).

Lower : The same, but for shape-shape correlations w ++ . Bypassing two-point correlations in this way has several advan-tages, not least that it is potentially less susceptible to noise.The method for obtaining the tidal tensor and the intrinsicshape ﬁeld is described in Section 3.3. In brief, the process involvespixelising the simulation volume at given redshift, and so building smoothed 3D shape and density ﬁelds. We can then compute thetidal ﬁeld by Fourier transforming the density ﬁeld (see Eq. (12)).One important thing to bear in mind is that we are free to choosethe pixel scale, a choice which has some bearing on the physicalinterpretation of the result.

MNRAS , 000–000 (0000) Samuroff et al

With these ingredients in hand, we can proceed to ﬁt for theamplitudes in Eq. (21). By varying C , C and b TA , we seek tominimise χ ( p IA ) = ∑ i,j,p [ γ Iij,p − γ I,modelij,p ( p IA )] C − ij,p [ γ Iij,p − γ I,modelij,p ( p IA )] . (32)Here the IA model parameters are p IA = ( C , C , b TA ) . Theindices i, j indicate an element of the × shape tensor,and p identiﬁes a pixel, within which galaxy shapes are av-eraged. The theory prediction γ I,modelij,p is obtained by evalu-ating Eq. (21). We will refer to this technique for constrain-ing IA parameters, in contrast to the earlier two-point method-ology, as Direct Alignment Field Fitting (DAFF). As in thetwo-point analysis, the likelihood sampling is performed usingC

OSMO

SIS, using M

ULTINEST ; one can ﬁnd the modules forthis here: https://github.com/ssamuroff/direct_ia_theory/tree/master/likelihood/field_fit .Our DAFF TATT constraints, using a range of pixel scales,are presented in Figure 18. For reference, the light purple contoursalso show the equivalent z = TATT model posterior from theI

LLUSTRIS

TNG two-point analysis.Notably, on smaller smoothing scales particularly, there is asigniﬁcant gain in signal-to-noise. Although, perhaps unsurpris-ingly, L > h − Mpc offers little information on the higher orderIA contributions, with a suitable choice of scale, the TATT poste-rior volume is reduced quite considerably. Although this is highlypromising in terms of the DAFF method’s future utility, we shouldpoint out a few caveats in the comparison.It is perhaps worth remembering here that these are not inde-pendent datasets. The underlying galaxy ﬁeld, and the shape noiseare the same in each, albeit smoothed on different scales. This isalso true of the matter tidal ﬁeld. For such comparisons, it is dif-ﬁcult to gauge the signiﬁcance of parameter shifts, given that theconﬁdence contours do not account for these correlations.The second consideration, which muddies the comparison, isthat of the scales probed. The earlier analyses of w g + and w ++ have an explicit window of sensitivity determined by our choiceof scale cuts, < r p < h − Mpc . Within that window, however,all scales are ﬁt simultaneously (albeit with unequal weight). TheTATT implementation used in the two-point ﬁts does not set an ex-plicit smoothing scale, which some previous incarnations of NLAhave, to suppress galaxy-scale ﬂuctuations; rather the ﬁlter is in-cluded as an implicit element of the model. The various IA am-plitudes are effectively renormalised to account for the impact ofsmall scale processes on mid-to-large scale modes (see Blazek et al.2019, Section F for discussion). The DAFF approach, in contrast, does include a smoothing scale, in a way that is inherent and un-avoidable. The various ﬁelds are explicitly pixelised, and averagedon a ﬁxed scale with cubic pixels . While this allows some level ofcontrol over the physical scales probed, it makes direct comparisonwith two-point results difﬁcult.One can, and people historically have, adopt a method similarto DAFF in the case of galaxy bias, as discussed in some detail byDesjacques et al. (2018) (see Section 4.2, pages 85-93). In orderto correctly interpret the results there are corrections of the order The use of cubic pixels is an explicit modelling choice in our DAFFpipeline. One could conceivably apply e.g. Gaussian smoothing on top ofthe pixelisation. > /h MpcDAFF, L = 12 . /h MpcDAFF, L = 6 . /h MpcDAFF, L = 3 . /h Mpc − . . . . A . . . A . . . . . b T A − . . . . A . . . . . b TA Figure 18.

Posterior IA parameter constraints obtained via the DAFF ﬁtsdescribed in Section 7. Shown are results using three different pixel scales(indicated in the legend); for reference, we also include the equivalent 1Dconstraints from w gg + w g + + w ++ , with a scale cut-off at r p > h − Mpc. of σ L , (i.e. the variance of linear density ﬁeld on scale L ), whichconvert between an N-point bias, and that of the moments/scatter.These corrections are complicated to compute, and are the focus ofongoing work. While it is necessary to have a robust estimate ofthese terms in order to use the DAFF method to make constraintsto a precision of better than a factor of a few, we set out here only topresent a proof of concept, and so defer calculation of these (orderof unity on scales ∼ h − Mpc ) additional terms to a future work.That said, these corrections should alter both the centering of theIA posteriors and the width by roughly the same factor, such thatthe signal to noise is approximately conserved. Based on this rea-soning, we expect an improvement in the signal-to-noise (posteriormean divided by the σ marginalised error) on A of a factor of ∼ relative to the two-point constraints.With these caveats ﬁrmly in mind, it is apparent that when onegoes to smaller smoothing scales, the favoured A starts to deviatefrom zero. While the level of signiﬁcance is still only ∼ − σ atmost, this pattern makes sense. We have presented a detailed study of galaxy intrinsic alignmentsusing two-point measurements from three of the most recent pub-lic hydrodynamic simulations. After halo mass reweighting of thesamples to simplify the comparison, we ﬁnd I

LLUSTRIS

TNG andI

LLUSTRIS -1 agree well within their respective uncertainties, al-though M

ASSIVE B LACK -II favours a somewhat stronger IA sig-nal. Our key results are summarised below. ● We analysed each sample using the NLA model, which as-sumes linearity in the tidal ﬁeld. All three of the simulationsconsider show strong evidence for a non-zero NLA amplitude.The results from the three are consistent within ∼ σ , although MNRAS000

A Constraints from Hydrodynamic Simulations M ASSIVE B LACK -II consistently displays a slightly stronger IAsignal relative to the other two across all snapshots. ● For the ﬁrst time, we ﬁt the more complex TATT model tothese various simulated data sets. On scales r p > h − Mpc weﬁnd no clear indication of A , b TA ≠ in I LLUSTRIS

TNG ( A = . ± . ), and so no strong evidence for deviation from the NLAscenario, at least within the (relatively large) statistical uncertaintyof the measurement. M ASSIVE B LACK -II, on the other hand, showsa mild preference for negative values ( A = − . ± . ). In allcases, the best ﬁtting A from the TATT ﬁts is consistent with theamplitude from the NLA only ﬁts on the same data, albeit withgreater uncertainty. There is also some level of degeneracy betweenthe two TATT amplitudes, such that combinations with non-zero A , combined with a slightly reduced but still non-zero A are alsoallowed. ● We discussed a series of ﬁts extending to smaller scales. Wejustiﬁed this by comparing galaxy-galaxy and matter-matter corre-lations, ﬁnding the linear bias assumption to hold in our case downto ∼ h − Mpc . Even with these relaxed cuts, we do not report astatistically signiﬁcant detection of A or b TA in I LLUSTRIS

TNG. ● We presented a colour split IA analysis on I

LLUSTRIS

TNG,an exercise enabled by the relatively realistic bimodal colour dis-tribution it exhibits. As expected, the red sample displays a strongalignment signal across redshifts. Our results on blue galaxies areconsistent with observations at low redshifts; at higher redshifts, z > . , where direct constraints on real data are lacking, we detecta non-zero IA signal A ∼ . We explored the origins of this blueIA feature, reporting that it persists even in relatively faint blue sub-samples (down to M ∗ < . × h − M ⊙ ), and is generated almostentirely by blue centrals. ● We examined the luminosity dependence in red galaxies, re-porting results consistent with the those of Johnston et al. (2019);the red I

LLUSTRIS

TNG sample favours a marginally shallowerslope than Joachimi et al. (2011) and Singh et al. (2015), whoseﬁts are dominated by brighter galaxies. We have also reported theconstraints on the luminosity dependence of the quadratic tidaltorquing amplitude A . In the TATT ﬁts, A and A exhibit consis-tent luminosity evolution, which is signiﬁcant at the level of − σ . ● We presented a similar exercise for blue galaxies, which ex-tend into the faint regime where there is a relative paucity of con-straints from real data. Our ﬁts prefer a power law index β ∼ . ,which is consistent with the equivalent ﬁt to the red sample. In theTATT parameter space the constraining power is degraded, yielding β and β values consistent with no luminosity dependence. ● We also ﬁtted disaggregated central and satellite correlations.On large scales, our ﬁts favour a weak but non-vanishing, satelliteshape alignment signal. Centrals show a stronger signal, and a sim-ilar trend with redshift to the mixed sample. The TT and densityweighting components of the TATT model do not differ systemati-cally between satellites and centrals, with both A and b TA consis-tent with zero. At the correlation function level, we see the satellitealignment signal is dominated by the red cs correlation; it persistsin the mixed sample, on r p > h − Mpc , albeit subdominant by afactor of several to the red central terms. ● We have outlined a new method for recovering intrinsic align-ment information from hydrodynamic simulations, which we referto as DAFF. Although the results depend signiﬁcantly on the choiceof smoothing scale (in part because we have omitted small correc-tions, the computation of which is left for future work), the ap-proach potentially offers a signiﬁcant boost in constraining powerrelative to an equivalent two-point analysis, and greater control overthe physical scales probed. This work is one of a relatively small number that focus on derivingparameteric IA constraints from hydrodynamic simulations (Codiset al. 2015a; Chisari et al. 2015; Tenneti et al. 2015; Hilbert et al.2017; Bhowmick et al. 2020); it is the ﬁrst to attempt a comprehen-sive analysis of directly comparable samples from multiple sim-ulations, including the current state of the art (I

LLUSTRIS

TNG).Unlike most previous studies, we perform a simultaneous analysis,modelling a joint w gg + w g + + w ++ data vector. Our analysis alsoincludes an analytic covariance matrix, which is both numericallystable and avoids the (potentially limiting) assumptions of internalestimators such as jackknife.Although our ﬁndings are a building block in our understand-ing of the behaviour of large scale galaxy alignments, we urgecaution in applying our ﬁndings directly to cosmological mea-surements. That is, our samples are comparable with each other,but are not tailored to match the more complex selection redshift-dependent function of a typical lensing shape sample used in cos-mic shear and galaxy-galaxy lensing measurements. Another use-ful exercise would be to use the observed trends with luminosity,colour and galaxy type to extrapolate out a mock IA signal, morerepresentative of the contamination in real lensing data; in turn, thiscan be used to test our IA models in a cosmological context. This isa relatively straightforward extension of the results presented here,and is the focus of future work.The novel DAFF method, which does not involve measuringtwo-point correlations, is to our knowledge the ﬁrst implementationin the literature. Although an analogous idea exists in the literaturefor galaxy bias (see Desjacques et al. 2018, Sec 4.2 and the ref-erences therein), it has never been discussed in the context of IAsbefore. The analysis on the z = I LLUSTRIS

TNG snapshot shouldbe seen as a proof-of-concept exercise; while promising, there arestill gaps in our interpretation (see Section 7), which are the subjectof ongoing work, but beyond the scope of the current paper.It is now well established that intrinsic alignments exist inthe Universe, and must be accounted for at some level to avoidbiasing cosmological analyses based on cosmic shear and galaxy-galaxy lensing. IAs have been included in cosmic shear analysesfor as long as shear has been a competative cosmological probe(Heymans et al. 2013; Dark Energy Survey Collaboration 2016;Jee et al. 2016; Troxel et al. 2018; Hildebrandt et al. 2020; Hik-age et al. 2019; Hamana et al. 2020; Chang et al. 2019; Asgariet al. 2020). Only recently, however, have the lensing data been ofsufﬁcient volume to potentially incur biases due to model insufﬁ-ciency (see Krause et al. 2016, and the Stage IV forecasts of For-tuna et al. 2020 and the upcoming tests in the context of DES Y3Secco et al. 2020; see also Joachimi et al. 2020 for an interestingcounter discussion). Developing a fuller understanding of intrinsicalignments, then, will be crucial for, arguably, the current genera-tion of cosmological surveys, and certainly the next. The currentpaper is one small step in this direction, providing the ﬁrst detailedanalysis at the level of model constraints on the best available cos-mological hydrodynamic simulations. Our results, of course, comewith a number of caveats. Most notably, our selection function isnot intended to accurately match current or future lensing surveys.This is in part because recreating the complex redshift-dependentselection function in a real lensing sample, which would typicallybe based on a number of correlated observables, is a difﬁcult task;it is also, however, a function of our aim in this study. We wishto understand the behaviour of IAs at a physical level, in orderto feed into understanding IAs and model building efforts, ratherthan make a detailed prediction or robustness test for a particu-lar survey. The behaviour of intrinsic alignments on small physical

MNRAS , 000–000 (0000) Samuroff et al scales is an important topic for future investigation, and one thatcould conceivably be addressed using hydrodynamic simulations;indeed, due to the larger number of measureable modes, the signalto noise on small scales is relatively high. The TATT approach al-lows some hope of pushing to smaller scales (though not into theregime of <∼ h − Mpc , where one would need an explicit modelfor 1h alignment contributions). Unfortunately, a number of otherpoorly-understood effects enter on small scales, particularly non-linear galaxy bias and baryonic physics. In order to pursue IA con-straints on such scales, it is likely that one would need to considerboth higher-order bias terms and the interplay with the higher-orderIA terms.

We are grateful to Duncan Campbell, Maria Cristina Fortuna,Franc¸ois Lanusse, Ami Choi, Scott Dodelson and the DES WLWGfor valuable discussions at various points in the writing of this pa-per. We would also like to thank Sukhdeep Singh for lending hiscode to help validate our pipeline and Harry Johnston for kindlyallowing us access to his KiDS+GAMA measurements. This workmade use of the Coma cluster, which is hosted by the McWilliamsCenter, at CMU. We also used resources of the National Energy Re-search Scientiﬁc Computing Center (NERSC), a U.S. Departmentof Energy Ofﬁce of Science User Facility operated under ContractNo. DE-AC02-05CH11231. SS and RM are supported by the USNational Science Foundation (NSF) under Grant No. 1716131.

REFERENCES

Alarcon A., et al. 2019, arXiv e-prints, p. arXiv:1910.07127Alonso D., Hadzhiyska B., Strauss M. A., 2016, MNRAS, 460, 256Ammazzalorso S., et al. 2020, Phys. Rev. Lett., 124, 101102Amon A., et al. 2020, in preparation

Anderson T. W., 2003, An introduction to multivariate statistical analysis, 3edn. Wiley-InterscienceArevalo F., Cid A., Moya J., 2017, European Physical Journal C, 77, 565Asgari M., et al. 2020, arXiv e-prints, p. arXiv:2007.15633Bagla J. S., Prasad J., Khandai N., 2009, MNRAS, 395, 918Baldry I. K., et al. 2004, ApJ, 600, 681Bhowmick A. K., et al. 2020, MNRAS, 491, 4116Blake C., et al. 2020, arXiv e-prints, p. arXiv:2005.14351Blazek J., McQuinn M., Seljak U., 2011, J. Cosmology Astropart. Phys.,2011, 010Blazek J., et al. 2012, J. Cosmology Astropart. Phys., 2012, 041Blazek J., Vlah Z., Seljak U., 2015, J. Cosmology Astropart. Phys., 8, 015Blazek J., et al. 2019, Phys. Rev. D, 100, 103506Bridle S., King L., 2007, New Journal of Physics, 9, 444Brown M. L., et al. 2002, MNRAS, 333, 501Campos A., et al. 2020, in preparation

Catelan P., Porciani C., 2001, MNRAS, 323, 713Catelan P., Kamionkowski M., Blandford R. D., 2001, MNRAS, 320, L7Chang C., et al. 2019, MNRAS, 482, 3696Chisari N., et al. 2015, MNRAS, 454, 2736Chisari N., et al. 2016, MNRAS, 461, 2702Chisari N. E., et al. 2018, MNRAS, 480, 3962Choi A., et al. 2016, MNRAS, 463, 3737Codis S., et al. 2015a, MNRAS, 448, 3391Codis S., Pichon C., Pogosyan D., 2015b, MNRAS, 452, 3369van Daalen M. P., et al. 2011, MNRAS, 415, 3649Dark Energy Survey Collaboration 2016, Phys. Rev. D, 94, 022001Dark Energy Survey Collaboration 2018, Phys. Rev. D, 98, 043526Dark Energy Survey Collaboration 2020, Phys. Rev. D, 102, 023509 Desjacques V., Jeong D., Schmidt F., 2018, Phys. Rep., 733, 1Di Matteo T., et al. 2012, ApJ, 745, L29Dubois Y., et al. 2014, MNRAS, 444, 1453Faltenbacher A., et al. 2007, The Astrophysical Journal Letters, 662, L71Fang X., et al. 2017, J. Cosmology Astropart. Phys., 2017, 030Feroz F., et al. 2019, The Open Journal of Astrophysics, 2, 10Fortuna M. C., et al. 2020, arXiv e-prints, p. arXiv:2003.02700Gatti M., et al. 2018, MNRAS, 477, 1664Giannini G., et al. 2020, in preparation

Gruen D., Brimioulle F., 2017, MNRAS, 468, 769Hamana T., et al. 2020, PASJ, 72, 16Handley W. J., Hobson M. P., Lasenby A. N., 2015, MNRAS, 453, 4384Hartlap J., Simon P., Schneider P., 2007, A&A, 464, 399Hearin A. P., et al. 2017, doi.org/10.5281/zenodo.835898 AJ, 154, 190Hernquist L., Springel V., 2003, MNRAS, 341, 1253Heymans C., Heavens A., 2003, MNRAS, 339, 711Heymans C., et al. 2013, MNRAS, 432, 2433Hikage C., et al. 2019, PASJ, 71, 43Hilbert S., et al. 2017, MNRAS, 468, 790Hildebrandt H., et al. 2020, A&A, 633, A69Hirata C. M., Seljak U., 2004, Phys. Rev. D, 70, 063526Hirata C. M., Seljak U. c. v., 2010, Phys. Rev. D, 82, 049901Hirata C. M., et al. 2007, MNRAS, 381, 1197Huang H.-J., et al. 2018, MNRAS, 474, 4772Huang H.-J., et al. 2019, MNRAS, 488, 1652Hui L., Zhang J., 2002, arXiv e-prints, pp astro–ph/0205512Jee M. J., et al. 2016, ApJ, 824, 77Joachimi B., Schneider P., 2010, arXiv e-prints, p. arXiv:1009.2024Joachimi B., et al. 2011, A&A, 527, A26Joachimi B., et al. 2020, arXiv e-prints, p. arXiv:2007.01844Johnston H., et al. 2019, A&A, 624, A30Joudaki S., et al. 2018, MNRAS, 474, 4894Kannawadi A., et al. 2019, A&A, 624, A92Khandai N., et al. 2015, MNRAS, 450, 1349Kiessling A., et al. 2015, Space Sci. Rev., 193, 67Kirk D., et al. 2012, MNRAS, 424, 1647Knebe A., et al. 2008, MNRAS, 386, L52Krause E., Eiﬂer T., Blazek J., 2016, MNRAS, 456, 207Krause E., et al. 2017, arXiv e-prints, p. arXiv:1706.09359Landy S. D., Szalay A. S., 1993, ApJ, 412, 64Leauthaud A., et al. 2017, MNRAS, 467, 3024Mandelbaum R., et al. 2011, MNRAS, 410, 844Mandelbaum R., et al. 2013, MNRAS, 432, 1544Mandelbaum R., et al. 2018, MNRAS, 481, 3170Marshall P., Rajguru N., Slosar A., 2006, Phys. Rev. D, 73, 067302McEwen J. E., et al. 2016, J. Cosmology Astropart. Phys., 2016, 015Melchior P., Viola M., 2012, MNRAS, 424, 2757Melchior P., et al. 2017, MNRAS, 469, 4899Myles J., et al. 2020, in preparation

Nelson D., et al. 2018, MNRAS, 475, 624Nelson D., et al. 2019, Computational Astrophysics and Cosmology, 6, 2Pereira M. J., Bryan G. L., Gill S. P. D., 2008, ApJ, 672, 825Pillepich A., et al. 2018, MNRAS, 473, 4077Piras D., et al. 2018, MNRAS, 474, 1165Power C., Knebe A., 2006, MNRAS, 370, 691Prat J., et al. 2018, Phys. Rev. D, 98, 042005Prat J., et al. 2019, MNRAS, 487, 1363Prescott M., et al. 2011, MNRAS, 417, 1374Rykoff E. S., et al. 2016, ApJS, 224, 1Samuroff S., et al. 2019, MNRAS, 489, 5453S´anchez J., et al. 2020, MNRAS, 497, 210Schaan E., et al. 2017, Phys. Rev. D, 95, 123512Schmitz D. M., et al. 2018, J. Cosmology Astropart. Phys., 2018, 030Schneider M. D., Bridle S., 2010, MNRAS, 402, 2127Secco L., et al. 2020, in preparation

Sif´on C., et al. 2015, A&A, 575, A48Singh S., Mandelbaum R., 2016, MNRAS, 457, 2301Singh S., Mandelbaum R., More S., 2015, MNRAS, 450, 2195MNRAS , 000–000 (0000)

A Constraints from Hydrodynamic Simulations Soussana A., et al. 2020, MNRAS, 492, 4268Springel V., 2005, MNRAS, 364, 1105Springel V., et al. 2001, MNRAS, 328, 726Springel V., et al. 2018, MNRAS, 475, 676Takahashi R., et al. 2012, ApJ, 761, 152Tenneti A., et al. 2014, MNRAS, 441, 470Tenneti A., et al. 2015, MNRAS, 448, 3522Tenneti A., Mandelbaum R., Di Matteo T., 2016, MNRAS, 462, 2668Troxel M. A., et al. 2018, Phys. Rev. D, 98, 043528Valentini M., et al. 2018, MNRAS, 480, 722Vielzeuf P., et al. 2019, arXiv e-prints, p. arXiv:1911.02951Vogelsberger M., et al. 2014, MNRAS, 444, 1518Weinberger R., Springel V., Pakmor R., 2020, ApJS, 248, 32Zuntz J., et al. 2015, Astronomy and Computing, 12, 45Zuntz J., et al. 2018, MNRAS, 481, 1149

APPENDIX A: PIPELINE AND COVARIANCE MATRIXVALIDATION

The likelihood pipeline used in this work is built from public code,developed within the C

OSMO

SIS framework. During the processof developing this code base, we implemented a series of valida-tion exercises, intended to ensure our results are both accurate andrepeatable.The ﬁrst step in this process is a data vector-level compari-son between different theory codes. We generate a nonlinear mat-ter power spectrum using C

OSMO

SIS, which is then fed into (a)our theory pipeline, which is used for inference in this work, and(b) an external code developed by an independent group, and usedin Singh et al. (2015). The two codes produce projected correla-tions w gg ( r p ) , w g + ( r p ) and w ++ ( r p ) . The r p sampling is slightlydifferent, and so we interpolate to a comparable set of values. Theresult is shown in Figure A1; we can see here that the two agreerelatively well. Though the residuals in the two IA correlations arenon-zero and roughly scale-independent, the difference is comfort-ably within ∼ . .Though it is reassuring that the two codes are consistentwith each other at some (relatively sensible) set of input param-eter values, this is not in iteslf a rigorous demonstration that ourpipeline is unbiased. Using the Singh et al. (2015) code we thengenerate a ﬁducial data vector, w gg , w g + , w ++ at four redshifts z = ( . , . , . , . ) . Using these mock data, we run our in-ference pipeline with the ﬁducial (analytic) covariance matrices ob-tained through the process described in Section 3.4. We report thatwe can recover the input parameters to comfortably within . σ .We also compare our analytic covariance matrix with an alter-native, obtained by jackknife resampling. The jackknife covarianceis generated by dividing the I LLUSTRIS

TNG box into = sub-volumes, and iteratively remeasuring our data vector. Although thecomparison is useful as a cross test, it is worth bearing in mindthat the jackknife estimator relies on a number of assumptions thatdo not strictly hold in our case (Hartlap et al. 2007). That is, al-though order-of-magnitude differences are not expected, we haveﬁrst principles reasons to trust our ﬁducial covariance.The numerical comparison of the diagonals can be seen in Fig-ure A2. As can be seen, the differences are signiﬁcant, on all scalesconsidered. In most cases (all but w gg on small scales), jackknifetends to underestimate the uncertainties at the level of or more. r p ⇥ w gg Singh et al (2016)CosmoSIS r p / h Mpc ⇥ w gg / w gg w g + r p / h Mpc ⇥ w g + / w g + w ++ r p / h Mpc ⇥ w ++ / w ++ Figure A1.

A comparison of theory data vectors produced by two inde-pendent codes. The dashed purple and black (solid) lines show the outputsof the C

OSMO

SIS module produced for this work, and an external theorycode used in Singh et al. (2015). In the lower panel we show the fractionalresidual between the two.

APPENDIX B: POSTERIOR CONSTRAINTS FROMM

ASSIVE B LACK -II & I

LLUSTRIS -1 In this appendix we present the full posterior constraints on ourthree simulated samples. In the main body of this work we pre-sented only a selection of these to emphasise our most interestingﬁndings. For completeness, they are shown in Figures B1 and B2.This represents the baseline TATT analysis on our three simula-tions at z = and z = . As described, the ﬁducial analysis hasfour free parameters ( A , A , b TA , b g ), includes the joint data vec- MNRAS , 000–000 (0000) Samuroff et al − . − . . . . σ ﬁ d / σ j k − ( w gg ) z = 0 . z = 0 . z = 0 . z = 1 . − . − . . . . . σ ﬁ d / σ j k − ( w g + ) r p / h − Mpc − . . . . . σ ﬁ d / σ j k − ( w ++ ) r p / h − Mpc 10 r p / h − Mpc 10 r p / h − Mpc

Figure A2.

A comparison of jackknife and analytic covariance matrices for I

LLUSTRIS

TNG. Here we show the square root of the diagonal of the twocovariances, for four snapshots and three two-point functions, as labelled.

TNGMBIIIllustris-1 − . . . . . b T A − . − . . . . A . . . . . A . . . b g − . . . . . b TA − . − . . . . A . . . b g z = 0 Figure B1.

Posterior TATT model constraints from the lowest redshift snap-shot of I

LLUSTRIS

TNG, M

ASSIVE B LACK -II and I

LLUSTRIS -1. tor w gg + w g + + w ++ , and scale cuts r p > h − Mpc for all correla-tion functions.

TNGMBIIIllustris-1 − . . . . . b T A − . − . . . . A . . . . . A . . . . . b g − . . . . . b TA − . − . . . . A . . . . . b g z = 1 Figure B2.

The same as B1, but at z = . APPENDIX C: IMPACT OF GALAXY WEIGHTS

To allow a meaningful comparison of galaxy samples from thethree simulations included in this paper, we derive a set of galaxyweights for our M

ASSIVE B LACK -II and I

LLUSTRIS -1 catalogues.The idea here is to weight the galaxy samples such that the distri-

MNRAS000

MNRAS000 , 000–000 (0000)

A Constraints from Hydrodynamic Simulations Simulation Redshift A A b TA TNG . . ± .

49 0 . ± .

65 0 . ± . TNG . . ± .

51 0 . ± .

76 0 . ± . TNG .

62 1 . ± .

50 0 . ± .

79 0 . ± . TNG . . ± .

53 0 . ± .

80 0 . ± . MBII . . ± . − . ± .

14 0 . ± . MBII . . ± .

27 0 . ± . − . ± . MBII .

62 4 . ± .

15 1 . ± . − . ± . MBII . . ± .

15 2 . ± . − . ± . Illustris . . ± .

47 1 . ± . − . ± . Illustris . . ± . − . ± . − . ± . Illustris .

62 1 . ± .

11 2 . ± . − . ± . Illustris . . ± .

90 1 . ± .

61 1 . ± . Table B1.

The best-ﬁtting TATT parameters and σ posterior uncertaintiesfrom all samples/redshifts considered in this work. bution of host halo masses match exactly. The weight assigned togalaxy i from simulation X is then: w Xi = p TNG ( M jh )/ p X ( M jh ) , (C1)where p X ( M h ) is the normalised histogram of host halo masses insimulation X , and j is a mass bin to which galaxy i belongs.The impact of this weighting on our IA constraints is shownin Figure C1. Although for clarity we show only the contours fromthe lowest redshift, we ﬁnd very similar behaviour in the other threesnapshots. It has been established elsewhere that there is a rela-tively tight relation between host halo mass and bias, at least onlarge physical scales; it is, then, intuitively correct that the weight-ing should bring the galaxy bias constraints (upper panel) from thetwo simulations into relatively close agreement. APPENDIX D: VALIDITY OF THE LINEAR GALAXYBIAS APPROXIMATION

Since our perturbative TATT model includes higher-order terms,there is some value in seeking to push to slightly smaller scales. Itis also true, however, that as one does so, one eventually enters theregime in which nonlinear galaxy bias also starts to become rele-vant. Such higher order bias contributions, and the cross-IA terms,are complex to model and not fully implemented in our analysis;if it exists, then, we would ideally like to identify a range of scalesbelow our ﬁducal cut at r p = h − Mpc , on which the linear biasapproximation is valid (or, at least, deviations from it are subdomi-nant to other uncertainties).We can obtain an estimate for the effective scale-dependentgalaxy bias in I

LLUSTRIS

TNG as the ratio of the galaxy-galaxyprojected correlation, and the matter-matter equivalent: b ′ g ( r p ∣ z ) = √ w gg w δδ . (D1)We refer to this as an “effective” bias because, strictly speaking,the galaxy bias is deﬁned in terms of the 3D density ﬁeld b g = δ g / δ (or equivalently in terms of 3D power spectra). Converting from3D power spectra to projected correlations w gg and w g + involvesan integral over k (e.g. Equations (25) and (26)), and if b g is scaledependent, it no longer separates cleanly from that integral. Whatwe measure, then, is an effective bias b ′ g , which is not quite thesame as the true 3D galaxy bias b g ( k ) .The discrete snapshots in the simulation allow a relativelyclean measurement of b ′ g at a given redshift. For this exercise, . . . . . b g A I A z = 0 . MassiveBlack-II, weightedMassiveBlack-II, unweightedIllustris, weightedIllustris, unweighted − A − − − − − A MassiveBlack-II, weightedMassiveBlack-II, unweightedIllustris, weightedIllustris, unweighted

Figure C1.

Demonstration of the impact of galaxy reweighting on the pos-terior IA constraints.

Upper : The z = constraints on the NLA model am-plitude and linear galaxy bias from M ASSIVE B LACK -II and I

LLUSTRIS -1,with and without weighting. By construction I

LLUSTRIS

TNG is unaffectedby weighting.

Lower : The same, but using the TATT instead of NLA model. we use the measured w gg correlation in a particular snapshot. Al-though, of course, this includes some level of statistical noise,the signal-to-noise is relatively high. For the matter-matter partit is sufﬁcient to use the theory prediction at the input I LLUS - TRIS

TNG cosmology. While

HALOFIT is subject to its own un-certainties on small scales ( ∼ at k < h − Mpc ; Takahashi et al.2012), we do not expect them to affect alter the conclusions of ourapproximate calculations.The resulting effective scale-dependent bias estimates areshown in Figure D1. While the ﬁducial cut at 6 h − Mpc (the pinkshaded region) does indeed effectively exclude scales on which thebias cannot be captured by a single coefﬁcient, we can also seethat it is relatively conservative. That is, there is a region from r p ∼ h − Mpc upwards, in which (within uncertainties) the biasis linear, but which are excluded by the ﬁducial cut. On the basis ofthese results, we carry out ﬁts (see Section 6.1) with lower cutoffsas low as h − Mpc .At z = , we see that the linear bias assumption begins to breakdown in our I LLUSTRIS

TNG sample at h − Mpc . It is worth bear-ing in mind that in three dimensions, since one is measuring the ac-tual galaxy bias, rather than b ′ g , the approximation likely becomes MNRAS , 000–000 (0000) Samuroff et al − r p /h − Mpc . . . . b g ( r p ) = p w gg / w δδ z = 0 . z = 0 . z = 0 . z = 1 . Figure D1.

Galaxy bias as a function of physical scale. The bias is estimatedas the ratio of the matter-matter and galaxy-galaxy projected correlations.The horizontal shaded bands show the best ﬁtting linear bias values and the σ uncertainties, as obtained from ﬁts to the large scale w gg correlations. invalid at some larger scale. By nature, the projected correlationsat given r p mix contributions from larger 3D separations, whichbring them closer to linear bias. The breaking scale seems to shiftdownwards at high z , which is perhaps as a result of the growth ofstructure (i.e. bias is closer to linear at high redshift at a given r p ). MNRAS000