[PDF] Synthetic Galaxy Clusters and Observations Based on Dark Energy Survey Year 3 Data

Abstract

We develop a novel data-driven method for generating synthetic optical observations of galaxy clusters. In cluster weak lensing, the interplay between analysis choices and systematic effects related to source galaxy selection, shape measurement and photometric redshift estimation can be best characterized in end-to-end tests going from mock observations to recovered cluster masses. To create such test scenarios, we measure and model the photometric properties of galaxy clusters and their sky environments from the Dark Energy Survey Year 3 (DES Y3) data in two bins of cluster richness \lambda\in[30;\,45), \lambda\in[45;\,60) and three bins in cluster redshift (z\in[0.3;\,0.35), z\in[0.45;\,0.5) and z\in[0.6;\,0.65). Using deep-field imaging data we extrapolate galaxy populations beyond the limiting magnitude of DES Y3 and calculate the properties of cluster member galaxies via statistical background subtraction. We construct mock galaxy clusters as random draws from a distribution function, and render mock clusters and line-of-sight catalogs into synthetic images in the same format as actual survey observations. Synthetic galaxy clusters are generated from real observational data, and thus are independent from the assumptions inherent to cosmological simulations. The recipe can be straightforwardly modified to incorporate extra information, and correct for survey incompleteness. New realizations of synthetic clusters can be created at minimal cost, which will allow future analyses to generate the large number of images needed to characterize systematic uncertainties in cluster mass measurements.

Full PDF

DDES-2020-0629FERMILAB-PUB-21-049-AE

Mon. Not. R. Astron. Soc. , 1–21 (2021) Printed 23 February 2021 (MN L A TEX style ﬁle v2.2)

Synthetic Galaxy Clusters and Observations Based on Dark EnergySurvey Year 3 Data

T. N. Varga, (cid:63)

D. Gruen,

S. Seitz,

N. MacCrann, E. Sheldon, W. G. Hartley, A. Amon, A. Choi, A. Palmese,

Y. Zhang, M. R. Becker, J. McCullough, E. Rozo, E. S. Rykoff,

C. To,

S. Grandis, G. M. Bernstein, S. Dodelson, K. Eckert, S. Everett, R. A. Gruendl,

I. Harrison,

K. Herner, R. P. Rollins, I. Sevilla-Noarbe, M. A. Troxel, B. Yanny, J. Zuntz, H. T. Diehl, M. Jarvis, M. Aguena,

S. Allam, J. Annis, E. Bertin,

S. Bhargava, D. Brooks, A. Carnero Rosell,

M. Carrasco Kind,

J. Carretero, M. Costanzi,

L. N. da Costa,

M. E. S. Pereira, J. De Vicente, S. Desai, J. P. Dietrich, I. Ferrero, B. Flaugher, J. García-Bellido, E. Gaztanaga,

D. W. Gerdes,

J. Gschwend,

G. Gutierrez, S. R. Hinton, K. Honscheid,

T. Jeltema, K. Kuehn,

N. Kuropatkin, M. A. G. Maia,

M. March, P. Melchior, F. Menanteau,

R. Miquel,

R. Morgan, J. Myles,

F. Paz-Chinchón,

A. A. Plazas, A. K. Romer, E. Sanchez, V. Scarpine, M. Schubnell, S. Serrano,

M. Smith, M. Soares-Santos, E. Suchyta, M. E. C. Swanson, G. Tarle, D. Thomas, and J. Weller (DES Collaboration) Author afﬁliations are listed at the end of this paper.

23 February 2021

ABSTRACT

We develop a novel data-driven method for generating synthetic optical observations of galaxyclusters. In cluster weak lensing, the interplay between analysis choices and systematic effectsrelated to source galaxy selection, shape measurement and photometric redshift estimation canbe best characterized in end-to-end tests going from mock observations to recovered clustermasses. To create such test scenarios, we measure and model the photometric properties ofgalaxy clusters and their sky environments from the Dark Energy Survey Year 3 (DES Y3)data in two bins of cluster richness λ ∈ [30; 45), λ ∈ [45; 60) and three bins in clusterredshift ( z ∈ [0.3; 0.35), z ∈ [0.45; 0.5) and z ∈ [0.6; 0.65). Using deep-ﬁeld imaging datawe extrapolate galaxy populations beyond the limiting magnitude of DES Y3 and calculatethe properties of cluster member galaxies via statistical background subtraction. We constructmock galaxy clusters as random draws from a distribution function, and render mock clustersand line-of-sight catalogs into synthetic images in the same format as actual survey obser-vations. Synthetic galaxy clusters are generated from real observational data, and thus areindependent from the assumptions inherent to cosmological simulations. The recipe can bestraightforwardly modiﬁed to incorporate extra information, and correct for survey incom-pleteness. New realizations of synthetic clusters can be created at minimal cost, which willallow future analyses to generate the large number of images needed to characterize system-atic uncertainties in cluster mass measurements. Key words: cosmology: observations, gravitational lensing: weak, galaxies: clusters: general (cid:63) corresponding author: [email protected]

The study of galaxy clusters has in recent years became a prominentpathway towards understanding the nonlinear growth of cosmicstructure, and towards constraining the cosmological parameters of © 2021 RAS a r X i v : . [ a s t r o - ph . C O ] F e b T. N. Varga the universe (Allen et al. 2011; Kravtsov & Borgani 2012; Wein-berg et al. 2013). Weak gravitational lensing provides a practicalmethod to study the mass properties of clusters. It relies on estimat-ing the gravitational shear imprinted onto the shapes of backgroundsource galaxies. The lensing effect is directly connected to the grav-itational potential of the lens, and its measurement is readily scal-able to an ensemble of targets in wide-ﬁeld surveys (Bartelmann& Schneider 2001). For this reason the lensing based mass calibra-tion of galaxy clusters has become a standard practice for galaxycluster based cosmological analyses (Rozo et al. 2010; Mantz et al.2015; Planck Collaboration 2016; Costanzi et al. 2019; Bocquetet al. 2019; DES Collaboration 2020).Methods for estimating the shapes of galaxies include modelﬁtting and measurements of second moments, with several innova-tive approaches developed in recent literature (Zuntz et al. 2013;Refregier & Amara 2014; Miller et al. 2013; Bernstein & Arm-strong 2014; Huff & Mandelbaum 2017; Sheldon & Huff 2017;Sheldon et al. 2020). Irrespective of the chosen family of algo-rithms, the performance of the shear estimates cannot be a-prioriguaranteed, and needs to be validated in a series of tests (Jarviset al. 2016, Fenech Conti et al. 2017, Zuntz & Sheldon et al.,2018, Samuroff et al. 2018, Mandelbaum et al. 2018, Kannawadiet al. 2019). These rely on synthetic observations: image simula-tions which are then used to estimate the bias and uncertainty ofthe different methods in a controlled environment (Massey et al.2007; Bridle et al. 2009; Mandelbaum et al. 2015; Samuroff et al.2018; Kannawadi et al. 2019; Pujol et al. 2019; MacCrann et al.2020).Galaxy clusters present a unique challenge for validating weaklensing measurements for a multitude of reasons: they deviate fromthe cosmic median line-of-sight in terms of the abundance andproperties of cluster member galaxies (Hansen et al. 2009; To et al.2019) resulting in increased blending among light sources (Simet &Mandelbaum 2015; Euclid Collaboration 2019; Eckert et al. 2020,Everett & Yanny et al., 2020), host a diffuse intra-cluster light (ICL)component (Zhang et al. 2019; Gruen et al. 2019; Sampaio-Santoset al. 2020; Kluge et al. 2020) inﬂuencing photometry, and inducecharacteristically stronger shears at small scales (McClintock &Varga et al., 2019).In this study we create synthetic galaxy clusters, and opticalobservations of these synthetic galaxy clusters in an unsupervisedway from a combination of observational datasets. To achieve this,we measure and model the average galaxy content of redMaPPerselected galaxy clusters in Dark Energy Survey Year 3 (DES Y3)data along with the measurement and model for galaxies in the fore-ground and background. During this procedure the DES Y3 wide-ﬁeld survey (Sevilla-Noarbe et al. 2020) is augmented with infor-mation from deep-ﬁeld imaging data (Hartley & Choi et al., 2020),resulting in enhanced synthetic catalog depth and better resolvedgalaxy features. Each synthetic cluster and its line-of-sight is gen-erated as a random draw from a model distribution, which enablescreating the large numbers of mock cluster realizations requiredfor benchmarking precision measurements. This approach short-cuts the computational cost and limited representation of reality ofnumerical simulations. The synthetic catalogs of cluster membergalaxies and foreground and background galaxies along with thesmall-scale model for light around the cluster centers are then ren-dered into images in the same format as actual survey observationsand can be further processed with the standard data reduction andanalysis pipelines of the survey.The synthetic cluster images are controlled environments,where all light can be traced back to a source speciﬁed in the underlying model. A mass model calibrated by McClintock et al.2019 is used to imprint a realistic lensing signal on backgroundgalaxies, which will enable future studies to perform end-to-endtests for recovering cluster masses from a weak lensing analysis ofsynthetic images, incorporating photometric processing, shear andphotometric redshift measurement and systematic calibration forlensing proﬁles and maps in a fully controlled environment. This isdifferent from insertion based methods (Suchyta et al. 2016, Everett& Yanny et al., 2020), where synthetic galaxies are added onto realobservations: Our method involves a generalization step avoidingre-using identical clusters multiple times, the full control of syn-thetic data allows quantifying the speciﬁc impact of the differentcluster properties on the lensing measurement.The primary focus of this work is to present the algorithm anda pilot implementation for generating synthetic cluster observationsfor the DES Y3 observational scenario mimicking the stacked lens-ing strategy of McClintock & Varga et al., (2019) and DES Col-laboration (2020). Due to the transparent nature of the framework,changes and improvements aiming for increased realism: e.g. cor-rections for input photometry incompleteness or high resolution,deep cluster imaging, can be directly added to the model in fu-ture studies. For this reason, the presented algorithm is expected tobe easily generalized and expanded to other ongoing (HSC: Hy-per Suprime-Cam , KiDS: Kilo-Degree Survey ) and upcoming(LSST: Legacy Survey of Space and Time , Euclid , Nancy GraceRoman Space Telescope ) weak lensing surveys as well.The structure of this paper is the following: In Section 2 weintroduce the DES year 3 (Y3) dataset, in Section 3 we outline thestatistical approach used in modeling the synthetic lines-of-sight,in Section 4 we describe the concrete results of the galaxy dis-tribution models derived from the DES Y3 dataset, and ﬁnally inSection 5 we outline the method for generating mock observationsfor DES Y3. In the following we assume a ﬂat Λ CDM cosmologywith Ω m = 0.3 and H = 70 km s –1 Mpc –1 , with distances deﬁnedin physical coordinates, rather than comoving. The ﬁrst three years of DES observations were made betweenAugust 15, 2013 and February 12, 2016 (DES Collaboration2016; Sevilla-Noarbe et al. 2020). This Y3 wide-ﬁeld dataset hasachieved nearly full footprint coverage albeit at shallower depth,with on average 4 tilings in each band ( g , r , i , z ) out of the eventu-ally planned 10 tilings. From the full 5000 deg , the effective sur-vey area is reduced to approximately 4400 deg due to the mask-ing of the Large Magellanic Cloud and bright stars. In parallel tothe wide-ﬁeld survey a smaller, deep ﬁeld survey is also conductedcovering a total unmasked area of 5.9 deg in 4 patches (Hartley& Choi et al., 2020). These consist of un-dithered pointings of theDark Energy Camera (DECam, Flaugher et al. 2015) repeated on aweekly cadence, resulting in data 1.5 - 2 mag deeper than the wide-ﬁeld survey. The DES Y3 footprint is shown on Figure 1. We usethree of the four of DES Y3 Deep Fields denoted as SN-C, SN-Eand SN-X, These consist of 8 partially overlapping tilings: threetilings for SN-C and SN-X, and two of the SN-E. Their location isalso shown on Figure 1. http://hsc.mtk.nao.ac.jp/ssp/ http://kids.strw.leidenuniv.nl/index.php http://sci.esa.int/euclid/ https://wﬁrst.gsfc.nasa.gov/ © 2021 RAS, MNRAS000

60 45 30 15 0 345 330RA4530150+15 D E C SN-XSN-C SN-E n c [ a r c m i n ] Figure 1.

Footprint of targeted clusters in DES Y3. Blue markers: locationof Deep ﬁeld regions SN-C, SN-E, SN-X (marker size not to scale). Thecolorscale indicates the number density of galaxy clusters ( n c ) identiﬁed bythe redMaPPer algorithm. The primary photometric catalog of DES Y3 is the Y3A2 GOLDdataset (Sevilla-Noarbe et al. 2020). This includes catalogs of pho-tometric detections and parameters from the wide-ﬁeld survey aswell as the corresponding maps of the characteristics of the obser-vations, foreground masks, and star-galaxy classiﬁcation.Data processing starts with single epoch images for which de-trending and photometric corrections are applied. They are sub-sequently co-added to facilitate the detection of fainter objects.The base set of photometric detections is obtained via SExtrac-tor (Bertin & Arnouts 1996) from r + i + z coadds. The ﬁducialphotometric properties for these detections are derived using the single-object-ﬁtting (SOF) algorithm based on the ngmix (Shel-don 2015) software which performs a simultaneous ﬁt of a bulge +disk composite model (CModel, cm ) to all available exposures ofa given object while modelling the point spread function (PSF) asa Gaussian mixture for each exposure. An expansion of this modelis the multi-object-ﬁtting (MOF) (Sevilla-Noarbe et al. 2020) ap-proach where in addition to the above ﬁrst step friends-of-friends(FoF) groups of galaxies are identiﬁed based on their ﬁducial mod-els, and in a subsequent step the galaxy models are corrected forall members of a FoF group in a combined ﬁt. While for the Y3A2GOLD dataset the SOF and MOF photometry were found to yieldsimilar solutions, it is expected that in crowded environments theMOF photometry would perform better, due to its more advancedtreatment of blending.The 10 σ detection limit for galaxies using SOF photometry inthe Y3A2 catalog is g = 23.78, r = 23.56, i = 23.04, z = 22.39 de-ﬁned in the AB system (Sevilla-Noarbe et al. 2020). That is a 99 percent completeness for galaxies with i < 22.5. Star - galaxy separa-tion is performed based on the morphology derived from SOF andMOF quantities, which for the i < 22.5 sample has 98.5 per cent ef-ﬁciency and 99 per cent purity, yielding approximately 226 millionextended objects out of a base sample of 390 million detections.SOF and MOF derived magnitudes are corrected for atmosphericand instrumental effects and for interstellar extinction to obtain theﬁnal corrected magnitudes. We consider an optically selected sample of galaxy clusters iden-tiﬁed by the redMaPPer algorithm in the DES Y3 data (Rykoffet al. 2014). The base input for this cluster ﬁnding is the Y3A2

Figure 2.

Distribution of redMaPPer clusters in DES Y3 dataset in thevolume-limited sample.

Solid black rectangles: narrow redshift selection.

Blue dotted rectangles : DES Y1 cluster cosmology selection.

SOF photometry catalog described above, from which redMaPPeridentiﬁes galaxy clusters as overdensities of red-sequence galaxies.This analysis uses redMaPPer version v6.4.22+2. An optical massproxy richness λ is assigned to each cluster deﬁned by the effec-tive number of red-sequence member galaxies brighter than 0.2 L ∗ .Cluster redshifts are estimated based on the photometric redshiftsof likely cluster members yielding a nearly unbiased estimate witha scatter of σ z /(1 + z ) ≈ z ≈ i ≈ λ > 5 and more than 21,000 above λ > 20. The spatial distribution of the latter higher richness sampleis shown on Figure 1, and the richness and redshift distribution isshown on Figure 2. In addition to the cluster catalog, a catalog ofreference random points is also provided, which are drawn from thepart of the footprint where survey conditions permit the detectionof a cluster of given richness and redshift.Finally we note that redMaPPer uses SOF derived photomet-ric catalogs instead of MOF, however this is expected to have noimpact on the result of this work as we only utilize the positions,richnesses and redshifts of the clusters. The DES supernova and deep ﬁeld survey is organized into four dis-tinct ﬁelds: SN-S, SN-X, SN-C and SN-E (Kessler et al. 2015; Ab-bott et al. 2019,Hartley & Choi et al., 2020). In this work we onlyconsider the SN-X, SN-C, SN-E ﬁelds covering a total unmaskedarea of 4.64 deg which overlap with the VISTA Deep Extragalac-tic Observations (VIDEO) survey (Jarvis et al. 2013), providing J , H , K band coverage.In the present study we consider only the detections derivedfrom the COADD_TRUTH stacking strategy which aims to opti-mize for reaching approximately 10 × the wide-ﬁeld survey depthwhile requiring that the deep ﬁeld resolution (FWHM) be no worsethat the median FWHM in the wide-ﬁeld data (Hartley & Choiet al., 2020).A difference compared to Y3A2 GOLD is that the MOF al- © 2021 RAS, MNRAS , 1–21 T. N. Varga y [ p i x ] RA: 3.3305 deg DEC: -41.2431 deg=46.70 z =0.33 DES Y3 y [ p i x ] Synthetic DES[45; 60) z [0.3; 0.35) y [ p i x ] cluster i = 17.4cluster i = 24.5field i = 17.4field i = 24.5 x [pix] y [ p i x ] cluster z = 0.325foreground z < 0.325background z > 0.9background z = 0.4 Figure 3.

Real and synthetic galaxy cluster side by side.

Top: gri color composite image of a real redMaPPer galaxy cluster in the DES Y3 footprint.

Secondrow: gri color composite image of a synthetic galaxy cluster representative of λ ∈ [45 60), z ∈ [0.3; 0.35). Third row: brightness distribution of the syntheticlight sources for cluster members (red/brown) and foreground and background objects (blue). Darker shades and larger symbols correspond to brighter objects.

Bottom row: exaggerated shear map of background sources (red ellipses) with the shade representing redshift, cluster members (black) and foreground sources(green). gorithm is run with "forced photometry" where astrometry and de-blending is done using DECam data, and infrared bands incorpo-rated only for the photometry measurement. This approach resultsin a coadded consistent photometric depth of i = 25 mag. The pho-tometric performance of these solutions were compared betweenthe DES wide and deep ﬁeld datasets using a joint set of photomet-ric sources, ﬁnding very good agreement on the derived colors (see Fig. 12 of Hartley & Choi et al., 2020). Additionally, for the deepﬁeld photometry the ngmix algorithm is run using the bulge + diskcomposite model with ﬁxed size ratio between the bulge and diskcomponents (in the following denoted as bdf to distinguish fromthe wide-ﬁeld processing).A photometric redshift estimate is derived by Hartley & Choiet al., (2020) for the deep-ﬁeld galaxies via the EAzY algorithm © 2021 RAS, MNRAS000 , 1–21 ynthetic Galaxy Clusters and Observations Based on Dark Energy Survey Year 3 Data (Brammer et al. 2008). These photometric redshift estimates areobtained by ﬁtting a mixture of stellar population templates tothe ugrizJHK band ﬂuxes of the deep ﬁeld galaxies. The possiblegalaxy redshifts and stellar template parameters are varied jointlyto obtain a redshift probability density function. The redshift esti-mates are validated using a reference set of spectroscopic galaxyredshifts over the same footprint, and Hartley & Choi et al., (2020)ﬁnds overall good performance for bright and intermediate depthswhich however deteriorates into a very large outlier fraction for thefaintest galaxies ( i > 24). In light of this we note that our algorithmfor modeling the properties of cluster member galaxies presentedin this analysis does not rely on redshifts, and we consider pho-tometric redshifts only for describing the line-of-sight distributionof foreground and background galaxies. Due to the substantiallyshallower limiting depth of the DES Y3 wide-ﬁeld survey the im-pact of the increased fraction of very faint ( i >24) redshift outliersis expected to be negligible. The focus of this study is to measure and model the galaxy contentof redMaPPer selected galaxy clusters within a bin of cluster prop-erties, and to use this measurement to create mock galaxy clusters.The cluster member model is complemented by a measurement andmodel for the properties of foreground and background galaxies.Each mock cluster is constructed to be representative in terms ofits member galaxies of the whole bin of cluster properties, and donot aim to capture cluster-to-cluster or line-of-sight to line-of-sightvariations.By construction, the clusters identiﬁed by redMaPPer are al-ways centered on a bright central galaxy (BCG). Central galaxiesform a unique and small subset of all galaxies, and therefore wetreat them separately from non-central galaxies. In our syntheticobservations we consider for each cluster bin a mock central galaxywhich has the mean properties of the observed redMaPPer BCGproperties within that bin. In this study, we only consider clustersselected on richness and redshift (mimicking DES Collaboration2020), and do not aim to incorporate correlated scatter between ad-ditional observables and mass properties at ﬁxed selection. Thus thetask for the rest of this section is to model the properties and distri-bution of non-central, foreground and background galaxies, in thefollowing simply denoted as galaxies. Faint stars are treated in thesame framework as foreground galaxies, while bright stars, tran-sients, streaks, and other imperfections which are masked duringdata processing are not incorporated in this model .Throughout this analysis we assume that galaxies are to ﬁrstorder sufﬁciently described by a set of observable features, primar-ily provided by the DES photometric processing pipeline. The keyfeatures are: i -band magnitude m i with de-reddening and other rel-evant photometric corrections applied, colors c = ( g – r , r – i , i – z ),galaxy redshift z g , and morphology parameters s describing thescale radius, ellipticity and ﬂux ratio of the two components of the ngmix SOF/MOF bulge + disk galaxy model. The full list of fea-tures and their relation to the DES Y3 data products is listed inTable A1.Our aim is to model the distribution of cluster member galax-ies, and foreground and background galaxies in the space of theabove features as a function of projected separation R from galaxy Nevertheless, these can be added after the synthetic images are generated. clusters of richness λ and redshift z . These distributions cannot bedirectly measured from the DES wide-ﬁeld survey as individualcluster member galaxies cannot be identiﬁed with sufﬁcient com-pleteness from photometric data alone, and the bulk of the galaxypopulations lie beyond the completeness threshold magnitude of i ≈ c ref ; R | λ , z )is measured in the wide-ﬁeld survey (Section 3.2 and Section 3.3).In the second step the wide-ﬁeld target distribution is used as aprior for resampling the galaxy features measured in the DES DeepFields (Section 3.5). Comparing the target distribution around clus-ters and around a set of reference random points enables us to iso-late the feature distribution of cluster members (Section 3.6). Thusthe resampling transforms the deep-ﬁeld feature distribution into anestimate on the full feature distribution of cluster member galax-ies, while keeping additional features measured accurately only inthe deep-ﬁeld data, and extrapolate the cluster population to faintermagnitudes.Figure 3 shows an illustration of a mock cluster generated asa result of this analysis at the level of a galaxy catalog and also asa fully rendered DES Y3-like coadd image, along with an actualredMaPPer cluster taken from the DES Y3 footprint with similarrichness and redshift. We group galaxy clusters into two bins of richness λ ∈ [30; 45) and[45; 60), and three bins of redshift z ∈ [0.3; 0.35), [0.45; 0.5) and0.6; 0.65), where each sample is processed separately. Our binningscheme is motivated by the selections of McClintock & Varga et al.,(2019) and DES Collaboration (2020), shown on Figure 2. In thispathﬁnder study, however, we only cover their central richness bins,and enforce a narrower redshift selection to reduce the smearing ofobserved photometric features (e.g. red sequence) due to mixingof different redshift cluster members. While this smearing is not alimitation for the presented model, reduced smearing and redshiftmixing will enable useful sanity checks in evaluating performance.The base dataset for this study is a subset of the Y3A2GOLD photometric catalog selected via the ﬂags listed in Table A2,queried from the DES Data Management system (DESDM, Mohret al. 2008). The ﬂags are chosen to yield a high-completenessgalaxy sample while excluding photometry failures. For each clus-ter in a given cluster selection we select all entries from this basecatalog which are within a pre-deﬁned search radius θ query ≈ –3 ; 0.1) arcmin, and in 50consecutive logarithmically-spaced radial bins between 0.1 arcminand 100 arcmin. Then, from each radial range we draw N draw =min( N bin ; N th ) galaxies where N bin is the number of galaxies in theradial bin, and N th = 10000 is a threshold number.The random draws are equally partitioned across the N clust clusters . To account for the number threshold N th , for each drawn That is from the vicinity of each cluster approximately N draw / N clust galaxies are drawn without replacement from each radial bin.© 2021 RAS, MNRAS , 1–21 T. N. Varga galaxy a weight w bin = N bin / N draw (1)is assigned. Therefore the number of tracers representing the galaxydistribution is reduced in an adaptive way. For each selected galaxythe full catalog row is transferred from the GOLD catalog, andthrough the random draws the same galaxy can enter multipletimes, but at different radii.The outcome of the above is a galaxy photometry catalog con-taining the projected radius R of each entry measured from the tar-geted cluster sample with a weight for each entry. The measurementis repeated for a sample of reference random points selected in thesame richness and redshift range as the cluster sample. This seconddataset is representative of the ﬁeld galaxy distributions, however,through the spatial and redshift distribution of the reference randompoints it also incorporates the impact of survey inhomogeneitiesand masking.Foreground stars appear in the projected vicinity of eachgalaxy cluster on the sky and also within the deep-ﬁeld areas,and enter into the photometry dataset. The model presented in thisstudy is not dependent on separation between stars and galaxies,as stars are automatically removed during statistical backgroundsubtraction. Nevertheless, the photometric properties of stars com-pared to galaxies increases the computational cost, as the differencebetween the proposal and target distribution increases when largenumber of stars are included. To counteract this we employ a size–luminosity cut i – mag < –50 + log (1 + T ) + 22 to remove the bulkof the stellar population, where T is the effective size of a detectiondeﬁned as listed in Table A1. These objects will be re-added at alater stage to produce survey-like observations. Our aim is to generalize the features of a ﬁnite set of observedgalaxies into an estimate on their multivariate feature probabilitydensity function (PDF). We achieve this task via kernel densityestimation (KDE), which is a type of unsupervised learning algo-rithm (Parzen 1962; Hastie et al. 2001). In brief, the ﬁnite set ofdata points are convolved with a

Kernel function K ( r , h ), where h is the bandwidth which sets the smoothing scale during the PDFreconstruction. We adopt a multivariate Gaussian kernel function K ( r , h ) formulated for d dimensional data with a single bandwidth h equal to the standard deviation. This way gaps and undersampledregions are modeled to have non-zero probability. For the practicalcalculation of KDEs we make use of the scikit-learn imple-mentation of the above algorithm .The photometry catalog has features with very disparatescales . This means that any single bandwidth h (smoothing scale)is not equally applicable for all dimensions. To address this westandardize and transform the input features before the KDE stepinto a set of new features which are better described by a singlebandwidth parameter. First we subtract the mean of each feature,then perform a principle component analysis (PCA) to ﬁnd theeigendirections of the input features (Hastie et al. 2001) via the scikit-learn implementation and map the features of each https://scikit-learn.org/stable/modules/density.html E.g., the value range and distribution of galaxy magnitudes and galaxycolors is markedly different. https://scikit-learn.org/stable/modules/decomposition.html galaxy into a set of eigenfeatures. Finally, these are standardizedby dividing each eigenfeature by its estimated standard deviationamong the sample.In order to ﬁnd the optimal bandwidth h for each KDE, weperform k-fold leave-one-out cross-validation (Hastie et al. 2001).Here the same base data is split into k equal parts, and fromthese each part is once considered as the test data, and the re-mainder is used as the training data. In this approach the score S = N (cid:80) Nj ln p n ( x j , h ) is calculated k = 5 times on different train-ing and test combinations, and from this a joint cross-validationscore is estimated. The ﬁnal KDE is then constructed from the fulldataset using the bandwidth maximizing the cross-validation score.Using PCA standardization, bandwidths can be expressed rel-ative to the standard deviation σ = 1 of the various standardizedeigenfeatures. Based on this we evaluate the cross validation scoreon a logarithmically-spaced bandwidth grid from 0.01 σ to 1.2 σ foreach KDE constructed. We ﬁnd that h = 0.1 σ simultaneously pro-vides a good bandwidth estimate for the deep-ﬁeld and the wide-ﬁeld KDEs, for this reason we adopt it as a global bandwidth forfurther calculations. Our aim is to model the radial feature distribution of cluster mem-ber galaxies for different samples of galaxy clusters. These mustbe separated from the distribution of foreground and backgroundgalaxies which we expect to be similar to the galaxies of the meansurvey line-of-sight. The input data product for the following cal-culations is the feature PDF estimated from the various deep-ﬁeldand wide-ﬁeld galaxy catalogs for each using the KDE approachin Section 3.3. The full list of feature deﬁnitions are shown in Ta-ble A1.Photometric redshift estimates available for the DES wide-ﬁeld (Hoyle & Gruen et al., 2018; Myles & Alarcon et al., 2020)are not precise enough to isolate a sufﬁciently pure and completesample of cluster member galaxies across the full range of galaxypopulations (e.g. not only the red sequence). Therefore, to avoidthe above limitation, we perform a statistical background subtrac-tion (Hansen et al. 2009) to estimate the feature distribution of purecluster member galaxies. In this framework we describe the line-of-sight galaxy distribution around galaxy clusters p clust as a two-component system of a cluster member population p memb , and aﬁeld population which is approximated by the distribution aroundreference random points p rand . This yields p memb ( θ , R ) = ˆ n r ˆ n c – ˆ n r (cid:20) ˆ n c ˆ n r p clust ( θ , R ) – p rand ( θ , R ) (cid:21) (2)where in practice both p.d.f-s on the right hand side are KDEs con-structed from the wide-ﬁeld dataset, θ is the list of features consid-ered, and R is the projected separation from the targeted positionson the sky. ˆ n c and ˆ n r refer to the mean number of galaxies detectedwithin R max around clusters and random points.The above approach is only applicable for those features θ and their respective value ranges which are covered by the wide-ﬁeld dataset. Furthermore, the formalism implicitly assumes thatthe p.d.f-s are dominated by the intrinsic distribution of properties,and not by measurement errors. To fulﬁll this requirement the wide-ﬁeld data must be restricted to a parameter range where photometryerrors play a subdominant role, and the completeness of the surveyis high. This necessitates excluding the bulk of the galaxy popula-tion from the naive background subtraction scheme.Especially important in relation to this study are galaxies © 2021 RAS, MNRAS000

18 20 22 24 i mag g - r p W ( w , ( i ), R | , z )

18 20 22 24 i mag p D ( w , w , R | , z )

18 20 22 24 i mag p D;prop = p D ( w , D ) Figure 4.

Illustration of the re-weighting approach according to Equation 5 and the various ingredients for the radial range R ∈ [10 –0.5 ; 1) arcmin aroundredMaPPer galaxy clusters with λ ∈ [45; 60 and z ∈ [0.3; 0.35). Left: color PDF estimates for the wide-ﬁeld shown in magenta, and the depth restricted DeepField shown in green.

Center left: color-magnitude diagram of galaxies in the DES wide-ﬁeld survey (not directly used in the transformation). This is the targetwhich the transformation aims to reproduce for i < 22.5. Center right: transformed deep-ﬁeld distribution according to Equation 5.

Right: color-magnitudediagram of galaxies measured in the DES Deep Fields.

Dashed vertical lines: wide-ﬁeld completeness magnitude i ≈ i < 22.5 magnitude range, the color based re-weighting shown on the center-right panel is in very good agreementwith the color-magnitude distribution of the cluster line-of-sight shown on the center-left panel. The color scale is capped to the same level on the three rightpanels to allow direct comparison of the distributions. whose ﬂux is great enough to meaningfully contribute to the totallight in a part of the sky, yet are not fully resolved or cannot be de-tected with conﬁdence using standard survey photometry pipelines(Suchyta et al. 2016, Everett & Yanny et al., 2020). Nevertheless,these partial or non-detections have a signiﬁcant impact on the pho-tometric performance of survey data products (Hoekstra et al. 2017;Euclid Collaboration 2019; Eckert et al. 2020). Therefore they mustbe modeled and included in the statistical description of a line-of-sight. A distinct undetected population of galaxies is associatedwith galaxy clusters, which are the faint-end of the cluster mem-ber galaxy population. The feature distribution of these galaxies ismarkedly different from the distribution of faint galaxies in the ﬁeld (cosmic mean) line-of-sight. To characterize the properties of galaxies too faint to have com-plete detections in the DES wide-ﬁeld survey, we make use of theDES Deep Fields. Owing to signiﬁcantly greater exposure timeover many epochs, the completeness depth of the Deep Fields inthe

COADD_TRUTH mode is ∼ i < 22.5 there are features measured more robustly for DeepFields such as the ngmix SOF/MOF morphology model parame-ters. However, the colors of photometric sources detected in bothdatasets are found to be largely robust against the differences in thephotometry analysis choices (see Section 2.3. of Everett & Yannyet al., 2020). Therefore we aim to combine the galaxy distributionsof the Deep Fields and the wide-ﬁeld using colors to inform theextrapolation of the various feature distributions to fainter magni-tudes.First, we denote our target distribution p D ( θ , R | λ , z ) wherethe subscript D indicates that the distribution is estimated fromthe Deep Fields down to a completeness limit of i ≈ W , and denoterestricting a deep-ﬁeld derived quantity to the shallower wide-ﬁeld depth with | W . In the following we decompose θ into two sets of fea-tures: θ wide which can be measured from the wide-ﬁeld dataset, and θ deep which can only be reliably measured from the Deep Fields:p D ( θ , R | λ , z ) ≡ p D ( θ deep , θ wide , R | λ , z ) . (3)Here we note that R , λ , and z are features and quantities which alsoonly originate from the wide-ﬁeld dataset. We note that all featuresin θ wide can also be measured with conﬁdence in the Deep Fields,but the reverse is not necessarily true.We formulate Equation 3 as a transformation of a naive pro-posal distribution, expressed by the factorization: p D ( θ deep , θ wide , R | λ , z ) = p D : prop ( θ deep , θ wide ) (4) × F ( θ deep , θ wide , R | λ , z ) .Here we separate the task into two parts, where the proposal distri-bution p D :prop carries information measured from the Deep Fields,and the multiplicative term F represents the required transforma-tion of the PDF.The simplest such transformation is derived in Appendix B,and the corresponding extrapolated distribution is given as ˜ p D ( θ deep , θ wide , R | λ , z ) ≈ p D ( θ deep , θ wide ) p W ( θ wide , R | λ , z ) ˆ V · p D ( θ wide ) (cid:12)(cid:12) W ,(5)where ˜ p indicates a survey-depth extrapolated PDF. In simple terms, p D ( θ deep , θ wide ) describes the cor-relation between features seen only in the Deep Fieldsand features seen also in the wide-ﬁeld survey, whilep W ( θ wide , R | λ , z ) / p D ( θ wide ) (cid:12)(cid:12) W captures the imprint of thecluster on the feature distributions . This framework conserves thecolor dependent luminosity function, and obeys ˜ p D ( θ deep | θ wide , R , λ , z ) ≡ p D ( θ deep | θ wide ) . (6)˙Since magnitudes are part of θ deep , this means that the ﬁnal PDFestimate inherits the luminosity function of the Deep Fields, alongwith all additional features which are measured in the Deep Fields.An illustration of the outcome and the ingredients of this ap-proach is shown on Figure 4. There, the center left panel shows the © 2021 RAS, MNRAS , 1–21 T. N. Varga target distribution: the color-magnitude diagram of galaxies mea-sured in projection with R ∈ [10 –0.5 ; 1) arcmin around redMaP-Per galaxy clusters with λ ∈ [45; 60 and z ∈ [0.3; 0.35) inthe DES wide-ﬁeld survey. The leftmost panel shows a wide-ﬁeldand the restricted deep-ﬁeld feature (color) distribution. The right-most panel shows the proposal distribution of galaxies measuredin the DES Deep Fields, with the wide-ﬁeld completeness mag-nitude shown as the vertical dashed line. The center right panelshows the transformed deep-ﬁeld distribution according to Equa-tion 5 where the radial color distribution around the cluster samplewas used as the target PDF The color scale is identical in the threepanels with iso-probability contours overlayed. For simplicity wetake θ wide = c wide as a set of colors measured in both the wide-ﬁeld survey and deep-ﬁeld survey, and θ deep = ( m , s , c deep , z g ) isa vector composed of magnitudes, colors, morphology parametersand redshifts measured in the deep-ﬁeld survey according to Ta-ble A1. In the KDE framework, evaluating the PDF is computationallymuch more expensive than drawing random samples from it. There-fore, we adopt an approach where instead of directly performingthe background subtraction we aim to generate random samplesfrom the target distribution ˜ p D ; memb . For this we make use of anapproach known as rejection sampling (MacKay 2002). In short,this generates random variables distributed according to a targetdistribution p targ by performing random draws from a proposal dis-tribution p prop which are then accepted or rejected according to adecision criterion.Appendix C derives the decision criterion for the combinedstatistical background subtraction and extrapolation. Using this wecan generate random samples from the extrapolated ˜ p memb , bydrawing samples { m i , c i , s i , z g ; i , R i } from p prop ( m , c , s , z g , R | λ , z ) = p D ( m , c , s , z g ) · p W ; rand ( R | λ , z ) (7)and considering the subset which fulﬁlls the extrapolated member-ship criteria ˆ n r ˆ n c p W ; rand ( c refwide;i , R i | λ , z ) M · p D ( c refwide; i ) · p W ; rand ( R i | λ , z ) < u i (8)and u i < p W ; clust ( c refwide; i , R i | λ , z ) M · p D ( c refwide; i ) · p W ; rand ( R i | λ , z ) , (9)where u i is drawn from a uniform random distribution U [0; 1). c refwide denotes a set of reference colors selected from c wide : { g – r ; r – i } z , { g – r ; r – i } z and { r – i ; i – z } z for the three clusterredshift bins respectively. These colors are chosen to bracket thered sequence at the respective redshift ranges in a manner similarto (Rykoff et al. 2014). Note that these criteria already implicitlycontain the evaluation of Equation 5 yielding an estimate of ˜ p memb ,and are composed entirely of factors which can be directly esti-mated from either the wide-ﬁeld or the deep-ﬁeld galaxy datasets. As a null-test, we can also perform the same resampling forthe galaxies around random points, which using the same proposaldistribution as above, is deﬁned by the criterion u i < ˆ n r ˆ n c p W ; rand ( c refwide; i , R i | λ , z ) M · p D ( c refwide; i ) · p W ; rand ( R i | λ , z ) , (10) which generates samples from the extrapolated ﬁeld galaxy distri-bution ˜ p rand .In the above formulas the factor M must be chosen appro-priately to ensure that the ratios are always less than or equal tounity. In practice there is no recipe for M , and the suitable valuemust be found for the actual samples proposed. Furthermore, mea-surement noise leads to small ﬂuctuations in the KDEs which es-pecially in the wings of the distributions manifests as p targ / p prop being very poorly constrained. To regularize this behaviour we re-lax the requirement on M and in practice only require the criterionto be fulﬁlled for 99 per cent of the proposed points. We explorethe M range in an iterative fashion up to 500, and ﬁnd no signiﬁ-cant change in the distribution of the samples for M > 40, thus weadopt M = 100 throughout this study.The random draws can be repeated until a sufﬁciently largesample is accepted for the cluster member and the ﬁeld objectdataset. Accepted draws can either be used directly to constructmock observations, or alternatively a KDE can then be constructedto estimate the PDF of the cluster members and extrapolated ﬁeldgalaxies separately.A practical limitation of this sampling method is that sincethe proposal R i values are drawn from the full considered radialrange around clusters and reference random points, the larger ra-dial ranges will be much better sampled than the lower radiusranges because of the increase in surface area. In our implemen-tation we counteract this by simultaneously considering multiplenested shells of overlapping radial intervals to ensure the efﬁcientcovering of the full radial range. While each of these PDFs is indi-vidually normalized to unity, we express the relative probability p l of a member galaxy residing in a given radial interval r l around acluster as p l ≈ ˆ n c ; l – ˆ n r ; l p l ( i < 22.5) (cid:30) (cid:88) l ˆ n c ; l – ˆ n r ; l p l ( i < 22.5) (11)where ˆ n c ; l , ˆ n r ; l is the average number of galaxies around clus-ters and random points residing in the radial bin in the wide-ﬁelddataset, and p l ( i < 22.5) is the probability that based on the KDEin radial bin l a galaxy is bright enough to be in the wide-ﬁeld se-lection. While this formalism is similar to the direct backgroundsubtraction scheme deﬁned in Section 3.4, it is only used to ap-proximate the relative weight of different radial ranges, and doesnot inﬂuence the estimation of the feature PDFs within the radialranges. For each sample of galaxy clusters we present the measurementsand the corresponding KDE estimates for the two primary input dis-tributions: The distribution of features around clusters in the wide-ﬁeld data, and the distribution of features in the deep-ﬁeld dataset.

We note that each KDE is constructed globally for all features andthe full value range, and not only for the shown conditional distri-butions.4.1.1 Distributions of wide-ﬁeld galaxies around clusters

Figure 5 shows the measured feature distribution of galaxies arounda selection of redMaPPer galaxy clusters with λ ∈ [45; 60) and z ∈ [0.3; 0.35). The features of this distribution are the reference colors © 2021 RAS, MNRAS000 , 1–21 ynthetic Galaxy Clusters and Observations Based on Dark Energy Survey Year 3 Data P D F log R < -0.5'-0.5' < log R < 0'0 < log R < 0.5'0.5 < log R < 1' 0.0 0.5 1.0 1.5r - i02468 P D F log R [arcmin]0102030 g a l [ a r c m i n ] r - i log R < -0.5[arcmin] 0 1 2g - r-0.5 < log R < 0 0 1 2g - r0 < log R < 0.5 0 1 2g - r0.5 < log R < 1 00.20.40.60.81 p p e a k Figure 5.

Distribution of galaxy features with i < 22.5 around redMaPPer galaxy clusters ( λ ∈ [45; 60), z ∈ [0.3; 0.35) in the DES wide-ﬁeld dataset. Top left and center: g – r and r – i color histograms of galaxies in bins of projected radius. Histogram:

DES data.

Contours:

KDE reconstruction. The radialbins correspond to the radial shells used in the calculation.

Top right:

Surface density proﬁle of galaxies around the targeted cluster sample. black: measuredproﬁle. color:

KDE reconstruction of the surface density proﬁle, color coded to the radial bins of the top left and center panels.

Bottom: g – r - r – i colordistribution of galaxies in the four radial shells. Each panel is normalized to the same color and contour levels such that the broadening of the color distributionof galaxies and the reduction in the prominence of the red-sequence with increasing radius is clearly visible in the data and is well reproduced by the KDE. Histogram:

DES data.

Contours:

KDE reconstruction. We note that the KDE is constructed globally for the full magnitude and feature ranges, and not onlyfor the shown 2d marginal distribution. c ref = ( g – r , r – i ) and the projected radial separation R measuredfrom the target galaxy cluster centers. Using these sets of features aKDE is constructed according to Section 3.3, whose model for thePDF is shown as the continuous curves and contours on Figure 5,while the 1D and 2D histograms represent the measured data.The top left two panels of Figure 5 show galaxy colors at dif-ferent projected radii from the cluster center for all galaxies with i < 22.5, while the bottom panels show the g – r - r – i color-colordiagram of galaxies with i < 22.5 in different radial bins. The his-tograms correspond to the measured distributions, while the con-tours represents the appropriate slice of the global KDE model. Aprominent radial dependence is visible as the red sequence becomesincreasingly dominant for small radii. The KDE model providesa good overall description of these galaxy distributions capturingthe two-component nature of the galaxy population. It recovers theposition and the approximate relative weight of the red sequencepopulation. We note that since the targeted galaxy clusters span aredshift range ∆ z = 0.05, the width of the observed red sequencepopulation is measured to be wider, by this dispersion, comparedto its intrinsic width.The top right panel of Figure 5 shows the surface number den-sity proﬁle Σ gal ( R ) = N ( R ) / 2 π R of galaxies with i < 22.5 aroundthe selected cluster sample in the wide-ﬁeld survey as the solidblack curve. Colored curves show the corresponding KDE mod-els for the four nested shells. In addition to the target range of theKDEs which are shown as the full lines, as a consistency test theinterior continuation of the KDE model for the outermost nestedspherical bin is shown as the dotted line. This only shows mild de-viation from the respective proﬁle of the data, and the measuredradial surface density proﬁle and the KDE models show very goodagreement. This means that the difference between the measured g - r r - i g - r21 < i < 22.5 g - r22.5 < i < 24 r - i i - z r - i21 < i < 22.5 r - i22.5 < i < 240 0.2 0.4 0.6 0.8 1 p peak Figure 6.

Distribution of g – r , r – i , i – z galaxy colors in the DES DeepFields in bins of i -band magnitude. Histogram:

DES data.

Contours:

KDEreconstruction. We note that the KDE is constructed globally for the fullmagnitude and feature ranges, and not only for the shown 2d marginal dis-tribution. and modeled absolute density is very small over a range of twoorders of magnitude, as set by the change in area element. © 2021 RAS, MNRAS , 1–21 T. N. Varga bulge / disk fraction P D F redshift deep i < 24.5deep i < 22.5KDE i < 24.5KDE i < 22.5 Figure 7.

Distribution of galaxy morphology parameters in the DES DeepFields, as listed in Table A1.

Histogram:

DES data.

Contours / curves:

KDEreconstruction. We note that the KDE is constructed globally for the fullmagnitude and feature ranges, and not only for the shown marginal distri-butions.

Figure 6 shows the g – r - r – i and the r – i - i – z color - colordiagrams of the deep-ﬁeld galaxies in three different magnituderanges. The measured distributions are shown as a 2D histograms,and the corresponding KDE model is represented by contours. ThisKDE model is constructed simultaneously for all features listedin Table A1, and it provides an excellent description of the color-color-magnitude distribution of galaxies.Figure 7 shows the same KDE model projected into the spaceof bulge / disk ﬂux fraction (a morphology parameter) and red-shift estimate. The left panel of Figure 7 shows the histograms ofthe measured bulge / disk ﬂux fraction of the ngmix bdf galaxymodel for two magnitude bins 19.5 < i < 21 and 21 < i < 22.5,along with the corresponding KDE model. Brighter galaxies aremore likely to be bulge dominated (e.g. described by a de Vau-couleurs light proﬁle) compared to fainter galaxies, which is in ac-cordance with expectations from galaxy evolution (Gavazzi et al.2010). The peak appearing at 0.5 is an imprint of the morphologyprior of the deep-ﬁeld photometry pipeline, and it becomes promi-nent for the fainter galaxy selection as there the available infor-mation to constrain morphology from survey observations dimin-ishes. KDE estimates cannot reproduce the hard cutoff edges [0; 1]of the bulge / disk ﬂux fraction value, and for this reason we capthe distributions around 0 and 1 to restrict the PDF model to theappropriate interval, so that values greater than 1 or lower than 0receive a value of 1 or 0 respectively. The right panel of Figure 7shows the estimated redshift distribution of the deep-ﬁeld galax-ies, as predicted by the EAzY algorithm (Brammer et al. 2008, seeSection 2.3) along with the KDE reconstruction for two differentmagnitude ranges. For both the bulge/disk ratio and the redshift pa-rameters the KDE model provides a very good description of themeasured data. We emphasize that these are different projections ofthe same model shown on Figure 6.

The result of the statistical model is a set of random samples drawnfrom the feature PDF of the extrapolated cluster member galaxies,and a set of random samples which are drawn from the extrapo-lated ﬁeld galaxy population. For both of these samples a KDE isconstructed according to Section 3.3, whose purpose is to providea computationally efﬁcient way of generating further samples. Thismodel covers the full set of features listed in Table A1 to a deeperlimiting magnitude of i = 24 and is shown on Figure 8 for a single cluster bin with λ ∈ [45; 60) and z ∈ [0.3; 0.35). In the follow-ing we overview the noteworthy features reproduced by this modeland present the line-of-sight structure and galaxy surface densitydistribution of our synthetic clusters. Our galaxy redshift distribution model used for creating syntheticcluster line-of-sights is illustrated on Figure 9 for a cluster samplewith λ ∈ [45; 60) and z ∈ [0.3; 0.35) where the emulated red-shift PDF of galaxies with i < 22.5 and within the radial range R ∈ [1; 3.16) arcmin is shown as the magenta histogram. This is acombination of a cluster member term located at the mean clusterredshift z = 0.325, and a ﬁeld term. As a comparison the redshiftPDF of deep-ﬁeld galaxies is shown in blue for the same magni-tude range. Owing to the extrapolation part of the analysis, the re-constructed line-of-sight is modeled down to the deep-ﬁeld limitingmagnitude of i < 24.5. It contains a faint cluster member popula-tion in addition to the faint end of the ﬁeld galaxy population shownas the orange histogram, with the comparison redshift distributionof the deep-ﬁeld galaxies shown as the green histogram.This line-of-sight model incorporates galaxy redshifts derivedfrom the deep-ﬁelds using ugrizJHK bands. In turn the reduced red-shift uncertainty for deep-ﬁeld galaxies allows us to take the lensgeometry correctly into account to apply the lensing effect for eachgalaxy. Figure 9 also shows that the redshift distribution of galaxiesnear a cluster in projection is signiﬁcantly different from the one inthe Deep Fields. This aspect of the line-of-sight model enables usto construct mock observations where we can test the response ofphotometric redshift estimates to the presence of the galaxy clus-ter. This manifests itself as the problem of boost factors or clustermember contamination (Sheldon et al. 2004; Melchior et al. 2017;Varga et al. 2019), as well as propagating blending-related pho-tometry effects onto the performance estimates of photometric red-shifts. The models for the galaxy surface density proﬁles are shown onFigure 10. The magnitude range is restricted to i < 22.5. In addi-tion, the measured galaxy surface density proﬁle is indicated by theorange shaded area, and the surface density proﬁle around the cor-responding sample of reference random points as the gray shadedarea. The width of these areas indicates the Poisson uncertainty ofthe number of galaxies.The model for the ﬁeld population is shown as the greenlines on Figure 10. This distribution corresponds to the backgroundmodel during the statistical background subtraction, but it is con-structed by re-weighting and resampling deep-ﬁeld galaxies. Theexcellent agreement between this and the proﬁle measured aroundrandom points in the DES wide-ﬁeld data is a strong consistencytest of the statistical model, and is an indication that the statisticalbackground subtraction works as intended.The model for the pure cluster member distribution is shownas the magenta curves on Figure 10, and it captures the radial vari-ations in surface density, approaching zero at large radii, consistentwith the ﬁnite extent of the cluster galaxy populations. The modelfor the full surface density proﬁle is then obtained as the sum ofthe cluster member (magenta) and the ﬁeld (green) population esti-mates, and this surface density proﬁle is shown as the black dashedlines, which can then be directly compared with the galaxy proﬁles © 2021 RAS, MNRAS000

20 22 24 i -mag P D F g - r r - i i - z log R | e | log (1 + T ) bulge/disk frac. z i - m a g g - r g - r r - i r - i i - z i - z l o g R l o g R | e | | e | l o g ( + T ) l o g ( + T ) b u l g e / d i s k f r a c . b u l g e / d i s k f r a c .

20 22 24 i -mag z g - r r - i i - z log R | e | log (1 + T ) bulge/disk frac. z P D F Figure 8.

Joint galaxy feature model in the radial range R ∈ [10 –0.5 ; 1] arcmin, for the cluster sample with λ ∈ [45; 60) and z ∈ [0.3; 0.35). The parametersshown are summarized in Table A1. Lower left panels, magenta: cluster member galaxies with i < 22.5. Lower left panels, black: ﬁeld galaxies with i < 22.5. Upper right panels, green: extrapolated cluster member galaxies 22.5 < i < 24. Upper right panels, gray: extrapolated foreground and background galaxieswith 22.5 < i < 24. The bump visible in the redshift PDF near the cluster redshift range (magenta dashed lines) is coincidental, it is a property of the DESdeep-ﬁeld galaxy distribution, also visible on Figure 7. measured in the DES data around clusters (orange lines). The twoshow excellent agreement. The downturn of the surface density pro-ﬁles at R < 0.1 arcmin is due detection incompleteness caused bythe central galaxy. In our model this regime is however describedby the BCG + ICL component components (see Section 5.3, com-pare with Figure 13). The light proﬁle of cluster centrals do showconsiderable variability on such small scales (see Fig. 18. Klugeet al. 2020), this is however not incorporated in the smooth ICLmodel of Gruen et al. (2019) adopted in this study. Galaxy clusters host a characteristic population of quiescent redgalaxies distributed along the red-sequence, and also a non-redcluster member component. In projection, these cluster membersare mixed together with foreground and background galaxies.Figure 11 shows the model and measurements for the g – r color distribution of galaxies as an illustration of the statisticallearning model for the cluster sample with λ ∈ [45; 60) z ∈ [0.3; 0.35). The columns correspond to different bins of projectedradius, and the rows to different magnitude ranges. The ﬁrst two[19; 21) and [21; 22.5) rows show the model ﬁtted to the DES © 2021 RAS, MNRAS , 1–21 T. N. Varga

Figure 9.

Line-of-sight model for the redshift distribution of galaxies nearclusters with λ ∈ [45; 60) and z ∈ [0.3; 0.35) within the projected radialrange R ∈ [1; 3.16). Magenta, orange: redshift distribution model aroundclusters in different magnitude bins.

Blue, green: photometric redshift dis-tribution measured in the DES Deep Fields in different magnitude bins.

Grey dashed: limits of the cluster redshift range. The cluster line-of-sightmodels show a signiﬁcant deviation from the ﬁeld line-of-sight, concen-trated in a narrow redshift peak at z clust . wide-ﬁeld data, while the third [23; 24) is a pure extrapolationbased on the algorithm. The measured color distributions from theDES wide-ﬁeld data are shown as the orange histograms, with thecolored area representing the Poisson uncertainty of the measure-ment. As a comparison, for each cell the respective conditionalcolor distribution measured in the DES Deep Fields is shown (bluehistogram). This population naturally has no radial dependence,and is thus identical in the different columns.Out of the above two populations, only the deep-ﬁeld one ismeasured down to the third magnitude bin i ∈ [23; 24), thereforethe cluster measurement (orange) is not shown there. The colordistribution around clusters shows a strong radial trend, with theorange histogram approaching the blue with increasing radius. Adominant driver of this trend is increasing prominence of the red-sequence at low radii, which manifests as a peak in the color dis-tribution. The relative weight of the red-sequence is greater forbrighter galaxies, and the difference between cluster and ﬁeld line-of-sights is also greater for brighter galaxies. As a reference, thelocation of the redMaPPer red-sequence model is indicated by thevertical gray dotted lines. These lines correspond to the 1 σ range ofthe membership probability weighted color distribution of redMaP-Per cluster members for that cluster richness, redshift range. Boththe location and the width of the peak of the cluster memberhistogram (shown in orange) is consistent with the properties ofredMaPPer cluster members, indicating that it is indeed an imprintof the red-sequence. We note that only galaxies with L > 0.2 L (cid:63) areconsidered by redMaPPer as potential member galaxies and thisdoes not fully cover the faintest magnitude bin of this analysis.Figure 11 shows the model for the projected galaxy distribu-tions around galaxy clusters as the black dashed lines, which canbe directly compared with the orange histogram. This model is de-rived without direct information about the wide-ﬁeld galaxy lumi-nosity function around clusters, and only using information fromthe deep-ﬁeld data. Nevertheless, as visible on the upper two rowsof Figure 11, the line-of-sight model can describe the magnitudedependent color variations of the galaxy distributions, and well approximate the relative weight of the red-sequence peak, albeitslightly over-estimating its width. The bottom row shows the modelfor galaxies in the line-of-sight with i ∈ [23; 24). Due to the ex-trapolation part of the approach, the model extends to these faintermagnitudes, even though they are not directly measured in clusterline-of-sights.The feature distributions of foreground and background galax-ies are independent of the cluster galaxy population. Thus it is ex-pected that the residual ﬁeld model is independent of radius. Whilethe bright tip of the DES Deep Fields is not fully representative ofthe actual median DES wide-ﬁeld survey due to sample variance,it still provides a reasonable reference distribution. Comparing theresidual ﬁeld model (green curve) with the deep-ﬁeld distribution(blue histogram) on Figure 11 shows no strong radial variations.The residual ﬁeld indeed approximates the deep-ﬁeld distribution,with only minor deviations visible at the faint end. The radial color evolution of the cluster member galaxy populationcan be described by the approximate red-fraction, whose radial pro-ﬁle for the three high richness bins is shown on Figure 12, alongwith the color cuts used in the deﬁnition. These regions are chosento bracket the position of the red sequence which is dominant atlow radii. Two magnitude ranges are shown: a brighter bin cover-ing i ∈ [19; 22.5) coincides with the DES wide-ﬁeld depth, and afainter bin covering i ∈ [22.4; 24.5), which is derived from a purelyextrapolated color-color distributions. While the ﬁgure shows onlythe higher richness samples, there appears to be no signiﬁcant dif-ference between the richness bins.The bright galaxy sample shows a clear monotonic trend inall redshift and richness samples, where the red-fraction decreasesfrom approximately unity at very low projected radii to approxi-mately 30 - 40 per cent at large radii approaching 10 arcmin. Thisbehaviour is consistent with previous measurements (Butcher &Oemler 1978; Hansen et al. 2009; Hennig et al. 2017). It is also inagreement with existing DES-like synthetic clusters derived fromdecorated gravity-only numerical simulations presented in DeRoseet al. (2019); Varga et al. (2019). The same behaviour is not uni-formly true for the fainter, extrapolated red-fraction proﬁles. Somecluster bins show a prominent red galaxy population at the cen-ter, the decline is much faster for these fainter populations than thebrighter counterparts for the same clusters. At large radii the galaxypopulation appears to show a constant mix of red and blue mem-bers, and approach the preferentially bluer cosmic mean galaxypopulations. The model for non-central galaxies is composed of two main com-ponents: the distribution of cluster member galaxies (satellites) andthe distribution of foreground and background galaxies. A syntheticcluster line-of-sight is created by random draws from the PDF ofthe different components. Here each draw corresponds to addinga new galaxy to a mock catalog with an angular and redshift po-sition, and the photometric and morphological features containedwithin the model.A PDF carries no information about the absolute number of © 2021 RAS, MNRAS000

Surface density of galaxies around galaxy clusters with different richness and redshift.

Orange:

Surface density proﬁle measured around redMaPPerclusters. The width of the shaded area represents the Poisson uncertainty propagated into surface density.

Gray vertical area: effective size of the cluster BCG( √ T ). The drop of the cluster LOS proﬁle within this range represents a detection incompleteness due to the light of the central galaxy. In our model thisregime is instead described by the BCG + ICL component (see Section 5.3, compare with Figure 13). Gray:

Surface density of galaxies measured aroundreference random points.

Green: model for the surface density proﬁle of ﬁeld galaxies within the cluster line-of-sight.

Magenta: model for the surface densityproﬁle of cluster member galaxies in the cluster line-of-sight.

Black dashed: model for the total galaxy surface density proﬁle in the cluster line-of-sight (thesum of the green and magenta curves). P D F R [0.03; 0.32) [arcmin] [45; 60) z [0.3; 0.35) i mag [19; 21] R [0.32; 1.00) [arcmin] R [1.00; 3.16) [arcmin] R [3.16; 10.00) [arcmin]02 P D F i mag [21; 22.5] P D F i mag [23; 24]Depth Extrapolation ProposalCluster LOS

Residual FieldCluster MemberLOS model

Figure 11.

Conditional color distribution of galaxies around galaxy clusters across four projected radial regimes (shown in the different columns) aroundgalaxy clusters with λ ∈ [45; 60) and z ∈ [0.3; 0.35). The distribution of galaxies are shown in g – r , g – r and r – i colors respectively. There are threemagnitude ranges shown (rows), the ﬁrst two [19; 21) and [21; 22.5) are ﬁtted to the DES wide-ﬁeld data, while the third [23; 24) is a pure extrapolationbased on the algorithm. Orange : color PDF measured as a histogram around galaxy clusters in DES data. The height of the shaded area indicates the Poissonuncertainty propagated into the normalized histogram.

Blue : color distribution measured within the corresponding magnitude range in the DES Deep Fields.This distribution is identical for each column and for all cluster samples.

Green : Model for the color distribution of foreground and background galaxies inthe line-of-sight.

Magenta : Model for the color distribution of cluster member galaxies.

Black dashed : Model for the full line-of-sight, which can be directlycompared with the orange histogram.

Gray dotted : 1 σ location of the redMaPPer red-sequence cluster member galaxies. objects, therefore this needs to be set based on the observed num-ber of galaxies. In real observations only the bright end of the lu-minosity function is observed in the survey (i.e. i < 22.5) thereforethe number of fainter galaxies must be deﬁned according to theirrelative probability in the model.A single mock galaxy cluster is constructed the followingway: (i) For each radial range l , calculate ˆ N C ; l and ˆ N R ; l the mean numberof galaxies with i < 22.5 around clusters and random points respec-tively in radial range l .(ii) For each radial range l , take a Poisson random number of galaxies © 2021 RAS, MNRAS , 1–21 T. N. Varga log R [arcmin] R e d F r a c t i o n

19 < i < 22.522.5 < i < 24.5 z [0.3; 0.35) g r > 1.2 r i > 0.4 z [0.45; 0.5) g r > 1.3 r i > 0.5 z [0.6; 0.65) r i > 0.7 i z > 0.25 Figure 12.

Red fraction of cluster members as a function of projected radiusfor three different cluster redshift samples with λ ∈ [45; 60). based on the mean number as N M ; l = Poisson (cid:32) ˆ N C ; l – ˆ N R ; l p memb; l ( i < 22.5) (cid:33) , (12)and N R ; l = Poisson (cid:32) ˆ N R ; l p rand; l ( i < 22.5) (cid:33) . (13)(iii) Draw cluster members N M ; l times from p memb; l and foregroundand background galaxies N R ; l times from p rand; l .(iv) For cluster members set the redshift to z clust .(v) Convert the projected radius feature R i into 2D position assumingcircular symmetry in a ﬂat-sky approximation.The outcome of the above recipe is a galaxy catalog which con-tains cluster members and foreground and background galaxieseach distributed according to their respective statistical models de-rived from the survey data, but extrapolated to a fainter limitingmagnitude, and the surface density of galaxies is set to the meansurface density measured around galaxy clusters.In practice we update step 1 by only measuring ˆ N C ; l from data,and expressing ˆ N R ; l as a function ˆ N C ; l using the statistical model.In practice this is achieved by taking the ratio of accepted eventsduring the rejection sampling (see Section 3.6) which only fulﬁllEquation 10, to the amount of events which fulﬁll both Equation 10and Equation 9. This latter formulation avoids scenarios when dueto measurement noise by chance ˆ N R ; l > ˆ N C ; l . Synthetic weak lensing measurements require a mass model for thegalaxy cluster to apply gravitational shear to the background galax-ies. For this we make use of the mass models and mass constraintsfound in McClintock & Varga et al., (2019). As that analysis didnot ﬁnd a signiﬁcant redshift evolution in the richness-mass scal-ing, we can approximate the relevant mean cluster masses for thepresent mocks, that is M ≈ M (cid:12) for the λ ∈ [30; 45)bin and M ≈ M (cid:12) for the λ ∈ [45; 60) bin across thethree different redshift bins.In the following pathﬁnder study, we only consider the massmodel for the 1-halo term which is dominant on the small scalesexplored in this study, and consists of a spherically symmetric X [pix] Y [ p i x ] without ICL X [pix]with ICL 0.2 arcmin

Figure 13.

Synthetic center of a mock galaxy cluster without (left) and withthe intra-cluster light model applied (right). Real galaxy clusters host a largefraction of their stellar light in the form of ICL, which the simple BCG onlylight model cannot reproduce. This is seen on Figure 3 and Figure 14. mass distribution with Navarro-Frenk-White (NFW) mass proﬁle(Navarro et al. 1996). This lens mass distribution is placed at thecluster redshift z clust and subsequently gravitational shear and mag-niﬁcation is applied to line-of-sight galaxies based on their true red-shifts assigned by the model. The lensing effect induced by a NFWhalo is expressed analytically following (Oaxaca Wright & Brain-erd 1999). Reduced gravitational shear g is directly applied to eachgalaxy through the ngmix bdf galaxy model. The magniﬁcation( µ ) is however only applied as a simple approximation, by modu-lating the total ﬂux of the galaxy light models F lensed; i = µ i F i inan a-chromatic way. This correctly captures the change in the totalobserved ﬂux of each galaxy, but does not reproduce the increasein observed size. The impact of this approximation is expected tobe minor given the very small apparent size of the high-redshiftgalaxies which experience the greatest magniﬁcation effect. A prominent feature of galaxy clusters is the presence of a brightcentral galaxy (BCG) and a surrounding distribution of intra-cluster light (ICL) emitted by a diffuse stellar component boundto the cluster halo. These components contain a signiﬁcant frac-tion of the total optical light emitted by the cluster (Zhang et al.2019; Sampaio-Santos et al. 2020; Kluge et al. 2020), therefore ac-counting for them is essential in a dedicated simulation of syntheticgalaxy cluster observations.By construction galaxy clusters identiﬁed by redMaPPer arealways centered on a bright red-sequence galaxy. This is a sim-pliﬁed view of reality, as in recent mergers or in non-equilibriumsystems the central galaxy might not be red or the brightest, or theremight be multiple similarly bright BCGs (Rykoff et al. 2014). Orig-inating from the special location they inhabit, the central galaxiesof massive halos follow a different evolutionary track comparedto satellite galaxies. It is observed that their properties are closelytied to the mass and properties of their cluster (Postman & Lauer1995), and their luminosity function is approximately Gaussian atﬁxed cluster mass proxy and redshift (Hansen et al. 2009). Basedon these observations we model the synthetic central galaxy in themocks as having the mean properties of the redMaPPer centralgalaxies in the cluster sample. The relevant mean central galaxyfeatures are listed in Table A3 for the different cluster redshift andrichness samples. The central galaxies are assumed to have a deVaucouleurs light proﬁle, and the only stochastic element in the © 2021 RAS, MNRAS000 , 1–21 ynthetic Galaxy Clusters and Observations Based on Dark Energy Survey Year 3 Data model is their random orientation in the plane of the sky with ﬁxedellipticity | g |.The total light in the central region of a cluster is, however,not fully described by the above model, as there is a continu-ous transition between the light usually associated with the cen-tral galaxy and the intra-cluster light (Kluge et al. 2020). Zhanget al. (2019) investigated the properties of the ICL for redMaPPerselected galaxy clusters with z clust ∈ [0.2; 0.3) within the DESY1 dataset. In a stacked analysis they measured the diffuse light ofthe ICL down to a surface brightness of 30 mag arcsec –2 . Zhanget al. (2019) investigated the richness (mass) dependence of theICL, ﬁnding a self-similarity of the light proﬁle when expressed inunits of R m . Thel ICL - mass relation was further established bySampaio-Santos et al. (2020) in an expanded re-analysis of the DESY1 redMaPPer cluster sample. Using the measurements of Zhanget al. (2019), Gruen et al. (2019) constructed a simple model forthe ICL observed around redMaPPer clusters in DES. This modelextrapolates from the measurement of Zhang et al. (2019) in termsof cluster mass using the self-similarity of the proﬁles, and also interms of cluster redshift by assuming a simple passively evolvingstellar population within the ICL. We note that this latter assump-tion is closely related to the formation history and age of the ICL,which is poorly constrained from current observational studies dueto the difﬁculty of high redshift observations. Thus in case of a late-forming ICL the above extrapolation overestimates the total lightcontained in it at early times. Furthermore, the model neglects themild radius dependent color gradient in the ICL, where the outerranges are slightly bluer.In the following we adopt the ICL model of Gruen et al.(2019). As a simpliﬁcation we assume that the colors of the ICLare identical to the mean colors of BCGs at that redshift and clus-ter richness sample. The ICL component extends to large radii asan approximate power law surface density light proﬁle, while the ngmix BCG light model is dominant in the inner regions. Becauseof their overlap, these components cannot be directly added to eachother. Therefore we deﬁne a tapered ICL model where the taperingscale is set by the size of the BCG component θ S = √ T BCG , where T BCG is taken from the DES Y3 MOF photometry catalog and isdeﬁned the same way as the size parameter listed in Table A1. Toensure the smooth joining of the BCG and ICL components we de-ﬁne the total light proﬁle model as µ ( θ ) = µ BCG ( θ ) + (cid:18) e θ – θ S ) (cid:19) µ ICL ( θ ) . (14)An illustration of this joint BCG + ICL light proﬁle in the mockcluster images is shown on Figure 13. The two panels show an iden-tical set of mock galaxies for a synthetic cluster corresponding tothe cluster bin with λ ∈ [45; 60) and z ∈ [0.3; 0.35), however theleft panel shows only the ngmix galaxy models, while the rightpanel also shows the ICL component added. Simulated galaxy images are the bedrock of estimating the perfor-mance of weak lensing methods, and therefore they were the topicof extensive study in the literature (Massey et al. 2007; Bridle et al.2009; Mandelbaum et al. 2015; Jarvis et al. 2016, Zuntz & Sheldonet al., 2018, Samuroff et al. 2018). In the following we make use ofa simpliﬁed version of the image simulation pipeline developed forthe Y3 analysis of DES (MacCrann et al. 2020).The construction starts with a catalog of photometric objects which will inhabit the mock image. For this study this catalog con-tains the parameters of the ngmix bdf light distribution model foreach entry which are pixel position in the image, shape ( g ; g ),size T , bulge / disk ﬂux fraction, and ﬂuxes in g , r , i , z bands. Thiscatalog corresponds to a random realization of a mock line-of-sightconstructed according to Section 5.1 and Section 5.2. Finally thecentral galaxy is added as deﬁned in Section 5.3. At this stage starsand foreground objects can be added according to their density atthe targeted galactic latitude. In the present pathﬁnder study theseare drawn from the population of stars excluded in Section 3.2. Fur-thermore, we only consider a simpliﬁed scenario and add a stellarsample drawn from the deep-ﬁeld catalog according to their relativedensity in the deep-ﬁeld footprints.Synthetic images are created via a customized version ofthe DES Y3 image simulation pipeline (MacCrann et al. 2020),which renders images based on a galaxy image simulation pack-age GalSim (Rowe et al. 2015), while using an extension pack-age for the ngmix bdf light proﬁle model used in the actual DESY3 deep-ﬁeld analysis . This model describes the galaxies as acombination of two terms: an exponential light proﬁle (disk) anda de Vaucouleurs (bulge) light proﬁle. Given that most galaxies ina DES-like survey are poorly resolved, an additional constraint isenforced by setting the effective radius of both light proﬁle compo-nents to be identical.In the following, we consider a simpliﬁed setup of the obser-vational scenario of DES where we directly simulate the so-called co-added survey images. Under real circumstances due to varia-tions in observing conditions and the point spread function (PSF)between exposures the net PSF in co-added images is difﬁcult tomodel, thus the DES shape estimation pipeline itself takes singleexposure images as input. In a simulation such variations can befactored out, which allows us to simplify the simulation setup intodeeper mock co-added images with well behaved PSFs.The synthetic co-added images are constructed the followingway:(i) The image canvas is deﬁned with its desired dimensions and pixelscale, in the case of DES, 0.27 arcsec / pixel. The canvas is deﬁnedas a 10k ×

10k pixel rectangle.(ii) For each object a small cutout image (postage stamp) is con-structed. The light model is deﬁned using ngmix , convolved with arepresentation of the mock PSF, then rendered into a postage stamp.We model the PSF as a Gaussian with a full-width half-maximum(FWHM) of 0.9 arcsec, which is roughly equal to the median DESobserving condition (Sevilla-Noarbe et al. 2020).(iii) After the creation of all postage stamps, they are added onto themain canvas at their intended pixel positions.(iv) A noise map is applied to the image. In this study we take thenoise properties of a randomly selected DES tile (DES2122+0209)and apply Gaussian noise matched to reproduce the median ﬂux ofthe unmasked regions of the reference tile in the chosen observa-tional band. Choosing the noise level for synthetic images is notstraightforward, as a substantial amount of light which is tradition-ally attributed to noise in fact originates from undetected faint starsand galaxies (Hoekstra et al. 2017; Euclid Collaboration 2019; Eck-ert et al. 2020). In the framework of the present analysis many ofthese undetected sources are explicitly part of the rendered objects,therefore as a rough approximation we reduce the background noisevariance by half for illustration purposes. https://github.com/esheldon/ngmix , the ngmix.gmix.GmixBDF model.© 2021 RAS, MNRAS , 1–21 T. N. Varga (v) Finally the tapered ICL model deﬁned according to Section 5.3 isevaluated for the pixel positions of the mock image and the addi-tional light component is added onto the synthetic observation. Weassume that the ICL has the same ellipticity and major axis direc-tion alignment as the central galaxy.The result of this recipe is illustrated on Figure 3 where a gri -band color composite image is shown for synthetic clusters side byside with redMaPPer clusters with similar observable parameters.While the synthetic images do contain an approximate stellar pop-ulation based on faint stars observed in the Deep Fields, very brightstars which need to be masked are not currently reproduced in themock observations. Furthermore, low redshift foreground objectssuch as galaxies with visible disc and spiral arm features are notcontained in the scope of the present analysis. In addition to thecolor composite images, Figure 3 also illustrates the compositionof the lines-of-sight. The third row of each ﬁgure shows the bright-ness distribution of the cluster component with brown/red symbols,and the foreground and background component with blue symbols.The shade and size of the symbols indicate the brightness withfainter objects shown as smaller markers. Many of the faint objectsare barely or not at all discernible on the composite images. Yetthese unresolved sources inﬂuence the performance of photometricmethods (Hoekstra et al. 2017; Euclid Collaboration 2019, Everett& Yanny et al., 2020). The bottom row of each ﬁgure shows theexaggerated gravitational shear imprinted on background sources(the ellipticities are increased by a factor of 20). The backgroundsources are shown in as darker color for low redshift and lightercolor for high redshifts. Cluster members are shown in black sym-bols, while foreground objects are shown in green. The differentbrightness values are indicated by the different marker sizes.While the galaxy populations of the λ ∈ [30; 45) and λ ∈ [45; 60) bins are found to be close in terms of their galaxy surfacedensity proﬁles, clusters show greater differences between the dif-ferent redshift ranges. This is illustrated by Figure 14, which showssynthetic galaxy clusters with λ ∈ [45; 60) in the z ∈ [0.3; 0.35), z ∈ [0.45; 0.5) and z ∈ [0.6; 0.65) cluster samples. These colorcomposite images show a striking illustration of the changes in thevisible properties of galaxy clusters across cosmic time. We present a pathﬁnder study to generate synthetic galaxy clustersand cluster observations in an unsupervised way from a combina-tion of observational data taken by the Dark Energy Survey up to itsthird year of observations (DES Y3). Example realisations of syn-thetic galaxy cluster observations are shown on Figure 3 and Fig-ure 14. Galaxy clusters present a unique challenge for validatingweak lensing measurements due to the increased blending amonglight sources, the presence of the intra-cluster light (ICL), and thecharacteristically stronger shears imprinted on source galaxies. Theaim of these synthetic observations is to enable future studies toaddress the above factors by calibrating and validating the perfor-mance of galaxy cluster weak lensing in an end-to-end fashion fromphotometry, through shear and photometric redshift measurementand calibration to mass recovery from lensing proﬁles or lensingmaps in a fully controlled environment. The focus of this paper wasto introduce the statistical learning algorithm itself and to demon-strate a pilot implementation for DES Y3 data. This consisted ofthe following steps: • We measured the galaxy content of redMaPPer galaxy clusters and their sky environments in projection, as a function of clusterrichness and redshift (Section 3.2). • Developed and validated a KDE framework for representinggalaxy distributions as high-dimensional probability density func-tions of photometric and morphological features Section 3.3). ThisKDE generalizes the ﬁnite set of galaxy and cluster observationsinto a continuous model. • Derived a mathematical formalism to combine wide-ﬁeld anddeep-ﬁeld survey data augmenting and extrapolating our model be-yond the depth of the wide-ﬁeld data (Section 3.5). • Created a model for the cluster member galaxy content ofredMaPPer clusters via statistical background subtraction in amulti-dimensional feature space (Section 3.6). • Through a series of comparisons between the properties of ob-served and modeled galaxies drawn from the KDE, we demonstratean excellent agreement in terms of real and synthetic galaxy cata-logs of cluster line-of-sights (Section 4). We note that this reﬂectsprimarily on the performance of the input catalogs used in creatingthe synthetic observations. A detailed analysis of the agreementbetween real data and the photometry derived from the syntheticimages is delegated for future work. Corrections for the potentialincompleteness of synthetic images can be addressed as a prior forEquation 5. • Combined the above steps into an algorithm constructing and ren-dering new realizations of mock galaxy clusters into synthetic im-ages (Section 5).The presented method addresses four distinct problems arisingwith simulated data:A The method does not rely on numerical simulations of baryonicstructure formation and galaxy evolution to construct galaxy clus-ters and thus it is independent from assumptions and approxima-tions inherent in cosmological simulations.B Synthetic galaxy clusters are generated to match their observedgalaxy content in DES Y3. Extrapolations of the galaxy popula-tions are performed where necessary, based on observational data.C The algorithm is formulated as a transparent, explicit recipe.Therefore the different components can be readily modiﬁed wherenecessary and external information (e.g. survey incompletenesscorrections, priors on cluster galaxy properties) can be added ina principled way.D Via the statistical learning approach, new, statistically indepen-dent realizations of synthetic galaxy cluster observations can becreated at minimal computational cost.The primary use-case of this type of synthetic data is to repro-duce the measurement of an analysis conducted with real lensingdata. On each image, the lensing signal of the cluster is imprintedin a realistic way, calibrated from previous weak lensing analysisof the same population clusters. Therefore the masses can be recov-ered from the stacked weak lensing measurement of a large numberof independent images, and systematic uncertainties can be prop-agated directly to the recovered masses. Furthermore, the clustermember galaxy model, which encapsulates the properties of ac-tual cluster member galaxies in real observations can be used asan augmentation or reference comparison dataset for synthetic sur-veys built from numerical simulations.Given their great statistical power, current (DES, KiDS, HSC)and upcoming (LSST, Euclid, WFIRST) weak lensing surveys areincreasingly dominated by systematic uncertainties. For this rea-son, calibration and validation tools such as the one presented inthis study will be indispensable in exploiting the cosmological and © 2021 RAS, MNRAS000

Synthetic galaxy clusters corresponding to redMaPPer clusters with λ ∈ [45; 60) across the different redshift ranges. astrophysical information made accessible by large area sky sur-veys. While this work was done in preparation of a cluster weaklensing analysis using the DES Y3 data, owing to the transparentand modular nature of the presented recipe it is expected that thealgorithm can be ﬁtted to other similar weak lensing surveys withminimal effort. DATA AVAILABILITY

The data underlying this article will be made available according tothe data release schedule of the Dark Energy Survey.

ACKNOWLEDGMENTS

This research was supported by the Excellence Cluster ORIGINSwhich is funded by the Deutsche Forschungsgemeinschaft (DFG,German Research Foundation) under Germany’s Excellence Strat-egy - EXC-2094-390783311. The calculations have been in partcarried out on the computing facilities of the Computational Centerfor Particle and Astrophysics (C2PAP). This work was supportedby the Department of Energy, Laboratory Directed Research andDevelopment program at SLAC National Accelerator Laboratory,under contract DE-AC02-76SF00515 and as part of the PanofskyFellowship awarded to DG.Funding for the DES Projects has been provided by theU.S. Department of Energy, the U.S. National Science Founda-tion, the Ministry of Science and Education of Spain, the Sci- © 2021 RAS, MNRAS , 1–21 T. N. Varga ence and Technology Facilities Council of the United Kingdom, theHigher Education Funding Council for England, the National Cen-ter for Supercomputing Applications at the University of Illinois atUrbana-Champaign, the Kavli Institute of Cosmological Physics atthe University of Chicago, the Center for Cosmology and Astro-Particle Physics at the Ohio State University, the Mitchell Institutefor Fundamental Physics and Astronomy at Texas A&M Univer-sity, Financiadora de Estudos e Projetos, Fundação Carlos ChagasFilho de Amparo à Pesquisa do Estado do Rio de Janeiro, Con-selho Nacional de Desenvolvimento Cientíﬁco e Tecnológico andthe Ministério da Ciência, Tecnologia e Inovação, the DeutscheForschungsgemeinschaft and the Collaborating Institutions in theDark Energy Survey.The Collaborating Institutions are Argonne National Labora-tory, the University of California at Santa Cruz, the University ofCambridge, Centro de Investigaciones Energéticas, Medioambien-tales y Tecnológicas-Madrid, the University of Chicago, Univer-sity College London, the DES-Brazil Consortium, the Universityof Edinburgh, the Eidgenössische Technische Hochschule (ETH)Zürich, Fermi National Accelerator Laboratory, the University ofIllinois at Urbana-Champaign, the Institut de Ciències de l’Espai(IEEC/CSIC), the Institut de Física d’Altes Energies, LawrenceBerkeley National Laboratory, the Ludwig-Maximilians Univer-sität München and the associated Excellence Cluster Universe, theUniversity of Michigan, the National Optical Astronomy Observa-tory, the University of Nottingham, The Ohio State University, theUniversity of Pennsylvania, the University of Portsmouth, SLACNational Accelerator Laboratory, Stanford University, the Univer-sity of Sussex, Texas A&M University, and the OzDES Member-ship Consortium.Based in part on observations at Cerro Tololo Inter-AmericanObservatory, National Optical Astronomy Observatory, which isoperated by the Association of Universities for Research in As-tronomy (AURA) under a cooperative agreement with the NationalScience Foundation.The DES data management system is supported by the Na-tional Science Foundation under Grant Numbers AST-1138766and AST-1536171. The DES participants from Spanish institu-tions are partially supported by MINECO under grants AYA2015-71825, ESP2015-66861, FPA2015-68048, SEV-2016-0588, SEV-2016-0597, and MDM-2015-0509, some of which include ERDFfunds from the European Union. IFAE is partially funded by theCERCA program of the Generalitat de Catalunya. Research leadingto these results has received funding from the European ResearchCouncil under the European Union’s Seventh Framework Pro-gram (FP7/2007-2013) including ERC grant agreements 240672,291329, and 306478. We acknowledge support from the AustralianResearch Council Centre of Excellence for All-sky Astrophysics(CAASTRO), through project number CE110001020.This manuscript has been authored by Fermi Research Al-liance, LLC under Contract No. DE-AC02-07CH11359 with theU.S. Department of Energy, Ofﬁce of Science, Ofﬁce of High En-ergy Physics. The United States Government retains and the pub-lisher, by accepting the article for publication, acknowledges thatthe United States Government retains a non-exclusive, paid-up, ir-revocable, world-wide license to publish or reproduce the publishedform of this manuscript, or allow others to do so, for United StatesGovernment purposes.

REFERENCES

Abbott T. M. C., et al. 2019, ApJ, 872, L30Allen S. W., Evrard A. E., Mantz A. B., 2011, ARA&A, 49, 409Bartelmann M., Schneider P., 2001, Phys. Rep., 340, 291Bernstein G. M., Armstrong R., 2014, MNRAS, 438, 1880Bertin E., Arnouts S., 1996, A&AS, 117, 393Bocquet S., et al. 2019, ApJ, 878, 55Brammer G. B., van Dokkum P. G., Coppi P., 2008, ApJ, 686,1503Bridle S., et al. 2009, Annals of Applied Statistics, 3, 6Butcher H., Oemler Jr. A., 1978, ApJ, 219, 18Costanzi M., et al. 2019, MNRAS, 488, 4779DES Collaboration 2016, MNRAS, 460, 1270DES Collaboration 2020, arXiv e-prints, p. arXiv:2002.11124DeRose J., et al. 2019, arXiv e-prints, p. arXiv:1901.02401Eckert K., et al. 2020, arXiv e-prints, p. arXiv:2004.05618Euclid Collaboration 2019, A&A, 627, A59Everett S., et al. 2020, arXiv e-prints, p. arXiv:2012.12825Fenech Conti I., et al. 2017, MNRAS, 467, 1627Flaugher B., et al. 2015, AJ, 150, 150Gavazzi G., et al. 2010, A&A, 517, A73Górski K. M., et al. 2005, ApJ, 622, 759Gruen D., et al. 2019, MNRAS, 488, 4389Hansen S. M., et al. 2009, ApJ, 699, 1333Hartley W. G., et al. 2020, arXiv e-prints, p. arXiv:2012.12824Hastie T., Tibshirani R., Friedman J., 2001, The Elements of Sta-tistical Learning. Springer Series in Statistics, Springer NewYork Inc., New York, NY, USAHennig C., et al. 2017, MNRAS, 467, 4015Hoekstra H., Viola M., Herbonnet R., 2017, MNRAS, 468, 3295Hoyle B., et al. 2018, MNRAS, 478, 592Huff E., Mandelbaum R., 2017, preprint,( arXiv:1702.02600 )Jarvis M. J., et al. 2013, MNRAS, 428, 1281Jarvis M., et al. 2016, MNRAS, 460, 2245Kannawadi A., et al. 2019, A&A, 624, A92Kessler R., et al. 2015, AJ, 150, 172Kluge M., et al. 2020, ApJS, 247, 43Kravtsov A. V., Borgani S., 2012, ARA&A, 50, 353MacCrann N., et al. 2020, arXiv e-prints, p. arXiv:2012.08567MacKay D. J. C., 2002, Information Theory, Inference & Learn-ing Algorithms. Cambridge University Press, USAMandelbaum R., et al. 2015, MNRAS, 450, 2963Mandelbaum R., et al. 2018, MNRAS, 481, 3170Mantz A. B., et al. 2015, MNRAS, 446, 2205Massey R., et al. 2007, MNRAS, 376, 13McClintock T., et al. 2019, MNRAS, 482, 1352Melchior P., et al. 2017, MNRAS, 469, 4899Miller L., et al., 2013, MNRAS, 429, 2858Mohr J. J., et al. 2008, in Proc. SPIE. p. 70160L( arXiv:0807.2515 ), doi:10.1117/12.789550Myles J., et al. 2020, arXiv e-prints, p. arXiv:2012.08566Navarro J. F., Frenk C. S., White S. D. M., 1996, ApJ, 462, 563Oaxaca Wright C., Brainerd T. G., 1999, arXiv e-prints, pp astro–ph/9908213Parzen E., 1962, Ann. Math. Statist., 33, 1065Planck Collaboration 2016, A&A, 594, A24Postman M., Lauer T. R., 1995, ApJ, 440, 28Pujol A., et al. 2019, A&A, 621, A2Refregier A., Amara A., 2014, Physics of the Dark Universe, 3, 1Rowe B. T. P., et al. 2015, Astronomy and Computing, 10, 121 © 2021 RAS, MNRAS000

AFFILIATIONS Max Planck Institute for Extraterrestrial Physics, Giessenbach-strasse, 85748 Garching, Germany Universitäts-Sternwarte, Fakultät für Physik, Ludwig-Maximilians Universität München, Scheinerstr. 1, 81679 München,Germany Department of Physics, Stanford University, 382 Via PuebloMall, Stanford, CA 94305, USA Kavli Institute for Particle Astrophysics & Cosmology, P. O. Box2450, Stanford University, Stanford, CA 94305, USA SLAC National Accelerator Laboratory, Menlo Park, CA 94025,USA Department of Applied Mathematics and Theoretical Physics,University of Cambridge, Cambridge CB3 0WA, UK Brookhaven National Laboratory, Bldg 510, Upton, NY 11973,USA Département de Physique Théorique and Center for Astropar-ticle Physics, Université de Genéve, 24 quai Ernest Ansermet,CH-1211Geneva, Switzerland Center for Cosmology and Astro-Particle Physics, The OhioState University, Columbus, OH 43210, USA Fermi National Accelerator Laboratory, P. O. Box 500, Batavia,IL 60510, USA Kavli Institute for Cosmological Physics, University of Chicago,Chicago, IL 60637, USA Argonne National Laboratory, 9700 South Cass Avenue,Lemont, IL 60439, USA Department of Physics, University of Arizona, Tucson, AZ85721, USA Faculty of Physics, Ludwig-Maximilians-Universität, Scheiner-str. 1, 81679 Munich, Germany Department of Physics and Astronomy, University of Pennsyl-vania, Philadelphia, PA 19104, USA Department of Physics, Carnegie Mellon University, Pittsburgh,Pennsylvania 15312, USA Santa Cruz Institute for Particle Physics, Santa Cruz, CA 95064,USA Center for Astrophysical Surveys, National Center for Super-computing Applications, 1205 West Clark St., Urbana, IL 61801,USA Department of Astronomy, University of Illinois at Urbana-Champaign, 1002 W. Green Street, Urbana, IL 61801, USA Department of Physics, University of Oxford, Denys WilkinsonBuilding, Keble Road, Oxford OX1 3RH, UK Jodrell Bank Center for Astrophysics, School of Physics andAstronomy, University of Manchester, Oxford Road, Manchester,M13 9PL, UK Centro de Investigaciones Energéticas, Medioambientales yTecnológicas (CIEMAT), Madrid, Spain Department of Physics, Duke University Durham, NC 27708,USA Institute for Astronomy, University of Edinburgh, EdinburghEH9 3HJ, UK Departamento de Física Matemática, Instituto de Física, Univer-sidade de São Paulo, CP 66318, São Paulo, SP, 05314-970, Brazil Laboratório Interinstitucional de e-Astronomia - LIneA, RuaGal. José Cristino 77, Rio de Janeiro, RJ - 20921-400, Brazil CNRS, UMR 7095, Institut d’Astrophysique de Paris, F-75014,Paris, France Sorbonne Universités, UPMC Univ Paris 06, UMR 7095,Institut d’Astrophysique de Paris, F-75014, Paris, France Department of Physics and Astronomy, Pevensey Building,University of Sussex, Brighton, BN1 9QH, UK Department of Physics & Astronomy, University CollegeLondon, Gower Street, London, WC1E 6BT, UK Instituto de Astroﬁsica de Canarias, E-38205 La Laguna,Tenerife, Spain Universidad de La Laguna, Dpto. Astrofísica, E-38206 LaLaguna, Tenerife, Spain Institut de Física d’Altes Energies (IFAE), The Barcelona Insti-tute of Science and Technology, Campus UAB, 08193 Bellaterra(Barcelona) Spain Astronomy Unit, Department of Physics, University of Trieste,via Tiepolo 11, I-34131 Trieste, Italy INAF-Osservatorio Astronomico di Trieste, via G. B. Tiepolo11, I-34143 Trieste, Italy Institute for Fundamental Physics of the Universe, Via Beirut 2,34014 Trieste, Italy Observatório Nacional, Rua Gal. José Cristino 77, Rio deJaneiro, RJ - 20921-400, Brazil Department of Physics, University of Michigan, Ann Arbor, MI48109, USA Department of Physics, IIT Hyderabad, Kandi, Telangana502285, India Institute of Theoretical Astrophysics, University of Oslo. P.O.Box 1029 Blindern, NO-0315 Oslo, Norway Instituto de Fisica Teorica UAM/CSIC, Universidad Autonomade Madrid, 28049 Madrid, Spain Institut d’Estudis Espacials de Catalunya (IEEC), 08034Barcelona, Spain Institute of Space Sciences (ICE, CSIC), Campus UAB, Carrerde Can Magrans, s/n, 08193 Barcelona, Spain Department of Astronomy, University of Michigan, Ann Arbor,MI 48109, USA School of Mathematics and Physics, University of Queensland,Brisbane, QLD 4072, Australia Department of Physics, The Ohio State University, Columbus,OH 43210, USA Australian Astronomical Optics, Macquarie University, NorthRyde, NSW 2113, Australia Lowell Observatory, 1400 Mars Hill Rd, Flagstaff, AZ 86001,USA © 2021 RAS, MNRAS , 1–21 T. N. Varga Department of Astrophysical Sciences, Princeton University,Peyton Hall, Princeton, NJ 08544, USA Institució Catalana de Recerca i Estudis Avançats, E-08010Barcelona, Spain Physics Department, 2320 Chamberlin Hall, University ofWisconsin-Madison, 1150 University Avenue Madison, WI 53706-1390 Institute of Astronomy, University of Cambridge, MadingleyRoad, Cambridge CB3 0HA, UK School of Physics and Astronomy, University of Southampton,Southampton, SO17 1BJ, UK Computer Science and Mathematics Division, Oak RidgeNational Laboratory, Oak Ridge, TN 37831 Institute of Cosmology and Gravitation, University ofPortsmouth, Portsmouth, PO1 3FX, UK

APPENDIX A: DATA SELECTION

The wide-ﬁeld galaxy sample used in this study for the statisticalmodeling (Section 3) is obtained from the DES Y3 GOLD galaxycatalog (Sevilla-Noarbe et al. 2020) using the criteria listed in Ta-ble A2. The full list of galaxy features used in this study are listedin Table A1 along with their relation to the DES Y3 data prod-ucts produced by Sevilla-Noarbe et al. (2020) and Hartley & Choiet al., (2020), corresponding to the wide-ﬁeld and deep-ﬁeld fea-tures respectively. The mean photometric and morphological pa-rameters of redMaPPer BCGs are listed in Table A3. These areobtained by matching the galaxy properties of the Y3 GOLD cat-alog with the catalog of redMaPPer central galaxies based on the

COADD_OBJECT_ID . APPENDIX B: PDF TRANSFORMATION

Let us formulate Equation 3 as a transformation of a naive proposaldistribution: p D ( θ deep , θ wide , R | λ , z ) = p D : prop ( θ deep , θ wide , R | λ , z ) (B1) × F ( θ deep , θ wide , R | λ , z ) .As there is no cluster information from the deep-ﬁeld survey, theproposal PDF cannot depend on λ and z : p D ;prop ( θ deep , θ wide , R | λ , z ) = p D ;prop ( θ deep , θ wide , R ) , (B2)and for the same reason in the proposal distribution of θ deep and θ wide cannot be correlated with R : p D :prop ( θ deep , θ wide , R | λ , z ) = p D ( θ deep , θ wide ) · p D :prop ( R ) .(B3)Here p D ( θ deep , θ wide ) can be directly measured from the deep-ﬁeldsurvey, and p D :prop ( R ) is chosen to capture the approximately uni-form surface density of galaxies, e.g. p D :prop ( R ) ∝ R .The remaining task is to ﬁnd an appropriate multiplicativeterm F ( θ deep , θ wide , R | λ , z ) which transforms the proposal distri-bution p D :prop into the target distribution ˜ p D . In the following wedenote with a tilde distributions or estimates which cover the fullfeature space, but are constrained by approximations due to infor-mation not accessible to us.

Since ˜ p D depends on λ , z and R , and p D :prop is independent of these, the F term must contain all such in-formation. Furthermore, the correlation between θ deep and R can-not be measured from wide-ﬁeld data, therefore we approximate F as ˜ F ( θ wide , R | λ , z ) ≈ F ( θ deep , θ wide , R | λ , z ) . (B4)A necessary consistency constraint placed on ˜ F is expressedas ˜ p D ( θ wide , R | λ z ) (cid:12)(cid:12) W = p D ;prop ( θ wide , R ) (cid:12)(cid:12) W × ˜ F ( θ wide , R | λ , z )(B5)= p W ( θ wide , R | λ , z ) (B6)where the W subscript indicates a PDF estimated from wide-ﬁelddata, and the | W subscript denotes that the otherwise greater mag-nitude range is restricted to the wide-ﬁeld completeness magnitudeof i ≈ F , as ˜ F ( θ wide , R | λ , z ) = 1 ˆ V p W ( θ wide , R | λ , z ) p D ;prop ( θ wide , R ) (cid:12)(cid:12) W (B7)= 1 ˆ V p W ( θ wide , R | λ , z ) p D ( θ wide ) (cid:12)(cid:12) W · p D ;prop ( R ) (B8)where ˆ V is a normalization factor to account for the different vol-umes of the wide-ﬁeld and deep-ﬁeld parameter spaces, e.g., thedifference in the limiting depth of i < 22.5 versus i < 24.5.From the combination of Equation B3 and Equation B8 wecan then write our estimate of the target distribution as ˜ p D ( θ deep , θ wide , R | λ , z ) ≈ p D ( θ deep , θ wide ) p W ( θ wide , R | λ , z ) ˆ V · p D ( θ wide ) (cid:12)(cid:12) W ,(B9)where p D ;prop ( R ) drops out, and the approximation is composed en-tirely of p.d.f-s which can be directly measured from the wide-ﬁeldor deep-ﬁeld data. APPENDIX C: REJECTION SAMPLINGC1 Deﬁning subtraction as resampling

The cluster member galaxy population can be statistically deﬁnedas the feature dependent galaxy excess compared to a reference ran-dom line-of-sight shown in Equation 2. In the language of rejectionsampling, p memb can be calculated by stochastically estimating thevolume between two PDFs (MacKay 2002). In our case the two dis-tributions are p rand and ˆ n c ˆ n r p clust , the scaled feature PDF of galaxiesmeasured in projection around reference random points and galaxyclusters respectively, and ˆ n r and ˆ n c refer to the normalization fac-tors, respectively.In the following we empirically sample p memb . For each sam-ple:(i) Draw a proposal sample β i ∼ p prop ∼ U , where β i is drawnfrom a uniform distribution whose support covers the support ofboth p clust and p clust .(ii) Perform a uniform random draw u i ∼ U [0; 1).(iii) Evaluate the acceptance condition p rand ( β i ) < u i · ˆ n c ˆ n r sup (cid:0) p clust ( β i ) (cid:1) < ˆ n c ˆ n r p clust ( β i ) , (C1)and repeat from the previous step until the condition is fulﬁlled anda sample can be accepted. The rejection sampling recipe guaran-tees that accepted samples will be distributed according to p memb .(MacKay 2002). © 2021 RAS, MNRAS000 , 1–21 ynthetic Galaxy Clusters and Observations Based on Dark Energy Survey Year 3 Data Feature catalog parameter descriptionDeep-ﬁeld features m bdf_mag_dered_3 i-band MOF magnitude with photometric correction c bdf_mag_dered_2 - bdf_mag_dered_1 g - r MOF color with photometric correction bdf_mag_dered_3 - bdf_mag_dered_2 r - i MOF color with photometric correction bdf_mag_dered_4 - bdf_mag_dered_3 i - z MOF color with photometric correction s sqrt( bdf_g_0 + bdf_g_1 ) absolute MOF ellipticity | e | FRACDEV bulge / disk ﬂux fraction at ﬁxed component sizelog (1 + bdf_T ) MOF size squared in arcsec T =< x > + < y > z g z_mc ugrizJHK -band based photo-z estimate from EAzYWide-ﬁeld features R log (cid:112) ( RA – ra ref ) + ( DEC – dec ref ) log projected separation in arcmin from reference point m MOF_CM_MAG_CORRECTED_I i-band MOF magnitude with photometric correction c MOF_CM_MAG_CORRECTED_G - MOF_CM_MAG_CORRECTED_R g - r MOF color with photometric correction

MOF_CM_MAG_CORRECTED_R - MOF_CM_MAG_CORRECTED_I r - i MOF color with photometric correction

MOF_CM_MAG_CORRECTED_I - MOF_CM_MAG_CORRECTED_Z i - z MOF color with photometric correction

Table A1.

Features and their deﬁnitions from the column of the relevant photometric catalogs. Deep ﬁeld features: DES Y3 deep and supernova ﬁelds (Hartley& Choi et al., 2020) for further explanation see Section 2.3. Wide-ﬁeld features: DES Y3 GOLD (Sevilla-Noarbe et al. 2020), for further explanation seeSection 2.1. Y3A2 GOLD column value description

FLAGS_FOOTPRINT

FLAGS_FOREGROUND bitand(FLAGS_GOLD, 122)

EXTENDED_CLASS_SOF

Table A2.

Y3A2 GOLD catalog query cuts used in obtaining the survey data from the DES Data Management System (DESDM, Mohr et al. 2008).

Since in practice p clust is not known exactly, we can rewriteInequality C1 by replacing it with an appropriately chosen value M which fulﬁlls that ˆ n c ˆ n r p clust < M and p rand < M : p rand ( β i ) < u i · ˆ n c ˆ n r M < ˆ n c ˆ n r p clust ( β i ) . (C2)We further increase the acceptance rate by drawing samples β i from an appropriately chosen proposal distribution p prop instead offrom a uniform distribution. In this case the inequality modiﬁes as p rand ( β i ) ˆ n c ˆ n r M · p prop < u i < p clust ( β i ) M · p prop (C3)where ˆ n c / ˆ n r is the average relative overdensity of galaxy counts inthe cluster line-of-sight compared to a reference random line-of-sight. C2 Combining resampling and extrapolation

The primary use of Equation C3 over directly performing the sub-traction of the rescaled PDFs is that it can incorporate the extrap-olation according to Equation B9.

For this we adopt the proposaldistribution as deﬁned by Equation B3: p prop = p prop ( θ deep , θ wide , R | λ , z )= p D ( θ deep , θ wide ) · p W ; rand ( R | λ , z )= p D ( m , c , s , z g ) · p W ; rand ( R | λ , z ) , (C4)which we use to draw the proposal random samples from. Further-more, we deﬁne a restricted proposal distribution which contains only features contained within θ ref , that is p rp = p rp ( θ wide , R | λ , z )= p D ( c wide ) · p W ; wide ( R | λ , z ) , (C5)which can be directly compared with p clust and p rand .Combining the above, we can generate random sam-ples from the survey extrapolated ˜ p memb , by drawing samples{ m i , c i , s i , z g ; i , R i } from Equation 7, and considering the subsetwhich fulﬁlls the extrapolated membership criteria ˆ n r ˆ n c p W ; rand ( c wide;i , R i | λ , z ) M · p D ( c wide; i ) · p W ; rand ( R i | λ , z ) < u i (C6)and u i < p W ; clust ( c wide; i , R i | λ , z ) M · p D ( c wide; i ) · p W ; rand ( R i | λ , z ) . (C7) These two inequalities serve the basis of the computation inthis work . © 2021 RAS, MNRAS , 1–21 T. N. Varga z ∈ λ ∈ (cid:104) i (cid:105) (cid:104) g – r (cid:105) (cid:104) r – i (cid:105) (cid:104) i – z (cid:105) (cid:104) T BCG (cid:105) [arcsec ] (cid:104) | g | (cid:105) [0.3; 0.35) [30; 45) 17.76 1.36 0.54 0.32 28.90 0.14[0.3; 0.35) [45; 60) 17.62 1.38 0.54 0.31 33.20 0.14[0.45; 0.5) [30; 45) 18.58 1.85 0.70 0.37 21.92 0.15[0.45; 0.5) [45; 60) 18.50 1.85 0.71 0.37 28.43 0.14[0.6; 0.65) [30; 45) 19.36 1.83 1.01 0.44 16.90 0.17[0.6; 0.65) [35; 60) 19.18 1.83 1.02 0.45 22.44 0.16 Table A3.

Properties of the mean bright central galaxy (BCG) across the different cluster richness and redshift bins. For each BCG the bulge (de Vaucouleurs)fraction is set to unity. The T BCG parameter is the effective area of the galaxy corresponding to the SOF size squared in arcsec T =< x > + < y >.© 2021 RAS, MNRAS000