[PDF] 3D-Matched-Filter Galaxy Cluster Finder I: Selection Functions and CFHTLS Deep Clusters

Abstract

We present an optimised galaxy cluster finder, 3D-Matched-Filter (3D-MF), which utilises galaxy cluster radial profiles, luminosity functions and redshift information to detect galaxy clusters in optical surveys. This method is an improvement over other matched-filter methods, most notably through implementing redshift slicing of the data to significantly reduce line-of-sight projections and related false positives. We apply our method to the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) Deep fields, finding ~170 galaxy clusters per square degree in the 0.2 <= z <= 1.0 redshift range. Future surveys such as LSST and JDEM can exploit 3D-MF's automated methodology to produce complete and reliable galaxy cluster catalogues. We determine the reliability and accuracy of the statistical approach of our method through a thorough analysis of mock data from the Millennium Simulation. We detect clusters with 100% completeness for M_200 >= 3.0x10^(14)M_sun, 88% completeness for M_200 >= 1.0x10^(14)M_sun, and 72% completeness well into the 10^(13)M_sun cluster mass range. We show a 36% multiple detection rate for cluster masses >= 1.5x10^(13)M_sun and a 16% false detection rate for galaxy clusters >~ 5x10^(13)M_sun, reporting that for clusters with masses <~ 5x10^(13)M_sun false detections may increase up to ~24%. Utilising these selection functions we conclude that our galaxy cluster catalogue is the most complete CFHTLS Deep cluster catalogue to date.

Full PDF

aa r X i v : . [ a s t r o - ph . C O ] O c t Mon. Not. R. Astron. Soc. , 000–000 (0000) Printed 2 November 2018 (MN L A TEX style ﬁle v2.2)

M. Milkeraitis ⋆ , L. Van Waerbeke , C. Heymans , H. Hildebrandt ,J. P. Dietrich , , , T. Erben University of British Columbia, Department of Physics and Astronomy, 6224 Agricultural Rd, Vancouver BC, V6T 1Z1, Canada The Scottish Universities Physics Alliance, Institute for Astronomy, University of Edinburgh, Blackford Hill, Edinburgh EH9 3HJ, UK Leiden Observatory, Leiden University, Niels Bohrweg 2, 2333CA Leiden, The Netherlands ESO, Karl-Schwarzschild-Str. 2, 85748 Garching b. M¨unchen, Germany Physics Department, University of Michigan, 450 Church St, Ann Arbor, MI 48109-1040, USA Michigan Center for Theoretical Physics, 450 Church St, Ann Arbor, MI 48109-1040, USA Argelander-Institut f¨ur Astronomie, Auf dem H¨ugel 71, 53121 Bonn, Germany

20 March 2010

ABSTRACT

We present an optimised galaxy cluster ﬁnder, 3D-Matched-Filter (3D-MF), thatutilises galaxy cluster radial proﬁles, luminosity functions and redshift informationto detect galaxy clusters in optical surveys. This method is an improvement overother matched-ﬁlter methods, most notably through implementing redshift slicing ofthe data to signiﬁcantly reduce line-of-sight projections and related false positives. Weapply our method to the Canada-France-Hawaii Telescope Legacy Survey Deep ﬁelds,ﬁnding ∼

170 galaxy clusters per square degree in the 0 . z . M > . × M ⊙ , 88% completeness for M > . × M ⊙ ,and 72% completeness well into the 10 M ⊙ cluster mass range. We show a 36% multi-ple detection rate for cluster masses > . × M ⊙ and a 16% false detection rate forgalaxy clusters & × M ⊙ , reporting that for clusters with masses . × M ⊙ false detections may increase up to ∼ Key words: galaxies: abundances - galaxies: luminosity function - galaxies: clusters:general - cosmology: theory - large-scale structure of Universe - methods: numerical

Clusters of galaxies pinpoint the densest regions in theUniverse. Housed in deep gravitational wells, clusters actas laboratories to study the inﬂuence of extreme environ-ments on galaxy formation and evolution (Dressler 1980).The most massive clusters are also natural gravitationaltelescopes, lensing and magnifying light from the mostdistant of galaxies (Stark et al. 2007). As clusters tracethe high mass tail of the matter distribution, a completesample can provide a very sensitive probe of cosmology(Battye & Weller 2003). Directly probing the growth of ⋆ [email protected] structure in the early Universe, cluster cosmology is fastbecoming an important component of future dark energysurveys. As the future of optical astronomy ushers inlarge datasets of wide, deep sky coverage, it becomesnecessary to develop automated algorithms that method-ically and accurately search these datasets for galaxyclusters. Depending on scientiﬁc goals, it is desirable tohave as complete a galaxy cluster sample as possible,over a range of redshifts; any intrinsic limitations of thesesearch algorithms and the resultant biases introduced intogenerated galaxy cluster catalogues must be understood.The question of how to methodically select and quantify thecompleteness of a cluster sample is the subject of this paper. c (cid:13) Milkeraitis et al.

Many diﬀerent approaches to searching for galaxy clusters inastronomical data exist: current optical astronomy methodsfocus on ﬁnding clusters as overdensities via friends-of-friends algorithms such as Li & Yee (2008), density mapsas in Adami et al. (2010), or Voronoi tesselation methodsas in Lopes et al. (2004), and van Breukelen & Clewley(2009) for example. Other search methods look for large,red, elliptical cluster galaxy populations, and are re-ferred to as red sequence techniques (Gladders & Yee(2000), Cohn et al. (2007), Kodama et al. (2007), Lu et al.(2009), and Thanjavur (2009) for example), perhaps alsoincluding the existence of bright central galaxies intothe search algorithms (such as the maxBCG methodof Koester et al. (2007), and Hilbert & White (2009)).Alternatively, other methods search for clusters based oncharacteristic galaxy cluster luminosity and radial proﬁles(Postman et al. (1996), Olsen et al. (1999a), Kepner et al.(1999), White & Kochanek (2002), Kochanek et al. (2003),Gilbank et al. (2004), Dietrich et al. (2007), Grove et al.(2009), Menanteau et al. (2009)). Each of these methodsmakes assumptions about general cluster properties andresultantly, derived galaxy cluster catalogues will primarilyinclude galaxy clusters that reﬂect those assumptions.Ideally a method that minimises these biases and producesan understood and complete galaxy cluster catalogue willbe best suited for statistical science of galaxy clusters andinvestigations into cosmology.With this in mind, we present an optimised luminos-ity function and radial proﬁle based optical galaxy clusterﬁnding algorithm that we call 3D-Matched-Filter (3D-MF).We present thorough tests of 3D-MF on simulations,and use the ascertained galaxy cluster selection functionsto produce a galaxy cluster catalogue for the Canada-France-Hawaii Telescope Legacy Survey Deep ﬁelds. Tothis eﬀect, Section 2 of this paper will discuss the Millen-nium Simulation catalogues used, Section 3 will discussmatched-ﬁlter methodology, as well as 3D-MF itself, andSection 4 will outline the selection functions of 3D-MFfrom the Millennium Simulation data ( without and with photometric redshift error). Subsequently, 3D-MF will berun on the Canada-France-Hawaii Telescope Legacy SurveyDeep Dataset and the resultant galaxy cluster catalogueswill be discussed in Section 5.A ΛCDM cosmology has been assumed throughoutthis work: H = 73 km / s / Mpc, Ω M = 0 .

25, Ω Λ = 0 . The simulation catalogues used throughout this work aresix pencil-beam mock catalogues from Kitzbichler & White(2007) (henceforth KW07). These catalogues are lightconescreated from the semi-analytic galaxy catalogue fromDe Lucia & Blaizot (2007), which in turn is derived fromthe Millennium Simulation of the concordance ΛCDMcosmogony (Springel et al. (2005); henceforth S05). The Millennium Simulation well-reproduces galaxyclustering as a function of luminosity as shown in Figure5 of S05. Furthermore, semi-analytic modelling appliedto the original Millennium Simulation catalogues, in theDe Lucia & Blaizot (2007) and KW07 catalogues, resultsin luminosity functions in similarly good agreement withobservations. Consequently, these catalogues are well suitedto test a cluster ﬁnding algorithm based on luminosity, suchas the 3D-MF galaxy cluster ﬁnding algorithm presented inthis work. Conversely, there are diﬃculties in reproducingobservational colours in simulation work (again, see Figure5 from S05 and a discussion in De Lucia & Blaizot (2007)).This fact hinders the analysis of colour-based cluster ﬁndingalgorithms, such as red sequence based techniques, on theMillennium Simulation. Cohn et al. (2007) tested theirred sequence algorithm to examine changes with redshiftusing the Millennium Simulation. The authors found thesimulation colours did not match observations; they insteaddetermined their red sequence by matching the simulations.In another recent example, Hilbert & White (2009) usedthe maxBCG method from Koester et al. (2007) (based onthe red sequence technique) on the Millennium Simulationand found that the simulation did not reproduce the coloursof passively evolving galaxies to the degree required fora direct application of colour-redshift relations derivedfrom observations. Resultantly, they used altered colour-redshift relations, noting that if they had not adjustedthe colour-redshift selection relations by hand, runningmaxBCG on the Millennium Simulation dataset and subse-quent modelling would have found almost no clusters with z > .

25. The 3D-MF technique does not use MillenniumSimulation colour information, and relies instead on theluminosity function of galaxy clusters; therefore, evaluatingon the Millennium Simulation provides a robust test of theapplication of this method to real data.The KW07 mock catalogues used herein consist of sixdeep ﬁelds of 1 . × . ) in anticipation of the Canada-France-HawaiiTelescope data that 3D-MF will ﬁrst have the opportunityto examine in Section 5. i MegaCam = i SDSS − .

085 ( r SDSS − i SDSS ) (1)Hereafter we refer to i MegaCam as i ′ − band. A common deﬁnition does not exist in the literature withwhich to deﬁne a cluster, other than the agreed upon fact D-Matched-Filter Galaxy Cluster Finder Table 1.

Parameters used to compile a list of known galaxy clus-ters from the KW07 mock lightcones of the Millennium Simula-tion.

Millennium Simulation Galaxy Cluster Traits • Cluster members have the same friends-of-friendsidentiﬁcation number (as described in the simulations),thus avoiding dark matter substructures • Cluster members are ﬂagged in the simulationswith the M of their parent halo • Clusters have > Figure 1.

Mass distribution of Millennium Simulation KW07galaxy clusters (following the cluster deﬁnition given in Table1, and showing the cluster redshift range of 0 . z . that a cluster consists of multiple galaxies gravitationallybound to one another orbiting a common centre of mass.Smaller numbers of galaxies bound together are more tradi-tionally known as groups , whereas larger galaxy associationsare typically called clusters ; it is the border between groupsand clusters that is not always strictly deﬁned. For the pur-poses of this work, a cluster will be deﬁned in the Millen-nium Simulation as a collection of galaxies that belong tothe same parent halo and have > z < . clusters are listed in Table1. Note that M is the mass enclosed within radius r :the radius within which the mean density is 200 times thecritical density at that redshift. Based on this galaxy clusterdeﬁnition, the mass distribution of KW07 galaxy clusters isshown in Figure 1. The redshift distribution of these known clusters is presented in Figure 2.The centre for each known Millennium Simulation cluster isdeﬁned in this work as the average position of the brightest10 (or average of all members when the number of galaxiesper cluster is < i ′ − band objects per cluster. The position Figure 2.

Redshift distribution of Millennium Simulation KW07galaxy clusters (following the cluster deﬁnition given in Table 1,and noting the catalogue lightcones extend much further than aredshift of 1.2: higher ranges are not shown here). of the centre is only used when matching the clusters foundby 3D-MF to those known in the Millennium Simulation todetermine which clusters were found, and a tolerance rangeis used (described in Section 4); it is not crucial to immedi-ately determine a more exact position of the cluster centreand it will be examined further in Section 4.3.2.

The 3D-MF galaxy cluster ﬁnding algorithm bases its searchon the luminosity and radial proﬁle of a galaxy cluster, ap-propriately sized for the redshift of the cluster. Prior toimplementing our changes (described in Section 3.3), thismethod was based on (Postman et al. 1996, hereafter P96),and the reader is directed there for a more in-depth dis-cussion of the background of this technique. The methodcan use any sensible choice of luminosity or radial proﬁle tomodel a galaxy cluster: 3D-MF follows P96 and currentlyuses a modiﬁed Schechter luminosity function (Equation 2)and truncated Hubble radial proﬁle (Equation 3) to describea ﬁducial galaxy cluster:Φ( M ) = 0 . ∗ . α +1)( M ∗ − M ) exp h − . M ∗ − M ) i (2)where Φ is the galaxy luminosity function, Φ ∗ setsthe overall normalisation of galaxy density, M is absolutemagnitude, and α is the slope of the faint end of the lumi-nosity function. M ∗ is a characteristic absolute magnitude:Φ drops with increasing luminosity but at magnitudesbrighter than M ∗ the exponentially decreasing slope ofthe luminosity function cuts oﬀ dramatically. The integralof the Schechter luminosity function diverges for α < − M ∗ ; the multiplicative term,exp[ − . M ∗ − M ) ], that has been added to modify the Milkeraitis et al.

Schechter equation, is a dimensionless power-law cutoﬀ(weighted by the ﬂux of the galaxy) that allows the functionto remain integrable.The radial proﬁle of the cluster is modelled as a pro-jected truncated Hubble radial proﬁle; P (cid:18) rr c (cid:19) = 1 r (cid:16) rr c (cid:17) − r (cid:16) r co r c (cid:17) (3)where r is the distance of a galaxy from a cluster centre, r c is the cluster core radius, and r co is the cutoﬀ radius. r co is arbitrary and must be chosen such that it extendswell beyond the cluster’s core radius, importantly allowingnon-circularly symmetric clusters to still be detected bythis technique.Once the ﬁducial cluster has been established by choiceof luminosity and radial proﬁle, the data is searchedin one wavelength band for areas that maximally andsimultaneously match both proﬁles, thereby ﬁltering galaxy clusters out of the data (and thus the terminology’matched-ﬁlter’). In previous applications of matched-ﬁltermethodologies, the data have not been binned per redshiftas photometric redshift information has only recentlybecome commonplace. In the absence of photometricredshift information in the past, the proﬁles that werebeing matched were themselves re-determined with an as-sumed cluster size and M ∗ value at a series of trial redshifts.The likelihood, L , that galaxies within r co match theluminosity and radial proﬁle of a ﬁducial cluster at aparticular assumed redshift is calculated according toEquation 4:ln L ∝ Z P (cid:18) rr c (cid:19) Φ( m − m ∗ ) b ( m ) D ( r, m ) d r d m (4)where m is the apparent magnitude of a galaxy, m ∗ isthe apparent magnitude corresponding to the characteristicluminosity of cluster galaxies, which incidentally includesa redshift dependent k -correction, b ( m ) is the backgroundgalaxy count, and D ( r, m ) is the total number of galaxiesat a given magnitude and distance from a cluster centre.This likelihood is calculated for every possible clustercentre in an input catalogue and can be viewed as an array,or image map; these maps will be referred to as likelihoodmaps. The likelihood maps are searched for peaks, whichare galaxy cluster detections.The signiﬁcance of a peak, or galaxy cluster detection, ismeasured according to Equation 5: σ = ( S p − S bg ) /σ bg (5)where S p is the peak signal, and S bg is the backgroundsignal calculated by binning the distribution of all likeli-hood values in a likelihood map, and taking the mode ofthis Gaussian distribution. The background dispersion, σ bg ,is 0 . × ( Q − Q ) where Q and Q are the ﬁrst and thirdquartiles of the ranked likelihood value distribution. Cluster detections can be ﬁltered based on their signiﬁcance, ensur-ing detections are well above the background noise, and inthis way a galaxy cluster list can be generated. False detections are deﬁned as galaxy clusters found byan algorithm that do not exist in reality: an importantparameter, among others, by which to deﬁne the purity ofan algorithm. Older versions of matched-ﬁlter techniqueshave uncomfortably high false detection rates, a featurelargely attributable to line-of-sight projections.P96 test their matched-ﬁlter algorithm on a self-createdsimulation (mimicking the Palomar Distant Cluster Survey)and ﬁnd a 65% false detection rate of galaxy clusterswith peak signal thresholds > σ . P96 reduce this falsedetection rate to 31% by adding the criteria that clustersmust be found by their algorithm twice, looking separatelyin 2 colour bands. The ESO Imaging Survey had somesuccess in ﬁnding clusters using their implementationof this same algorithm, but it was initially developedwith the purpose of preparing a list of candidate clus-ters for follow-up observations and not for producing awell-deﬁned sample for statistical analysis (Olsen et al.(1999a), Olsen et al. (1999b); O1&2). These ESO ImagingSurvey cluster papers, and subsequent work with theiralgorithm (Dietrich et al. (2007): D07; and Olsen et al.(2007), Grove et al. (2009): O3&4), have not tested theirversion of the matched-ﬁlter algorithm on simulations interms of its ability to recover known clusters, other thanto very generally quantify the ability of their ﬁnder toavoid noise detections. Furthermore, these papers aim forthe same galaxy cluster density as the Palomar DistantCluster Survey in P96 (i.e. O3 quotes ∼ . . ± . R -band magnitude < D-Matched-Filter Galaxy Cluster Finder or bright stars is necessary in reality and clusters on theborders of the masked regions, if not properly weighted oraccounted for in terms of the fraction available to the searchalgorithm, will likely be missed. In terms of false detectionreports, WK02 show how their false detection rates changewith redshift: for their deepest self-created simulation of1.5 x 1.5 square degrees out to redshift 1.0, with a limitingR band magnitude of 24, they have false positive rates of >

40% (averaging ≈

60% for all redshift bins) for clusters > × M ⊙ .Proponents of other cluster ﬁnding techniques, suchas red sequence based methods, often comment uponolder matched-ﬁlter methods’ high false detection ratesas a reason to avoid this technique (Gladders & Yee(2000) or Gilbank et al. (2004) for example). However,the false detection rates quoted above are largely causedby line-of-sight projections in the data; removing thiscontamination will lead to much improved false detectionrates and push down the galaxy cluster mass limit to whichreliable detections can be made. Arguably, red sequencebased methods have their own line-of-sight projectionissues. Gladders & Yee (2000) for example introduced theirred sequence based cluster ﬁnder by testing it on knownCNOC2 clusters (i.e. not simulations, but real data), andfound excellent false positive rates of <

5% for clusters atredshifts < .

5. However, as WK02 also show, false positivesincrease with redshift. The Gladders & Yee (2000) CNOC2sample testing provides limitations in the higher redshiftranges: notably a 36% false detection rate in the redshiftrange 0 . < z < . . < z < . Using the foundations of the method discussed above in Sec-tion 3.1, 3D-MF advances the matched-ﬁlter technique byincorporating photometric redshift information, and adjustsadditional elements as described below, to update and im-prove galaxy cluster ﬁnding. With the requirement that allnext-generation surveys for cosmology come with photomet- ric redshifts, it is natural for 3D-MF to extend the matched-ﬁlter technique into this third dimension. original slices. Afterrunning 3D-MF on these original slices, the process is re-peated by slicing the whole input catalogue diﬀerently, fromredshift 0.10 to 1.30 (with the same bin width for exam-ple), and these would be considered the shifted slices. Theresulting galaxy cluster lists from a complete run of originalslices and a complete run of shifted slices are then merged(the details of which are presented in Section 4). Clustersthat are multiply detected can be tracked, and false detec-tions in either the original or shifted slices can be loweredby associating them properly with clusters from the shiftedor original slices respectively. The aim of 3D-MF is to opti-mise cluster detections through redshift slicing, thus elimi-nating line-of-sight projections, but also to take into accountthe fact that each cluster spans a range of space and subse-quently redshift slicing has to be applied carefully to not cutthese clusters and count them multiple times, or converselyto miss them entirely. In optimising 3D-MF a variety of slic-ing styles were examined (various slice widths, non-uniformslice widths, diﬀering overlapping slice amounts, etc), andthe slicing method that produced the best results (as seenin Section 4.2.1) used redshift slice widths of 0.2, with shiftedslices overlapping original slices by a redshift of 0.1. L Mask weighted = L (1 + %masked / L (from Equation 4), is up-weightedby the percentage of the redshift-scaled radial ﬁlter that is Milkeraitis et al. masked, giving L Mask weighted .The mask-weighted likelihood maps, L Mask weighted , in-crease the likelihood that a partially masked clustermatches the ﬁducial cluster at a given redshift, as its lumi-nosity and radial proﬁles are up-weighted to compensate forthe information missing due to the presence of the mask.Importantly, mask-weighting also improves detections ofclusters near the edges of an image as clusters previouslymissed in these regions were found after the addition of thiscriteria.

The requirement for an accurate measure of the backgroundgalaxy counts ( b ( m ) in Equation 4) is an issue in P96and other implementations of matched-ﬁlter such as O1-4 discussed in Section 3.2. These older implementations ofmatched-ﬁlter methodology assume a priori that b ( m ) canbe modelled by a power law in m , or they determine b ( m )by creating self-simulated backgrounds (following the pre-scription of Soneira & Peebles 1978). 3D-MF determines thebackground counts by binning the magnitude distributionof the input data and subsequently interpolates within thismeasured distribution to determine b ( m ) for each galaxy (i.e.pulling out an interpolated background number count for aparticular galaxy magnitude), including it as a data-driven b ( m ) in the integral over cluster galaxies in Equation 4. Weﬁnd that this accurately models the likelihood of a clustersignal against a true data-derived background and increasesa cluster’s likelihood of detection with the advantage thatour method does not rely on biased modelling of b ( m ). Previous matched-ﬁlter methods (implementations of P96such as D07 and O1-4) detect peaks in the likelihood mapsusing SE

XTRACTOR ; while SE

XTRACTOR is optimised forthe detection of isolated peaks, it tends to blend things.Although this is excellent for object detection, the likeli-hood maps created with a matched-ﬁlter algorithm are moreakin to continuous maps and thus need an appropriate peakdetection method. Furthermore, SE

XTRACTOR is parame-ter dependent; we desire a robust method that withstandsreasonable changes in parameter space. 3D-MF detects allpeaks above a user-input threshold (such as > . σ ) andthereby generates a complete list of possible cluster candi-dates. Other versions of the matched-ﬁlter technique assign aredshift to a galaxy cluster where it is maximally detected(i.e. the redshift at which the ﬁducial cluster is maximallymatched) but in practise this often assigns an incorrectredshift to the cluster because all data is considered atall redshifts. Several other factors also contribute to thisincorrect redshift assignment, among them the assumedbackground counts discussed above, the k -correction (andthe eﬀect on this of the assumed evolution of variousgalaxy types), and the fact that the cluster signals become over-corrected via the normalisation of the luminosity ﬁlter(as in D07 and O1-4). These methods apply a cluster signalcorrection factor to counteract some of these points, butresulting redshift assignments are still, on average, incorrectand thus are not robust (nor do they claim to be; see O1&2).A plethora of papers followed up on the clustersfound in O1&2 with the goal of obtaining spectro-scopic redshifts of the matched-ﬁlter found cluster members(Ramella et al. (2000), Hansen et al. (2002), Benoist et al.(2002), Olsen et al. (2003), Olsen et al. (2005)a, Olsen et al.(2005)b, Grove et al. (2008)). Not all clusters were con-clusively found in the spectroscopic analysis; of the onesthat were, the spectroscopic redshifts for these systemswere compared with their matched-ﬁlter found redshifts.Unfortunately, number counts are too low to ascertainanything statistically (i.e. the sample size available tocompare spectroscopic and photometric redshifts was oftenonly a few galaxy clusters per paper).Also investigating redshift comparisons, galaxy clus-ters in D07 have redshifts derived from their matched-ﬁlterimplementation and a sample of these are compared toknown spectroscopic redshifts from the literature. Theseauthors ﬁnd their matched-ﬁlter has a bias toward lowerredshifts and conclude they are underestimating the trueredshift of galaxy clusters; a discrepancy that increaseswith higher redshifts.As 3D-MF requires photometric redshift informationfor all of the objects in a catalogue and slices the catalogueaccording to redshift, once each galaxy cluster is found,its associated redshift slice is automatically, immediatelyand more accurately known. For example, for a redshift binwidth of 0.2, with overlapping bins shifted by a redshiftof 0.1 from the original bins, consider a cluster that isﬁrst detected in an original bin: the cluster is assigned theredshift of the centre of the bin and this cluster centre isthereby known to within ± . ± .

05 in redshift: the width of the overlap regionof the original and shifted slices). This of course assumesthat photometric redshifts are without error and unbiased.Knowing a cluster’s galaxy members (and their redshifts)leads to an even more localised determination of a cluster’sredshift; this will be discussed in relation to 3D-MF inSection 4.3.1.

D-Matched-Filter Galaxy Cluster Finder Table 2.

Schechter Function (Equation 2) M ∗ i ′ − band -20.69 α -1.11Hubble Proﬁle (Equation 3) r c r co . ′ pix Peak DetectionMask Weighting L (1 + %masked / >

50% masked, rejectMinimum Signiﬁcance > . σ (Equation 5) for RealDetectionMultiple Detections2D Grouping (RA,Dec) 1.5Mpc diameterin Same Redshift SliceGrouping Between z − MF ± . In this section we present the application of the 3D-MF al-gorithm to the Millennium Simulation KW07 mock cata-logues. We ﬁrst assume exact redshifts are known for allgalaxies, and then investigate the impact of adding photo-metric redshift errors to the data, on 3D-MF’s selection func-tions. When matching the 3D-MF found clusters to knownmock clusters, we will use the known clusters above a mass of1 . × M ⊙ , and focus on the redshift range 0 . z . The Millennium Simulation mock lightcone catalogueswere cut at an i ′ − band magnitude of 25.5 to simulate amagnitude-limited survey. The catalogues were then cutinto optimal redshift slices (see Section 4.2.1): redshift slicewidths of 0.2 and shifted slices shifted from original slices by0.1 in redshift (i.e. original and shifted slices overlap by 0.1in redshift). The mock catalogues were then analysed with3D-MF, as described in Section 3, using the parameterslisted in table 2. M ∗ and α (Equation 2) were obtained by ﬁtting a Schechterluminosity function to mock galaxies and found to be M ∗ i ′ − band = − . ± .

24 and α = − . ± . > M ⊙ by setting the cutoﬀ radius, r co , to be 1Mpc based onthe r radius of a 3 × M ⊙ cluster. This is suﬃ-ciently large to enclose non-symmetrically shaped clusters and thus allow them to be detected, but still small enough tomaximally prevent neighbouring or on-the-verge-of-merginggalaxy clusters from becoming blended into one cluster de-tection. P96 explored a range of cutoﬀ radii and found thesigniﬁcance of their cluster detections dropped by up to 40%when increasing r co beyond 1Mpc, but didn’t test decreas-ing this window size signiﬁcantly within 1Mpc. However,P96 notes that when cluster dimensions are signiﬁcantlylarger then the cutoﬀ radius, the signal will obviously betruncated, though the degree depends on cluster shape. Inthe Millennium Simulation analysis that follows we ﬁnd ourchoice of r co is very successful at detecting clusters of masses > × M ⊙ and has a very useful by-product of also de-tecting clusters with masses as low as ∼ M ⊙ . As the setcutoﬀ radius for these low mass clusters is signiﬁcantly largerthan their extent, it will be diﬃcult to interpret our resultsfor these objects as we discuss further in Section 4.3. Futureimplementations of 3D-MF will experiment with optimisinglow mass detections using a varying cutoﬀ radius. Following 3D-MF’s search of the Millennium Simulationmock lightcone catalogues, the output detected galaxy clus-ter list was compared with the known Millennium Simula-tion galaxy clusters (see Section 2.2). To match a galaxycluster detection with known galaxy clusters, a cut wasﬁrst made in a projected two-dimensional radius, set at0.044 degrees. This value was chosen due to the fact that0.044 degrees at a mid-redshift range, z ∼ .

55, corre-sponds to ∼ r co )for an individual cluster as per Table 2, and notably at thehigher redshift end, z = 0 .

95, 0.044 degrees correspondsto ∼ candidates . From thelist of candidates , each 3D-MF found cluster was matchedin 3D space (RA, Dec, z) to the closest known cluster andthat known cluster is thus considered detected. The redshiftcomponent of this matching was performed within rangesof z − MF ± . > . z − MF de-notes the 3D-MF derived cluster redshift, and ± . z − MF ± . ± z .

6. Notethat lower redshift clusters will be spread out in redshift binsmore appreciably than higher redshift clusters in relation tothe bin volume; we wanted to avoid missing proper matchesbetween 3D-MF clusters and known mock clusters seeminglyspread out due to this eﬀect and thus a slight widening ofthe matching redshift parameter for lower redshift clusterswas chosen.

As per the discussion in Section 2.2, it can easily becomecomplicated to deﬁne a galaxy cluster . For example, insome cases it can become diﬃcult to realistically determinewhether a multiply detected galaxy cluster is either a)one larger known galaxy cluster detected multiple times

Milkeraitis et al. (for instance consider a merging system in a slightly’dumbbell’ shape that still has two clear lobes from theindividual galaxy clusters, each lobe separate and largeenough to be detected individually despite the chosen r co value), or b) from one detection of a cluster and asecond detection is instead from smaller known nearbyclusters in the lower mass ranges. For this reason, we choseto track multiple detections in three-dimensional space,and apply grouping criteria. Grouping 3D-MF detectionsto 3D-MF detections by a physically motivated linkinglength, and calling these linkages multiple detections , wecan track multiple detections within each redshift slice, aswell as between redshift slices. We wanted to avoid falselyassigning detected clusters to the wrong known galaxyclusters and to track multiply detected clusters at all massranges separately from false detections. Resultantly, any3D-MF detections within a predetermined linking length ofeach other were assigned a similar grouping identiﬁcationnumber. This linking length was chosen to be 0.75Mpc(in two dimensions: RA and Dec) based on the physicallymotivated assumption that if one 3D-MF detection couldbe exactly at the correct known centre position, and notingthat clusters typically span 1.5Mpc in diameter, this wouldput a detection at 0.75Mpc away from the correct centreas a maximal radius by which to associate detections. Thethird dimension of linking length, redshift, was selectedto be z − MF ± . h − Mpc and redshift diﬀerence of ± .

05. 3D-MF’smultiple detection grouping criteria are summarised inTable 2. It is important to note that 3D-MF detectionswere associated to each other and the known cluster centreswere not used in this step, because we wanted to develop amethod that would transfer to real data where centres arealways unknown.The best match in a multiple detection grouping ischosen to be the one that is closest to a known simulationcluster centre. We point out to the reader that this choiceis a diﬃcult one, as there are known issues regardingcentroiding; how one decides the true centre of a cluster isan interesting science topic many papers devote themselvesto in their entirety. We will provide evidence in Section4.3.2 that serves to support our choice of best match, andplan to investigate multiple matches and centroiding infuture work. Our selection function plots are free of multipledetections not considered best matches as we believe theseare duplicate detections of single clusters.

False detections were those 3D-MF cluster detections thatdid not match to any known galaxy clusters after the abovematching and grouping criteria were applied. As described,the cluster ﬁnding process is repeated with original andshifted slices. In order to ensure each false detection wasnot a ’faint’ detection of a real structure (for example, to de- termine whether a false detection was actually found moresigniﬁcantly in a diﬀerent redshift slice, perhaps due to acutting of the cluster), and to merge original and shiftedslices, the false detections from the original slices were crossreferenced in RA, Dec, and redshift with real cluster detec-tions from overlapping shifted slices (and vice versa). Theremaining false detections will be quantiﬁed and are likelydue to noise ﬂuctuations or subsisting projection eﬀects.

Photometric redshift errors were modelled for this workby examining the area of overlap between the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) Wideﬁeld data and the VIMOS VLT Deep Survey (VVDS;Le F`evre et al. (2005)) and the DEEP2 spectroscopicsurvey (Davis et al. (2007)). Photometric redshifts wereestimated for the CFHTLS Wide data from images providedby the CFHTLS-Archive-Research Survey (Erben et al.(2009)) with the method described in Hildebrandt et al.(2009), and those photometric redshifts were then comparedto secure spectroscopic redshifts. This process yields anaccuracy of the photometric redshifts for galaxies withmagnitudes i ′ <

24 of σ ∆ z/ (1+ z ) ≈ .

047 after rejection of2.8% of outliers.We conﬁne the redshift range of galaxies to z . z photometric − z spectroscopic ) after rejection ofoutliers) for each bin was calculated and weighted by the binnumber counts. A three-dimensional surface (magnitude,photometric redshift, and photometric redshift rms) wasthen found to be best ﬁt with a 2 degree polynomial: therms error of any new magnitude and redshift can then beextrapolated from this model (including an extrapolationof errors for magnitudes such as i ′ >

24 where deeperspectroscopic redshift catalogues do not currently existto allow more accurate modelling of this high magnituderegion). Using the Millennium Simulation mock lightconedata, a galaxy’s magnitude and exact redshift was usedin conjunction with the aforementioned rms model tocalculate a typical observational photometric redshift errorrms for that particular galaxy, from which a Gaussiandistribution was created. The assumption of a Gaussianerror distribution is supported by the comparisons betweenphotometric redshifts and spectroscopic redshifts mentionedabove (no signiﬁcant secondary peak or skewness was foundas conﬁrmed by the low number of outliers stated above andan overall vanishing photo-z bias respectively). Randomlyassigning a photometric redshift error to a given galaxywas then accomplished by randomly drawing an error valuefrom the Gaussian distribution with the rms modelled forthat galaxy. The redshift distribution of mock lightconegalaxies before and after the redshift errors were appliedare shown in Figure 3 (where the data have also been cutat a limiting i ′ − band magnitude of 25.5 to mimic realobservational data, and are only shown out to a redshiftof 1.1 although, as mentioned, the lightcones themselves D-Matched-Filter Galaxy Cluster Finder Figure 3.

Redshift distribution of the 6 Millennium Simulationmock lightcones (KW07) as exact redshifts (solid histograms) andwith redshift errors modelled from the photometric redshift er-rors of Hildebrandt et al. (2009) (blue, dashed histograms). Er-rors were derived from comparisons between the CFHTLS withVVDS and DEEP2 overlap regions; an explanation of the errormodelling can be found in the text (Section 4.2). The data havebeen cut at a limiting i ′ − band magnitude of 25.5 to mimic real,observationally limited data, and are only shown out to a redshiftof 1.1 as explained in the text. are much deeper). The redshift error scales with redshift;selecting redshift slices of 0.2 in width ensures our redshiftslicing is larger than the rms of ∼ . Investigating the use of various redshift slice widths with3D-MF, as well as examining the eﬀects of introducingredshift errors into the data, and the corresponding recoveryrates of galaxy clusters, leads us to be able to ﬁne-tune3D-MF’s parameters and select an optimal redshift slicing.An examination of the completeness (the fraction ofknown clusters recovered by 3D-MF) as a function of mass and redshift slice width is essential to recover as manyclusters as possible. Figure 4a presents a few variationsof redshift slice width as evidence that an analysis of themock lightcones with 3D-MF set to use redshift slice binwidths of 0.2 in size recovers the most known clusterswith M > × M ⊙ . Repeating this analysis withphotometric redshift errors shows that completeness tendsto go up in almost all cases when running 3D-MF separatelywith various redshift slice widths (see Figure 4b) but thesigniﬁcance of detections against the background goes down(as will be shown).Maintaining a constant redshift slice width, it is of furtherinterest to note the eﬀect on completeness when the limitingmagnitude of the survey is changed. Figure 5 presents aconstant redshift slice width of 0.2 with varying surveydepths of 23.5 to 25.5 i ′ − band magnitudes. As expected,the ability to ﬁnd all clusters is signiﬁcantly reduced as Figure 4.

Completeness with respect to mass, measured as afraction of known clusters found by 3D-MF. The input data wassliced into overlapping redshift slices as explained in the text.Overlapping slices are in each case shifted by half of the redshiftslice width. Figure 4a presents exact redshifts while Figure 4bincludes redshift errors in the data.

Figure 5.

Similar to Figure 4b (completeness with respect tomass, measured as a fraction of known clusters found by 3D-MF),but here the width of the redshift slices is held ﬁxed and the eﬀect(on the fraction of galaxy clusters found) caused by changing the i ′ − band limiting magnitude of the survey is shown. Milkeraitis et al.

Figure 6.

Comparing 3D-MF’s real, true galaxy cluster detec-tions with false detections as a function of detection signiﬁcance(down-sloping lines represent no redshift errors, while up-slopinglines represent data with redshift errors). the survey depth decreases: fewer galaxies are present inshallower data making their parent clusters improbable orimpossible to optically detect. Interpretations associatedwith cluster masses below ∼ × M ⊙ will be addressedthroughout Section 4.3 and the complications in this regionwill be addressed.Returning to a discussion of signiﬁcance levels, ignor-ing any detections below the 3.5 σ level is a sensibledecision: the probability of a 3 σ positive curvature peakfrom a background Gaussian ﬁeld, due to chance, is 4% (seeVan Waerbeke (2000)). Eﬀectively, desiring a contaminationdue to random ﬂuctuations of less than 1%, we have to bemore conservative and accept only those cluster detectionsabove 3.5 σ . Figure 6 investigates signiﬁcance and clearlyshows false detections were not found at a high level ofsigniﬁcance when compared to known clusters accuratelydetected by 3D-MF. Low false detection signiﬁcances areencouragingly still seen in Figure 6 with the introductionof photometric redshift errors (and Figure 6 shows, asexpected, redshift errors reduce detection signiﬁcances ingeneral). If we cut at a higher signiﬁcance level, such as6 . σ , we could remove >

95% of our false detections, butwe would also lose many lower signiﬁcance, real detections;a trade-oﬀ must be reached and thus 3 . σ was chosen. Having determined appropriate parameter settings, we nowpresent 3D-MF’s selection functions. The cumulative massfunction of galaxy clusters recovered by 3D-MF is presentedin Figure 7. An examination of this ﬁgure shows the recoveryrate of known clusters to be very accurate above an M of ∼ × M ⊙ ; results that are encouraging for 3D-MF, andfurthermore are consistent for all redshifts (see Figure 8).Taking the optimal redshift bin width of 0.2 (as perSection 4.2.1) and examining completeness with respectto redshift results in fractional recovery rates of knownclusters per redshift that are constant in each mass range,with excellent recovery (i.e. > Figure 7.

Cumulative mass function for galaxy clusters from theMillennium Simulation KW07 catalogue (as per Table 1; redshifterrors described in the text).

Figure 8.

Completeness with respect to redshift, measured as afraction of known clusters found. M > . × M ⊙ . Completeness rates per mass,multiple detection rates, and false detection informationare summarised in Table 3. 3D-MF has a 100% recoveryfor clusters with M > . × M ⊙ , 97% of clusters arefound with masses above 2 . × M ⊙ , 88% recovery ratesare seen for M > . × M ⊙ as mentioned, and 72%of clusters are found in even lower mass ranges (numbersquoted pertain to the presence of photometric redshifterrors).For clusters with redshifts in the range 0 . z . M > . × M ⊙ , the galaxy cluster numberdensity in the Millennium Simulation KW07 catalogues is ∼

18 galaxy clusters per square degree; 3D-MF ﬁnds ∼ ∼

16 galaxy clusters per square degree (using redshifts witherrors) for this same redshift and mass range. Furthermore,for clusters with 1 . × M ⊙ M < . × M ⊙ ,the galaxy cluster number density in the Millennium D-Matched-Filter Galaxy Cluster Finder Table 3.

Completeness (percent of total known clusters found by 3D-MF) and multiple and false detection rates for 3D-MF on KW07simulation galaxy clusters (photometric redshift errors described in the text). The Detected Galaxy Clusters section reports the numberof clusters detected at least one time, the Multiple Detections section reports additional detections, and the False Detections sectionpresents the fraction of total 3D-MF detections not matched to known clusters above 1 . × M ⊙ . Detected Galaxy Clusters

Mass Range Number of No photometric redshift errors With photometric redshift errorsKnown Clusters Number Completeness Number Completeness > . × M ⊙

16 16 100.0% 16 100.0% > . × M ⊙

32 32 100.0% 31 96.9% > . × M ⊙

208 197 94.7% 184 88.5% > . × M ⊙

637 548 86.0% 456 71.6% > . × M ⊙ Additional Multiple Detections

Percent of total Percent of totalCluster Detections Cluster Detections > . × M ⊙

14 0.4% 21 0.7% > . × M ⊙

28 0.8% 39 1.4% > . × M ⊙

208 6.1% 227 8.1% > . × M ⊙

504 14.8% 439 15.6% > . × M ⊙ False Detections > . × M ⊙

383 11.2% 438 15.6%

Simulation KW07 catalogues is ∼

258 galaxy clusters persquare degree; using exact redshifts, 3D-MF ﬁnds ∼ ∼

100 galaxy clusters per square degree.The multiple detection rate, measured as the fractionof total 3D-MF detections matched to known clusters thatare already matched to a 3D-MF detection , for clusters with M > . × M ⊙ , is 36.7% when exact redshifts areused and 36.2% when redshift errors are added to the data.This multiple detection rate is what would be expectedconsidering our redshift slices overlap by 0.1 in redshift:many clusters are included wholly in both original andshifted slices and, depending on their simulation redshift,are found by either preferentially matching our ﬁducialcluster (recall Section 3.1) in the original or shifted slices,or both.False detection rates are reported as the fraction oftotal 3D-MF detections not matched to known clustersabove . × M ⊙ ; when simulations have photometricredshift errors added, the false detection rate increases amere 4.4% from the non-error photometric redshift case to15.6% false detections overall. Since photometric redshifterrors scatter galaxies to random redshift slices they are notexpected to induce increased clustering. False detectionscan be further analysed; they are shown in Figure 9 to bea fairly uniform function of redshift, with the exception ofthe redshifts at the edges of our redshift range ( z = 0 . Figure 9. with redshifts < . > .

0. This needs to be investigatedfurther, but note that regardless of the redshift rangechosen, there will always be potential contamination fromgalaxy members of simulation clusters whose centres lieoutside the range of interest but whose presence is enoughto cause a 3D-MF detection. Recall that we had reasons toselect this redshift range: introducing redshift errors intothe simulations was conﬁned to a redshift region wherephotometric redshift errors could be accurately modelled(see Section 4.2).As previously discussed, 3D-MF measures the signiﬁcanceof each detection above the background galaxy distribution Milkeraitis et al.

Figure 10.

Signiﬁcance of galaxy cluster detections in all 6 sim-ulation lightcones. Only detections > . σ are considered ac-tual real detections. Known clusters are not considered below1 . × M ⊙ (a sensible mass cut, well below the divergencefrom expectation as shown in Figure 7). The redshift bins in theplot do not reﬂect redshift slicing, but rather are found clustersbinned by their redshifts. (Equation 5). False detections as a function of the detectionsigniﬁcance are presented in Figure 6 with the signiﬁcancelevels at which known clusters were accurately detected:many more highly signiﬁcant clusters were found which werereal detections, rather than false detections. Introducing red-shift errors does lower the signiﬁcance of detections, but asFigure 4 shows, completeness remains high. The signiﬁcance of cluster detections ( > . σ ) as a functionof cluster mass is presented in Figure 10. Focusing onclusters with masses > M ⊙ we see higher mass clustersbeing detected with higher signiﬁcance as one might expect.For lower mass clusters we ﬁnd the detection signiﬁcanceis not correlated with mass and we believe this is dueto a number of reasons. We looked at the few cases of > σ detections matched with known clusters of masses < × M ⊙ and found that while the closest knowncluster to the detection was a low mass cluster, in themajority of cases galaxy interlopers from a nearby moremassive, > M ⊙ , cluster were more likely responsible forthe signiﬁcance of the detection. In addition, as discussedin Section 4.1, 3D-MF is not ﬁne-tuned for interpreting thedetection of low mass clusters as the radial ﬁlter is currentlymuch larger than the extent of these low mass clusters.3D-MF assigns all galaxies within a 1Mpc radius around apeak detection to a cluster; for low mass clusters, galaxieswithin this radius and within a redshift slice, but outsidethe low mass cluster halo, will add noise to a measurementof signiﬁcance. This is demonstrated in Figure 11 whichpresents the number of galaxies found by 3D-MF per knowncluster as a function of cluster mass. The left panel of Figure 11 (Figure 11a) shows theresults for exact redshifts and the right panel (Figure 11b)shows the results when photometric redshift errors areincluded. For clusters with M > × M ⊙ , 3D-MFcorrectly measures the number of galaxies belonging toa cluster. However, for lower mass clusters we see themeasured galaxy cluster member counts unable to go below ∼

30 galaxies per cluster (blue crosses), which is the averagenumber of galaxies that ﬁt 3D-MF’s radial proﬁle within a1Mpc window. Reducing the cutoﬀ radius, r co , in Equation3 would improve the association of correct galaxies tolow mass clusters but this would signiﬁcantly reduce thehigh rate of success shown for higher mass cluster detections.The introduction of redshift errors moves the pointsin Figure 11a leftward, meaning that as cluster membersbecome scattered in redshift space, 3D-MF associates lessof them to their parent cluster, as would be expected. In-terestingly, for the lowest mass clusters, which are the mostdiﬃcult to detect, this means galaxy number counts areless likely to be overestimated while remaining detectablewith 3D-MF.Examining the number of galaxies per cluster, as de-termined by 3D-MF, as a potential proxy for mass is shownin Figure 12. Again focusing on the clusters with masses > M ⊙ we see a weak redshift dependent trend: themost massive clusters have the most numerous members.For the low mass clusters however, owing to the incorrectassignment of too many galaxies to those clusters, we seeno correlation.We will investigate not having a clear proxy for massvia 3D-MF in future work. P96, D07 and O1-4 assumethat all light after background subtraction follows theSchechter function and that this can be expressed as aluminosity equivalent to the number of L ∗ galaxies (whichthey refer to as Λ cl ), with good results, but accurate galaxynumber counts per cluster is less of an issue with the lowercluster densities and higher mass clusters their methodstypically ﬁnd. We plan to further investigate our currentlyoverestimated cluster galaxy members in the low massranges, by looking at galaxy number counts per cluster,and potentially optimising a variable cutoﬀ radius in the3D-MF search window.Methods complimentary to 3D-MF would work as massestimators as well; weak gravitational lensing techniquesfor example can be utilised on real data to measure clustermasses (as in Hoekstra 2007). Alternatively, spectroscopicredshifts could be utilised to precisely identify galaxycluster members and potentially lead to determining abetter contamination fraction in the current relationshipbetween 3D-MF galaxy cluster member number counts andcluster mass. X-ray information is known to be an excellentproxy for mass (Smith et al. 2005), and Sunyaev-Zel’dovich(SZ) information could be added to tell us more about thestructure of clusters along the line-of-sight (as in Motl et al.(2005) and Sealfon et al. (2006) for example). D-Matched-Filter Galaxy Cluster Finder Figure 11.

The number of galaxies found per known cluster as a function of cluster mass (the dashed horizontal line denotes thedeﬁnition in Table 1 which requires clusters have > Figure 12.

The range of cluster masses and the number of galax-ies associated to those cluster masses by 3D-MF.

As previously discussed, it is not an easy task to deﬁnethe centre position of a cluster in an automated way.By construction, 3D-MF does not have a cluster centrepredeﬁned. Other cluster ﬁnding methods, such as thosementioned in Section 1, use the position of the brightestcluster galaxy member (Koester et al. 2007), or the centreof the distribution of red galaxies for example. We choosethe centre of a galaxy cluster to be based on the luminosityand radial distribution of cluster galaxies. As shown inFigure 11, for clusters with M > × M ⊙ , 3D-MFcorrectly associates galaxy members to their parent cluster,and we thereby expect the cluster centroid to be determinedby this method with reasonable accuracy. For lower massclusters however, owing to interloping galaxy members from nearby clusters (interlopers that fall within a radial proﬁlecentred near the lower mass cluster), the centroid that wemeasure is rather diﬃcult to interpret.Recall that a 3D-MF cluster detection within 0.75Mpc ofanother simulation matched 3D-MF cluster detection (a3D-MF detection that was chosen to be the match as itwas closer to the known cluster centre) is considered asecondary detection. This further complicates the issue ofcentroiding, as it is not clear whether the 3D-MF centroidfor clusters with secondary detections should be calculatedfrom the galaxies associated to both the primary andsecondary detection or just the primary, especially consid-ering the fact that clusters often have substructures (andperhaps in reality don’t have a clear centre). In the analysisthat follows we choose the primary detection as the centroid.Figure 13 shows the number of 3D-MF derived clus-ters matched to known simulation clusters as a function ofthe distance between their centres. For all mass ranges weﬁnd there are fewer 3D-MF cluster centre positions withincreased radius from known cluster centres, showing that a3D-MF centroid measurement is frequently quite accurate.The elevated plateau at a matching radius > . × M ⊙ M < . × M ⊙ massrange, indicates some level of contamination in our lowmass matched sample.To investigate this, we randomised the cluster centroidsfound by 3D-MF in RA and Dec and repeated our match-ing to known simulation cluster centre positions. Theaverage random matches for all cluster mass ranges, from50 randomised trials, is presented in Figure 13 as thedotted line, which we call matching noise. Over 80% of thecontribution to this matching noise comes from low massclusters (1 . × M ⊙ M < . × M ⊙ ) and isthus a major source of the contamination in the low masssample’s plateau (red, triangle line) in Figure 13. This iswhat we would expect: low mass (and thus small) clusters Milkeraitis et al.

Figure 13.

Distance of 3D-MF derived cluster centres from pre-determined, known simulation centres (as deﬁned in Section 2.2).The dotted line shows results from 3D-MF derived clusters ran-domised in RA and Dec before matching to known simulationclusters according to our matching criteria: over 80% of the con-tribution here is from the lowest (i.e. 1 . × M ⊙ M < . × M ⊙ ) mass range (see text for further discussion). contribute to the matching noise increasingly at larger radiifrom known KW07 centres since randomising their 3D-MFfound positions should uniformly place their detectionsacross the ﬁeld, and with any size of matching windowthere will always be some of these low mass clusters (ofhigh number density) matched to a known cluster.The matching noise we have shown is an overestimateof the real matches due to noise because randomisingthe 3D-MF cluster positions in RA and Dec eliminatesproperly identiﬁed multiple detections of single clusters,thereby overestimating single detections and maximallyincreasing the matching due to chance incorrect matches.For example, a cluster that is detected by 3D-MF in oneoriginal redshift slice and also in the overlapping redshiftslice would be considered a multiply detected cluster. Afterrandomisation of the cluster positions, this would nowappear as two separate cluster detections in the originaland shifted redshift slices. To correct for this, we use ourmultiple detection rates of 36.2% and conservatively reporta maximal upper limit on our false detections of 24.3%(i.e. [(Total number of false detections + (1 − . × Total number counts due to an upper limit of matchingnoise)/Total 3D-MF detections]). As discussed at thebeginning of this section, we believe the false detection rate(especially for clusters of masses > . × M ⊙ ) is closerto 15.6%, but for the higher number densities of lower massclusters this could be up to 24.3%.Due to the increased sensitivity of 3D-MF over two-dimensional matched-ﬁlter methods, we believe 3D-MFis correctly recognising the signiﬁcance of low mass clus-ters above background noise, but due to excess galaxynumber counts per cluster, and oversized ﬁducial clusterﬁlters, has trouble correctly choosing a cluster centroidfor these low mass clusters. We can safely conﬁrm fromthis centroid analysis that we have chosen a physicallymotivated and sensible matching radius within which to properly match 3D-MF’s detections with known simulationclusters. Interestingly, the two-dimensional matched-ﬁlterof White & Kochanek (2002) successfully implements aniterative centroiding technique: a logical next step forimproving the current implementation of 3D-MF and itsability to determine cluster centres. For the current work we use the four 1 × i ′ − band object detec-tions. The seeing for this band varies between 0.71 and 0.82arcseconds within the four ﬁelds and we reach a 1 σ limitingmagnitude of i ′ AB , lim ≈ . BPZ (see Ben´ıtez (2000) and theprescription in Hildebrandt et al. (2009)). In the ﬁelds D1,D2 and D3 we quantiﬁed the accuracy of our photometricredshifts with spectroscopic redshifts from the VVDS(Le F`evre et al. (2005)), z COSMOS (Lilly et al. (2007)),and DEEP2 (Davis et al. (2007)) respectively. Within themagnitude range of 17 < i’- band <

24 we estimate unbiasedphotometric redshifts, and after rejecting problematicsources (we cut with the

BPZ ODDS parameter for sourceswith

ODDS > .

9; see also Hildebrandt et al. (2009)), we ﬁnda photometric - spectroscopic redshift scatter of σ ≈ . z = ( z phot − z spec ) / (1 + z spec ). Our outlierrate with | ∆ z | > .

15 is 1 . ∼ . × galaxies per square degree with i ′ − band magnitudes < . z < . ∼ . × galaxies per square degree). M ∗ and α were de-rived from the CFHTLS Deep ﬁelds and found to be M ∗ i ′ − band = 22 . ± .

15 and α = − . ± . ∼ −

24% falsepositives in this catalogue, distributed mostly in the lowermass ranges according to the selection functions in Section4. Using our multiple detection criteria, we found 37.6% of

D-Matched-Filter Galaxy Cluster Finder Table 4.

CFHTLS Deep Galaxy Clusters

Deep RA Dec Redshift Signiﬁcance Grouping Identiﬁcation NumberField ( σ ) Same Redshift Overlapping RedshiftD1 02:25:07.536 -04:01:47.892 0.2 7.16 - -D1 02:24:26.040 -04:52:29.676 0.9 5.21 2 58D2 09:58:54.720 01:54:57.744 0.4 5.35 - -D2 09:58:53.759 02:14:16.188 0.5 5.15 - 31D3 14:17:22.802 52:54:52.920 0.2 7.27 - -D3 14:18:19.440 53:05:58.200 0.5 5.61 - -D4 22:14:39.120 -18:09:41.760 0.6 5.23 - 37D4 22:16:08.400 -18:11:48.840 0.4 6.73 - -...complete catalogue available upon request... Deep detections were duplicate detections of clusters (com-parable to the ∼

36% multiple detection rate found fromour Millennium Simulation tests). Grouping IdentiﬁcationNumbers in Table 4 are numbered ﬂags: cluster detectionswithin 0.75Mpc of each other (recall Section 4.1.2 fordetails) in the same redshift slice are ﬂagged with identical

Same Redshift numbers, and cluster detections within aprojected 0.75Mpc of each other in overlapping redshiftslices are separately ﬂagged with identical

OverlappingRedshift grouping identiﬁcation numbers (which can bepropagated through more than one overlapping redshiftslice if still within 0.75Mpc in projected radii of each otherin overlapping slices). Note that Grouping IdentiﬁcationNumbers in both the

Same Redshift and

OverlappingRedshift numbering restart at zero for each Deep ﬁeld.We use the signiﬁcance of our detections to select thebest galaxy cluster candidate from among multiple detec-tions, and excise the remaining multiple detections from thefollowing discussion and analysis. The redshift distributionof 3D-MF found Deep galaxy clusters is shown in Figure14. A comparison to other published CFHTLS Deep clusterlists via older matched-ﬁlter methodology ensues.

Olsen et al. (2007) (herein O3) published a matched-ﬁlterderived cluster list of the CFHTLS Deep ﬁelds. A subsequentpaper in 2009 (Grove et al. (2009); O4) examined runningtheir matched-ﬁlter method (which does not utilise photo-metric redshifts) separately on diﬀerent wavelength bandsand then merging lists with the 2007 paper, a technique sim-ilar to P96’s eﬀorts to reduce false detection rates. Since theresulting lists are not too diﬀerent, we choose to compare the i ′ − band 2007 O3 cluster list to a cluster list found by 3D-MF, using the same i ′ − band data, in this paper. We focuson the 0 . z . . σ cut, for all four CFHTLS Deepﬁelds is shown overlayed on the 3D-MF results in Figure 14.Drastically fewer galaxy clusters were found by O3 at all red- Figure 14.

Redshift distribution of all 3D-MF found (dashedup-sloping histogram) and Olsen et al. (2007) found (shaded his-togram) galaxy clusters in CFHTLS Deep ﬁelds 1 through 4. shifts; 3D-MF ﬁnds ∼

170 galaxy clusters per square degree(well within reason compared to the Millennium Simulationanalysis cluster number densities in Section 4.3) comparedto ∼

40 galaxy clusters per square degree in O3.In order to match the clusters found using both methods, atwo-dimensional tolerance radius of 0.044 degrees (akin tothe matching strategy in Section 4.1.1) was placed aroundeach O3 cluster centre position and the closest 3D-MF de-tection in RA and Dec was considered a match. Figure 15shows the redshift distribution, and Figure 16 presents theredshift diﬀerences, of the two-dimensional matches betweenthe two cluster lists.As mentioned in Section 3.3.5, older matched-ﬁlter methodstend to wrongly estimate the redshift of clusters. O3searches the entire input catalogue (i.e. no redshift slicing)for a match to a ﬁducial cluster sized to match what wouldbe expected at a particular redshift, and then repeats theprocess with a slightly re-sized ﬁducial cluster size (tomatch what would be expected for a cluster that existed Milkeraitis et al.

Figure 15.

The redshift distribution of only those galaxy clustersfound using both

Figure 16.

Diﬀerences in assigned galaxy cluster redshifts forclusters found using both 3D-MF and O3 methods (using two-dimensional matching as described in Section 5.3) at a slightly diﬀerent redshift); O3 requires each cluster tobe detected in two neighbouring ﬁducial cluster re-sizingsearches in order to be considered further. This aﬀectshigher redshifts disproportionately, as they are less likelyto be found in equally-stepped ﬁducial cluster-resizingbins than lower redshift clusters; higher redshift space ofequivalent bin widths covers much more volume. Figure15 shows the likely result of this: the O3 cluster sample isskewed toward lower redshifts.A second separate matching was performed betweenthe 3D-MF Deep cluster list and O3; this time a three -dimensional tolerance range was considered around each O3cluster. There were less matches, as seen in Figure 17, again

Figure 17.

Redshift distribution of galaxy clusters found usingboth 3D-MF and O3 methods, matched to each other in three -dimensional space (see text for details). The vertical scale of Fig-ure 15 is duplicated here for comparison purposes.

Table 5.

A Comparison of CFHTLS Deep galaxy clusters foundwith 3D-MF and published CFHTLS Deep galaxy clusters fromO3 (Olsen et al. 2007). Cluster redshift ranges of 0 . z .

3D Cluster Comparison

CFHTLS 3D-MF O3 O3Deep Field & skewed toward the lower redshift end of the distribution.Comparing both the 3D-MF and O3 CFHTLS Deep clusterlists, there were 145 clusters found by both 3D-MF and O3out of 149 total O3 clusters (97%) with the two-dimensionalmatching criteria described. 528 additional galaxy clusterswere found with 3D-MF. As mentioned, the matchingprocess between 3D-MF clusters and O3 clusters wasrepeated considering a further third dimension in redshiftspace (see Table 5); in this case there were 111 clustersfound by both 3D-MF and O3, suggesting 34 of the O3clusters (23%) were not assigned the correct redshift bytheir algorithm.The completeness has been shown to be vastly diﬀer-ent between the two methods: 3D-MF is 100% completedown to ∼ × M ⊙ , ∼

90% complete down to 1 × M ⊙ and the false detections are likely concentrated even furtherdown the cluster mass function at the low mass end. It was D-Matched-Filter Galaxy Cluster Finder discussed in general in Section 3.2 that older matched-ﬁltermethods have false detection rates of ∼ − • the cutting of input data into overlapping redshift slicesto examine it piecewise, signiﬁcantly reducing line-of-sightprojections; • an implementation of mask-handling capabilities, im-proving edge-eﬀects, and cluster detections near bright starsor saturated pixels; • an accurate modelling of data-dependent backgroundgalaxy counts; • the development of a new peak detection pipeline;as well as a list of other parameters that can beﬁne-tuned according to the dataset being examined.The Millennium Simulation mock catalogue lightconeswere used to extensively test and improve 3D-MF, and se-lection functions for the algorithm were presented. Redshifterrors mimicking real data were modelled and added to thesimulations and their eﬀect on the selection functions wasderived. With redshift errors, and focusing on the clusterrange 0 . z .

0, 3D-MF was found to recover 100% ofknown galaxy clusters with an M > . × M ⊙ ; 97%of clusters with an M > . × M ⊙ ; 88% of clusterswith an M > . × M ⊙ ; and 72% of clusters withan M > . × M ⊙ . 36% of detections were multipledetections of clusters. False detections were determined tobe occuring at a rate of 15.6% of the total cluster detections,and a subsequent analysis showed this to be concentrated inthe low signiﬁcance, low galaxy number ( .

50) per cluster,and likely lower mass ( M < × M ⊙ ) range (poten-tially increased by noise up to a conservatively reportedrate of 24%). After selection functions were quantiﬁed (andthe eﬀects of adding redshift errors to the catalogues wereanalysed), 3D-MF was run on the four CFHTLS Deepﬁelds. 3D-MF ﬁnds ∼

170 galaxy clusters per squaredegree in the Deep dataset: over 400% more, with a muchlower false detection rate, and higher accuracy of redshiftdetermination for true clusters, than found by other authorsusing two-dimensional matched-ﬁlter methods on the same i ′ − band data.For future work, there are subtle adjustments to 3D-MF that we are examining. A non-passively evolving k -correction could be applied, taking into consideration the eﬀect of variations in galaxy types. More interestingly per-haps, as shown in Popesso et al. (2005), galaxy clusters areoften better ﬁt with two Schechter functions as opposed toone: implementing this result may further improve 3D-MF.The radial proﬁle used in 3D-MF’s radial ﬁlter could also betuned to try to ﬁnd more low mass, smaller clusters (if pos-sible). Diﬀerent ﬁlter bands could be used in 3D-MF whenanalysing the redshift slices, and results compared with the i ′ − band used herein, or spectroscopic redshifts could beutilised to conﬁrm cluster detections and precisely identifygalaxy cluster members. It would be sensible to try to ex-tend the high redshift range of the detection capabilities of3D-MF as well, although this method is ultimately limitedby the build up of the luminosity function at redshifts & . ACKNOWLEDGEMENTS

We thank the reviewer for useful comments and feedback.MM also thanks Gabriella De Lucia for helpful discussionson the Millennium Simulation. This work involves observa-tions obtained with MegaPrime/MegaCam, a joint projectof CFHT and CEA/DAPNIA, at the Canada-France-HawaiiTelescope (CFHT) which is operated by the National Re-search Council (NRC) of Canada, the Institut National desSciences de l’Univers of the Centre National de la RechercheScientiﬁque (CNRS) of France, and the University of Hawaii.This work is based in part on data products produced atTERAPIX and the Canadian Astronomy Data Centre aspart of the Canada-France-Hawaii Telescope Legacy Sur-vey, a collaborative project of NRC and CNRS. HH wassupported by the European DUEL RTN, project MRTN-CT-2006-036133. TE is supported by the German Ministryfor Science and Education (BMBF) through DESY underthe project 05AV5PDA/3 and the German Science Founda-tion (DFG) through the project TR33 ’The Dark Universe’.The Millennium Simulation databases used in this paperand the web application providing online access to themwere constructed as part of the activities of the GermanAstrophysical Virtual Observatory.

REFERENCES

Adami C., Durret F., Benoist C., et al., 2010, A&A, 509, 81Battye R. A., Weller J., 2003, PRD, 68, 083506Ben´ıtez N., 2000, ApJ, 536, 571Benoist C., da Costa L., Jørgensen H. E., et al., 2002, A&A,394, 1Cohn J. D., Evrard A. E., White M., Croton D., Ellingson E.,2007, MNRAS, 382, 1738 Milkeraitis et al.

Davis M., Guhathakurta P., Konidaris N. P., et al 2007, ApJL,660, L1De Lucia G., Blaizot J., 2007, MNRAS, 375, 2Dietrich J. P., Erben T., Lamer G., Schneider P., Schwope A.,Hartlap J., Maturi M., 2007, A&A, 470, 821Dressler A., 1980, ApJ, 236, 351Erben T., Hildebrandt H., Lerchster M., et al 2009, A&A, 493,1197Gilbank D. G., Bower R. G., Castander F. J., Ziegler B. L., 2004,MNRAS, 348, 551Gladders M. D., Yee H. K. C., 2000, AJ, 120, 2148Grove L. F., Benoist C., Martel F., 2009, A&A, 494, 845Grove L. F., da Costa L., Benoist C., 2008, A&A, 490, 945Hansen L., Olsen L. F., Jørgensen H. E., 2002, A&A, 388, 1Hilbert S., White S. D. M., 2010, MNRAS, 404, 486Hildebrandt H., Pielorz J., Erben T., van Waerbeke L., SimonP., Capak P., 2009, A&A, 498, 725Hoekstra H., 2007, MNRAS, 379, 317Kepner J., Fan X., Bahcall N., Gunn J., Lupton R., Xu G., 1999,ApJ, 517, 78Kitzbichler M. G., White S. D. M., 2007, MNRAS, 376, 2Kochanek C. S., White M., Huchra J., Macri L., Jarrett T.H.,Schneider S.E., Mader J., 2003, ApJ, 585, 161Kodama T., Tanaka I., Kajisawa M., Kurk J., Venemans B., DeBreuck C., Vernet J., Lidman C., 2007, MNRAS, 377, 1717Koester B. P., McKay T. A., Annis J., et al., 2007, ApJ, 660,221Le F`evre O., Vettolani G., Garilli B., et al., 2005, A&A, 439,845Li I. H., Yee H. K. C., 2008, AJ, 135, 809Lilly S. J., Le F`evre O., Renzini A., et al., 2007, ApJS, 172, 70Lopes P. A. A., de Carvalho R. R., Gal R. R., Djorgovski S.G.,Odewahn S.C., Mahabal A.A., Brunner R.J., 2004, AJ, 128,1017Lu T., Gilbank D. G., Balogh M. L., Bognat A., 2009, MNRAS,399, 1858 Menanteau F., Hughes J. P., Jimenez R., et al., 2009, ApJ, 698,1221Motl P. M., Hallman E. J., Burns J. O., Norman M. L., 2005,ApJL, 623, L63Olsen L. F., Benoist C., Cappi A., et al., 2007, A&A, 461, 81Olsen L. F., Benoist C., da Costa L., Hansen L., Jørgensen H. E.,2005, A&A, 435, 781Olsen L. F., Hansen L., Jørgensen H. E., Benoist C., da CostaL., Scodeggio M., 2003, A&A, 409, 439Olsen L. F., Scodeggio M., da Costa L., et al., 1999a, A&A, 345,681Olsen L. F., Scodeggio M., da Costa L., et al., 1999b, A&A, 345,363Olsen L. F., Zucca E., Bardelli S., Benoist C., da Costa L.,Jørgensen H.E., Biviano A., Ramella M., 2005, A&A, 442, 841Popesso P., B¨ohringer H., Romaniello M., Voges W., 2005, A&A,433, 415Postman M., Lubin L. M., Gunn J. E., Oke J.B., Hoessel J.G.,Schneider D.P., Christensen J.A., 1996, AJ, 111, 615Ramella M., Biviano A., Boschin W., et al., 2000, A&A, 360,861Sealfon C., Verde L., Jimenez R., 2006, ApJ, 649, 118Smith G. P., Kneib J., Smail I., Mazzotta P., Ebeling H., CzoskeO., 2005, MNRAS, 359, 417Soneira R. M., Peebles P. J. E., 1978, AJ, 83, 845Springel V., White S. D. M., Jenkins A., et al., 2005, Nat, 435,629Stark D. P., Ellis R. S., Richard J., Kneib J., Smith G. P., SantosM. R., 2007, ApJ, 663, 10Thanjavur K., Willis J., Crampton D., 2009, ApJ, 706, 571van Breukelen C., Clewley L., 2009, MNRAS, 395, 1845Van Waerbeke L., 2000, MNRAS, 313, 524White M., Kochanek C. S., 2002, ApJ, 574, 24