Galaxy and Mass Assembly (GAMA): The GAMA Galaxy Group Catalogue (G3Cv1)
A.S.G. Robotham, P. Norberg, S.P. Driver, I.K. Baldry, S.P. Bamford, A.M. Hopkins, J. Liske, J. Loveday, A. Merson, J.A. Peacock, S. Brough, E. Cameron, C.J. Conselice, S.M. Croom, C.S. Frenk, M. Gunawardhana, D.T. Hill, D.H. Jones, L.S. Kelvin, K. Kuijken, R.C. Nichol, H.R. Parkinson, K.A. Pimbblet, S. Phillipps, C.C. Popescu, M. Prescott, R.G. Sharp, W.J. Sutherland, E.N. Taylor, D. Thomas, R.J. Tuffs, E. van Kampen, D. Wijesinghe
aa r X i v : . [ a s t r o - ph . C O ] J un Mon. Not. R. Astron. Soc. , 1–31 (2011) Printed 22 October 2018 (MN L A TEX style file v2.2)
Galaxy and Mass Assembly (GAMA): The GAMA GalaxyGroup Catalogue (G Cv1)
A.S.G. Robotham ⋆ , P. Norberg, S.P. Driver, , I.K. Baldry, S.P. Bamford, A.M. Hopkins, J. Liske, J. Loveday, A. Merson, J.A. Peacock, S. Brough, E. Cameron, C.J. Conselice, S.M. Croom, C.S. Frenk, M. Gunawardhana, D.T. Hill, D.H. Jones, L.S. Kelvin, K. Kuijken, R.C. Nichol, H.R. Parkinson, K.A. Pimbblet, S. Phillipps, C.C. Popescu, M. Prescott, R.G. Sharp, W.J. Sutherland, E.N. Taylor, D. Thomas, R.J. Tuffs, E. van Kampen, D. Wijesinghe, SUPA † , School of Physics & Astronomy, University of St Andrews, North Haugh, St Andrews, KY16 9SS, UK SUPA, Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ, UK ICRAR ‡ , The University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia Astrophysics Research Institute, Liverpool John Moores University, Egerton Wharf, Birkenhead, CH41 1LD, UK Centre for Astronomy and Particle Theory, University of Nottingham, University Park, Nottingham NG7 2RD, UK Australian Astronomical Observatory, PO Box 296, Epping, NSW 1710, Australia European Southern Observatory, Karl-Schwarzschild-Str. 2, 85748 Garching, Germany Astronomy Centre, University of Sussex, Falmer, Brighton BN1 9QH, UK Institute for Computational Cosmology, Department of Physics, Durham University, South Road, Durham DH1 3LE, UK Department of Physics, Swiss Federal Institute of Technology (ETH-Z¨urich), 8093 Z¨urich, Switzerland Sydney Institute for Astronomy, School of Physics, University of Sydney, NSW 2006, Australia School of Physics, Monash University, Clayton, Victoria 3800, Australia Leiden University, P.O. Box 9500, 2300 RA Leiden, The Netherlands Institute of Cosmology and Gravitation (ICG), University of Portsmouth, Dennis Sciama Building, Portsmouth PO1 3FX, UK HH Wills Physics Laboratory, University of Bristol, Tyndall Avenue, Bristol, BS8 1TL, UK Jeremiah Horrocks Institute, University of Central Lancashire, Preston PR1 2HE, UK Research School of Astronomy & Astrophysics, Mount Stromlo Observatory, Cotter Road, Western Creek, ACT 2611, Australia Astronomy Unit, Queen Mary University London, Mile End Rd, London E1 4NS, UK Max Planck Institute for Nuclear Physics (MPIK), Saupfercheckweg 1, 69117 Heidelberg, Germany
ABSTRACT
Using the complete GAMA-I survey covering ∼
142 deg to r AB = 19 .
4, of which ∼
47 deg is to r AB = 19 .
8, we create the GAMA-I galaxy group catalogue (G Cv1),generated using a friends-of-friends (FoF) based grouping algorithm. Our algorithmhas been tested extensively on one family of mock GAMA lightcones, constructed fromΛCDM N-body simulations populated with semi-analytic galaxies. Recovered groupproperties are robust to the effects of interlopers and are median unbiased in the mostimportant respects. G Cv1 contains 14,388 galaxy groups (with multiplicity > ∼
40% of all galaxiesare assigned to a group. The similarities of the mock group catalogues and G Cv1 aremultiple: global characteristics are in general well recovered. However, we do find anoticeable deficit in the number of high multiplicity groups in GAMA compared tothe mocks. Additionally, despite exceptionally good local spatial completeness, G Cv1contains significantly fewer compact groups with 5 or more members, this effect be-coming most evident for high multiplicity systems. These two differences are mostlikely due to limitations in the physics included of the current GAMA lightcone mock.Further studies using a variety of galaxy formation models are required to confirmtheir exact origin. The G Cv1 catalogue will be made publicly available as and whenthe relevant GAMA redshifts are made available at . Key words: cosmology – galaxies: environment – large scale structure ⋆ E-mail:[email protected] † Scottish Universities Physics Alliancec (cid:13)
A.S.G. Robotham et al.
Galaxy group and cluster catalogues have a long his-tory in astronomy. Early attempts at creating asso-ciations of galaxies were quite qualitative in nature(e.g. Abell 1958; Zwicky et al. 1961), but more re-cently significant effort has been devoted to robustly de-tecting grouped structures (e.g. Huchra & Geller 1982;Moore, Frenk & White 1993; Eke et al. 2004; Gerke et al.2005; Yang et al. 2005; Berlind et al. 2006; Brough et al.2006; Knobel et al. 2009). The pioneering application of thisprocess was by Huchra & Geller (1982), where the catalogueof De Vaucouleurs (1975), the earliest reasonably completeattempt at a group catalogue, was reconstructed using fullyquantitative means— i.e. by a method that was reproducibleand not subjective.The power of group catalogues resides in their rela-tion to the theoretically motivated dark matter haloes.ΛCDM, the literatures current favoured structure formationparadigm, makes very strong predictions about the self sim-ilar hierarchical merging process that occurs between haloesof dark matter (Springel et al. 2005). Galaxy groups are theobservable equivalent of dark matter haloes, and thus offer adirect insight into the physics that has occurred in the darkmatter haloes in the Universe up to the present day. Fur-ther to offering a route to studying dark matter dynamics(e.g. Plionis et al. 2006; Robotham et al. 2008), analysis ofgalaxy groups opens the way to understanding how galax-ies populate haloes (e.g. Cooray & Sheth 2002; Yang et al.2003; Cooray 2006; Robotham et al. 2006, 2010b).The strongest differentials between competing physicalmodels of dark matter are found at the extremes of the halomass function (HMF), i.e. on cluster scales (e.g. Eke et al.1996) and on low mass scales. The HMF describes the co-moving number density distribution of dark matter haloesas a function of halo mass. Low mass groups are highly sen-sitive to the temperature of the CDM. We either expect tosee a continuation of the near power-law prediction for theHMF down to Local Group mass haloes (see Jenkins et al.2001, and references therein) for a cold dark matter Uni-verse, or, as the dark matter becomes warmer, the slopeshould become suppressed significantly.The Galaxy and Mass Assembly project (GAMA) isa major new multi-wavelength spectroscopic galaxy survey(Driver et al. 2011). The final redshift survey will contain ∼ r AB = 19 . ∼
360 deg , with asurvey design aimed at providing an exceptionally uniformspatial completeness (Robotham et al. 2010a; Baldry et al.2010; Driver et al. 2011). One of the principal science goalsof GAMA is to make a statistically significant analysis oflow mass groups ( M h − M ⊙ ), helping to constrainthe low mass regime of the dark matter HMF and galaxyformation efficiency in Local Group like haloes.As well as allowing us to determine galaxy groupdynamics and composition at the highest fidelity possi-ble due to the increased redshift density, GAMA willalso provide mult i -band photometry spanning the UV(GALEX), visible (SDSS; VST-KIDS), near-IR (UKIDSS-LAS, VIKING), mid-IR (WISE), far-IR (ATLAS) and radio(GMRT, ASKAP). By combining a GAMA Galaxy Group ‡ International Centre for Radio Astronomy Research
Catalogue (G C) constructed with spatially near-completeredshifts and 21 band photometry, the GAMA project isin a unique position to answer many of the most pressingquestions that exist in extra-galactic astronomy today. Cru-cially, the interplay between Star Formation Rate (SFR),stellar mass, morphology, QSO activity and Star FormationEfficiency (SFE) with environment can be probed in un-precedented detail. The group catalogue presented here willalso enable galaxy evolution to be investigated as a functionof halo mass, rather than with coarse environmental mark-ers, in statistically significant low mass regimes for the firsttime. This is a huge advance on the capabilities of currentlarge spectroscopic surveys like SDSS and 2dFGRS that arealmost single pass and hence suffer seriously from spectro-scopic incompleteness in clustered regions. GAMA, by beingat least 6 pass in every unit of sky, is exceptionally completeon all angular scales (Robotham et al. 2010a; Driver et al.2011).The catalogue and group analyses presented here isbased on the first three years of spectroscopic observations(February 2008 to May 2010) made at the Anglo-AustralianTelescope (AAT). Within the GAMA project, this period isreferred to as GAMA-I, since the deeper, larger area, con-tinuation of the GAMA survey is commonly referred to asGAMA-II.The paper is organized as follows. § §
3. Group properties (i.e. velocity dispersion, ra-dius, dynamical mass and total luminosity) and their esti-mates are presented in § § C and corresponding mock group catalogues. Afew GAMA group examples are discussed in §
6, with conclu-sions presented in §
7. We assume throughout an Ω m = 0 . Λ = 0 .
75, H = h
100 km s − Mpc − cosmological model,corresponding to the cosmology of the Millennium N-bodysimulation used to construct the GAMA lightcone mocks. There are many subtle differences in the specific algorithmused to construct groups from spectroscopic surveys, butthe major dichotomy occurs at the scale of association con-sidered: galaxy-galaxy links or halo-galaxy links. Here weadopt galaxy-galaxy linking via a Friends-of-Friends (FoF)algorithm ( § A standard Friends-of-Friends algorithm creates links be-tween galaxies based on their separation as a measure of c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 1.
Schematic of the two step process used when associat-ing galaxies via FoF algorithm on redshift survey data. The sameset of galaxies are shown in two panels: along the line of sight(left) and projected on the sky (right). Both the radial and pro-jected separations are used to disentangle projection effects andrecover the underlying group (galaxies 1, 5 and 6 in this exam-ple). The radial linking length has to be significantly larger thanthe projected one to properly account for peculiar velocities alongthe line of sight. the local density. In practice the projected and radial sep-arations are treated separately, due to significant line-of-sight effects from peculiar velocities within groups and clus-ters. The comoving radial separations within a group ap-pear larger than the projected ones, because radial distancesinferred from galaxy redshifts contain peculiar velocity in-formation along the line of sight on top of their underly-ing Hubble distance away from the observer. Fig. 1 showsschematically how the radial and projected separations areused to detect a group. This shows that neither the radialnor the projected separation provide enough information tounambiguously detect a group, but their combination gen-erate a secure grouping.
In its simplest form we can say that two galaxies are asso-ciated in projection when the following condition is met:tan[ θ , ]( D com , + D com , ) / b i , j ( D lim , + D lim , ) / , (1)where θ , is the angular separation of the two galaxies, D com , i is the radial distance in comoving coordinates togalaxy i , b i , j the mean required linking overdensity and D lim , i is the mean comoving inter-galaxy separation at the positionof galaxy i , here defined as D lim ,i = (cid:20)Z M lim ,i −∞ φ ( M ) dM (cid:21) − / , (2) where M lim ,i is the effective absolute magnitude limit of thesurvey at the position of galaxy i , φ ( M ) the survey galaxyluminosity function (LF). b is used to specify the overdensity with respect to themean required to define a group. The approximate overden-sity contour that this linking would recover in a simulation(Universe) with equal mass particles (galaxies) is given by ρ/ ¯ ρ ∼ / (2 πb ) (Cole et al. 1996). For a uniform sphericaldistribution of mass the virial radius corresponds to a meanoverdensity of 178, hence the popularity of masses defined asbeing within 178 and 200 times the mean overdensity. For anNFW type profile (Navarro, Frenk & White 1996) the over-density within the virial radius is approximately 178 / ≃ b ≃ . /b = 125 be-tween galaxies. Linking together 1000s of dark matter parti-cles in a simulation with real-space coordinates is a relativelysimple and robust process, extending this methodology toredshift-space using galaxies that trace the dark matter isnon-trivial. Consequently, it is not simply true to state that b = 0 . b will be recoveredfrom careful application to mock catalogues (see below forfull details). Since there a subtle effects that vary the pre-cise b used on a galaxy by galaxy basis b i , j used above is themean b for galaxy i and j respectively. In general, for near-bygalaxies, b does not vary significantly.To this standard form of the mean comoving inter-galaxy separation at the position of galaxy i , we introducean extra term, with Eq. 2 thus becoming: D lim ,i = (cid:18) φ ( M lim ,i ) φ ( M gal ,i ) (cid:19) ν/ (cid:20)Z M lim ,i −∞ φ ( M ) dM (cid:21) − / , (3)where M gal ,i is the absolute magnitude of galaxy i . This ex-tra term, ( φ ( M lim ,i ) /φ ( M gal ,i )) ν/ , allows for larger linkingdistances for intrinsically brighter galaxies, provided ν > ν allows the algorithm to be more or less sensitiveto the intrinsic brightness of a galaxy, and can be thought ofas a softening power. The principle behind introducing thisterm is that associations should be more significant betweenbrighter galaxies, and tests on mocks show that this gener-ates notably better quality group catalogues as determinedfrom the cost function (see § With Eq. 1 we have established an association in projec-tion, but we also require that a given pair of galaxies areassociated along the line-of-sight or radially, i.e.: | D com , − D com , | b R ( D lim , + D lim , ) / , (4)where b is the linking length of Eq. 1, D lim , i is given byEq. 3 and R is the radial expansion factor to account forpeculiar motions of galaxies within groups. With a redshiftsurvey, the measured redshift contains both information onthe Hubble flow redshift and any galaxy peculiar velocityalong the line of sight. c (cid:13) , 1–31 A.S.G. Robotham et al.
To construct a group catalogue we link together all asso-ciations that meet our criteria given by Eq. 1 and Eq. 4.Galaxies that are not directly linked to each other can still begrouped together by virtue of common links between them.All possible groups are constructed in precisely this manner,leaving either completely ungrouped galaxies or galaxies ingroups with 2 or more members.Despite its apparent simplicity, a FoF algorithm isstill a very parametric approach to grouping. On topof the assumed cosmology, it requires the survey selec-tion function, and values for the linking parameters b and R . The galaxy LF can be directly estimated fromthe data (e.g. Loveday et al. 1995; Norberg et al. 2002;Blanton et al. 2003), while the linking parameters cannotbe estimated from the data. Instead they are commonlydetermined from either analytic calculations or analyses ofN-body simulations populated with galaxies, with the lat-ter approach taken here (see § b and R is lessthan optimal for accurately reconstructing groups in themock data. An obvious shortcoming is that galaxies in clus-ters are significantly spread out along the line-of-sight, dueto their large peculiar velocities a result of being bound tomassive structures. To account for this we introduce a localenvironment measure that calculates the density contrast ofa cylinder that is centred on the galaxy of interest. Similarto the approach of Eke et al. (2004), we allow the b and R parameters to scale as a function of the observed densitycontrast, leading to position ( r ) and faint magnitude limit( m lim ) dependent linking parameters: b ( r , m lim ) = b (cid:18) ρ emp ( r , m lim )¯ ρ ( r , m lim ) (cid:19) E b (5) R ( r , m lim ) = R (cid:18) ρ emp ( r , m lim )¯ ρ ( r , m lim ) (cid:19) E R (6)where ¯ ρ is the average local density implied by the selectionfunction, ρ emp is the empirically estimated density, m lim theapparent magnitude limit at position r and ∆ is the den-sity contrast, an additional free parameter together with E b and E R . For this work ¯ ρ is estimated from the galaxy se-lection function at r (i.e. it varies with the GAMA surveydepth). ρ emp is calculated directly from the number densitywithin a comoving cylinder centred on r and of projectedradius r ∆ and radial extent l ∆ . ∆ determines the transi-tion between where the power scaling reduces or increasesthe linking lengths, so a galaxy within a local volume pre-cisely ∆ times overdense will not have its links altered. Theexact values for E b , E R and ∆ are determined from thejoint optimisation of the group cost function (see § b , R , ∆, r ∆ , l ∆ , E b , E R and ν . Whilst many parameters, b and R are thedominant one for the grouping, the latter 6 merely deter-mining how best to modify the linking locally, and typicallyintroducing minor perturbations to the grouping. Since the GAMA survey is highly complete ( ∼
98% withinthe r -band limits) the effect of incompleteness is minor,and tests on the mocks indicate the final catalogues are ex-tremely similar regardless of whether the linking length isadjusted based on the local completeness. A number of def-initions of local completeness were investigated: complete-ness within a pixel on a mask, completeness on a fixedangular top-hat scale around each galaxy and a complete-ness window function that represents the physical scale ofa group on the sky. The difference between each was quiteminor, but defining completeness on a physical scale pro-duced marginally better grouping costs ( § b at position r isgiven by: b comp ( r , m lim ) = b ( r , m lim ) c ( r ) / , (7)where c ( r ) is the redshift completeness within a projectedcomoving radius of 1 . h − Mpc centred on r . The effect is toslightly increase the linking length to account for the loss of(possible) nearby galaxies that it could otherwise be linkedwith. Since GAMA was designed to be extremely completeeven at small angular scales (Robotham et al. 2010a), themean modifications are less than 1%. Extensive details of the GAMA survey characteristics aregiven in Driver et al. (2011), with the survey input cataloguedescribed in Baldry et al. (2010) and the spectroscopic tilingalgorithm in Robotham et al. (2010a).Briefly, the GAMA-I survey covers three regions each12 × ∼
96 deg to r AB = 19 . ∼
47 deg to r AB = 19 . . Allregions are more than 98% complete (see Driver et al. 2011,for precise completeness details), with special emphasis on ahigh close pair completeness, which is greater than 95% forall galaxies with up to 5 neighbours within 40 ′′ of them (seeFig. 19 of Driver et al. 2011) . Despite this high global red-shift completeness, we still apply completeness corrections tothe FoF algorithm (as described in § ∼
50 km s − (Driver et al. 2011), slightly largerthan the nominal SDSS velocity uncertainties of ∼
35 km s − but significantly better than the typical ∼
80 km s − associ-ated with 2dFGRS redshifts (Colless et al. 2001).For this study, we use a global GAMA ( k + e )-correction,of the form:( k + e )( z ) = N X i =0 a i ( z ref , z p )( z − z p ) i + Q z ref ( z − z ref ) (8) See Baldry et al. (2010) for additional GAMA-I selections. ′′ .c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) where z ref is the reference redshift to which all galaxies are( k + e )-corrected, Q z ref is a single luminosity evolution pa-rameter (as in e.g. Lin et al. 1999), z p is a reference red-shift for the polynomial fit to median KCORRECT-v4.2k-correction (Blanton & Roweis 2007) of GAMA-I galaxies,and a i ( z ref , z p ) the coefficients of that polynomial fit. Thepresent study uses z ref = 0, Q = 1 . z p = 0 . N = 4,with a = 0 . , . , . , . , . Q = 1 .
75 is not essential,as our estimate of the luminosity function accounts for anyresidual redshift evolution.Once the global ( k + e )-correction have been defined,it is straightforward to estimate the redshift dependentgalaxy luminosity function using a non-parametric estima-tor like the Step-Wise Maximum Likelihood (SWML) ofEfstathiou et al. (1988). We perform this analysis in fivedisjoint redshift bins, which are all correlated through theglobal normalisation constraint. This is set by the cumula-tive number counts at r AB = 19 . ∼ ),as estimated directly from the full GAMA survey and com-pared to ∼ of SDSS DR6 survey (to account forpossible sample variance issues). This LF estimate is usedboth to described the survey selection function (as requiredby Eqs. 1–6), to adjust the galaxy magnitudes in the GAMAmock catalogues (see § φ GAMA . To appropriately test the quality and understand the intrin-sic limitations of a given group finder it is essential to test itthoroughly on a series of realistic mock galaxy catalogues,for which the true grouping is known. Those tests should in-clude all the limitations of the real spectroscopic survey, e.g.spectroscopic incompleteness, redshift uncertainties, varyingmagnitude limits, etc.In this first paper on GAMA groups, we limit ourtests of the group finding algorithm to one single type ofmock galaxy catalogue, constructed from the Millenniumdark matter simulation (Springel et al. 2005), populatedwith galaxies using the GALFORM Bower et al. (2006)semi-analytic galaxy formation recipe. The galaxy positionsare interpolated between the Millennium snapshots to bestmimic the effect of a proper lightcone output, enabling themocks to include the evolution of the underlying dark mat-ter structures along the line of sight, key for a survey ofthe depth of GAMA that spans ∼ r -band filter magnitudesmodified to give a perfect match to the redshift dependentGAMA luminosity and selection function (see § r AB − h ≃ − .
05. This limitis faint enough to not attempt to address this issue in thisfirst generation of GAMA mocks.7) the halo definition used in these mocks correspondto standard halo definition of GALFORM (Cole et al.2000; Bower et al. 2006; Benson & Bower 2010), i.e.DHalo (Helly et al. 2003), as listed in the MillenniumGAVO database . DHalo is a collection of SubFind sub-haloes (Springel et al. 2001) grouped together to make ahalo. The differences between DHalo and FoFHalo are sub-tle. A preliminary analysis on a small fraction of the mockdata show that the log ratio of the DHalo and FoFHalomasses are median unbiased, and exhibit a 1- σ scatter of0.05 dex. The 10% population that exhibits the largest massmismatch are still median unbiased (i.e. they will not affectthe median relationship between the FoF masses we measureand the intrinsic dark matter mass of the halo), but can scat-ter more than 1 dex away from the median. Because the twohalo mass definitions are not biased w.r.t. each other, theDHalo mass can be used safely in this paper as a halo massdefinition.8) the most luminous galaxy of a halo is nearly always atits centre and at rest w.r.t. the dark matter halo.These mocks are a subset of the first generation of wideand deep mock galaxy catalogues for the Pan-STARRS PS1survey. Further details on their construction are given inMerson et al. (in prep.). FoFHalos are identified with a linking length of b = 0 . (cid:13) , 1–31 A.S.G. Robotham et al.
The minimisation or maximisation of non-analytic functionsthat depend on multiple parameters is an intense researcharea in statistics and computational science. When the di-mensionality of the dataset is low, typically 2–4 dimensions,it is straightforward to completely map out the whole pa-rameter space on a grid. However, when the number of pa-rameters is large (e.g. up to 8 for our FoF algorithm) thensuch a computationally intensive approach is not ideal, es-pecially if each set of parameter values requires a series ofcomplex calculations. For our data size and problem consid-ered, each complete grouping takes 10s of seconds, with a fullparameter space not necessarily obvious to define. Hence weuse the Nelder-Mead optimisation technique (i.e. downhillsimplex, see Nelder & Mead 1965) that allows for maxima(or minima) to be investigated for non-differentiable func-tions. The onus is still on the user to choose an appropriatefunction to maximise. For this work we desire a high groupdetection rate with a low interloper fraction in each group,and this is the criteria that defines the cost function to beminimised.
One of the defining characteristics of how we decide to deter-mine grouping quality is that the statistics measured shouldbe two-way (bijective). By this we mean that the group cat-alogue made with this algorithm is an accurate representa-tion on the mock group catalogue, and vice-versa. This isan important distinction since it is possible for the groupcatalogue to perfectly recover every mock group, but forthese to be the minority of the final catalogue, i.e. mostof the groups are spurious. This has a serious effect on al-most any science goal involving use of the GAMA groupssince any given group would be more likely to be false thanreal— follow up proposals making use of the groups wouldbe highly inefficient, and any science involving the stackingof detections of multiple groups (X-ray, HI) would be hardto achieve.With this two-way nature of defining grouping qualityin mind, there are two global measures that can be ascer-tained: how well are the groups and the galaxies within themrecovered. To retrieve a group accurately we require the jointgalaxy population of the FoF groups and mock haloes to in-clude more than 50% of their respective group members.This is called a bijective match, and it ensures that thereis no ambiguity when we associate groups together— it isimpossible for a group to bijectively match more than onegroup. To turn this into a global grouping efficiency statisticwe define the following quantities: E FoF = Ng bij Ng FoF (9) E mock = Ng bij Ng mock (10) E tot = E FoF E mock (11)where Ng bij , Ng FoF and Ng mock are the number of bijec-tive, FoF and mock groups respectively. E tot is the globalhalo finding efficiency measurement (or purity product) we want to use in our maximisation statistic, and will be 1 ifall groups are bijectively found, and 0 if no groups are de-termined bijectively.The second measure of group quality determines howsignificantly matched individual groups are, in effect it de-termines the ‘purity’ of the matching groups. The best two-way matching group is the one which has the largest productfor the relative membership fractions between the FoF andmock group. Take for example a FoF group with 5 mem-bers where 3 of these galaxies are shared with a mock groupthat has 9 members and the other 2 are shared with a mockgroup that has 3 members. In this case the two possible pu-rity products are × = = 0 . × = ∼ . × = 0 . Q FoF = P Ng FoF i=1 P FoF [ i ] ∗ Nm FoF [ i ] P Nm FoF (12) Q mock = P Ng mock i =1 P mock [ i ] ∗ Nm mock [ i ] P Nm mock (13) Q tot = Q FoF Q mock (14)where Nm FoF [ i ] and Nm mock [ i ] are the number of groupmembers in the i th FoF and mock group respectively. P FoF [ i ]and P mock [ i ] are the purity products of the i th best match-ing FoF and mock group respectively. In the example above P FoF ∼ .
27 and Nm FoF = 5. If a halo is perfectly recoveredbetween the FoF and mock then P FoF and P mock both equal1 for that matching halo. Q tot is the global grouping puritywe want to use in our maximisation statistic, and will be 1if all groups are found perfectly in the FoF catalogue. Thelower limit must be more than 0 (since it is always possibleto break a catalogue with N gal galaxies into a catalogue of N gal groups), and at worst Q tot = Ng /N .Using E tot and Q tot we can now calculate our final sum-mary statistic: S tot = E tot Q tot , (15)where S tot will span the range 0–1. Whilst it is possible to optimise the set of grouping param-eters such that the absolute maximum value for S tot is ob-tained, in practice some of the parameters barely affect thereturned group catalogue as long as sensible values are cho-sen. For FoF group finding, ∆, r ∆ , l ∆ have a weak affect onthe final grouping, and fixing them at 9, 1.5 h − Mpc and 12proved to be almost as effective as allowing them to be freelyoptimised. For expediency they were fixed after this initialdetermination. The other 5 FoF parameters do require opti-misation, the descending order of parameter importance is: b , R , E b , E R and ν .As well as choosing the set of parameters to adjust, theset of groups chosen as the basis of optimisation must beconsidered carefully. The optimisation strategy has to be c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) defined depending on the desired goals. Most further stud-ies will make use of the largest and best fidelity groups, andthese groups suffer disproportionately if the optimisation iscarried out using smaller systems and then applied to all ofthe mock data. Because of this only groups with 5 or moremembers were used to determine the appropriate combina-tion of parameters. Part of the justification for this is that5 or more members are required to make a meaningful es-timate of the dynamical velocity dispersion ( σ FoF ) and 50 th percentile radius (Rad − group ).To optimise the overall grouping to maximise the out-put of S tot we used a standard Nelder-Mead (Nelder & Mead1965) approach, using the optim function available in the R programming environment. We simultaneously attemptedto find the optimal combination of the 5 specified parame-ters for all 9 mock GAMA volumes, a process that took ∼ r AB . r AB . r AB . r AB . r AB . E b and E R are so close to zero that their effect is completelynegligible. Interestingly, if we instead attempt the same op-timisation problem but remove ν these parameters becomemore significant, but the final cost for the optimisation isnot as good. This means the 3 parameters adapt in a degen-erate manner, but the luminosity based adaptation is themost successful, and the parameter most fundamentally re-lated to optimal galaxy groups. The GAMA galaxy groupcatalogue will still use all 5 parameters as specified, but wenote that in future extensions to this work E b and E R maybe removed.It is clear that the chosen set of parameters producevery similar final S tot for all depths ( ∼ . E FoF , E mock , Q FoF and Q mock are all ∼ . N FoF > E tot = 0 . Q tot = 0 .
53. The contribution to the overall cost is alsoslightly asymmetric from the mock and FoF components: E mock = 0 . E FoF = 0 . Q mock = 0 .
73 and Q FoF = 0 . S tot is 0.65, and fromthe FoF groups it is 0.62. These numbers indicate that theFoF algorithm must recover, on average, more groups thanactually exist in the mock data. Also, the FoF algorithm isslightly better at constructing the groups it finds than it is atrecovering haloes from the data. These statistics mean thatthe most successful algorithm is necessarily a conservativeone where real haloes are robustly and unambiguously de-tected, and interloper rates kept low in these systems. Thisis required since it is very easy to create spurious group de-tections once the grouping is more generous. To assess how sensitive the best parameters found are to per-turbations in the volume investigated (sample variance) wemade optimisations for each of the 9 GAMA mock volumes.The distribution of the parameters gives us an indicationof both how well constrained they are, and how degeneratethey are with respect to the other parameters.A PCA analysis of the outcome for 5 parameters opti-mised to 9 volumes suggests nearly all the parameter vari-ance is explained with just two principle components. Themost significant parameters are b and ν , and these areanti-correlated. R is the only other significant parameterthat contributes to component 1, and this is anti-correlatedclosely with b . E b and E R dominate the second component,and they are strongly anti-correlated.Table 2 shows the 1- σ spread in optimal parameter val-ues obtained, and gives an indication of how stable our pa-rameters are to the sample selection. The only surprisingfact is that E R is prone to vary quite a large amount de-pending on the volume, however this is precisely because ithas least influence on the quality of any grouping outcome,and hence a large change can cause minor improvements inthe grouping. b is extremely well constrained, which is im-portant to know since it is comfortably the most significantparameter for any FoF grouping algorithm. Whilst the primary aim of the grouping algorithm is to max-imise the accuracy of the content of the groups, it is essentialto derive well determined global group properties. The groupvelocity dispersion ( σ FoF ) and radius ( r FoF ), are key proper-ties to recover accurately, as they form the most directly in-ferred group characteristics, together with the group centreand total group luminosity ( L FoF ). The importance of theirprecise recovery is further strengthened by the expectationthat a reasonable dynamical mass estimator is proportionalto σ and r FoF ( § σ FoF and r FoF , butit is essential for the estimates to be median un-biased androbust to slight perturbations in group membership. Bothconstraints are important so as to not make our group prop-erties overly sensitive to some precise aspect of the groupingalgorithm (a process that will never produce a perfect cata-logue).Hereafter we adopt the following notation. X FoF and X halo correspond to a quantity X measured using galaxiesof the Friends-of-Friends mock group and of the underly-ing/true Dark Matter haloes respectively. The estimate of X is done with the same method both times, i.e. only thegalaxy membership changes between the two measurementsfor matched FoF and halo groups. Matching in the mockscorresponds to the best group matching between FoF groupsand intrinsic haloes, defined as the two way match that pro-duces the highest Q tot (see § N FoF , as the number of group mem-bers a given FoF group has, which has to be distinguishedfrom N halo the true number of group members fo a givenhalo. X mock is a value based on an output of the semi-analytic mock groups directly, it is not measured using a c (cid:13) , 1–31 A.S.G. Robotham et al. b R E b E R ν S tot ( r AB . S tot ( r AB . S tot ( r AB . Table 1.
The optimal global parameters for all groups with N FoF > σ b σ R σ E b σ E R σ ν σ b /b σ R /R σ ν /ν Table 2.
The 1- σ spread of the optimal grouping parameters found for the 9 different mock GAMA lightcones. For the three mostimportant parameters, their relative spread is indicated as well. similar method as for the FoF groups. In practice, only thetotal luminosity of the galaxies in the mock group ( L mock )require this notation since they are found from summing upthe flux of all group members beyond the magnitude limitof the simulated lightcone. Finally, X DM refers to a prop-erty that is measured from the Millennium Simulation darkmatter haloes themselves (so not dependent on the semi-analytics in any manner). In practice, only the total massof all dark matter particles within the halo ( M DM ) requirethis notation. The group velocity dispersion, σ FoF , is measured withthe gapper estimator introduced by Beers et al. (1990),and used for velocity dispersion estimates in e.g. 2PIGG(Eke et al. 2004). This estimator is unbiased, even for lowmultiplicity systems, and is robust to weak perturbations ingroup membership.In summary, for a group of multiplicity N = N FoF , allrecession velocities are ordered within the group and gapsbetween each velocity pair is calculated using g i = v i +1 − v i for i = 1 , ..., N −
1, as well as weights defined by w i = i ( N − i ). The velocity dispersion is then estimated via: σ gap = √ πN ( N − N − X i =1 w i g i . (16)Based on the fact that in the majority of mock haloes thebrightest galaxy is moving with the halo centre of mass,the velocity dispersion is increased by an extra factor of p N/ ( N −
1) (as implemented in Eke et al. 2004). Eq. 16assumes no uncertainty on the recession velocities, while inreality the accuracy of the redshifts (and therefore recessionvelocities) depend among other things on the galaxy surveyconsidered. To account for this the velocity dispersion isfurther modified by the total measurement error σ err beingremoved in quadrature, giving: σ = r NN − σ − σ . (17)The total measurement error σ err is the result of adding to-gether the expected velocity error for each individual galaxyin quadrature, where we account for the survey origin ofthe redshift, the leading source of uncertainty in estimating σ err . The GAMA redshift catalogue is mainly composed ofredshifts from GAMA ( ∼ ∼ ∼ ∼
50 km s − , ∼
30 km s − and ∼
80 km s − (see Driver et al. 2011, for further details on the redshift uncertainties in the GAMAcatalogue).Fig. 2 shows the distribution of the log-ratio of the mea-sured/recovered velocity dispersion ( σ FoF ) to the intrinsicgalaxy velocity dispersion ( σ halo ), for best matching FoF/halo mock groups. Explicitly σ halo is estimated using Eq. 16with mock GAMA galaxies belonging to the same under-lying halo, i.e. σ halo does not correspond to the underlyingdark matter halo velocity dispersion. Furthermore σ halo isestimated using only the line-of-sight velocity information.Hence a perfect grouping would result in δ Dirac distributionsin Fig. 2. The fact that these distributions are so tight is areflection of the quality of the FoF grouping. For ∼ . ∼ σ FoF is within ∼
50% ( ∼ More contentious quantities to define and estimate are the centre and the projected radius of a group. Firstly there isno unique way to define the group centre (e.g. centre ofmass (CoM), geometric centre (GC), brightest group/clustergalaxy (BCG),...) from which the projected radius is defined.Secondly the projected radius definition will depend on whatfraction of galaxies should be enclosed within it and on whatassumption is made for the distance to the group.To determine the most robust and appropriate defini-tions for the centre and projected radius of a group a numberof schemes were investigated. Hereafter we implicitly assumeprojected radius when referring to the group radius.
For the group centre three approaches were considered.Firstly, the group centre was defined as the centre of light(CoL) derived from the r AB -band luminosity of all the galax-ies associated with the group, which is an easily observ-able proxy for the CoM. Secondly, an iterative procedurewas used where at each step the r AB -band CoL was foundand the most distant galaxy rejected. When only two galax-ies remain, the brighter r AB -band galaxy is used as thegroup centre. We refer to it as Iter. Thirdly, the brightestgroup/cluster member (BCG) was assumed to be the group c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 2.
Probability distribution function (PDF) of log σ FoF /σ halo , i.e. the log-ratio of the measured/recovered velocity dispersion( σ FoF ) to the intrinsic galaxy velocity dispersion ( σ halo ), for best matching FoF/ halo mock groups. Each panel shows groups of differentmultiplicities, as labelled. The vertical dashed lines indicate where σ FoF is a factor 2/5/10 off the intrinsic σ halo . The more peaked andcentred on 0 the PDF is, the more accurately the underlying σ halo is recovered. Figure 3.
Distribution of position offsets between different group centre definitions and the underlying/true group centre for bijec-tively matched mock groups. Each panel shows groups of different multiplicities, as labelled. Solid/dashed/dotted lines indicate theIter/CoL/BCG centre definitions (see text). The nearly vertical lines at small radii correspond to groups which have a perfectly recov-ered centre position (i.e. zero offset). Their fraction is indicated in the panel as “Perfect”. centre. For mock groups with N FoF >
5, 95% of the time theiterative procedure produces the same group centre as theBCG definition.Fig. 3 presents a comparison between three group centredefinitions (Iter, CoL, BCG) and the true/underlying groupcentre for the best matching (highest Q tot ) mock groups.In this context “true” refers to the centre we obtain whenrunning the same algorithm on the exact mock group. Theplot shows the distribution of the positional offsets for thedifferent definitions of group centre when compared to the“truth” for different group multiplicities, with the fractionthat agrees perfectly stated in each panel for each groupcentre definition.The iterative method always produces the best agree-ment for the exact group centre and seems to be slightlymore robust to the effects of group outliers. As should beexpected, the flux weighted CoL definition is the least goodat recovering the underlying/true halo centre position. Withthe CoL definition, the group needs to be recovered exactly to get a perfect match and any small perturbations in mem-bership influences the accuracy with which the centre is re-covered. This is very different to the BCG or Iter centredefinitions, which are only very mildly influenced by pertur-bations in membership.The iterative centre is therefore preferable over merelyusing the BCG: it has a larger precise matching fractionand a smaller fraction of groups with spuriously large centreoffsets. It is very stable as a function of multiplicity, witha fraction of precise group centre matches of ∼ The group centre definitions as considered in § c (cid:13) , 1–31 A.S.G. Robotham et al.
An alternative solution would be to select the group red-shift as the median redshift of the group members. Fig. 4presents the distribution of the difference between the re-covered median redshift and the intrinsic median redshiftfor best matching FoF/ halo mock groups. The fractionof group redshifts that agree precisely is stable as a func-tion of multiplicity at ∼ − . 80% of the time the redshift differencesare within the GAMA velocity error of ∼
50 km s − (seeDriver et al. 2011, for details). It is essential to notice thatthis radial centre is defined in redshift space (i.e. includingpeculiar velocities) as opposed to real space (i.e. based onHubble flow redshift), as only information for the former isavailable from a redshift survey. A comparison between thereal and the redshift space centre shows directly the impor-tance and the impact of bulk flow motions, i.e. the galaxygroups themselves are not at rest. The radius definition must be a compromise between con-taining a large enough number of galaxies to be stable sta-tistically and small enough to not be overly biased by or sen-sitive to outliers and interlopers (which tend to lie at largerprojected distances). Three radius definition were consid-ered: Rad , Rad − σ and Rad containing 50%, 68% and100% of the galaxies in the group respectively. The latter,Rad , is mainly used for illustrative purposes, as it is ex-tremely sensitive to outliers. Rad X is defined using the de-fault quantile definition in R, i.e. the group members aresorted in ascending radius value, assigned a specific per-centile (the most central 0% and the furthest away 100%)and finally a linear interpolation between the radii of the tworelevant percentiles is performed. This implies that only theradial distance of the two galaxies bracketing the percentiledefinition used are considered in the estimate of Rad X , ex-plaining why Rad is expected to be the most sensitive tooutliers.Fig. 5 shows a comparison between three radii def-initions as measured from the iterative centre for recov-ered mock groups (Rad X − FoF ) and for true mock haloes(Rad X − halo ) for best matching FoF/ halo mock groups.Rad is marginally more centrally concentrated than Rad σ for all multiplicity subsets and is hence the least affected byinterlopers and outliers.The subsets plotted in Fig. 5 up to 10 N FoF N FoF N FoF
19. This does not affect the median of thedistribution, but requires the mean to be offset from themedian in these cases.The highest multiplicity subset (right most panel ofFig. 5) has an identifiable excess of low radius groups, lead-ing to a biased median that is ∼
15% lower than the originalaim. Hence the estimated Rad − FoF for half of the high-est multiplicity groups is underestimated by at least ∼ − halo . Wenote however that this definition still behaves better thanany of the other two considered.Whilst the accuracy of the measured velocity dispersionnoticeably improves as a function of multiplicity (see Fig. 2), the accuracy of the observed radius does not. This obser-vation should be expected since groups have their centresiterated towards the optimal solution. During this processthey, in effect, become lower multiplicity as the outliers areremoved, and thus will suffer from similar numerical arte-facts.Based on the improvement in radius agreement for N FoF >
5, Rad was selected as the preferred definitionof radius for use in the GAMA galaxy group catalogue. Forthe remainder of this paper, and in any future discussion ofGAMA galaxy groups, any mention of group radius implic-itly refers to Rad . However it is to be noted that Rad − σ is better behaved for low multiplicity groups ( N FoF ± . Once an unbiased and robust group velocity dispersion anda nearly unbiased group radius can be estimated, the finalstep is to combine this information into a dynamical massestimator. To first order for a virialized system we expectits dynamical mass to scale as M ∝ σ R , where σ and R are calculated as described in § § Rad X − FoF / Rad X − halo – log ( σ FoF /σ halo ) plane, splitas function of redshift and multiplicity, with ranges speci-fied in each panel. The green dashed lines delineate regionswhere σ Rad − FoF is 2/5/10 times off the expectationgiven by σ Rad − halo , reflecting to some extent the im-plied uncertainty on any dynamical mass estimate. As amatter of fact, if the dynamical mass is proportional to σ R as expected for a virialized system and can be directly es-timated from σ Rad − halo , then the green dashed linesindicate by what amount the halo mass as inferred from σ Rad − FoF deviates from the true one (assuming thesame proportionality factor). Additionally any asymmetryin the density distribution w.r.t. those guide lines is a signof a bias in the inferred mass: a density excess in the top-right/bottom-left of any panel indicates a bias towards in-correctly high/low dynamical masses. Note that a densityexcess orthogonal to these lines is not problematic for themass estimates since the individual biases cancel out in thisparametrisation.As a function of redshift the density distributions inFig. 6 are well behaved. As a function of multiplicity themain effect is a tightening of the distribution, which is ex-pected since the velocity dispersion and, to a lesser degree,the radius can be better estimated with more galaxies. The5 N FoF N FoF N FoF > c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 4.
Probability distribution function (PDF) of z FoF − z halo for best matching FoF/ halo mock groups, where z is the medianredshift of the group. Each panel shows groups of different multiplicities, as labelled. The fraction of exact matches is indicated in eachpanel, as “Perfect”. Figure 5.
Probability distribution function (PDF) of log Rad X − FoF / Rad X − halo , i.e. the log-ratio of the measured/recovered radius(Rad X − FoF ) to the intrinsic galaxy radius (Rad X − halo ), for best matching FoF/ halo mock groups. Each panel shows groups of differentmultiplicities, as labelled. Solid/dashed/dotted lines indicate the Rad , Rad − σ and Rad radii definitions respectively encompassing50%, 68% and 100% of the galaxies in the group. The solid line, Rad , produces the tightest distribution of the three considered. Thevertical dashed lines indicate where Rad X − FoF is a factor 2/5/10 off the intrinsic Rad X − halo . tributions rather than in the median or the mode. However,for low multiplicity groups ( N FoF
4) the situation is ratherdifferent. First of all, there is an extensive scatter in the re-covered velocity dispersion at log Rad X − FoF / Rad X − halo ≃± .
3. This is entirely related to the “bumps” seen in Fig. 5and are due to mismatches in the grouping, explaining whythe velocity dispersions are so poorly recovered for some ofthose systems. The reason for an overdensity of groups at ± . works. When a N FoF = 2 group misses one mem-ber and when a N FoF = 3 group contains one interloper thisresults most often in a FoF group where the calculated groupcentre is the same but radius that is half and double thehalo radius respectively. Additionally any asymmetry seenin the top panels of Fig. 6 can be attributed to low mul- Because the group centre is so accurately recovered, see Fig. 3 tiplicity groups. Generally Fig. 6 gives us confidence thatmeasurement errors in σ and R are not highly correlated.The dynamical mass of a system is estimated using M FoF h − M ⊙ = A G M ⊙− km s − Mpc (cid:16) σ FoF km s − (cid:17) Rad
FoF h − Mpc(18)where G is the gravitational constant in suitable units, i.e. G = 4 . × − M ⊙− km s − Mpc. A is the scaling fac-tor required to create a median unbiased mass estimate of M DM /M FoF . For a ‘typical’ cluster with a 1 h − Mpc radiusand a velocity dispersion of 1000 km s − , the mass given byEq. 18 is ∼ A × h − M ⊙ . A is likely to be larger thanunity, since the estimated velocity dispersion using Eq. 16traces the velocity dispersion along the line-of-sight only and the average projected radius is smaller than the average For isotropic systems σ ∼ √ σ c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 6. Rad X − FoF / Rad X − halo – log ( σ FoF /σ halo ) plane, split as a function of redshift and multiplicity (top and bottom panel respectively). The x and y-axes show the relative accuracy ofthe recovered radius and velocity dispersion (squared) respectively. The contours represent the regions containing 10/50/90% of the datafor three magnitude limits, i.e. r AB . r AB . r AB . σ Rad − FoF is 2/5/10 times off the expectation given by σ Rad − halo , reflecting to some extent the implied uncertaintyon any dynamical mass estimate (see text for details). intrinsic radius . Finally, Eq. 18 can only be truly valid for asystem in virial equilibrium, which many of our system willnot necessarily be. Hence the best approach is to determine A in a semi-empirical manner by requiring it to produce amedian unbiased halo mass estimate when comparing bestmatching FoF/ halo mock groups.Performing a single global optimisation using all bijec-tively matched groups with N FoF > A = 10 . A = 5 factor found inEke et al. (2004). This should not be surprising since thereare differences in the style of grouping optimisation, and wehave used a more compact definition of the group radius anda different approach to recovering the group centre. It is in-teresting to note that this scaling of A = 10 . A = 10 . For isotropic systems the relation depends on the exact ra-dius definition. Conceptually the 3D and 2D radius will agree forRad
00 but increasingly disagree as the radius measured becomessmaller due to the relative concentration of objects towards thecentre when observing a projected 2D, as opposed to intrinsic 3D,distribution. tribution is globally unbiased for N FoF > N FoF A is to the specific subset of data consideredcombined cuts in redshift and multiplicity were made. Ta-ble 3 contains the various A factors required for the differentsubsets as a function of the possible limiting magnitudes forthe GAMA group catalogue.Using the data in Table 3 the best fitting plane thataccounts for the variation of A as a function of √ N FoF and √ z FoF is calculated. To prevent strong biases to low N FoF systems purely by virtue of their overwhelming numbers, theplane was not weighted by frequency and should producethe appropriate corrections throughout the parameter spaceinvestigated. The plane function for A is given by A ( N FoF , z
FoF ) = A c + A N √ N FoF + A z √ z FoF , (19)where A c , A N and A z are constants to be fitted. Table 4contains the parameters that produce the best fitting planesfor the three different GAMA magnitude limits. The mo- c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 7.
FoF –M DM plane, split as a function of redshift andmultiplicity (top and bottom panel respectively). These panels objectively compare the recovered group masses to the underlying DMhalo masses. The contours represent the regions containing 10/50/90% of the data for three magnitude limits, i.e. r AB . r AB . r AB . M FoF – M DM pairs. For M FoF we use Eq. 18 and A = 10 .
0. Thegreen dashed lines delineate regions where M
FoF is 2/5/10 times off the underlying M DM .2 N FoF N FoF N FoF
19 20 N FoF z FoF . . z FoF . . z FoF . . z FoF . Table 3.
Values of A , the dynamical mass scaling factor of Eq. 18, required to create an unbiased median mass estimate for differentdisjoint subsets of bijectively matched groups. tivation for the functional form is mainly driven to ensurepositivity of A ( N FoF , z
FoF ) over the range of GAMA multi-plicities and redshifts, and a good fit to the data within theselimits. The errors shown in Table 4 are estimated from find-ing the best fitting plane for the 9 mock GAMA volumesseparately and measuring the standard deviation of the in-dividual best fitting planes, much like the approach used forTable 2.
It is important to highlight that even though the observeddynamical mass estimates and halo masses are well corre-lated (in particular the scatter is approximately mirroredacross the 1–1 line in Fig. 7), it is impossible to select anunbiased subset of mass unless the selection is across the A c A N A z r AB . − ± ± ± r AB . − ± ± ± r AB . ± ± ± Table 4.
Table of parameters that create the best fitting planeto the data in Table 3. The plane is a function of group redshiftand multiplicity, as given in Eq. 19. Errors are estimated fromrunning plane fits to the 9 mock GAMA volumes separately andmeasuring the standard deviation of the individual best fittingplanes. mode of the distribution. This is due to Eddington biasrather than any intrinsic issue with the mass estimates—since most haloes in GAMA will have moderate masses( ∼ h − M ⊙ ) if simple Gaussian scatter in mass esti- c (cid:13) , 1–31 A.S.G. Robotham et al. mates is assumed, then a high mass subset must contain alarger fraction of lower mass haloes scattered up in mass, anda low mass subset must contain a larger fraction of highermass haloes scattered down in mass, hence the medians arebiased. This effect is different to a Malmquist-bias, which ex-plains the observational bias in distribution of halo massesas a function of distance.This effect can be modelled quite accurately by assum-ing we have median unbiased log-normal relative error inthe mass estimate, where the standard deviation of the dis-tribution ( M err ) is a function of system multiplicity. Theeffect multiplicity has on the accuracy of the mass can beseen clearly in Fig. 8, where although median unbiased for N FoF >
4, the standard deviation of the distribution de-creases strongly as a function of multiplicity. The approxi-mate function for this effect is given bylog ( M err h − M ⊙ ) = 1 . − .
43 log ( N FoF ) , (20)where the appropriate range of use is 2 N FoF
50, beyondwhich the standard deviation is ∼ .
27. We recast this errorfunction back onto the intrinsic mock halo masses to give anew mass with simulated dynamical mass errors: M sim h − M ⊙ = M DM h − M ⊙ G (0 , log ( M err h − ⊙ )) , (21)where G (¯ x, µ ) is a random sample from the normal distribu-tion with a mean ¯ x and standard deviation µ . Fig. 9 showshow the intrinsic halo mass compares for the same halomasses but with our fiducial error function applied. Thisshows the main contour twisting features described above—particular clear is the sampling bias you would expect whenselecting groups based on the observed halo masses. For in-stance, the manner in which the mode of the contours ap-pears to be more vertical than the 1–1 line in Fig. 7 (theslight rotation of the contours) is well replicated in Fig 9and can be explained by the random scatter of the mea-sured dynamical mass from the intrinsic halo mass. The total group luminosity is an equally important globalgroup property. It should not be just the total luminosityof the observed group members but the total luminosity asinferred from an arbitrarily faint absolute magnitude limitcut in order to address residual selection effects. To do thiswe calculate the effective absolute magnitude limit of eachgroup, measure the r AB -band luminosity contained withinthis limit and then integrate the global GAMA galaxy LF(see § L FoF = B L ob R − − − . M r φ GAMA ( M r ) dM r R M r − lim − − . M r φ GAMA ( M r ) dM r , (22)where L ob is the total observed r AB -band luminosity of thegroup, B is the scaling factor required to produce a perfectlymedian unbiased luminosity estimate and M r − lim is the ef-fective r AB -band absolute magnitude limit for the group.This limit depends on the redshift of observation and ap-parent magnitude limit used. Corrections are only a few percent at low redshift when using r AB . z FoF ∼ .
5. To convert magni-tudes into solar luminosities we take the r AB -band absolutemagnitude of the Sun to be M r ⊙ = 4 . . The limits of − M r −
14 used in the numerator of Eq. 22 are effec-tively limits of −∞ M r ∞ since the luminosity densityof a typical LF is nearly all recovered within a couple of mag-nitudes of M ∗ . Assuming the Schechter function parametersof Blanton et al. (2003) we would expect to retrieve 99.5%of the intrinsic flux using these limits, assuming the LF con-tinues down to infinitely faint galaxies. More practically, thebright limit ( M r > −
30) is much brighter than any knowngalaxy, and the faint limit ( M r −
14) is the limit of theGAMA SWML LF used for this work, and thus is also theeffective limit of the mock catalogues used since the galaxyluminosities were adjusted to return the GAMA LF.Since the median redshift of GAMA is z ∼ . r AB = 19 .
4, mostgroups will contain members faintwards of M ∗ h (with M ∗ h = M ∗ − h = − .
44, Blanton et al. 2003). Because theluminosity density is dominated by galaxies around M ∗ h , theextrapolation required to get a total group luminosity willbe quite conservative since most groups are sampled wellbeyond M ∗ h .This process assumes that a global LF is appropriate forall groups over a range of masses and environments, which isknown not to be the case (e.g. Eke et al. 2004; Croton et al.2005; Robotham et al. 2006). However, since the median lu-minosity scaling is less than a factor 1.6, the difference thatadjusting to halo specific LFs would have to the integratedlight will usually be smaller than the statistical scatter ob-served (which is many 10s of percent).Performing a single global optimisation using all bijec-tively matched groups with N FoF > B = 1 . α ) and the characteris-tic magnitude ( M ∗ ) varying between grouped environmentsand the global average, and the effects of interloper fluxbiasing the extrapolated group luminosities. Overall the ef-fects are rather small, and globally we see a value close to1, which implies neither a large amount of under-groupingnor over-grouping.Fig. 10 compares the inferred total group luminosity(L FoF ) to the underlying mock luminosity (L mock ) for bestmatching FoF/mock galaxy groups. The typical scatter as afunction of mock group luminosity is quite constant regard-less of group multiplicity, with only an excessive amount ofscatter for the lowest multiplicity groups, as evidenced inthe bottom left panel of Fig. 10. The relations are mostlyunbiased, except for the two higher redshift samples (topright panels of Fig. 10).The scatter in extrapolated group luminosity is muchsmaller than seen for dynamical masses in Fig. 7. This isexpected since fewer observables are required in its esti-mate and the effect of interlopers is much smaller. By theirnature, interlopers are more likely to systematically affect“geometrical” quantities, like biasing the observed velocitydispersion and radius, while having a lesser impact on e.g.total luminosities. This is because the nature of the opti- http://mips.as.arizona.edu/~cnaw/sun.html c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 8.
Relative difference between measured and underlying group masses as a function of multiplicity for different redshift subsets.The improvement in the measurement of the velocity dispersion and the radius tightens the distribution until N FoF ∼
50. The linesrepresent the 3 survey depths of interest: r AB . r AB . r AB . M FoF we use Eq. 18 and A = 10 . Figure 9.
As Fig. 7, but for the simulated relation between M DM and M sim ( M DM with the expected random errors applies using Eq. 21),by modelling the expected error just as a function of group multiplicity. The contours represent the regions containing 10/50/90% of thedata for three different magnitude limits, r AB . r AB . r AB . M sim is estimated using Eq. 20. mal grouping used for this work means that on average weshould miss as many true group galaxies as add interlop-ers, so the net loss and gain of galaxy luminosities tend tobalance out.As with the dynamical mass estimates, scaling factors,listed in Table 5, are calculated for various redshift and mul- tiplicity subsets in order to properly quantify outstandingbiases that remain after scaling the observed luminosities toaccount for galaxies below the survey flux limit. They aredistributed around unity, which is what we would expect ifthe extrapolated flux fully accounts for all of the missingflux. The variation in the median seen in the table is larger c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 10.
FoF –L mock plane, split as a function of redshift andmultiplicity (top and bottom panel respectively). These panels objectively compare the recovered group luminosities to the underlyingtotal luminosity in the mocks. The contours represent the regions containing 10/50/90% of the data for three magnitude limits, i.e. r AB . r AB . r AB . FoF –L mock pairs. The green dashed linesdelineate regions where L FoF is 2/5/10 times off the underlying L mock . For L
FoF we use Eq. 22 and B = 1 . than seen for the dynamical mass scaling factors. This is be-cause we have applied a global LF correction to the data andthe LF is known to vary strongly as a function of group envi-ronment (e.g. Robotham et al. 2006). Since we are naturallymore sensitive to higher mass groups at higher redshifts, thisexplains the strong redshift gradient scaling factor required,and in comparison the multiplicity variation is very small.For the dynamical A factors the dominant variable was thegroup multiplicity. When using the groups this is an impor-tant consideration: the group dynamical masses are moreintrinsically stable (require smaller corrections) as a func-tion of redshift, whilst group luminosities are more stable asa function of multiplicity.As with the dynamical masses, the total group luminos-ity correction factors ( B ) can be well described by a planethat fits Table 5 viz B ( N FoF , z
FoF ) = B c + B N √ N FoF + B z √ z FoF , (23)where B c , B N and B z are constants to be fitted. Table 6contains the best parameters that produce the best fittingplanes for the three different GAMA magnitude limits. Theerrors shown in Table 6 are estimated from finding the bestfitting plane for the 9 mock GAMA volumes separately andmeasuring the standard deviation of the individual best fit-ting planes, much like the approach used for Table 2. B c B N B z r AB . ± ± ± r AB . ± ± ± r AB . ± ± ± Table 6.
Table of parameters that create the best fitting planeto the data in Table 5. The plane is a function of group redshiftand multiplicity, as given in Eq. 22. Errors are estimated fromrunning plane fits to the 9 mock GAMA volumes separately andmeasuring the standard deviation of the individual best fittingplanes.
The
M/L observed in groups is a fundamental property ofinterest in the analysis of galaxy groups. It is obviously im-portant that any intrinsic scatter in the estimates of bothmass and luminosity of groups is not strongly correlated.Fig. 11 shows the observed fidelity of the group dy-namical masses compared to the total group luminositiesfor a variety of data subsets. Encouragingly the dynamicalmass and luminosity estimates do not correlate strongly inany direction— the most significant concern would be strongscatter along the − ◦ direction since this would mean thatthe dynamical mass estimates tend to be erroneously smallwhen the luminosity estimates tend to be erroneously large(creating a very small M/L ratio) and vice-versa. Instead c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) N FoF N FoF N FoF
19 20 N FoF z FoF . . z FoF . . z FoF . . z FoF . Table 5.
Values of B , the luminosity scaling factor of Eq. 22, required to create an unbiased median halo luminosity estimate for differentdisjoint subsets of bijectively matched groups. Figure 11.
Comparison of the fidelity of the recovered group mass (x-axis) against the group luminosity (y-axis), split as a function ofredshift and multiplicity (top and bottom panel respectively). For both axes only a global median correction optimized for N FoF > A = 10 . B = 1 .
04 for the mass and luminosity estimates respectively. Thevertical (horizontal) green dashed lines present accuracy factors of 2/5/10 for mass (luminosity) estimates. The contours represent theregions containing 10/50/90% of the data for three different magnitude limits, r AB . r AB . r AB . the two group measurements show no strong correlations inthe accuracy of their recovery.To demonstrate the improvement witnessed when usingthe multiplicity and redshift scaling relations, Fig. 12 com-pares side by side the scatter expected for a simple mediancorrection for N FoF > L FoF /L mock , where we see the contours tighten into veryclose agreement once the correction is made. This meansthat groups extracted from regions of different depths (e.g.G09 and G15 versus G12) can be compared more directly.It is also clear that the mode and median are brought intobetter agreement, moving up towards L FoF /L mock = 1.Depending on the precise science goal the full scalingequations should be used. Particular cases would be in anycomparison of extremely dissimilar groups over a large red-shift baseline. However, in small volume limited samples asimple median correction factor might be desirable. This isparticularly true at small redshift where the asymptotic na-ture of the plane function used could produce spurious re-sults. c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 12.
Comparison of the fidelity of the recovered groupmass (x-axis) against the group luminosity (y-axis). The left paneluses only a global median correction for mass and luminosity, op-timized for N FoF > A = 10 . B = 1 . r AB . r AB . r AB . The accuracy with which the galaxy composition of a groupis recovered is a distinct issue, but nevertheless equally im-portant as the precise recovery of intrinsic group proper-ties, as considered in § § Q tot , as defined by Eq. 14 in § Q tot and E tot vary withindifferent group subsets for best matching FoF/halo mockgroups. The grouping optimisation was not done with thewhole sample, rather only groups with N F oF > N F oF
4) did notdrive the optimisation, but demonstrate the consequence ofit. The parameter that best constrains the group qualityis the multiplicity, where the spread in observed groupingquality reduces for higher multiplicity systems. The mostaccurate groups tend to be at redshifts z ∼ . N FoF is high (middle panels of Fig. 14), whilethe quality of the groups is, on average, quite constant with N FoF (middle panel of Fig. 13. Bijection and quality are obvi-ously related, and these results should be interpreted as lowmultiplicity groups possessing a large amount of scatter in the quality of grouping, meaning that they can be scatteredbelow the quality limits required for a successful bijectioneven though the median quality is quite high. Higher mul-tiplicity systems possess less intrinsic scatter in the qualityof grouping, meaning they are very rarely scattered belowthe bijection limits, and consequently the average bijectionfraction remains higher.The exception to this is that the lowest mass groupsappear to be the most accurately recovered, even thoughmost observed have masses M ∼ h − M ⊙ . This canbe understood when careful attention is paid to how theFoF algorithm constructs the groups. It creates upper limitsfor the allowed difference in either the radial (velocity) ortangential (physical) separation between galaxies. It mustbe the case that groups that are constructed from galaxiesthat are at the limit of the allowed separations will be largerin terms of projected radius and observed velocity dispersionthan groups with galaxy separations well within these limits.This means they will have larger dynamical masses, andassuming interlopers are spread uniformly in space they willhave a lower Q tot since they will cover a larger volume inredshift space, so be more likely to include interlopers. Thisis an interesting effect of the grouping, because although themasses measured are likely to be too small the actual groupsare extremely secure.With this understandable effect in mind, different meth-ods for estimating the intrinsic Q tot using observed link-ing characteristics were investigated. The most successfulproved to be calculating the following for each group: L proj = P N FoF i =1 P N FoF j =1 h − tan θ i , j ( D com , i + D com , j ) b i , j ( D lim , i + D lim , j ) δ ci,j i N links , (24)where δ i,j is unity if i and j are directly linked (and zero oth-erwise), while all other terms are as described in Eq. 1. Hencethe sum is done over allowed links within the group ( N links )which has a limit of N FoF ( N FoF − L proj correlates loosely with Q tot . Interestingly, the equiva-lent statistic measuring the radial linking shows very littlecorrelation with Q tot . This means that outliers tend to fitquite comfortably in velocity space, but look anomalous inprojection. To aid the selection of high-fidelity groups L proj will be released along with the group catalogue. So far in this work we have made the implicit assumptionsthat the mocks are to a large extent a good representationof the real Universe and that optimising the grouping algo-rithm to recover mock groups as accurately as possible willhave the desired effect of also returning the best groups fromthe GAMA data. Clearly we should be wary of the effects ofover-tuning our algorithm to the mocks, especially given thelimitations listed in § c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 13.
Total group quality ( Q tot ) as a function of group redshift ( z FoF ), group multiplicity ( N FoF ) and group mass ( M FoF ). Eachpanel present a specific subsample of groups, as indicated by the key. Solid lines represent the moving median for r AB . r AB . r AB . r AB . tions where applied to the r . + .2) reducing all galaxy peculiar velocities along the line ofsight by 10%, creating groups that are more compact invelocity space than the default mocks: mock − . 3) convolving all galaxy peculiar velocities along the lineof sight with a Gaussian velocity distribution of width σ =50 km s − , mimicking the GAMA velocity errors: mock σ .The first two sets of mock, mock + and mock − , test thesensitivity of the grouping to the fidelity in which small scaleredshift space distortions are accounted for in the mocks.From (Kim et al. 2009) (and Norberg et al. in prep) we knowthat the Bower et al. (2006) semi-analytic galaxy formationmodel do not reproduce very accurately the redshift space c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 14.
Bijective group fraction ( E tot ) as a function of group redshift ( z FoF ), group multiplicity ( N FoF ) and group mass ( M FoF ).Each panel present a specific subsample of groups, as indicated by the key. Solid lines represent the moving median for r AB . r AB . r AB . r AB . clustering on h − Mpc scales and smaller. By systematicallymodifying the peculiar velocities by ±
10% and by keepingthe same FoF grouping parameters we attempt to addressthis mismatch between data and mocks and measure howsensitive the grouping is such differences. From Norberg etal. (in prep) we expect that an additional velocity bias of ∼ +10% to the mock galaxies should be enough to reconcilethe redshift space clustering of the mocks and the data. Thethird set, mock σ , tests the sensitivity of the grouping to velocity errors, which were not considered in the nominalmocks described in § c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 15.
Comparison of the observed linking strength L proj with the intrinsic group quality Q tot . The colour of each datapoint represents the group multiplicity, going from N F oF = 5(red) to N F oF = 200 (blue). The correlation is strongest for lowmultiplicity systems, which is important since it is these thatcan be pathologically bad. The black line is the linear regressionfit to the entire data, so it predominantly describes the lowermultiplicity systems. mocks might have on the grouping is on the group assign-ments themselves, so S tot was calculated for all 3 varieties ofnew mocks where the reference mock data is now the originalmock lightcone. This means we are only analysing how sim-ilar the new mock FoF groupings are to the original set, notto the “true” mock groupings. S tot is found to be ∼ .
97 forall three varieties of mock perturbation for N halo >
2, andonly drops slightly for N halo >
20 which shows the greatestdiscrepancy. In this regime mock + has S tot = 0 .
94, mock − has S tot = 0 .
96 and mock σ has S tot = 0 . − andmock + will require slightly different scaling relations to re-cover unbiased halo masses. The global mass scaling factor(where N FoF >
5) for mock − , A − , needs to be 11 .
6, so 16%larger than A , while A + needs to be ∼ .
7, so 15% smallerthan A . This implies that we have an underlying system-atic uncertainty of at least 15% on all masses assuming weexpect the true physics to vary the galaxy velocities at the10% level. Naively we might have expected the differenceto be at the ∼
20% level since 1 . = 1 .
21, but the ran-dom nature of peculiar velocities and the slight variation ingrouping conspires to reduce the variation.For mock σ we require exactly the same global scalingrelation as before, i.e. A σ = A = 10 .
0. This implies that re-moving the velocity error in quadrature is the correct proce-dure, and means we certainly do not expect the uncertaintyin radial velocities to have a significant effect on the impliedmasses.The implication for the group luminosities are, as ex-pected, very marginal w.r.t. these modifications of themocks, which is a result of the grouping still being rathergood for all three set of mocks (as evidenced by the marginalchange in S tot ) despite the algorithm not being tuned tothem. Cv1
Having run extensive optimisations and calculated refine-ments based on the mock catalogues, the algorithm was runover the real GAMA data. In total, taking the deepest ver-sion of each GAMA survey region possible, 14,388 groupswere formed containing 44,186 galaxies out of 110,192 galax-ies in our volume limited selection, meaning 40% of allgalaxies are assigned to a group. This is almost identicalto the average grouping rate found in the mocks, also 40%.The headline group number statistics are listed in Ta-ble 7 for each of the GAMA regions, i.e. G09, G12 and G15. r AB . r AB . r AB . z < . r AB . z < .
1) where the mean galaxy c (cid:13) , 1–31 A.S.G. Robotham et al. r . r . r . ± σ (low, high) N group 2–4 2051 2409 2436 2334 (3154, 4100) 3334 3703 3776 3623 (3154, 4100) 5687 5520 (4861, 6101) N group 5–9 190 233 234 253 (188, 294) 329 395 339 390 (322, 455) 539 584 (509, 661) N group 10–19 45 55 59 66 (43, 82) 75 79 102 102 (69, 133) 121 155 (98, 189) N group 20+ 8* 16 16 26 (15, 39) 17* 26 25 40 (20, 55) 44 62 (34, 88) z group 0–0.1 419 577 512 531 (318, 856) 514 705 597 634 (379, 1028) 857 746 (437, 1204) z group 0.1–0.2 973 1369 1450* 1144 (803, 1381) 1338 1829 1967* 1552 (1076, 1841) 2331 2024 (1424, 2424) z group 0.2–0.3 725 640 633 814 (606, 996) 1372 1217 1198 1377 (1074, 1683) 1997 2124 (1683, 2584) z group 0.3–0.5 178 127 100* 189 (125, 258) 531 452 480 593 (421, 730) 1206 1426 (1044, 1708)Total 2294 2713 2745 2678 (2204, 3107) 3755 4203 4242 4156 (3578, 4728) 6391 6321 (5535, 7025) Table 7.
Number of galaxy groups as a function of multiplicity, redshift and survey depth. The GAMA groups are split by GAMAregions, i.e. G09, G12 and G15. For the mocks, the mean number of groups between all 9 mock GAMA lightcones in a single GAMAregion of ∼
48 deg is listed together with their low and high extreme across all mocks (within brackets). Samples with an asterisk arethose which are outside the min-max range of the mocks. number density is the highest, such voids are still very evi-dent in the GAMA data.We still see groups of significant size ( N FoF ∼
20) be-yond a redshift of 0.3 in G09, and there is evidence of fil-amentary structure in the under-lying galaxy populationbeyond z ∼ . ∼ . z group .
2. The black pointsshow the location of individual galaxies, and as expected thegroups closely trace overdensities seen in the galaxy distri-bution. Intriguingly, we see evidence of extremely fine fila-mentary structure that is not associated with any of the de-fined groups. If these structures were purely radial in direc-tion then they could be claimed as misidentified systems, forwhich the filamentary structure merely betrays the velocitydispersion along the line of sight. Instead we witness gentlesweeping arcs that move round steadily radially and in pro-jection, implying that they are real fine filamentary struc-ture that connects group nodes. This is probably one of thefirst times that one sees the galaxy distribution mimickingso closely the filamentary distribution which is so commonlyseen in large Dark Matter dominated numerical simulations.The most striking of these filaments can be found inthe top-right panel of Fig. 17 where fine strands can be seenextending out from α ∼ z ∼ .
18, and also from α ∼ z ∼ .
19. In both of these cases it is possi-ble to identify group and cluster nodes that connect the fila-ments together, but there are no groups detected within thefilaments themselves. It is important to highlight that with-out GAMA redshifts these regions would have previouslybeen identified as void like, and that the additional galaxiesare not randomly distributed ‘field’ galaxies, but appear tobe in extremely well defined environments, but non-groupedw.r.t. the GAMA mean galaxy number density.After considering the spatial distribution of GAMAgalaxy groups, Fig. 18 shows the distributions of four basicproperties of the GAMA galaxy group catalogue (G Cv1): the observed group multiplicity, mass, velocity dispersionand radius distributions. We now discuss them in turn.The top left panel of Fig. 18 presents the distribution ofgroup multiplicities for three survey depths (coloured solidlines) to be compared to the equivalent average mock multi-plicity distributions (dashed lines). Unsurprisingly the rawnumber of groups increases with survey depth explainingwhy the three coloured curves are ordered as a function ofsurvey depth, i.e. r AB . r AB . r AB .
8. More importantly, the number of high multi-plicity systems is significantly different between data andmocks, a result already discussed in Table 7, while theirnumbers are much more similar for low multiplicity sys-tems. The difference at the high multiplicity end is impor-tant and put key constraints on the galaxy formation modelused. The group multiplicity distribution is mostly sensi-tive to the Halo Occupation Distribution (HOD), as for agiven number of haloes the group multiplicity distributionis entirely dependent on its HOD. A known feature of theGALFORM Bower et al. (2006) galaxy formation model isits tendency to populate the more massive haloes with anexcess of faint satellite galaxies (e.g. Kim et al. 2009).The top right panel of Fig. 18 presents the distributionof group masses for three survey depths (coloured solid lines)to be compared to the equivalent average mass distributionsfrom the mocks (dashed lines). For the comparison to beas fair as possible, the group masses used for the mocksis estimated in exactly the same way as the data. Becausevelocities uncertainties have not been included in the mocksit is essential to remove from this comparison all groupswhich velocity dispersion estimate is significantly affected bythis uncertainty, as the group mass is proportional to σ (seeEq. 18) and would bias the distribution. To achieve this wesimulated mock σ groups with 80 km s − velocity error andcalculated the velocity dispersion at which more than 95%of the population should be robust to being scattered belowthe presumed GAMA group velocity error (which would givea corrected σ of 0 km s − ). This velocity dispersion limitwas found to be 130 km s − . Thus the top-right panel onlyshows a comparison of groups where this selection has beenapplied.The agreement between data and mocks beyond ∼ h − M ⊙ is remarkably good for all survey depths, withpossibly only the normalisation that is slightly lower forGAMA data than for the mocks (however within the typ-ical scatter expected from sample variance). The relativeprofiles are all very similar. We note that this mass distri- c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 16.
Redshift space position of GAMA galaxy groups projected onto the equatorial plane, split by survey area and with symbolsize reflecting the group multiplicity and symbol colour the group velocity dispersion (see figure keys for exact values). G09 and G15are for a survey depth of r AB .
4, while G12 is for r AB .
8, explaining why the number of groups detected at higher redshifts islarger in G12 compared to G09 and G15. At low redshifts where the projection effects are the smallest, groups are still visually stronglyassociated with the filaments and nodes of the larger scale cosmic structure. Fewer groups are found beyond at higher redshift, a resultof GAMA survey being magnitude limited.c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 17.
Four one degree wide declination slices of the GAMA G12 region covering the 0 . < z < .
20 redshift range. Declinationincreases from left to right and top to bottom, as indicated by the panel key. Galaxies are shown with black dots, and galaxy groupswith the same symbols as in Fig. 16. bution has been convolved with the error distribution onthe group masses which have been estimated using a sin-gle correction factor ( A = 10). This explains why unre-alistically large group masses are found (e.g. greater than10 h − M ⊙ ). More detailed work on estimating the groupmasses is underway (Alpaslan in prep).The bottom left panel of Fig. 18 presents the distribu-tion of group velocity dispersions for three survey depths(coloured solid lines) to be compared to the equivalent aver-age group velocity dispersion distributions from the mocks(dashed lines). For the comparison to be as fair as possible,the velocity dispersion used for the mocks is estimated inexactly the same way as the data. Because velocities uncer-tainties have not been included in the mocks, it is essential toremove from this comparison all groups those for which thevelocity dispersion estimate is significantly affected by thisuncertainty. This can be straightforwardly done by ignoring groups with σ
130 km s − (as discussed above). Beyondthat limit in the velocity dispersion distribution, the dataand mock distributions are very comparable, showing yetagain how closely matched the mocks and the data are. Forsmaller velocity dispersion system a more careful modellingof the velocity errors (and hence velocity dispersion errors)is needed before any conclusions can be drawn on how ap-propriate the mocks are. Work is currently ongoing withinGAMA to better understand the precise nature, and distri-bution, of the redshift velocity errors. A full comparison isdeferred until these errors have been fully characterised.Finally, the bottom right panel of Fig. 18 presentsthe distribution of group radius for three survey depths(coloured solid lines) to be compared to the equivalent av-erage group radius distributions from the mocks (dashedlines). Considering the full sample of groups, the mocks andthe data seem to be very comparable. c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 18.
Global group properties of the GAMA galaxy group catalogue (G Cv1) compared to the corresponding mock group catalogue:group multiplicity distribution (top left), dynamical group mass distribution limited to σ FoF >
130 km s − (top right), group velocitydispersion distribution limited to σ FoF >
130 km s − (bottom left) and group radius distribution (bottom right). Solid (dashed) lines forGAMA (mock) for r AB . r AB . r AB . To investigate in more detail where differences betweenthe GAMA data and the mocks may reside we divided themass, velocity dispersion and radius distributions into mul-tiplicity subsets (Fig. 19). For clarity, Fig. 19 only uses the r AB . > N FoF > M FoF M ⊙ ,where we see excess number counts for the mock groups.This difference is most evident for 5 > N FoF >
9. The mostlikely explanation for this low mass excess comes from thefinding that mock groups are typically more compact thanGAMA groups, which will naturally cause a lower estima-tion of the mass. The radial discrepancies are discussed inmore detail below.The velocity dispersion (middle panel of Fig. 19) only c (cid:13) , 1–31 A.S.G. Robotham et al. shows strong evidence of a normalisation offset, where theagreement is excellent for low multiplicity systems but asthis increases we find the GAMA groups have a generalcount deficit. Since the strength of the normalisation off-set varies with multiplicity the difference cannot be simplydue to sample variance, where all multiplicity subsets wouldbetray the same deficit.The differences between GAMA and the mocks is mostpronounced for the group radius (bottom panel of Fig. 19).The most significant deviations are seen where Rad . h − Mpc: GAMA finds many fewer systems, and the ef-fect is much more significant for higher multiplicities wherethe mocks contain a significant excess of compact systemsnot seen at all in the data. At the GAMA median redshift( z ≃ . h − Mpc (comoving) radius corresponds to anangular separation of 25 ′′ on the sky. Whilst the simplestexplanation might be the GAMA survey suffers from sig-nificant close pair incompleteness, Fig. 19 of Driver et al.(2011) suggests this not be the case: GAMA is better than95% complete for systems with up to 5 neighbours within40 ′′ (on the sky). These separations are much larger thanthe expected optical confusion limit (1–2“), so photometricbias (i.e. close pairs not being deblended) cannot explain thediscrepancies we find. Since the main variance witnessed forvelocity dispersions between the mocks and GAMA data isthe normalisation, the more compact mock groups appearto be the origin of the low mass population we find in thetop panels of Fig. 19.The differences seen in Fig. 19 could well be dueto limitations in the physics implemented in the GAL-FORM Bower et al. (2006) semi-analytic galaxy formationmodel, where the exact distribution of galaxies within a halodepends on their dynamical friction timescale and whichdark matter particle the galaxy was originally associatedwith. Despite the high numerical resolution of the Millen-nium simulation, the vast majority of the satellite galax-ies in the galaxy formation model are not resolved in sub-haloes, implying that their merging timescales are governedby an analytic calculation and their position is given by themost bound dark matter particle of their parent halo. Aconsequence of a too long merging timescale is an overabun-dance of galaxies at small distances away from the centreof the halo. This, together with the definition of group ra-dius adopted for this work (i.e. Rad ), is the most likelyexplanation for the apparent excess of compact groups inthe mocks compared to the data. This has the consequenceof also creating a deficit of low mass groups in the GAMAdata in comparison to the mocks since the dynamical massesare directly proportional to the group radius measured.In summary, the GAMA group catalogue (G Cv1) andits mock counterpart are similar in many respects, but notall. In the discussion of Fig. 18 and Fig. 19 it has becomeclear that already G Cv1 is providing new constraints tothe galaxy formation model used to construct the mocksand will be implemented in the next generation of mocks.Investigating the discrepancies between GAMA and mockgroup catalogues, and the impact this has on any measuredHMF, is a complex and important task. A full analysis is de-ferred to a GAMA paper in preparation, which will presenta more in depth analysis of a series of statistically equiva-lent mocks as well as galaxy formation based mocks as usedhere. Only with a large variety of mocks will it be possible to put realistic constraints on the underlying dark mattermodel. The analysis in the present paper is entirely limitedto one family of mock realisations, which explains why theconstraints from the GAMA groups are so far mostly lim-ited to possible constraints on the galaxy formation modelrather than on the underlying dark matter physics.
For every group we create a rgb image as a K AB - r AB - u AB -band composite, along with visual diagnostics that allowinteresting features to be easily identified. Example imagesare shown in Fig. 20, Fig. 21 and Fig. 22 and discussedhereafter.Fig. 20 highlights 4 cluster-scale groups extracted fromthe GAMA data. The top panel shows relatively low redshiftclusters with high multiplicities, whilst the bottom panelsare examples of low multiplicity groups that show evidencefor a lot of associated galaxies that are fainter than theGAMA survey limits (shown by a dashed red line on theluminosity distribution plotted in each panel). All of thesegroups are quite circularly symmetric and concentrated to-wards the centre, both of which are indicators of a well viri-alised population of galaxies.Fig. 21 shows groups at radically different stages of evo-lution. The top panels show examples of fossil groups withone exceptionally dominant BCG. In both cases only theBCG had a known redshift before GAMA, and the largepeak in the redshift distribution suggests particularly strongradial linking— an indication that the grouping is reliable.The bottom panels show groups with very loose associa-tion in comparison. Both groups are quite massive (in thecluster regime) and have identifiable background galaxies,but neither exhibits a centrally concentrated distribution ofgalaxies or a dominant BCG. Both of these groups havea relatively uniform redshift distribution, showing none ofthe strong central peak seen for the fossil groups in the toppanel. The bottom-right group in particular has a very flatluminosity distribution and an extremely non-circular dis-tribution of galaxies. The most likely scenario is that thisgroup has two distinct sub-structures (top and bottom) col-lapsing into each other, where the bottom structure is phys-ically nearer to us in space and thus exhibits a large extracomponent of recessional velocity towards the CoM.Fig. 22 shows particularly pleasing examples of galaxy-galaxy merging/interactions. A natural outcome from theGAMA group catalogue is that nearly all possible close-pairswill be grouped (modulo a very small amount of incomplete-ness). Often these merging systems will be found in highermultiplicity systems, but here are examples of two membergroups that exhibit evidence for mergers. The top-left andtop-right panels show quite similar looking systems: a red(likely passive) galaxy interacting with a blue (late-type)galaxy. The top-left panel has larger tidal tails and moreof the flux is in the late-type system, suggesting it is at anearly stage of the merging process. The top panels are ex-amples where the multi-pass nature of GAMA has overcomethe problems of fibre collisions to give us redshifts for bothgalaxies in the merging system. The bottom panels showmerging systems that are both too faint and too close to beobtainable with SDSS data. The bottom-left panel system c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 19.
Distribution of GAMA and mock galaxy group mass (top panels), velocity (middle panels) and radius (bottom panels), for asurvey depth of r AB .
4. GAMA is shown in red while the mocks are in grey. Multiplicity subsets are as stated in each panel. For themass and velocity panels the mocks and GAMA data are limited to σ >
130 km s − , required to avoid the effects of velocity errors in theGAMA data biasing the results. For the mass and velocity plots the clearest differences are normalisation offsets, and for N FoF > M FoF and σ FoF for a given multiplicity subset. The distributions are significantlydifferent for compact systems (Rad . h − Mpc) with N FoF >
5, where GAMA groups are less compact in projection. This effectbecomes more significant for higher multiplicity subsets. appears to be a triple merger system, where the blue galaxyto the right does not have GAMA redshift because it is toofaint. The bottom-right panel shows two extremely faint andrelatively u -band bright galaxies merging— a tidal connec-tion can be seen between them. In both of these bottompanels the groups in question have extremely low velocity dispersions ( ∼
45 km s − ) and very low implied dynamicalmasses ( ∼ h − M ⊙ ).In such systems dynamical friction is acting in such amanner that the dynamical mass will likely not be a goodindicator of the intrinsic halo mass, rather it highlights a sys-tem where the energy has been transferred from group scalekinematics (energy in galaxies) to galaxy scale kinematics c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 20.
Top panels show two cluster scale groups confirmed spectroscopically. Bottom panels show low multiplicity groups withsignificant, possibly associated, background galaxies. The rgb image is a K AB - r AB - u AB -band composite. The size of the circle markinggroup members scales with the r AB -band flux and its colour reflects the galaxy u AB − r AB colour. A galaxy redshifted w.r.t. the groupmedian redshift has a red upwards pointing line which length scales with the velocity difference, while for a blueshifted one the line isblue and points downwards. The rings represent the 50 th , 68 th and 100 th percentiles of the radial galaxy distributions relative to theiterative group centre. The velocity PDF smoothed with a Gaussian kernel of width σ = 50 km s − (the typical GAMA velocity error)is shown on the left of each panel, where the group median is shown with a green dashed-line and the BCG with a black dashed line.The bottom plot presents the raw absolute r AB magnitude distribution of the group, with the effective GAMA survey limit shown witha red dashed-line, the group median absolute magnitude with green, and the BCG absolute magnitude with black. (energy in the stars/ gas). Dynamical friction conspires toreduce the velocity difference and physical distance betweenmerging galaxies, and since we use M FoF ∝ σ R this willalso reduce the implied dynamical mass that we measure. The generation of a group catalogue produces a myriad ofoutputs, most of which are not of interest to the typicaluser. To ease interpretation for the average user, a deliber-ately simplified set of outputs will be made available. For each GAMA region two tables are released. The first oneis a two column link list that identifies which group everygalaxy belongs. The second is a table of group propertieswith the most important attributes of each group. This in-cludes the group radius Rad , the velocity dispersion σ FoF ,the implied dynamical mass. Other metrics related to eachgroup are also calculated to aid the analysis and interpre-tation of individual grouping quality. As well as the L proj linking quality discussed above, the kurtosis of the radialseparation of all galaxies from the group centre is calcu-lated and the ‘modality’ of the system is also computed us- c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) Figure 21.
Top panels are potential fossil-groups, where the BCG is at least 2 magnitudes brighter than the second ranked galaxy inthe r AB -band (in the case of the top-right groups the second rank galaxy is nearer in magnitude than this, but it is separated a largedistance in projection). Bottom panels show groups with complex in-fall structure. See Fig. 20 for figure description. ing (1 + skewness ) / (3 + kurtosis ). This will be 1/3 for anormal distribution and 0.555 for a uniform, and is a usefulmetric since it does not just provide information on how non-Gaussian the velocity profile of each system is— it also pro-vides information on the whether the velocity profile is morecusped or cored than a Gaussian distribution. Additionally,in a similar manner to how the local over-density was cal-culated in a comoving cylinder centred around each galaxy,the local relative density is calculated for each group. This iscalculated using a comoving cylinder of radius 1 . h − Mpcand total radial depth of 36 h − Mpc, and gives a measureof how isolated the groups are relative to much larger scalestructure.Finally, as a separate but useful output from creatingthe GAMA galaxy group catalogue, a full pair cataloguewill be released. This is a natural output of the galaxy–galaxy linking stage of the grouping algorithm, and in- cludes all pairs that are within a common velocity sepa-ration of 1000 km s − and a physical projected separationof 50 h − kpc. This will be used within the team for workinvolving the study of galaxy pairs. In this paper we have presented a new group catalogue basedon the spectroscopic component of the GAMA survey. TheFoF based grouping algorithm used has been extensivelytested on semi-analytic derived mock catalogues, and hasbeen designed to be extremely robust to the effects of out-liers and linking errors. The velocity dispersion and radius ofthe groups are median unbiased, even when allowing for thepossibility of catastrophic grouping errors. Globally, 77% ofthe recovered FoF groups bijectively (unambiguously) match c (cid:13) , 1–31 A.S.G. Robotham et al.
Figure 22.
Examples of ultra low-mass groups that are also excellent candidates for merging systems. The bottom plots are groupsthat are within the nominal SDSS r AB .
77 limit, but one or both galaxies are missing from that survey due to fibre collisions. Thebottom plots are groups that are both too faint and too close together to be present in a spectroscopic SDSS catalogue. See Fig. 20 forfigure description. a mock group, and 89% of all mock groups are bijectivelyrecovered. The purity of all FoF groups is 80%, and for mockgroups the equivalent figure is 73%. This suggests that theFoF algorithm is quite well balanced and does not have astrong preference to over-grouping or to conservatively re-covering just the strongly bound core of systems.The overall number of groups within from 0 z . C will be made publicly available on the GAMAwebsite ( ) as soon as the as-sociated redshift data are made available. Interested par-ties should contact the author at [email protected] if they c (cid:13) , 1–31 AMA: The GAMA Galaxy Group Catalogue (G Cv1) wish to make use of the group catalogue data before the fullpublic release. ACKNOWLEDGMENTS
We thank Vincent Eke for his helpful refereeing com-ments. These added clarity to various aspects of the pa-per. ASGR acknowledges STFC and SUPA funding thatwas used to do this work. PN acknowledges a Royal So-ciety URF, an ERC StG grant (DEGAS-259586) and STFCfunding. GAMA is a joint European-Australasian projectbased around a spectroscopic campaign using the Anglo-Australian Telescope. The GAMA input catalogue is basedon data taken from the Sloan Digital Sky Survey and theUKIRT Infrared Deep Sky Survey. Complementary imagingof the GAMA regions is being obtained by a number of in-dependent survey programs including GALEX MIS, VSTKIDS, VISTA VIKING, WISE, Herschel-ATLAS, GMRTand ASKAP providing UV to radio coverage. GAMA isfunded by the STFC (UK), the ARC (Australia), the AAO,and the participating institutions. The GAMA website is . REFERENCES
Abell G. O., 1958, ApJs, 3, 211Baldry I. K., Robotham A. S. G., Hill D. T., Driver S. P.,Liske J., et al., 2010, MNRAS, 404, 86Beers T. C., Flynn K., Gebhardt K., 1990, The AstronomicalJournal, 100, 32Benson A. J. & Bower R. G., 2010, MNRAS, 405, 1573Berlind A. A., et al., 2006, ApJS, 167, 1Blanton M. R., Hogg D. W., Bahcall N. A., Brinkmann J.,Britton M., et al., 2003, ApJ, 592, 819Blanton M. R. & Roweis S, 2007, AJ, 133, 734Bower R. G., Benson A. J., Malbon R., Helly J. C., FrenkC. S., Baugh C. M., Cole S., Lacey C. G., 2006, MN-RAS, 370, 645Brough A., Duncan A. F., Virginia A. K., Warrick C., 2006,MNRAS, 370, 1223Chilingarian I. V. and Mamon G. A., 2008, MNRAS, 385,83Cole S., Lacey C. G., 1996, MNRAS, 281, 716Cole S., Lacey C. G., Baugh C.M., Frenk C. S., 2000, MN-RAS, 319, 168Colless M., Dalton G, Maddox S, Sutherland Will, NorbergP., 2001, MNRAS, 328, 1039Colless, Matthew; Dalton, Gavin; Maddox, Steve;Sutherland, Will; Norberg, Peder;Cooray A., 2006, MNRAS, 365, 842Cooray A., Sheth R., 2002, PhR, 372, 1Croton D. J., Farrar G. R., Norberg P., Colless M. M., Pea-cock J. A., et al., 2005, MNRAS, 356, 1155De Vaucouleurs G., 1975, Nearby Groups of Galaxies. Galax-ies and the Universe, pp 557–+Driver S. P., Hill D. T., Kelvin L. S., Robotham A. S. G.,Liske J., et al., 2011, MNRAS, in press
Efstathiou G., Ellis R. S., Peterson B. A., 1988, MNRAS,232, 431Eke V. R., Cole S., Frenk C. S., 1996, MNRAS, 282, 263 Eke V. R., Baugh C. M., Cole S., Frenk C. S., Norberg P.,et al., 2004, MNRAS, 348, 866Eke V. R., Frenk C. S, Baugh C. M., Cole S., Frenk C. S.,Norberg P., et al., 2004, MNRAS, 355, 769Gerke B. F., Newman J. A., Davis M., Marinoni C., YanR., Coil A. L., Conroy C., Cooper M. C., Faber S. M.,Finkbeiner D. P., Guhathakurta P., Kaiser N., KooD. C., Phillips A. C., Weiner B. J., Willmer C. N. A.,2005, ApJ, 625, 6Helly J. C., Cole S. Frenk C. S., Baugh C. M., Benson A. J.,Lacey C. G, 2003, MNRAS 338, 903Huchra J. P., Geller M. J., 1982, ApJ, 257, 423Jenkins A., Frenk C. S., White S. D. M., Colberg J. M., ColeS., et al. 2001, MNRAS, 321, 372Kim H. S., Baugh C. M., Cole S., Frenk C. S., Benson A. J.,2009, MNRAS, 400, 1527Knobel C., Lilly S. J., Iovino A., Porciani C., Kovaˇc K.,et al., 2009, ApJ, 697, 1842Loveday J., Maddox S. J., Efstathiou G., Peterson B. A.,1995, ApJ, 442, 457Lin H., Yee, H. K. C., Carlberg R. G., Morris, S. L., Sawicki,M., et al., 1999, ApJ 518, 533Moore B., Frenk C. S. & White S. D. M., 1993, MNRAS261, 827Navarro J., Frenk C. S., White S. D. M., 1996, MNRAS 462,563Nelder J. A. & Mead R., 1965, Computer Journal, 7, 308Norberg P., Cole S., Baugh C. M., Frenk C. S., Baldry I.,et al., 2002, Monthly Notices of the Royal AstronomicalSociety, 336, 907Plionis M., Basilakos S., Ragone-Figueroa C., 2006, ApJ,650, 770Robotham A., Wallace C., Phillipps S., De Propris R., 2006,ApJ, 652, 1077Robotham A., Phillipps S., De Propris R., 2008, ApJ, 672,834Robotham A., Driver S. P., Norberg P., Baldry I. K., Bam-ford S. P., et al., 2010, PASA, 27, 76Robotham A., Phillipp,s S., De Propris, R, 2010, MNRAS,403, 1812Schechter P., 1976, ApJ, 203, 297Skibba R. A., van den Bosch F. C., Yang X., More S., MoH., Fontanot F., 2011, MNRAS, 410, 417Springel V., White, D. M., Tormen, G., Kauffmann, G., etal., 2001, MNRAS, 328, 726Springel V., White D. M., Jenkins A., Frenk C. S., YoshidaNaoki, et al., 2005, Nature, 435, 629Yang X., Mo H. J., van den Bosch F. C., Jing Y. P., 2005,MNRAS, 356, 1293Yang X., Mo H. J., van den Bosch F. C., 2003, MNRAS,339, 1057Zwicky F., Herzog E., Wild P., 1961, Catalogue of galaxiesand of clusters of galaxies, Vol. I. Pasadena: CaliforniaInstitute of Technology (CIT), —c1961 c (cid:13)000