Non-Parametric Cell-Based Photometric Proxies for Galaxy Morphology: Methodology and Application to the Morphologically-Defined Star Formation -- Stellar Mass Relation of Spiral Galaxies in the Local Universe
M. W. Grootes, R. J. Tuffs, C. C. Popescu, A. S. G. Robotham, M. Seibert, L. S. Kelvin
aa r X i v : . [ a s t r o - ph . C O ] N ov Mon. Not. R. Astron. Soc. , 1–39 (2002) Printed 18 September 2018 (MN L A TEX style file v2.2)
Non-Parametric Cell-Based Photometric Proxies forGalaxy Morphology: Methodology and Application to theMorphologically-Defined Star Formation – Stellar MassRelation of Spiral Galaxies in the Local Universe
M. W. Grootes ⋆ , R. J. Tuffs , C. C. Popescu , A. S. G. Robotham , , M. Seibert ,L. S. Kelvin , , Max-Planck Institut f¨ur Kernphysik, Saupfercheckweg 1, 69117 Heidelberg, Germany Jeremiah Horrocks Institute, University of Central Lancashire, Preston PR1 2HE, UK ICRAR, The University of Western Australia, 35 Stirling Highway, Crawley, WA 6009,Australia SUPA School of Physics & Astronomy, University of St. Andrews, North Haugh, St. Andrews KY16 9SS, UK Observatories of the Carnegie Institution for Science, 813 Santa Barbara Street, Pasadena, CA 91101, USA Institut f¨ur Astro- und Teilchenphysik, Universit¨at Innsbruck, Technikerstrasse 25, 6020 Innsbruck, Austria
Accepted ???? Received ????; in original form ????
ABSTRACT
We present a non-parametric cell-based method of selecting highly pure and largelycomplete samples of spiral galaxies using photometric and structural parameters asprovided by standard photometric pipelines and simple shape fitting algorithms. Theperformance of the method is quantified for different parameter combinations, usingpurely human-based classifications as a benchmark. The discretization of the parame-ter space allows a markedly superior selection than commonly used proxies relying ona fixed curve or surface of separation. Moreover, we find structural parameters derivedusing passbands longwards of the g band and linked to older stellar populations, espe-cially the stellar mass surface density µ ∗ and the r band effective radius r e , to performat least equally well as parameters more traditionally linked to the identification of spi-rals by means of their young stellar populations, e.g. UV/optical colours. In particularthe distinct bimodality in the parameter µ ∗ , consistent with expectations of differentevolutionary paths for spirals and ellipticals, represents an often overlooked yet pow-erful parameter in differentiating between spiral and non-spiral/elliptical galaxies. Weuse the cell-based method for the optical parameter set including r e in combinationwith the S´ersic index n and the i − band magnitude to investigate the intrinsic specificstar-formation rate - stellar mass relation ( ψ ∗ − M ∗ ) for a morphologically definedvolume limited sample of local universe spiral galaxies. The relation is found to bewell described by ψ ∗ ∝ M − . ∗ over the range of 10 . M ⊙ M ∗ M ⊙ with amean interquartile range of 0 . Key words: galaxies:fundamental parameters – galaxies:spiral – galaxies:structure– galaxies:photometry.
With the advent of large optical photometric ground andspace-based surveys which are ongoing, commencing (TheSloan Digital Sky Survey (SDSS; York et al. 2000; Abaza-jian et al. 2009, The Galaxy And Mass Assembly Survey ⋆ E-mail: [email protected] (GAMA; Driver et al. 2011), SKYMAPPER (Keller et al.2007), The VST Atlas, The Kilo Degree Survey (KiDS;de Jong et al. 2012), The Dark Energy Survey (DES; TheDES collaboration 2005)) or scheduled to commence inthe next years (e.g., EUCLID; Laureijs et al. 2011), thenumber of extragalactic sources with reliable, uniform datais increasing dramatically, further opening the door tostatistical studies of the population of galaxies, both at c (cid:13) M. W. Grootes, et al. local and intermediate redshifts.To first order, the visible matter distributions of galaxiesmay be classified as being best described either as an ex-ponential disk, i.e. a largely rotationally-supported system,or a spheroid, i.e. a largely pressure-supported system.This dichotomy forms the basis of the standard morpho-logical categorization of galaxies into late-types/spiralsand early-types/ellipticals, introduced by Hubble (1926)and in widespread use ever since. This basic morpho-logical bimodality of the galaxy population appears tobe mirrored in a range of physical properties, with late-type/spiral galaxies having blue UV/optical colours andshowing evidence of star formation, on average, whileearly-type/elliptical galaxies appear red on average, andmostly only display a low level of star formation, if any atall (e.g. Strateva et al. 2001; Baldry et al. 2004; Balogh etal. 2004). However, a wide variety of exceptions to this ruleexist. For example, spiral galaxies may appear red due tothe attenuation of their emission by dust in their disks, ora spiral may truly have very low star formation and redcolours whilst maintaining its morphological identity, while,on the other hand, an elliptical galaxy may appear blue dueto a localized recent burst of star- formation.It is assumed that different modes of assembly of the stellarpopulations of these galaxy categories are responsible for thedistinction. This in turn, necessitates the ability to reliablyidentify and distinguish between the types of galaxies wheninvestigating the physical processes determining galaxyformation and evolution on the basis of large statisticalsamples of galaxies. Furthermore, it is clear that in anyinvestigation of galaxy properties for a given morphologicalclass, the classification itself should not introduce a biasinto the property being investigated. For example, a puresample of spiral galaxies used to investigate star-formationas a function of galaxy environment must include thepopulation of red, passively star-forming spiral galaxies.Visual classifications of galaxy morphology by professionalastronomers therefore remain the method of choice and thebenchmark for robustly identifying the morphology of agalaxy. However, such classifications may suffer from biasesarising from the individual performing the classification,and the uncertainty/robustness of the classification isdifficult to quantify. Furthermore, in the case of marginallyresolved data, even the ability of the human eye to iden-tify morphological structure may be limited, so that thedecreasing linear resolution as a function of redshift mayintroduce systematic biases. In such cases, quantitativephotometric measures of the light-profile may be at leastas reliable as human classifications. The overriding factwhich immediately stymies the visual classification byprofessionals of all sources in modern imaging surveyssuch as SDSS, however, is the size of the galaxy samplesprovided by the surveys, and accordingly the time requiredfor classification. Thus, one is forced to develop alternativeschemes for obtaining morphological classifications of largesamples of galaxies.Recently, in an attempt to circumvent the limitations insample size, reduce the possibility of bias, and provide anobjective measure of robustness, Lintott et al. (2008) haveenlisted the help of ’citizen scientists in visually classifying alarge fraction of SDSS DR7 galaxies in the GALAXY ZOOproject (Lintott et al. 2008, 2011), releasing a catalogue of probability-weighted visual classifications into spiralsand ellipticals. Although demonstrably feasible, such anapproach is nevertheless very time consuming, especially onlarge data-sets.The often adopted alternative is to attempt an automaticclassification of galaxies based on some proxy for a galaxy’smorphology. These automatic classification schemes canbe roughly divided into three categories: i) those relyingon a detailed analysis of the full imaging products, ii)those using a wide variety of photometric and spectroscopicproxies, in combination with a sophisticated algorithmicdecision process, and iii) those using one or two simple,usually photometric, parameters and a fixed or simplyparameterized separator. Of course, hybrids between thesecategories also exist.Examples of the first category include the Concentration,Asymmetry, and Clumpiness (CAS, Conselice 2003) param-eters, derived directly from the data reduction and modelfitting of the imaging data, as well as the Gini coefficient(Gini 1912; Abraham, van den Bergh & Nair 2003; Lotz,Primack, & Madau 2004) and the M coefficient (Lotz,Primack, & Madau 2004). Forming a hybrid between thisand the second category, Scarlata et al. (2007) have intro-duced the Zurich Estimator of Structural Types (ZEST) based on a principle component analysis of these and othermodel- independent quantities, which has been applied tovarious data sets. Examples of the second category aregiven by classification schemes based on neural networks(e.g. Banerji et al. 2010) and making use of support vectormachines (Huertas-Company et al. 2011). Finally, thethird category, which finds widespread use, includes, forexample, the concentration index (Strateva et al. 2001;Stoughton et al. 2002; Kauffmann et al. 2003), the locationin colour-magnitude space (Baldry et al. 2004), the S´ersicindex (Blanton et al. 2003; Bell et al. 2004; Jogee et al.2004; Ravindranath et al. 2004; Barden et al. 2005), thelocation in the
NUV − r resp. u − r vs. log( n ) plane (Kelvinet al. 2012; Driver et al. 2012), the location in the spacedefined by the SDSS f dev parameter (i.e., the fraction ofa galaxy’s flux which is fit by the de Vaucouleurs profile(de Vaucouleurs 1948) in the best fit linear combination ofa de Vaucouleurs and an exponential profile) and the axisratio of the best fit exponential profile, q exp (Tempel et al.2011), and, in the case of high-z galaxies the location in the( U − V ) - ( V − J ) restframe colour-colour plane (Patel etal. 2012).Overall, the advantages and disadvantages of the automaticschemes can also be categorized in a similar manner.Schemes in the category i) ideally require well resolvedimaging, which may be difficult to obtain for faint galaxiesin wide field imaging surveys, even in the local universe.Furthermore they require detailed imaging products, oftenincluding intermediate data reduction products whichare not archived, making an independent morphologicalclassification very time consuming and/or computationallyexpensive, especially for large data sets. Schemes in cate-gory ii), on the other hand, require the implementation ofa complex analysis algorithm in addition to the existenceof a training set of objects with known morphologies, andmay require assumptions about the nature of the statisticaldistribution of the parameters considered. Finally, forthe third category, the simple parameterization must c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals limit either the degree to which the selection recovers allmembers of a given morphological category, or the level atwhich the classification is robust against contamination,even for proxies which make use of structural information.Furthermore, it should be noted that the majority of themethods considered make use of parameters linked directlyto ongoing star-formation, and as such may introducea bias into the star-formation properties of a selectedgalaxy sample. For example in category i), the clumpinessparameter in the CAS scheme traces localized current starformation in spirals, while in category ii) both the methodsof Banerji et al. (2010) and Huertas-Company et al. (2011)make use of galaxy colours, and Banerji et al. (2010) usestexture of the imaging as well. Finally in category iii) arange of simple proxies make use of the colour bimodality,linked to star-formation, of the galaxy population.In the following we present a non-parametric method forselecting spirals based on the combinations of two and threephotometric and simple structural parameters. The methodis based on a discretization of the parameter space spannedby the parameter combination performed using an adaptivegrid which increases the resolution in regions of high galaxyparameter space density. The division of the discretizedparameter space into a spiral and a non-spiral subvolumeis calibrated using the morphological classifications ofGALAXY ZOO Data Release 1 (DR1; (Lintott et al. 2011).We quantify the performance of each parameter combina-tion in terms of completeness and purity, identifying thosewith the best performance, also investigating parametercombinations which make no use of properties directlylinked to ongoing star- formation. This approach can beconsidered formally analogous to the classifications of starsin discrete spectral classes as discussed in the review ofMorgan & Keenan (1973).We describe the data used in Sect. 2 and the method inSect. 3. We then investigate the performance of the param-eter combinations in Sect. 4 and compare the performanceof our selection with other methods in Sect. 5. We discussour results and the applicability of the method in Sect. 6,and apply the selection method to obtain a reliable sampleof spirals as a basis for investigating the intrinsic scatter inthe stellar mass - specific star-formation rate relation of thisclass of galaxies in Sect. 7. Finally, we close by summarizingour results in Sect. 8. Throughout the paper we assume anΩ M = 0 .
3, Ω λ = 0 . H = 70 kms − Mpc − cosmology. Within this work we aim to investigate the efficacy and per-formance as proxies of various combinations of UV/opticalphotometric parameters for the morphological selection ofspiral galaxies. To facilitate this comparison and broadenthe range of possible proxies we have endeavored to createan unbiased sample of galaxies with as much available dataas possible. We have selected all spectroscopic objects with
SpecClass = 2 (Galaxies) from the seventh data release(DR7) of SDSS (Abazajian et al. 2009) which lie within theGALEX MIS depth (1500 s; Martin et al. 2005; Morrisseyet al. 2007) footprint. We have matched this sample to thecatalogue of the MPA/JHU analysis of SDSS DR7 spectra (providing emission line fluxes) and to the catalogue ofsingle S´ersic fits recently published by Simard et al. (2011)using the SDSS unique identifiers, and to the preliminaryNUV GALEX MIS depth unique NUV source galaxycatalogue GCAT MSC (Seibert et al., 2013 in prep.) using a4 arcsec matching radius . Given the uncertainties involvedwith flux redistribution (e.g., Robotham & Driver 2011),we have chosen to treat only one-to-one matches betweenSDSS and GALEX as possessing reliable UV data.Where multiple spectra are available for a single photo-metric object, we have used the spectrum correspondingto the the MPA/JHU entry. Where multiple spectra formthe MPA/JHU reductions are available, we have chosen thespectrum with the smallest redshift error. In order to obtaina reliable benchmark morphological classification, we havematched the sample to the GALAXY ZOO data release1 (DR1) (Lintott et al. 2008, 2011) catalogue of visual,red-shift debiased morphological classifications (Bamford etal. 2009; Lintott et al. 2011) using the photometric SDSS ObjId, limiting ourselves to local universe sources (redshift z . Opticalsample ), with a subsample of114047 NUV detected, uniquely matched sources (referredto as the
NUVsample ). Finally we have cross-matchedthese samples to the catalogue of ∼
14k bright SDSS DR4(Adelman-McCarthy et al. 2006) galaxies with detailedmorphological classifications of Nair & Abraham (2010).This results in a subsample of 6220 sources with twoindependent morphological classifications (which we referto as the
NAIRsample ). 4470 sources in the
NAIRsample have NUV detections, and we refer to this subsample as the
NUVNAIRsample . We have retrieved Petrosian magnitudes, the foregroundextinction, the f deV and q exp parameters from the SDSSphotometric pipeline, and the petrosian 50th ( R ) and90th ( R ) percentile radii in the u , g , r , and i pass-bandsfrom the SDSS database using CasJobs. To obtain total(S´ersic) magnitudes we use the algorithms for convertingSDSS petrosian magnitudes to total S´ersic magnitudesderived by Graham et al. (2005). The obtained magnitudeshave been corrected for foreground extinction using the ex-tinction values supplied by SDSS (derived from the Schlegel,Finkbeiner, & Davis (1998) dust maps). K-corrections to z = 0 have been performed using kcorrect_v4.2 (Blanton& Roweis. 2007).GALEX sources with NUV artifact flag indicating windowor dichroic reflections have been removed from the sample.The FUV and NUV magnitudes of the matched GALEXsources have been corrected for foreground extinction usingthe Schlegel, Finkbeiner, & Davis (1998) dust maps and A FUV = 8 . E ( B − V ) and A NUV = 8 . E ( B − V ) followingWyder et al. (2007).Photometric stellar mass estimates have been calculated We note that the GCAT MSC includes a cut on
S/N > (cid:13) , 1–39 M. W. Grootes, et al. from the extinction and k-corrected magnitudes using the g − i colour and the i -band absolute magnitude M i aslog( M ∗ ) = − .
68 + 0 . · ( g − i ) − . M i + 0 . · . , (1)where the factor 4 .
58 is identified as the solar i -bandmagnitude, following the prescription provided by Tayloret al. (2011). We make use of the emission line fluxes form the H α , H β ,[NII]6584, and [OIII]5007 emission lines, and of the under-lying continuum flux for the H α emission line. Using thesedata we calculate the H α equivalent width, and the BalmerDecrement. We use the H α equivalent width as an inde-pendent observable in the investigation of possible biases inthe morphological proxies for spiral galaxies and the BalmerDecrement in the correction of observed UV photometry forthe effects of attenuation due to dust using the prescriptionof Calzetti et al. (2000) (cf. section 7). The ratios of H α to[NII]6584 and H β to [OIII]5007 are used to identify galaxieshosting an AGN following the prescription of Kewley et al.(2006). The emission line data is taken from the MPA/JHUanalysis of the SDSS DR7 spectra (performed by StephaneCharlot, Guineverre Kauffmann, Simon White, Tim Heck-man, Christy Tremonti, and Jarle Brinchmann). We calcu-late the H α EQWs as the ratio of emission line to contin-uum flux. As the listed uncertainties are formal we multiplythe uncertainties on the emission line fluxes by the factorslisted on the website, in particular by 2.473 for H α , 2.039for [NII]6584, 1.882 for H β , and 1.566 for [OIII]5007. Thesefactors have been determined by the MPA/JHU group us-ing comparisons of duplicate spectra of objects within thesample. For sources with S/N <
In constructing the parameter combinations for use asproxies, we have made use of the structural informationsupplied by the simultaneous fits in the g - and r -band ofsingle S´ersic profiles to SDSS photometry made available bySimard et al. (2011), performed using GIM2D.
In particular,we have used the S´ersic index n , the single S´ersic effectiveradius r e (half-light semi-major axis) in the r -band, andthe ellipticity e . Simard et al. (2011) find that multiplecomponent fits are not justified for most SDSS sourcesgiven the resolution of the imaging, and similar issues willafflict other surveys as well. Therefore we have chosen touse the largely robust single S´ersic profile fits in this work.We note, however, that Bernardi et al. (2012) have recentlyargued that for the brightest sources two component fits arepreferable over single S´ersic fits and that for these sourcesthe sizes derived by Simard et al. (2011) are systematicallytoo small. This will not affect the analysis presented here, asthese sources form a minority of the population considered and the effect will be accounted for in the calibration of theproxies. The GALAXY ZOO DR1 (Lintott et al. 2008; Bamfordet al. 2009; Lintott et al. 2011) represents the largestand faintest sample of galaxies with morphological clas-sifications based on visual inspection. We have employedthese morphological classifications, specifically those of thesources with redshift debiased classifications as providedby Bamford et al. (2009), as a benchmark morphologicalclassification. Such a debiased estimate is only possible forsources with spectroscopic redshifts. Rather than a binaryclassification, GALAXY ZOO DR1 provides a probabilityfor the source being an elliptical ( P E , DB ) or a spiral ( P CS , DB (CS denotes the combined spiral class, i.e summed over thesub-classes available in GALAXY ZOO DR1, i.e clockwisespiral, anti-clockwise spiral, spiral edge-on/other), based onthe outcome of all classifications of the object . It is then upto the user to decide where to place the threshold for assum-ing a classification is reliable. After eyeballing a selectionof galaxies we have chosen to treat a debiased probabilityof 0.7 or greater as being a reliable classification in thecontext of this work. Such a choice results in three popu-lations: i) spirals, ii) ellipticals, and iii) undefined. We willshow that this choice leads to highly pure samples of spirals. Nair & Abraham (2010) have provided detailed visualmorphological classifications of 14,034 galaxies in the SDSSDR4 (Adelman-McCarthy et al. 2006), with 0 . z . g ′ < ∼ In order to obtain reliable morphological selections of galax-ies based upon photometric parameters, the parameterchosen must ideally display a distinct separation into twopopulations corresponding to the different morphological It should be noted that due to the debiasing procedure, P CS,DB + P E,DB for a given galaxy is not necessarily equal tounity. c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals categories. Prominent examples of such one parameter sep-aration criteria are the concentration index C idx = R /R (e.g., Strateva et al. 2001) and the S´ersic index n (e.g.,Blanton et al. 2003).Other schemes make use of combinations of two or moreparameters such as the u − r colour and r -band absolutemagnitude (Baldry et al. 2004), or the q exp and f deV parameters, possibly in combination with u − r colourinformation (Tempel et al. 2011). Recently, Kelvin et al.(2012) and Driver et al. (2012) have suggested the useof a UV/optical colour ( u − r , resp. NUV − r ) and theS´ersic index n in separating spiral and elliptical galaxies,and a variant of the NUV − r , n selection has been usedby Grootes et al. (2013) to select spiral galaxies for thepurpose of a radiation transfer analysis and has proven tobe efficient.Common to all these approaches is the difficulty of selectinga curve/surface of separation between the two populations,which includes as large a fraction of the desired categoryas possible, whilst simultaneously keeping the level of con-tamination as low as feasible. In addition, this choice maybe influenced by further requirements upon the recoveryfraction and purity of the sample, which can be envisionedto vary with application.The functional form of the curve or hypersurface providingthe optimal separation of the two populations is not knowna priori, and an appropriate choice can be non-trivial,even if the population of spiral galaxies is easily separablefrom the non-spiral population by eye. Furthermore, thesharp division between the two is generally not exhibitedby the galaxy populations which show a more gradualtransition. Accordingly, sharp transitions in combinationwith simple parameterizations where the functional formmay be ill-suited can give rise to large contaminations. Rather than making assumptions about the functional formof the separation, we discretize the space spanned by theparameters used into individual cells. For each cell we can,using the Galaxy Zoo classifications measure the fraction ofthe galaxies residing therein which are spirals (i.e. P CS,DB > . F sp as F sp = N GZ , sp N cell , (2)where N GZ , sp is the number of GALAXY ZOO spirals (i.e., P CS , DB > .
7) in the cell and N cell is the total number ofgalaxies in the cell. The associated relative error ∆ F sp,rel is calculated using Poisson statistics and error propagation.We then define those cells with F sp > F sp (where F sp isthe threshold spiral fraction) and ∆ F sp,rel . to be spiralcells, i.e., we treat every object in the cell as a spiral galaxy,and thus obtain a decomposition of the parameter spaceinto a spiral and a non-spiral subvolume. The choice of∆ F sp,rel . has little effect in terms of the total popu-lation, as large values of ∆ F sp,rel correspond to scarcelypopulated cells. The population is obviously more sensitiveto the choice of the limiting fraction F sp , with lower valuesleading to larger recovery fractions but lower purity. Herewe have experimented with different values of F sp and find F sp = 0 . F sp = 0 .
5, however, note that if a larger recoveryfraction or an even greater purity is desired this choice canbe altered.In this work we focus on combinations of two and threeparameters. While the approach is theoretically applicableto higher dimensional parameter spaces, the requirementson resolution and cell population impose an effective limitof three dimensions for the calibration sample available. Weprovide a decomposition of the parameter space for threecombinations of three parameters in appendix A, whichalso provide the values of F sp and ∆ F sp,rel for all cells. Weemphasize that any reader wanting to use the discretizationsprovided must check for systematic differences betweenhis/her data/parameters and those used in this work, andrefer the reader to Sect. 6.3 for a further discussion of theapplication of the results presented here to other surveys. In order to provide a robust and reliable decomposition ofthe parameter space, the calibration sample must adequatelysample the parameter space and the galaxy population, i.e.it must contain sufficient galaxies to achieve the requiredlevel of resolution and to sufficiently populate the individualcells, as well as be representative of the galaxy population asa whole. On the other hand, as the calibration sample mustbe visually classified, it is desirable to understand how the c (cid:13) , 1–39 M. W. Grootes, et al.
Figure 1.
Cell grid obtained for the parameter combination (log( n ),log( r e ),log( µ ∗ )) using a calibration sample of 10000 galaxies. The10k galaxies of the calibration sample are overplotted with colour-coding according to the probability of being a spiral (blue : spiral, red:non-spiral). performance of the method relies on the size of the calibra-tion sample. In particular, it is of interest how the purity,completeness, and contamination by ellipticals of the sampledepend on the size of the calibration sample.We define the purity fraction P pure as P pure = N sel , SP N sel , (3)where N sel is the number of galaxies selected as spirals bythe cell-based method, and N sel , SP is the number of thosegalaxies which are visually classified as being spiral galax-ies. Analogously the contamination fraction P cont is definedas the fraction of the selected galaxies which are visuallyclassified as ellipticals, i.e. P cont = N sel , E N sel . (4)The completeness fraction of the sample P comp is defined as P comp = N sel , SP N SP , (5)where N SP is the total number of visually classified spiralsin the sample being classified by the cell-based method.Fig. 2 shows the fractional purity, completeness, and con-tamination by elliptical galaxies for samples selected usinga combination of the parameters S´ersic index (log( n )), effec-tive radius in the r -band (log( r e )), and stellar mass surface density (log( µ ∗ )), as a function of the size of the calibrationsample (this parameter combination is found to perform wellin selecting simultaneously pure and complete samples ofspirals; for further details on the parameters, the parametercombinations, and their performance we refer the reader toSect. 4). The values at each sample size correspond to themean obtained from 5 random realizations of a calibrationsample of that size, with the error bars corresponding to the1- σ standard deviation. In each case, the calibration sampleis drawn from the whole of the GALAXY ZOO sample.The figure shows the performance in classifying three testsamples: i) the entire optical galaxy sample using the visualclassifications of spirals provided by GALAXY ZOO (solid),ii) the optical galaxy sample with independent morpho-logical classifications provided by Nair & Abraham (2010)making use of these to define which galaxies really are spi-rals (dash-dotted), and iii) the optical galaxy sample withmorphological classifications provided by Nair & Abraham(2010), but making use of the visual classifications providedby GALAXY ZOO (dashed). When calculating the contam-ination by ellipticals for GALAXY ZOO-based definitionswe assume all sources with P E , DB > . c (cid:13)000
Cell grid obtained for the parameter combination (log( n ),log( r e ),log( µ ∗ )) using a calibration sample of 10000 galaxies. The10k galaxies of the calibration sample are overplotted with colour-coding according to the probability of being a spiral (blue : spiral, red:non-spiral). performance of the method relies on the size of the calibra-tion sample. In particular, it is of interest how the purity,completeness, and contamination by ellipticals of the sampledepend on the size of the calibration sample.We define the purity fraction P pure as P pure = N sel , SP N sel , (3)where N sel is the number of galaxies selected as spirals bythe cell-based method, and N sel , SP is the number of thosegalaxies which are visually classified as being spiral galax-ies. Analogously the contamination fraction P cont is definedas the fraction of the selected galaxies which are visuallyclassified as ellipticals, i.e. P cont = N sel , E N sel . (4)The completeness fraction of the sample P comp is defined as P comp = N sel , SP N SP , (5)where N SP is the total number of visually classified spiralsin the sample being classified by the cell-based method.Fig. 2 shows the fractional purity, completeness, and con-tamination by elliptical galaxies for samples selected usinga combination of the parameters S´ersic index (log( n )), effec-tive radius in the r -band (log( r e )), and stellar mass surface density (log( µ ∗ )), as a function of the size of the calibrationsample (this parameter combination is found to perform wellin selecting simultaneously pure and complete samples ofspirals; for further details on the parameters, the parametercombinations, and their performance we refer the reader toSect. 4). The values at each sample size correspond to themean obtained from 5 random realizations of a calibrationsample of that size, with the error bars corresponding to the1- σ standard deviation. In each case, the calibration sampleis drawn from the whole of the GALAXY ZOO sample.The figure shows the performance in classifying three testsamples: i) the entire optical galaxy sample using the visualclassifications of spirals provided by GALAXY ZOO (solid),ii) the optical galaxy sample with independent morpho-logical classifications provided by Nair & Abraham (2010)making use of these to define which galaxies really are spi-rals (dash-dotted), and iii) the optical galaxy sample withmorphological classifications provided by Nair & Abraham(2010), but making use of the visual classifications providedby GALAXY ZOO (dashed). When calculating the contam-ination by ellipticals for GALAXY ZOO-based definitionswe assume all sources with P E , DB > . c (cid:13)000 , 1–39 hotometric Proxies for Selecting Spirals sizes greater than ∼
50k galaxies no longer lead to a largeimprovement of the performance. The improvement in per-formance with increasing size of the calibration sample isparticularly striking for the optical sample matched to thebright galaxy sample of Nair & Abraham (2010). The in-creasing sample size enables a higher resolution, thus in-creasing purity and decreasing contamination by allowingregions of parameter space to be excluded, while simultane-ously allowing the full extent of the parameter space occu-pied by spiral galaxies to be sufficiently sampled, increasingcompleteness by including other sections of the parameterspace.Even for the smallest sample sizes the performance of themethod does not appear to depend strongly on the specificrealization of the calibration sample, as shown by the er-rorbars. However, there is nevertheless a notable decrease inthe 1- σ uncertainty around the mean with increasing samplesize from ∼ − .
5% to . . z . In the context of this work we focus on a suite of directlyobserved and derived parameters for the purpose of identi-fying spiral galaxies which consists of a UV/optical colour( u − r , respectively NUV − r for the NUV matched sample),the S´ersic index n , the effective radius r e (half-light semi-major axis), the i -band absolute magnitude, the ellipticity e , the stellar mass M ∗ , and the stellar mass surface density µ ∗ calculated as µ ∗ = M ∗ πr e . (6)The usefulness of the u − r colour and the S´ersic indexin selecting spirals is well documented (e.g., Baldry et al.2004 respectively Barden et al. 2005). Similarly, as spiralgalaxies are often assumed to be largely star-forming, the NUV − r colour may be assumed to be of use. We havechosen to include the i -band magnitude M i (a directlyobservable tracer of stellar mass) and the derived parameterstellar mass M ∗ , as early-type galaxies are, on average, moremassive than late-types. Furthermore, at a given stellarmass, it appears likely that a rotationally-supported spiralwill be more radially extended than a pressure-supportedearly-type galaxy, hence we make use of the effective radius.This also implies that the stellar mass surface density ofsources may be useful in separating spirals from non-spirals.While for a spiral the value of µ ∗ derived using Eq. 6 Sample size N0.000.020.040.060.080.100.12 E lli p t i c a l C on t a m i na t i on F r a c t i on F r a c t i ona l C o m p l e t ene ss F r a c t i ona l P u r i t y Figure 2.
Fractional purity (top), fractional completeness (mid-dle), and fractional contamination by ellipticals (bottom) for a se-lection of spirals obtained using the S´ersic index (i.e. log( n )), theeffective radius in the r -band (i.e. log( r e )), and the stellar masssurface density (i.e. log( µ ∗ )), as a function of the size of the cali-bration sample. The solid line corresponds to the results obtainedwhen classifying the optical sample (i.e without the requirementof an NUV detection), while the dash-dotted line correspondsto the results obtained when classifying the optical sample withmorphological classifications by Nair & Abraham (2010) defin-ing spirals using these detailed classifications, and the dashedline corresponds to the optical sample matched to the Nair &Abraham (2010) catalogue but using the GALAXY ZOO visualclassifications. The data points correspond to the mean of 5 ran-dom realizations of the calibration sample drawn form the opticalgalaxy sample with the error bars corresponding to the 1- σ stan-dard deviation about the mean. is readily interpretable in a physical sense , the valuederived in this manner for a true ellipsoid will tend tounderestimate the actual surface density of the object,as the approximation of the surface area using r e as inEq. 6 will tend to overestimate the projected surface area.Hence, any observed separation of the spiral and non-spiralpopulations in this parameter will represent a lower limit As a spiral galaxy can be assumed to be circular to first order,the effective radius can be used to derive a reasonable estimateof the surface area and consequently of the stellar mass surfacedensity.c (cid:13) , 1–39
M. W. Grootes, et al. to the actual separation. Finally we have included theobserved ellipticity e , as the objects on the sky whichappear most elliptical are likely to be spirals observed at amore edge-on orientation. We note, however, that the useof ellipticity as a parameter will bias any selection of spiralstowards sources seen edge-on.Our goal is to identify (multiple) optimal sets of parameterswhich can be used as morphological proxies in the selectionof highly pure and largely complete samples of spiralgalaxies. As NUV data is only available for a subset of thetotal sample we perform the investigations in parallel bothfor the OPTICALsample , as well as the
NUVsample .For the
OPTICALsample we perform the discretizationof the parameter space using a sample of 50k galaxiesrandomly drawn from the
OPTICALsample (the samesample is used for all parameter combinations) and clas-sify the performance using the
OPTICALsample andthe
NAIRsample (i.e. the subsample with morphologicalclassifications fromNair & Abraham (2010)). For the NUVpreselected sample (the
NUVsample ) we perform thediscretizations using a sample of 30k galaxies with NUVdetections (randomly sampled from the sample of 50kgalaxies used for the
OPTICALsample ), and in this caseclassify the performance using the entire
NUVsample , andthe
NUVNAIRsample (i.e., the subsample of galaxies withmorphological classifications from Nair & Abraham (2010)and NUV detections.)Fig. 3 shows the distributions of the parameters forthe entire
OPTICALsample (dashed), as well as for therandomly selected subset of 50k galaxies in the calibrationsubsample (solid). As expected, the distributions for thetwo samples are so similar as to be indistinguishable inFig. 3 with the differences being smaller than the linewidth The figure shows the distributions for the galaxiesin the samples classified as spirals ( P CS,DB > .
7, blue),ellipticals ( P E,DB > .
7, red), non-spirals ( P CS,DB < . P CS,DB < . P E,DB < . µ ∗ notably alsodisplays a distinct separation of the two populations, andeven shows a separation between the spiral and undefinedpopulations. The parameters stellar mass, effective radius,and i - band absolute magnitude show the expected trendsin the populations as previously discussed. The distributionof ellipticities, however, is noteworthy. As expected, thespiral sample dominates the largest values of ellipticity anddisplays a separation from the undefined population athigh ellipticity. However, at intermediate and lower valuesof e there is considerable overlap with the other popula-tions. Furthermore, the population of spirals as definedby GALAXY ZOO appears biased towards high values of This is quantitatively supported by the fact that Kolmogorov-Smirnoff tests (and two sample χ -tests for similarity for the dis-crete distributions in e , n , and r e ) support the null hypothesisthat the samples have the same distribution ( p > . ellipticity, i.e. galaxies seen edge-on . As a consequencea discretization of parameter space using this calibrationsample and e in the parameter combination will also bebiased towards high values of ellipticity (even more so, thandue to the intrinsic overlap of the spiral and non-spiralsample at low and intermediate values of e ). However,the bias will not affect the discretization of the parameterspace for combinations of parameters which are, to firstorder, independent of the orientations of the galaxies withrespect to the observer (e.g. log( r e ), log( M ∗ ), log( µ ∗ ), M i ,log( n ) . In such cases, the distribution of ellipticities ofspiral galaxies in each of the cells may be expected to besimilar to that of the entire calibration sample, hence thebias towards edge-on systems will have no effect.The bias of the GALAXY ZOO spiral sample must alsobe taken into account when quantifying the performanceof different combinations of parameters. When usingsamples relying on the GALAXY ZOO classifications astest samples, the bias in e can give rise to spuriouslycomplete samples in combination with e as a selectionparameter. In spite of this bias, we nevertheless chooseto use the GALAXY ZOO sample for calibration andtesting purposes, as it represents the only large and faintsample of visually classified galaxies with a wide range ofhomogeneous ancillary data available. We check for effectsarising from the ellipticity bias using the bright subsampleof galaxies with independent visual classifications by Nair& Abraham (2010), which does not display an ellipticitybias.Fig. 4 shows the same for the parameter distributions ofthe NUVsample and the randomly selected subset of 30kgalaxies constituting the NUV calibration sample.Comparing the parameter distributions between the
OP-TICALsample and the
NUVsample shown in Figs. 3 &4, the samples appear remarkably similar. Nevertheless,Kolomogorov-Smirnoff and χ tests indicate that, in spiteof their similar appearance, the null hypothesis that theparameter distributions in these samples are the samehas low probability ( p . OPTICALsample and the
NUVsample ( p > . u − r and NUV − r colours ( p · − ), indicating that the NUV pre-selectionmainly affects the undefined population and its size relativeto the spiral and elliptical populations. Despite of thesedifferences, overall, the use of UV-preselection only has asmall effect on the parameter distributions, in comparisonwith the large shift in the distributions between themorphological categories. This qualitative impression isconfirmed for the optical properties of spirals and ellipticals,the null hypothesis being supported with p > .
37. A might For an unbiased sample one would expect a flat distribution inellipticity A bias in ellipticity can potentially give rise to a slight bias to-wards redder UV/optical colours, as edge-on spirals appear red-der on average. However, we have found no significant evidenceof such a bias. Recent work by Pastrav et al. (2013a) has alsofound that fully resolved dust rich galaxies seen edge-on may ap-pear larger than when seen face-on, however, the strength of thiseffect remains to be quantified for marginally resolved sources.c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals be expected, the null hypothesis is, however, rejected forthe NUV − r and u − r colors ( p · − ). The NUVpre-selection also appears to affect the undefined populationand its size relative to the spiral and elliptical distributions,even in the optical parameters, the null hypothesis beingrejected for this class for all parameters. Our goal in this work is to identify parameter combina-tions which provide a pure, but also largely complete sampleof spiral galaxies. As such an additional important figure ofmerit in quantifying the performance of the different param-eter combinations is the bijective discrimination power P bij which we define as the product of P pure and P comp as definedin Eqs. 3, 5, i.e. P bij = P pure · P comp . (7)This provides a measure of the efficacy of the parametercombination at simultaneously selecting a pure and com-plete sample of spirals from the test samples. P bij cantake on values between 0 and 1, with 1 corresponding toa perfectly pure and complete sample. As a reference, aselected sample with P pure = 0 .
75 and P comp = 0 . P bij = 0 . P pure = 0 .
984 and P comp = 0 . P bij = 0 . P CS , DB > . P cont as defined in Eq. 4, where wedefine all sources with P E , DB > . In the following we investigate the performance of selectionsusing parameters which can be applied to samples withoutthe requirement of UV data, i.e. u − r colour, log( n ),log( r e ), log( M ∗ ), log( µ ∗ ), M i , and e . The figures of meritinvolving completeness P comp and P bij are given in relationto the OPTICALsample and the
NAIRsample . Tables 1 and 2 show the figures of merit achieved whentesting using the
OPTICALsample and the
NAIRsample ,respectively, for all 21 unique combinations of two parame-ters drawn from the suite applicable to optical samples. This statement is valid for the combination of UV and opticalphotometric depths in the dataset used in this work. We cautionthat for different datasets this may not necessarily be true.
Testing the performance of different parameter com-binations using the
OPTICALsample , we find that theparameters log( µ ∗ ) and log( r e ) are efficient at selectingcomplete samples, with all samples with P comp > . P pure > .
7. In concert with either log( µ ∗ )or log( r e ), the parameter log( n ) also leads to pure andcomplete samples of spirals (in particular (log( n ),log( r e ))attains the highest value of P bij = 0 . e inparameter combinations leads to selections which are highlypure on average ( P pure & . P comp < . µ ∗ ), e ) with P pure = 0 . P comp = 0 . P bij = 0 . P bij overall. However, this may be influenced bythe ellipticity bias in the test sample (see the previousdiscussion in Sect. 4).Interestingly, use of the u − r colour does not of itself lead tovery pure samples, as the purity of, e.g., the combinations( u − r ,log( M ∗ )) and ( u − r , M i ) is only ∼ .
6, while similarcombinations (e.g., (log( r e ), log( M ∗ )) attain much greatervalues. In addition, the completeness attained by usingthe u − r colour is strongly dependent upon the secondparameter used. If the second parameter is more bimodal,e.g. log( µ ∗ ), the combination provides good purity andcompleteness, while the completeness drops for parameterswith less separation of the populations (e.g. M i ). Similarly,the S´ersic index is less efficient than expected, as thebijective discrimination power of the combinations of log( n )with log( M ∗ ) and M i (but also u − r ), is low comparedto that attained in combination with log( r e ) and log( µ ∗ ).Overall, the combination (log( n ),log( r e )) has the greatestbijective discrimination power ( P bij = 0 . µ ∗ ), e ) with ( P bij = 0 . r e ),log( M ∗ ), (log( r e ),log( µ ∗ )),and (log( n ),log( µ ∗ )) all with P bij ≈ .
5. Amongst thesecombinations (log( n ),log( r e )) and (log( n ),log( µ ∗ )) havethe lowest values of contamination by ellipticals with P cont . NAIRsample , using boththe independent morphological classifications of Nair &Abraham (2010) and the GALAXY ZOO visual classifica-tions.Overall, the purity of the selections obtained when testingthe parameter combinations using the
NAIRsample withGALAXY ZOO visual classifications is greater than for the
OPTICALsample with values of P pure ∼ . − .
9, indi-cating, that some of the ’impurities’ in the selections fromthe
OPTICALsample are very likely unreliably classifiedspirals. On the other hand, the fractional completenessof the selections is of order 0 . − . OPTICALsample . An exception to this are the combina-tions including e , for which the fractional completeness is ∼ . e in the OPTICALsample which is not present in the
NAIRsample . As for the
OPTI- c (cid:13) , 1–39 M. W. Grootes, et al. e /kpc )01000200030004000500060008 9 10 11 12log(M ∗ / M Ο • )0100020003000 5 6 7 8 9 10log( µ ∗ / M Ο • kpc )01000200030004000 -24 -22 -20 -18 -16M i Figure 3.
Distribution of the parameters in the entire
OPTICALsample (dashed) and the calibration sample as defined in Sect. 3.2 forthe population of spirals (blue), ellipticals (red), non-spirals (green), and undefined (orange). The distributions of the whole sample andthe calibration subsample are nearly indistinguishable as differences are smaller than the line width. e /kpc )01000200030008 9 10 11 12log(M ∗ / M Ο • )050010001500 5 6 7 8 9 10log( µ ∗ / M Ο • kpc )0500100015002000 -24 -22 -20 -18 -16M i Figure 4.
Distribution of the parameters in
NUVsample (dashed) and NUV the calibration sample as defined in Sect. 3.2 (solid) for thepopulation of spirals (blue), ellipticals (red), non-spirals (green), and undefined (orange). The distributions of the whole sample and thecalibration sample are nearly indistinguishable.
CALsample , the parameter combination with the greatestbijective discrimination power is (log( n ,log( r e )). Unlike forthe OPTICALsample , however, the combination with thesecond largest value of P bij is (log( n ),log( µ ∗ )), which alsoattains the lowest value of contamination by ellipticals,rather than (log( µ ∗ ), e ) (likely due to the removal of theellipticity bias as previously discussed). As for the OPTI-CALsample the 5 combinations with the highest valuesof P bij ((log( n ),log( r e )), (log( n ),log( µ ∗ ), ( u − r ,log( µ ∗ )), (log( r e ),log( M ∗ )), (log( µ ∗ ), M i )) all include either log( r e )or log( µ ). Furthermore, log( n ) again leads to very pureand complete selections in combination with log( r e ) orlog( µ ∗ ). In addition, its efficiency in combination with otherparameters is also increased (e.g., (log( n ), M i )).Testing using the NAIRsample with the independent clas-sifications of Nair & Abraham (2010) leads to very similarresults. However, the fractional purity of the selectionsis even larger, further underscoring the conclusion that a c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals Table 1. N sel , P pure , P comp , P bij , P cont for combinations of twoparameters applied to the OPTICALsample .Parameter combination N sel P pure P comp P bij P cont ( u − r ,log( n )) 67436 0.617 0.655 0.404 0.060( u − r ,log( r e )) 57168 0.710 0.639 0.453 0.054( u − r ,log( M ∗ )) 63194 0.580 0.577 0.334 0.084( u − r ,log( µ ∗ )) 65254 0.690 0.709 0.489 0.054( u − r , M i ) 61275 0.584 0.563 0.329 0.079( u − r , e ) 47567 0.719 0.538 0.387 0.042(log( n ),log( r e )) 64179 0.724 0.731 0.529 0.032(log( n ),log( M ∗ ) 67304 0.623 0.660 0.412 0.055(log( n ),log( µ ∗ ) 67026 0.688 0.726 0.499 0.027(log( n ), M i n ), e ) 55547 0.685 0.599 0.410 0.038(log( r e ),log( M ∗ )) 63985 0.711 0.716 0.509 0.048(log( r e ),log( µ ∗ )) 61678 0.721 0.700 0.504 0.048(log( r e ), M i ) 61263 0.699 0.674 0.471 0.071(log( r e ), e ) 44938 0.760 0.538 0.409 0.051(log( M ∗ ),log( µ ∗ )) 60231 0.724 0.686 0.496 0.040(log( M ∗ ), M i ) 45243 0.578 0.412 0.238 0.069(log( M ∗ ), e ) 34862 0.737 0.405 0.298 0.062(log( µ ∗ ), M i ) 65086 0.697 0.714 0.497 0.049(log( µ ∗ ), e ) 66627 0.710 0.744 0.528 0.035( M i , e ) 35006 0.730 0.402 0.293 0.072 large contribution to the ’impurity’ of the selections is dueto unreliably classified spirals. which also has amongst thelowest contamination by ellipticals. The combinations withthe highest bijective discrimination power again includeeither log( r e ), log( µ ∗ ), and/or log( n ), supporting theprevious findings.Overall, the parameters log( µ ∗ ), log( r e ), and log( n ) appearto be most efficient at selecting pure and complete samplesof spirals. While the performance of selections using only two param-eters is already encouraging, it seems likely that the purityand completeness, and hence the bijective discriminationpower, as well as the fractional contamination, can beimproved by using more information in the selection, i.e. byusing a third parameter.Tables 3 and 4 show the figures of merit achieved whentesting using the
OPTICALsample and the
NAIRsample ,respectively, for all 35 unique combinations of three pa-rameters drawn from the suite applicable to optical samples.Testing the performance of different combinations ofthree parameters using the
OPTICALsample , we find thatboth the purity and completeness attained are greater, onaverage, than for combinations of two parameters, as shownin Table 3. In most cases, the use of additional informationin the form of a third parameter leads to a simultaneousincrease in purity and completeness. In some cases, however,the deprojection along the additional third axis can lead tothe inclusion of more parameter space, causing an increaseof completeness at the cost of a decrease in purity or, viceversa, to the exclusion of parameter space, increasing purityat the expense of completeness (e.g., (log( r e ),log( M ∗ )) with P pure = 0 .
711 & P comp = 0 .
716 and (log( r e ),log( M ∗ ), M i ) with P pure = 0 .
707 & P comp = 0 . n ), M i ) with P pure = 0 .
615 & P comp = 0 .
694 and(log( n ), M i , e ) with P pure = 0 .
708 & P comp = 0 . e attain high values of purity(13/15 with P pure > .
7, and 6/15 with P pure > . r e ) and log( µ ∗ )) also attain very highvalues of completeness ( & . P bij (of the 10 combinations with the highest values of P bij ,the first 6 include e ). However, as for the combinationsof two parameters, these high values of completeness arepartially due to the ellipticity bias of the OPTICALsample .We will discuss the performance of these combinations onthe basis of tests using the
NAIRsample below. However,we note that all six combinations with the highest valuesof P bij include log( r e ) and/or log( µ ∗ ) . The remaining fourparameter combinations of the 10 with the highest valuesof P bij are (in descending order) (log( n ),log( r e ), M i ) with P bij = 0 . n ),log( r e ),log( µ ∗ )) with P bij = 0 . n ),log( M ∗ ),log( µ ∗ )) with P bij = 0 . n ),log( r e ),log( M ∗ )) with P bij = 0 . r e ) and/or log( µ ∗ ) in additionto log( n ), indicating the potential of these parameters toselect pure and complete samples of spirals. In additionthese four combinations exhibit the lowest contaminationby ellipticals with P cont . .
02. As for combinations of twoparameters, however, log( n ) is only efficient in combinationwith another efficient parameter. The same is true for theparameter u − r colour. Finally, the parameters M i , andlog( M ∗ ), are efficient in combination with combinations oflog( r e ), log( µ ∗ ), and log( n ).Testing the performance of three-parameter combina-tions using the NAIRsample with GALAXY ZOO visualclassifications (Table 4), we again find again find that thevalues of P pure and P comp are greater than for combinationsof two parameters. Comparison of the values of purity withthose obtained for the OPTICALsample also again indicatethat a fraction of the ’impurity’ arises from the unreliableclassification of spirals.Of the 10 combinations with the highest values of P bij none include e , indicating that the high values attainedfor the OPTICALsample are, at least partially, due tothe ellipticity bias. In descending order, the combina-tions with the greatest bijective discrimination powerare (log( n ),log( r e ),log( µ ∗ )), (log( n ),log( M ∗ ),log( µ ∗ )), (log( n ),log( µ ∗ ), M i ), (log( n ),log( r e ), M i ), and(log( n ),log( r e ),log( M ∗ )), supporting the results obtainedusing the OPTICALsample .Testing using the
NAIRsample with the independentclassifications of Nair & Abraham (2010) again leadsto very similar results. In terms of choice of the mosteffective parameters, the 5 parameter combinations withthe greatest values of P bij are the same as found whenusing the GALAXY ZOO visual classifications, althoughthe combination with the overall greatest bijective dis-crimination power is (log( n ),log( µ ∗ ), M i ) rather than(log( n ),log( r e ),log( µ ∗ )). c (cid:13) , 1–39 M. W. Grootes, et al.
Table 2. N sel , P pure , P comp , P cont , and P bij for combinations of two parameters applied to NAIRsample using the GALAXY ZOO visualclassifications (columns 3-6) and the independent classifications of Nair & Abraham (2010, columns 7-9). In the case of the independentclassifications the contamination fraction is taken to be the complement of the purity (i.e. this includes sources with T-type = 99).GALAXY ZOO Nair & Abraham (2010)Parameter combination N sel P pure P comp P bij P cont P pure P comp P bij ( u − r , log( n )) 2104 0.839 0.601 0.505 0.048 0.923 0.575 0.530( u − r , log( r e )) 1828 0.882 0.549 0.485 0.040 0.9234 0.496 0.458( u − r , log( M ∗ )) 1856 0.799 0.505 0.403 0.075 0.883 0.481 0.425( u − r , log( µ ∗ )) 2053 0.884 0.618 0.546 0.030 0.950 0.572 0.544( u − r , M i ) 1815 0.803 0.496 0.398 0.068 0.888 0.473 0.420( u − r , e ) 1111 0.832 0.315 0.262 0.038 0.926 0.302 0.280(log( n ), log( r e )) 2479 0.821 0.693 0.569 0.086 0.874 0.641 0.560(log( n ), log( M ∗ ) 2173 0.824 0.609 0.502 0.055 0.904 0.581 0.525(log( n ), log( µ ∗ ) 2124 0.873 0.631 0.551 0.023 0.950 0.597 0.567(log( n ), M i n ), e ) 1435 0.833 0.407 0.339 0.033 0.929 0.394 0.366(log( r e ), log( M ∗ )) 2006 0.893 0.610 0.545 0.026 0.947 0.558 0.528(log( r e ), log( µ ∗ )) 1948 0.901 0.598 0.538 0.024 0.956 0.546 0.523(log( r e ), M i ) 1868 0.866 0.551 0.477 0.050 0.926 0.507 0.469(log( r e ), e ) 1354 0.792 0.365 0.289 0.091 0.854 0.339 0.290(log( M ∗ ), log( µ ∗ )) 1858 0.906 0.573 0.519 0.021 0.959 0.523 0.502(log( M ∗ ), M i ) 1351 0.827 0.380 0.314 0.057 0.899 0.356 0.320(log( M ∗ ), e ) 798 0.786 0.213 0.168 0.056 0.905 0.212 0.192(log( µ ∗ ), M i ) 2012 0.891 0.610 0.543 0.027 0.953 0.562 0.535(log( µ ∗ ), e ) 1880 0.874 0.559 0.489 0.023 0.950 0.522 0.497( M i , e ) 793 0.784 0.212 0.166 0.067 0.898 0.209 0.187 Overall we find that the optimum results in termsof purity and simultaneous completeness for optical sam-ples are obtained by combinations of three parametersincluding log( r e ), log( µ ∗ ), log( n ), and log( M ∗ ) or M i ,notably (log( n ),log( r e ),log( µ ∗ )), (log( n ),log( r e ), M i ), and(log( n ),log( µ ∗ ), M i ). Spirals are very often found to be systems with on-goingstar formation, consequently possessed of a younger stel-lar population emitting in the UV (FUV and NUV) anddisplaying blue UV/optical colours. Early-type galaxies onthe other hand are generally found to be more quiescentand redder. Where available, the use of UV properties ofsources may thus prove efficient in the selection of spiralgalaxies. Similarly, a pre-selection on UV emission will en-hance the purity of a sample of star-forming spiral galaxies,at the expense of removing UV-faint, quiescent spirals. Inthe following we investigate the performance of selectionsusing parameters which can be applied to samples prese-lected on the availability of NUV data (the
NUVsample and
NUVNAIRsample in this case), i.e.
NUV − r colour, log( n ),log( r e ), log( M ∗ ), log( µ ∗ ), M i , and e . The figures of meritinvolving completeness P comp and P bij are given in relationto the NUV preselected samples ( P comp , n and P bij , n ) and tothe optical samples for comparison ( P comp , o and P bij , o ). Tables 5 and 6 show the figures of merit for all 21 uniquecombinations of two parameters applied to the NUV preselected samples.Testing using the
NUVsample , the combinationswith the greatest values of P bij , n are (log( µ ∗ ), e ) with P bij , n = 0 .
542 (although the completeness may be in-fluenced by the ellipticity bias), (log( r e ),log( M ∗ )) with P bij , n = 0 . n ),log( r e )) with P bij , n = 0 . r e ),log( µ ∗ )) with P bij , n = 0 . r e ), M i )with P bij , n = 0 . r e ) and log( µ ∗ )again result in the most simultaneously pure and completesamples, particularly in combination with log( M ∗ ), M i , orlog( n ). In particular log( µ ∗ ) leads to selections with highpurity (4/5 with P pure > . P pure > . NUV − r colour and S´ersic index are less efficientat selecting pure and complete samples than expected, onlyattaining values of P pure & . NUV − r colourdoes, however, predominantly lead to samples with highcompleteness ( & . M ∗ )and M i .Making use of the NUVNAIRsample with GALAXYZOO visual classifications we find that the combina-tions with the greatest bijective discrimination power are(
NUV − r ,log( r e )) with P bij , n = 0 . NUV − r ,log( M ∗ ))with P bij , n = 0 .
612 and (
NUV − r , M i ) with P bij , n = 0 . n ,log( r e )) with P bij , n = 0 .
568 and(log( n ,log( µ ∗ )) with P bij , n = 0 . NUV − r and a marginally efficient parameter applied to theNUV preselected sample leads to highly complete sam-ples ( P comp , n ∼ . NUV − r in combinationwith efficient parameters leads to pure samples ( e.g.( NUV − r ,log( µ ∗ )) with P pure = 0 . µ ∗ ) all result in very pure samples with P pure > . c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals Table 3. N sel , P pure , P comp , P bij , and P cont for combinations of three parameters applied to the OPTICALsample .Parameter combination N sel P pure P comp P bij P cont ( u − r , log( n ), log( r e )) 65154 0.724 0.743 0.539 0.024( u − r , log( n ), log( M ∗ )) 69906 0.625 0.688 0.430 0.058( u − r , log( n ), log( µ ∗ )) 66453 0.709 0.741 0.526 0.033( u − r , log( n ), M i ) 70880 0.623 0.695 0.433 0.058( u − r , log( n ), e ) 60259 0.682 0.647 0.442 0.042( u − r , log( r e ), log( M ∗ )) 65727 0.713 0.737 0.525 0.038( u − r , log( r e ), log( µ ∗ )) 63633 0.720 0.721 0.520 0.042( u − r , log( r e ), M i ) 67015 0.710 0.749 0.532 0.047( u − r , log( r e ), e ) 63993 0.764 0.770 0.588 0.022( u − r , log( M ∗ ), log( µ ∗ )) 62888 0.719 0.712 0.512 0.039( u − r , log( M ∗ ), M i ) 64714 0.582 0.593 0.345 0.082( u − r , log( M ∗ ), e ) 56811 0.701 0.626 0.439 0.045( u − r , log( µ ∗ ), M i ) 62289 0.720 0.706 0.508 0.037( u − r , log( µ ∗ ), e ) 66140 0.735 0.766 0.563 0.023( u − r , M i , e ) 56083 0.713 0.629 0.449 0.045(log( n ), log( r e ), log( M ∗ )) 65708 0.738 0.764 0.564 0.018(log( n ), log( r e ), log( µ ∗ )) 66581 0.739 0.774 0.572 0.017(log( n ), log( r e ), M i ) 66937 0.740 0.779 0.576 0.021(log( n ), log( r e ), e ) 60988 0.776 0.745 0.577 0.019(log( n ), log( M ∗ ), log( µ ∗ )) 67149 0.731 0.773 0.565 0.019(log( n ), log( M ∗ ), M i ) 68977 0.624 0.678 0.423 0.052(log( n ), log( M ∗ ), e ) 58955 0.692 0.643 0.445 0.042(log( n ), log( µ ∗ ), M i ) 68151 0.716 0.768 0.549 0.018(log( n ), log( µ ∗ ), e ) 67837 0.715 0.763 0.546 0.020(log( n ), M i , e ) 57541 0.708 0.641 0.454 0.036(log( r e ), log( M ∗ ), log( µ ∗ )) 63189 0.717 0.713 0.511 0.044(log( r e ), log( M ∗ ), M i ) 66491 0.706 0.739 0.521 0.052(log( r e ), log( M ∗ ), e ) 64608 0.754 0.767 0.579 0.027(log( r e ), log( µ ∗ ), M i ) 66374 0.707 0.739 0.523 0.055(log( r e ), log( µ ∗ ), e ) 65079 0.759 0.777 0.590 0.026(log( r e ), M i , e ) 58887 0.753 0.698 0.525 0.038(log( M ∗ ), log( µ ∗ ), M i ) 63574 0.713 0.713 0.509 0.045(log( M ∗ ), log( µ ∗ ), e ) 65408 0.754 0.776 0.585 0.027(log( M ∗ ), M i , e ) 49084 0.686 0.530 0.363 0.061(log( µ ∗ ), M i , e ) 66104 0.745 0.775 0.577 0.033 usually, however, at the cost of completeness.Using the independent morphological classifications of Nair& Abraham (2010) we obtain very similar results, with themost bijectively powerful combinations including NUV − r with M i , log( M ∗ ), or log( r e ) followed by those combininglog( n ), log( r e ), and log( µ ∗ ).For the bright subsample of Nair & Abraham (2010) NUV − r efficiently selects pure and complete samples ofspirals, however, the efficiency of the parameters log( M ∗ )and log( r e ) also remains high.Overall, the parameters log( n ), log( r e ), and log( µ ∗ ) appearefficient in selecting pure and complete samples of spiralsas for optical samples. In addition, the NUV − r colour incombination with NUV preselection is also efficient in thisrespect.A comparison of the figures of merit of the selectionsapplied to the NUV pre-selected samples with those ofcomparable parameter combinations applied to the opticalsamples indicates that the use of such a preselectionenhances the ability of the method to select pure andcomplete samples of spirals, with P bij , n being, on average,greater than P bij for comparable parameter combinationsapplied to the optical samples. This is due to the NUV pre- selection removing non-spiral contaminants, thus enlargingthe spiral subvolume by making spirals more dominant andincreasing the purity of spiral cells. In many cases both thecompleteness and the purity of the selections increase (e.g.,(log( r e ), log( M ∗ ))). However, in some cases the increase incompleteness is accompanied by a (slight) decrease in thepurity, indicating that the enlargement of parameter spaceis the dominant effect.Nevertheless, it must be born in mind that these samplesare complete with respect to the preselected sample andmay be biased against intrinsically UV faint spiral galaxiesas well as strongly attenuated spirals seen edge-on if thesesources lie below the NUV detection threshold. Application of combinations of three parameters to theNUV preselected samples has much the same effect as forthe optical samples, i.e. the purity and completeness, andconsequently the bijective discrimination power, increasewith respect to selections based on two parameters. Thesame processes as discussed in Sect. 4.1.2 apply. Tables 7and 8 show the figures of merit for combinations of three c (cid:13) , 1–39 M. W. Grootes, et al.
Table 4. N sel , P pure , P comp , P cont , and P bij for combinations of three parameters applied to NAIRsample using the GALAXY ZOOvisual classifications (columns 3-6) and the independent classifications of Nair & Abraham (2010, columns 7-9). In the case of theindependent classifications the contamination fraction is taken to be the complement of the purity (i.e. this includes sources with T-type= 99). GALAXY ZOO Nair & Abraham (2010)Parameter combination N sel P pure P comp P bij P cont P pure P comp P bij ( u − r , log( n ), log( r e )) 2339 0.867 0.690 0.598 0.041 0.925 0.640 0.592( u − r , log( n ), log( M ∗ )) 2280 0.829 0.643 0.533 0.053 0.910 0.614 0.559( u − r , log( n ), log( µ ∗ )) 2270 0.872 0.674 0.588 0.033 0.941 0.632 0.595( u − r , log( n ), M i ) 2353 0.826 0.662 0.546 0.052 0.909 0.633 0.576( u − r , log( n ), e ) 1627 0.846 0.469 0.396 0.030 0.930 0.448 0.416( u − r , log( r e ), log( M ∗ )) 2100 0.897 0.641 0.575 0.020 0.951 0.587 0.558( u − r , log( r e ), log( µ ∗ )) 2068 0.894 0.630 0.563 0.024 0.951 0.577 0.549( u − r , log( r e ), M i ) 2059 0.888 0.622 0.553 0.030 0.944 0.571 0.538( u − r , log( r e ), e ) 1872 0.888 0.566 0.502 0.017 0.947 0.521 0.493( u − r , log( M ∗ ), log( µ ∗ )) 1995 0.896 0.609 0.546 0.022 0.956 0.560 0.535( u − r , log( M ∗ ), M i ) 2066 0.809 0.569 0.460 0.071 0.886 0.537 0.476( u − r , log( M ∗ ), e ) 1375 0.834 0.391 0.326 0.038 0.919 0.371 0.341( u − r , log( µ ∗ ), M i ) 1992 0.896 0.608 0.545 0.020 0.958 0.560 0.536( u − r , log( µ ∗ ), e ) 1932 0.893 0.587 0.524 0.019 0.962 0.546 0.525( u − r , M i , e ) 1452 0.842 0.416 0.351 0.035 0.915 0.390 0.356(log( n ), log( r e ), log( M ∗ )) 2319 0.881 0.696 0.613 0.024 0.941 0.646 0.608(log( n ), log( r e ), log( µ ∗ )) 2364 0.884 0.712 0.629 0.024 0.945 0.660 0.624(log( n ), log( r e ), M i ) 2360 0.879 0.706 0.621 0.032 0.935 0.652 0.610(log( n ), log( r e ), e ) 2142 0.867 0.632 0.548 0.045 0.920 0.582 0.536(log( n ), log( M ∗ ), log( µ ∗ )) 2347 0.885 0.707 0.626 0.024 0.946 0.657 0.621(log( n ), log( M ∗ ), M i ) 2283 0.833 0.647 0.539 0.049 0.908 0.613 0.557(log( n ), log( M ∗ ), e ) 1703 0.847 0.491 0.416 0.039 0.926 0.466 0.432(log( n ), log( µ ∗ ), M i ) 2363 0.881 0.709 0.625 0.020 0.950 0.664 0.631(log( n ), log( µ ∗ ), e ) 1989 0.873 0.591 0.516 0.019 0.953 0.560 0.534(log( n ), M i , e ) 1686 0.856 0.492 0.421 0.035 0.921 0.459 0.422(log( r e ), log( M ∗ ), log( µ ∗ )) 1983 0.901 0.608 0.548 0.023 0.955 0.556 0.531(log( r e ), log( M ∗ ), M i ) 2098 0.884 0.631 0.558 0.032 0.939 0.578 0.543(log( r e ), log( M ∗ ), e ) 1888 0.895 0.575 0.514 0.019 0.953 0.528 0.504(log( r e ), log( µ ∗ ), M i ) 2091 0.885 0.630 0.557 0.035 0.940 0.577 0.542(log( r e ), log( µ ∗ ), e ) 1908 0.899 0.584 0.525 0.018 0.958 0.536 0.514(log( r e ), M i , e ) 1731 0.870 0.513 0.446 0.034 0.932 0.473 0.441(log( M ∗ ), log( µ ∗ ), M i ) 1980 0.893 0.602 0.538 0.028 0.952 0.552 0.526(log( M ∗ ), log( µ ∗ ), e ) 1926 0.899 0.590 0.530 0.017 0.958 0.541 0.518(log( M ∗ ), M i , e ) 1447 0.838 0.413 0.346 0.048 0.909 0.430 0.391(log( µ ∗ ), M i , e ) 1922 0.900 0.589 0.530 0.017 0.957 0.539 0.516 parameters applied to the NUVsample and
NUVNAIRsam-ple . The combination of three parameters with the high-est value of P bij when applied to the NUVsample is(
NUV − r ,log( r e ), e ) with P bij , n = 0 .
617 ( P pure = 0 . P comp , n = 0 . e (and are likely affected by the ellipticity bias). However,all 10 combinations include log( r e ), log( µ ∗ ) and/or log( n ).The three most efficient parameter combinations notincluding e are (log( n ),log( r e ),log( µ ∗ )) ( P pure = 0 . P comp , n = 0 . n ),log( r e ), M i ) ( P pure = 0 . P comp , n = 0 . NUV − r ,log( r e ), M i ) ( P pure = 0 . P comp , n = 0 . NUVsample leads tovery complete selections. Of the combinations not including e P comp , n > .
7, 6 of which have P comp , n > . NUV − r in combination with at least oneefficient parameter leads to very complete selections with P comp , n & . NUVNAIRsample with GALAXYZOO visual classifications the most bijectively powerfulcombination is (
NUV − r ,log( r e ), e ) with P bij , n = 0 . P pure = 0 . P comp , n = 0 . e ).However, of the ten most efficient combinations, this isthe only one including e . The following 5 combinationswith the highest values of P bij , n are (in descending or-der): ( NUV − r ,log( n ),log( r e )), ( NUV − r ),log( r e ), M i ),(log( n ),log( r e ),log( M ∗ )), ( NUV − r ,log( n ),log( M ∗ )), and( NUV − r ,log( n ),log( µ ∗ )). Clearly NUV − r applied incombination with another efficient parameter and NUVpreselection leads to very pure and complete selectionsrecovered from the bright subsample. Similar purity, butat the cost of completeness is also achieved by the param-eter log( µ ∗ ), even without the parameter NUV − r (e.g.(log( r e ),log( M ∗ ),log( µ ∗ )). c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals Table 5.
Purity, completeness, bijective discrimination power, and contamination for combinations of two parameters applied to
NU-Vsample . Completeness and bijective discrimination power are listed w.r.t. the
OPTICALsample ( P comp , o and P bij , o ) and the NUVsample ( P comp , n and P bij , n ). Parameter combination N sel P pure P comp , n P bij , n P cont P comp , o P bij , o ( NUV − r , log( n )) 53285 0.603 0.678 0.408 0.069 0.506 0.305( NUV − r , log( r e )) 46791 0.722 0.713 0.514 0.042 0.532 0.384( NUV − r , log( M ∗ )) 56682 0.581 0.695 0.404 0.082 0.518 0.301( NUV − r , log( µ ∗ )) 47516 0.717 0.719 0.516 0.031 0.536 0.385( NUV − r , M i ) 55825 0.582 0.685 0.399 0.081 0.511 0.298( NUV − r , e ) 40000 0.714 0.603 0.431 0.041 0.450 0.321(log( n ), log( r e )) 46867 0.731 0.723 0.529 0.033 0.540 0.395(log( n ), log( M ∗ ) 53124 0.608 0.681 0.414 0.063 0.508 0.309(log( n ), log( µ ∗ ) 51284 0.688 0.744 0.512 0.032 0.555 0.382(log( n ), M i n ), e ) 37343 0.705 0.556 0.392 0.044 0.415 0.293(log( r e ), log( M ∗ )) 47184 0.731 0.727 0.532 0.039 0.543 0.397(log( r e ), log( µ ∗ )) 45305 0.741 0.708 0.525 0.036 0.529 0.392(log( r e ), M i ) 49531 0.707 0.739 0.523 0.070 0.552 0.390(log( r e ), e ) 40215 0.734 0.623 0.457 0.083 0.465 0.341(log( M ∗ ), log( µ ∗ )) 44472 0.742 0.696 0.517 0.032 0.520 0.386(log( M ∗ ), M i ) 38529 0.567 0.461 0.262 0.097 0.344 0.195(log( M ∗ ), e ) 28449 0.731 0.439 0.321 0.075 0.327 0.239(log( µ ∗ ), M i ) 47342 0.718 0.717 0.515 0.037 0.535 0.384(log( µ ∗ ), e ) 49323 0.721 0.751 0.542 0.030 0.560 0.404( M i , e ) 24399 0.767 0.395 0.302 0.061 0.294 0.226 Table 6.
Purity, completeness, bijective discrimination power, and contamination for combinations of two parameters applied to
NU-VNAIRsample using the GALAXY ZOO visual classifications (columns 3-8) and the independent classifications of Nair & Abraham(2010, columns 9-13). Completeness and bijective discrimination power are listed w.r.t. the
OPTICALsample ( P comp , o and P bij , o ) andthe NUVsample ( P comp , n and P bij , n ). In the case of the independent classifications the contamination fraction is taken to be thecomplement of the purity (i.e. this includes sources with T-type = 99).GALAXY ZOO Nair & Abraham (2010)Parameter combination N sel P pure P comp , n P bij,n P cont P comp , o P bij,o P pure P comp , n P bij,n P comp , o P bij,o ( NUV − r , log( n )) 1551 0.853 0.607 0.518 0.053 0.450 0.384 0.919 0.565 0.519 0.418 0.384( NUV − r , log( r e )) 1801 0.869 0.719 0.624 0.044 0.533 0.463 0.914 0.650 0.594 0.483 0.441( NUV − r , log( M ∗ )) 1970 0.822 0.744 0.612 0.064 0.552 0.454 0.895 0.695 0.622 0.517 0.463( NUV − r , log( µ ∗ )) 1497 0.888 0.611 0.543 0.030 0.453 0.402 0.948 0.560 0.531 0.416 0.394( NUV − r , M i ) 1950 0.824 0.738 0.608 0.064 0.547 0.451 0.896 0.689 0.617 0.512 0.459( NUV − r , e ) 1127 0.859 0.444 0.382 0.031 0.330 0.283 0.933 0.415 0.387 0.308 0.287(log( n ), log( r e )) 1790 0.831 0.683 0.568 0.084 0.507 0.421 0.879 0.623 0.548 0.461 0.405(log( n ), log( M ∗ ) 1591 0.813 0.594 0.482 0.069 0.440 0.358 0.894 0.564 0.504 0.417 0.373(log( n ), log( µ ∗ )) 1616 0.873 0.648 0.566 0.032 0.480 0.419 0.942 0.603 0.568 0.446 0.421(log( n ), M i n ), e ) 944 0.815 0.353 0.288 0.049 0.262 0.213 0.915 0.342 0.313 0.253 0.232(log( r e ), log( M ∗ )) 1512 0.900 0.625 0.562 0.026 0.463 0.417 0.950 0.567 0.539 0.421 0.400(log( r e ), log( µ ∗ )) 1447 0.902 0.599 0.540 0.025 0.444 0.401 0.956 0.546 0.522 0.405 0.388(log( r e ), M i ) 1630 0.842 0.630 0.531 0.075 0.467 0.394 0.890 0.572 0.509 0.425 0.378(log( r e ), e ) 1488 0.728 0.498 0.363 0.160 0.369 0.269 0.776 0.456 0.354 0.339 0.263(log( M ∗ ), log( µ ∗ )) 1387 0.906 0.577 0.523 0.021 0.428 0.388 0.960 0.525 0.504 0.390 0.374(log( M ∗ ), M i ) 1263 0.792 0.459 0.364 0.097 0.340 0.270 0.859 0.428 0.368 0.318 0.273(log( M ∗ ), e ) 728 0.731 0.244 0.178 0.092 0.181 0.132 0.865 0.249 0.215 0.185 0.160(log( µ ∗ ), M i ) 1488 0.898 0.613 0.551 0.026 0.455 0.408 0.953 0.559 0.533 0.416 0.396(log( µ ∗ ), e ) 1397 0.886 0.568 0.504 0.022 0.422 0.374 0.953 0.525 0.500 0.390 0.372( M i , e ) 631 0.751 0.218 0.163 0.094 0.161 0.121 0.876 0.218 0.191 0.162 0.142 Testing using the
NUVNAIRsample with the independentmorphological classifications of Nair & Abraham (2010)supports the importance of
NUV − r as a parameter forselecting pure and complete samples of spirals under NUVpreselection. The combinations with the largest bijective discrimination power are ( NUV − r ,log( n ),log( M ∗ )),( NUV − r ,log( n ),log( r e )), and ( NUV − r ,log( r e ), e ), withthe use of NUV − r leading to very complete samples, asvisible in the comparison of ( NUV − r ,log( n ),log( r e )) with c (cid:13) , 1–39 M. W. Grootes, et al. (log( n ),log( r e ),log( µ ∗ )), or (log( n ),log( r e ), M i ).To summarize, we find that for NUV preselectedsamples the use of NUV − r as a parameter leads to verycomplete, and in the case of the bright subsample of Nair &Abraham (2010) also pure, selections of spiral galaxies. Thisis particularly the case in combination with log( r e ) andlog( n ), while combinations with log( µ ∗ ) are also efficient,but mostly improve the purity of selections at the expenseof completeness. A comparison of the figures of merit forcomparable parameter combinations applied to the opticaland NUV samples shows, as for the combinations of twoparameters, that the use of NUV preselection increasesboth purity and completeness on average. We again note,however, that the values of completeness are with respectto the NUV samples, and will be biased against UV-faintsources (these may be intrinsically UV faint or UV faint dueto being seen edge-on and experiencing severe attenuationdue to dust).Overall, the parameters log( r e ), log( µ ∗ ), and log( n )appear efficient at selecting pure and complete samplesof spirals, as for the optical samples. Under NUV pres-election however, the NUV − r colour becomes efficientat selecting complete and pure spiral samples, much moreso that the u − r colour for the optical samples. Themost efficient combinations include ( NUV − r ,log( r e ), e ),( NUV − r ,log( n ),log( r e )), and (log( n ),log( r e ),log( µ ∗ )). As shown in Sect. 4.2.2, the use of NUV preselectionresults, on average, in samples with greater completenessand often also greater purity for comparable combinationsof selection parameters. Under NUV preselection theparameter
NUV − r leads to efficient selections of completesamples of spirals, while attaining high values of purityfor the bright subsample. As spiral galaxies are often starforming systems, this result is unsurprising. However, asdiscussed, NUV preselection will bias samples of spiralsagainst intrinsically UV-faint systems, as well as againstsystems which are UV-faint due to severe attenuation (e.g.on account of being seen edge-on).Overall, the efficiency of the considered parameter com-binations in selecting pure and complete (under theaforementioned caveat) samples is enhanced by NUV pres-election, with larger volumes of the parameter space beingincluded in the spiral volume than for the whole sample,as indicated by increases in completeness accompaniedby slight reductions in purity when using comparableparameter combinations with and without preselection. Inaddition, especially for combinations of three parameters,NUV preselection can also lead to an increase in purityaccompanied by a decrease in completeness, as regionsmarginally dominated by spirals in the whole sample areexcluded. On average, however, in both cases the value of P bij , n is larger than P bij for a comparable parameter com-bination applied to the OPTICALsample . Thus, dependingupon the science goal of the selection, UV information couldbe a valuable asset in selecting samples of spirals. However,we caution that, in addition to the biases previously dis- cussed, if the depth of the UV coverage is not such that itmatches the depth of the optical data and encompasses theentire (realistic) colour range, UV preselection will stronglysuppress the completeness attainable and introduce biasesinto any selections.In light of these effects, the greater completeness of usingonly optical parameters applied to optical samples, asevidenced by the values of P comp,o in, for example, Table 7and the robustness against bias will likely outweigh thegain in purity achievable by NUV preselection for mostapplications. Based on the figures of purity, completeness, and bijectivediscrimination power it is readily apparent that the useof combinations of three parameters generally leads topurer and simultaneously more complete samples of spiralsthan using only two parameters. Furthermore, the mostimportant parameters appear to be log( r e ) & log( µ ), whichprovide the most efficient selection when complemented bylog( n ) and/or M i . Applying an NUV preselection appears tofurther improve the attainable purity, and makes NUV − r a further important selection parameter. However, althoughthe purity, completeness, and bijective discriminationpower are good indicators of a selection’s performance,they provide little information about possible biases inthe selections. While the cell-based method allows for aflexible surface of separation, any boundary in parameterspace used in classifying objects entails that reliable spiralswith strongly outlying values in the selection parametersmay be missed, and that the selection may not be fullyrepresentative of the actual population of spirals.In the following we will investigate the potential biasescaused by the selection on the basis of four different repre-sentative combinations of three parameters (( u − r ,log( r e ), e )resp. ( NUV − r ,log( r e ), e ), (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ,log( M ∗ ),log( µ ∗ ))), cho-sen to be amongst the most bijectively powerful. Wewill consider the distributions of the suite of parametersinvestigated for these selections, as well as consider the thedistributions of the H α equivalent width as an independentobservable and the T-type classification given by Nair& Abraham (2010) to investigate possible biases in theselections of spiral galaxies. Finally, we will investigate theredshift dependence of the selections of spiral galaxies. Figs. 5 & 6 show the normalized distributions of all eightparameters in the suite investigated, after selection byfour different representative combinations of three param-eters (( u − r ,log( r e ), e ) resp. ( NUV − r ,log( r e ), e ) in red,(log( n ),log( r e ),log( µ ∗ )) in green, (log( n ),log( r e ), M i ) inblue, and (log( n ,log( M ∗ ),log( µ ∗ )) in orange), chosen to beamongst the most bijectively powerful, applied to both the OPTICALsample (Fig. 5) and to the
NAIRsample (Fig. 6).For comparison the parameter’s distribution for reliablespirals in the respective sample as defined by GALAXY c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals Table 7.
Purity, completeness, bijective discrimination power, and contamination for combinations of three parameters applied to
NUVsample . Completeness and bijective discrimination power are listed w.r.t. the
OPTICALsample ( P comp , o and P bij , o ) and the NUVsample ( P comp , n and P bij , n ).Parameter combination N sel P pure P comp , n P bij , n P cont P comp , o P bij , o ( NUV − r , log( n ), log( r e )) 50514 0.726 0.774 0.562 0.028 0.577 0.419( NUV − r , log( n ), log( M ∗ )) 56380 0.617 0.733 0.452 0.064 0.547 0.337( NUV − r , log( n ), log( µ ∗ )) 48707 0.716 0.736 0.527 0.032 0.549 0.39( NUV − r , log( n ), M i ) 56496 0.616 0.734 0.452 0.064 0.548 0.337( NUV − r , log( n ), e ) 43708 0.695 0.641 0.445 0.044 0.478 0.332( NUV − r , log( r e ), log( M ∗ )) 48885 0.736 0.759 0.559 0.029 0.567 0.417( NUV − r , log( r e ), log( µ ∗ )) 49163 0.737 0.765 0.564 0.029 0.571 0.421( NUV − r , log( r e ), M i ) 51151 0.731 0.789 0.577 0.033 0.589 0.430( NUV − r , log( r e ), e ) 48396 0.777 0.794 0.617 0.014 0.592 0.460( NUV − r , log( M ∗ ), log( µ ∗ )) 46269 0.746 0.728 0.543 0.029 0.543 0.405( NUV − r , log( M ∗ ), M i ) 56066 0.582 0.689 0.401 0.085 0.514 0.299( NUV − r , log( M ∗ ), e ) 43874 0.730 0.676 0.493 0.035 0.504 0.368( NUV − r , log( mu ∗ ), M i ) 48991 0.730 0.755 0.551 0.030 0.563 0.411( NUV − r , log( mu ∗ ), e ) 49430 0.748 0.780 0.583 0.015 0.582 0.435( NUV − r , M i , e ) 44092 0.734 0.683 0.501 0.033 0.509 0.374(log( n ), log( r e ), log( M ∗ )) 49304 0.744 0.773 0.575 0.020 0.577 0.429(log( n ), log( r e ), log( µ ∗ )) 49665 0.744 0.780 0.580 0.022 0.582 0.433(log( n ), log( r e ), M i ) 49054 0.749 0.775 0.580 0.023 0.578 0.433(log( n ), log( r e ), e ) 47441 0.765 0.766 0.586 0.029 0.571 0.437(log( n ), log( M ∗ ), log( µ ∗ )) 49945 0.736 0.775 0.571 0.020 0.579 0.426(log( n ), log( M ∗ ), M i ) 53302 0.611 0.687 0.420 0.062 0.513 0.313(log( n ), log( M ∗ ), e ) 41242 0.702 0.611 0.429 0.044 0.456 0.320(log( n ), log( µ ∗ ), M i ) 50378 0.719 0.764 0.550 0.019 0.570 0.410(log( n ), log( µ ∗ ), e ) 51054 0.715 0.770 0.551 0.026 0.575 0.411(log( n ), M i , e ) 42160 0.705 0.627 0.443 0.046 0.468 0.330(log( r e ), log( M ∗ ), log( µ ∗ )) 46264 0.738 0.721 0.532 0.033 0.538 0.397(log( r e ), log( M ∗ ), M i ) 48838 0.727 0.749 0.545 0.042 0.559 0.407(log( r e ), log( M ∗ ), e ) 48793 0.764 0.786 0.600 0.028 0.586 0.448(log( r e ), log( µ ∗ ), M i ) 48671 0.729 0.749 0.546 0.045 0.559 0.407(log( r e ), log( µ ∗ ), e ) 49571 0.762 0.797 0.607 0.027 0.595 0.453(log( r e ), M i , e ) 46084 0.757 0.736 0.556 0.043 0.549 0.415(log( M ∗ ), log( µ ∗ ), M i ) 47355 0.729 0.729 0.531 0.039 0.544 0.397(log( M ∗ ), log( µ ∗ ), e ) 49250 0.762 0.791 0.603 0.028 0.590 0.450(log( M ∗ ), M i , e ) 40952 0.698 0.603 0.421 0.065 0.450 0.314(log( µ ∗ ), M i , e ) 49331 0.757 0.787 0.596 0.031 0.588 0.445 ZOO is shown as a dash-dotted black line. Finally, theparameter’s distribution for reliable spirals as defined bythe independent morphological classifications of Nair &Abraham (2010), i.e. in the
NAIRsample , is shown as agrey dash-dotted line.Overall, the distributions of the parameters derived fromthe selections applied to the
OPTICALsample (Fig. 5)coincide well with that of the GALAXY ZOO defined sam-ple, indicating that the non-parametric method using threeparameters is neither heavily influencing the parameterranges available to the sample, nor is itself introducinglarge biases. Similarly, the parameter combinations for theselections applied to the
NAIRsample also agree well withthe parameter’s distributions as defined by the GALAXYZOO and Nair & Abraham (2010) visual classifications.Nevertheless, the effect of the individual choice of parametercombinations is visible in the distributions, with this beingmore pronounced for the application to the
NAIRsample .For example, all combinations involving log( n ) are biasedtowards lower values of this parameter than the visually de-fined samples, while the combination ( u − r ,log( r e ), e ) tracesthem with higher fidelity. The discontinuous steep fall-off towards redder u − r colours of the selection determined by( u − r ,log( r e ), e ) (most pronounced in the NAIRsample ), isalso an example of the effects of the discretization.The largest differences, both between the selections and thevisually-defined samples, as well as between the selectionsthemselves, are visible, however, in the distributions ofellipticity. While the distribution of e is more or less flatin the NAIRsample , as is to be expected for an unbiasedsample, the GALAXY ZOO-defined spiral subsample ofthe
OPTICALsample displays a bias towards high valuesof e . Using e as selection parameter, as in the combination( u − r ,log( r e ), e ), gives rise to a bias in the distribution of e for the selected sample as visible in Fig. 6, causing theselection provided by ( u − r ,log( r e ), e ) to largely coincidewith the GALAXY ZOO defined spiral sample for the OP-TICALsample . This bias may also give rise to the agreementbetween the
NUV − r colour distributions of the GALAXYZOO defined sample and the ( u − r ,log( r e ), e ) selectionin Fig. 5 (i.e. for the OPTICALsample ), which extend toredder colours than the other selections, as NUV emissionfrom highly inclined galaxies will be strongly attenuated,more so than in optical bands (e.g., Tuffs et al. 2004). In c (cid:13) , 1–39 M. W. Grootes, et al.
Table 8.
Purity, completeness, bijective discrimination power, and contamination for combinations of three parameters applied to
NAIRsample using the GALAXY ZOO visual classifications (columns 3-6) and the independent classifications of Nair & Abraham(2010, columns 7-9). Completeness and bijective discrimination power are listed w.r.t. the
NAIRsample ( P comp , o and P bij , o ) and the NUVNAIRsample ( P comp , n and P bij , n ). In the case of the independent classifications the contamination fraction is taken to be thecomplement of the purity (i.e. this includes sources with T-type = 99).GALAXY ZOO Nair & Abraham (2010)Parameter combination N sel P pure P comp , n P bij,n P cont P comp , o P bij,o P pure P comp , n P bij,n P comp , o P bij,o ( NUV − r , log( n ), log( r e )) 1879 0.864 0.745 0.644 0.047 0.553 0.477 0.915 0.681 0.623 0.504 0.461( NUV − r , log( n ), log( M ∗ )) 1934 0.841 0.747 0.628 0.055 0.554 0.466 0.906 0.694 0.629 0.514 0.465( NUV − r , log( n ), log( µ ∗ )) 1564 0.878 0.630 0.553 0.033 0.467 0.410 0.943 0.584 0.551 0.432 0.408( NUV − r , log( n ), M i ) 1906 0.839 0.735 0.617 0.055 0.545 0.457 0.902 0.681 0.615 0.504 0.455( NUV − r , log( n ), e ) 1299 0.856 0.511 0.437 0.038 0.379 0.324 0.928 0.478 0.443 0.354 0.328( NUV − r , log( r e ), log( M ∗ )) 1687 0.893 0.691 0.617 0.027 0.513 0.458 0.942 0.627 0.591 0.466 0.439( NUV − r , log( r e ), log( µ ∗ )) 1713 0.891 0.701 0.624 0.025 0.520 0.463 0.941 0.636 0.599 0.473 0.445( NUV − r , log( r e ), M i ) 1770 0.884 0.718 0.635 0.034 0.533 0.471 0.928 0.648 0.602 0.482 0.447( NUV − r , log( r e ), e ) 1705 0.908 0.711 0.645 0.014 0.527 0.479 0.956 0.643 0.615 0.478 0.457( NUV − r , log( M ∗ ), log( µ ∗ )) 1594 0.897 0.657 0.589 0.025 0.487 0.437 0.946 0.595 0.563 0.442 0.418( NUV − r , log( M ∗ ), M i ) 1970 0.815 0.737 0.601 0.069 0.547 0.446 0.887 0.690 0.612 0.512 0.455( NUV − r , log( M ∗ ), e ) 1478 0.884 0.600 0.531 0.020 0.445 0.394 0.941 0.549 0.516 0.408 0.384( NUV − r , log( mu ∗ ), M i ) 1647 0.888 0.672 0.597 0.029 0.498 0.442 0.943 0.613 0.578 0.455 0.429( NUV − r , log( mu ∗ ), e ) 1494 0.908 0.623 0.566 0.017 0.462 0.420 0.967 0.570 0.551 0.424 0.410( NUV − r , M i , e ) 1467 0.883 0.595 0.526 0.022 0.441 0.390 0.938 0.543 0.509 0.403 0.378(log( n ), log( r e ), log( M ∗ )) 1745 0.886 0.710 0.629 0.028 0.526 0.466 0.940 0.650 0.611 0.481 0.452(log( n ), log( r e ), log( µ ∗ )) 1736 0.885 0.705 0.624 0.028 0.523 0.463 0.940 0.646 0.607 0.478 0.449(log( n ), log( r e ), M i ) 1757 0.874 0.705 0.617 0.042 0.523 0.457 0.923 0.642 0.593 0.475 0.438(log( n ), log( r e ), e ) 1754 0.831 0.669 0.556 0.078 0.496 0.412 0.884 0.615 0.543 0.455 0.402(log( n ), log( M ∗ ), log( µ ∗ )) 1698 0.894 0.697 0.623 0.025 0.517 0.462 0.948 0.638 0.605 0.472 0.448(log( n ), log( M ∗ ), M i ) 1695 0.820 0.638 0.523 0.069 0.473 0.388 0.895 0.601 0.538 0.445 0.398(log( n ), log( M ∗ ), e ) 1189 0.834 0.455 0.380 0.049 0.338 0.282 0.918 0.432 0.396 0.320 0.293(log( n ), log( µ ∗ ), M i ) 1694 0.888 0.691 0.614 0.021 0.512 0.455 0.950 0.638 0.606 0.472 0.449(log( n ), log( µ ∗ ), e ) 1545 0.869 0.617 0.536 0.029 0.457 0.397 0.939 0.575 0.540 0.425 0.400(log( n ), M i , e ) 1307 0.828 0.497 0.411 0.060 0.368 0.305 0.896 0.464 0.416 0.343 0.308(log( r e ), log( M ∗ ), log( µ ∗ )) 1465 0.903 0.607 0.549 0.024 0.450 0.407 0.954 0.552 0.526 0.410 0.391(log( r e ), log( M ∗ ), M i ) 1567 0.886 0.637 0.564 0.036 0.473 0.419 0.936 0.579 0.542 0.430 0.403(log( r e ), log( M ∗ ), e ) 1528 0.889 0.624 0.554 0.026 0.462 0.411 0.944 0.569 0.537 0.423 0.399(log( r e ), log( µ ∗ ), M i ) 1567 0.880 0.633 0.557 0.041 0.470 0.413 0.934 0.577 0.539 0.429 0.400(log( r e ), log( µ ∗ ), e ) 1536 0.896 0.632 0.566 0.022 0.469 0.420 0.951 0.577 0.548 0.428 0.407(log( r e ), M i , e ) 1450 0.870 0.579 0.504 0.044 0.430 0.374 0.916 0.524 0.480 0.389 0.357(log( M ∗ ), log( µ ∗ ), M i ) 1516 0.888 0.618 0.549 0.032 0.458 0.407 0.942 0.563 0.531 0.419 0.394(log( M ∗ ), log( µ ∗ ), e ) 1556 0.894 0.639 0.571 0.021 0.474 0.423 0.951 0.584 0.555 0.434 0.413(log( M ∗ ), M i , e ) 1154 0.792 0.420 0.332 0.074 0.311 0.246 0.885 0.403 0.356 0.299 0.265(log( µ ∗ ), M i , e ) 1548 0.897 0.637 0.571 0.023 0.473 0.424 0.946 0.578 0.547 0.429 0.406 contrast to the selection using ( u − r ,log( r e ), e ), the otherinvestigated parameter combinations show distributionswhich are more or less flat in e , also justifying the use ofthe GALAXY ZOO sample as a calibration sample.Comparison of the distribution of the parameters in theselections applied to the OPTICALsample with those ofthe galaxies classified as spirals in the
NAIRsample usingthe classifications of Nair & Abraham (2010), shows asystematic difference in the distributions of the param-eters between these samples. Overall, the spiral galaxiesin the
NAIRsample are more weighted towards redder
NUV − r and u − r colours, as well as towards largervalues of log( M ∗ ) and log( µ ∗ ), and brighter i -band absolutemagnitudes. Furthermore, the distributions of log( n ) andlog( r e ) are weighted towards larger values of n and lowervalues of r e , respectively. The observable differences arelargely consistent with the bright NAIRsample ( g ′ -bandmag
16) being more weighted towards large spirals which,on average, are more massive and redder than lower mass spiral galaxies. Furthermore, they often also have moredominant bulges, increasing the values of n and decreasingthose of r e , while simultaneously decreasing the value of e ,in agreement with the observed distributions. However, thedifferences may also be due, in part, to the fact that thecell-based selection misses regions of parameter space whichare sparsely populated by spirals and in which they do notrepresent the dominant galaxy population. Nevertheless,Fig. 6 shows that the selections using combinations ofthree parameters trained on the GALAXY ZOO visualclassifications of the OPTICALsample perform well atrecovering the
NAIRsample .Fig. 7 shows the parameter distributions for thecombinations applied to the
NUVsample (we make useof (
NUV − r ,log( r e ), e ) instead of ( u − r ,log( r e ), e )). Theresults of applying the combinations to the NUVsample arenearly identical to those obtained for the
OPTICALsample .However, the use of NUV preselection does bias the selected c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals galaxy populations towards bluer objects as can be seenin the shift of the distributions of the u − r and to lesserextent the NUV − r colour, between Figs. 5 & 7. The useof NUV preselection and NUV − r colour also slightlylessens the bias against sources with low values of e selectedusing the combination ( NUV − r ,log( r e ), e ), rendering thedistribution in e of this selection flatter than that of theGALAXY ZOO defined sample. The overall similarity tothe results obtained for the optical samples show thatthe requirement of an NUV detection itself is only mildlyinfluencing the selections. α equivalent width as independentobservables Although the agreement between the parameter distribu-tions of the visually defined samples and the selections isvery good, the fact that a bias towards bluer u − r and NUV − r colours is discernible, and that the selectionsslightly favour lower values of log( n ) and log( µ ∗ ) and highervalues of log( r e ), raises the possibility that the selectionsmay nevertheless be biased against a subclass of spirals. T-type distributions of the
NAIRsample
In order to investigate to what extent such a bias maybe present, we first make use of the distributions of theT-type classifications of Nair & Abraham (2010). Fig. 8shows the normalized distributions of the T-type valuesfor the four selections, compared with the distributionsof the visually classified spiral samples (GALAXY ZOO:black, Nair & Abraham (2010):grey). The distribution ofthe T-types of galaxies classified as spirals by the selectionis shown in green, while the magenta line shows the T-typedistributions of the GALAXY ZOO defined reliable spiralslocated in spiral cells following the selection. For the
NAIRsample the GALAXY ZOO classifications (black solidline) appear moderately biased against early type spirals(mainly against Sa, and less against Sa/b). The selectionsbased on the combinations of three parameters (green line)display a similar, but more pronounced bias, favoring spiralgalaxies of type Sa/b, Sb and later, underscored by thestronger bias against early type spirals of GALAXY ZOOspirals in spiral cells (magenta line). Overall, the parameterbased selections recover relatively more earlier type spiralsthan the GALAXY ZOO classifications, in line with thefindings that a large fraction of the ’impurity’ arises fromspiral galaxies which fail to meet the P CS,DB > . T-type distributions of the
NUVNAIRsample
Fig. 9 shows the resultant distributions of T-types for the (u-r,log(r e ),e) -0.15-0.10-0.050.000.050.100.15 ∆ f r e l a t i v e f r equen cy f (log(n),log(r e ),log( µ ∗ )) -5 0 5 10T-Type (Nair & Abraham, 2010) (log(n),log(r e ),M i )-5 0 5 10T-Type (Nair & Abraham, 2010)-0.15-0.10-0.050.000.050.100.15 ∆ f r e l a t i v e f r equen cy f (log(n),log(M ∗ ),log( µ ∗ )) Figure 8.
Distribution of T-types for galaxies in the
NAIRsample classified as spirals based on the classifications of Nair & Abraham(2010) (gray), GALAXY ZOO (black), and the parameter combi-nation listed top left (green). The T-type distribution of galaxieswith P CS , DB > . located in cells associated with spiral galaxiesis shown in magenta. The inset panel below each distributionshows the distribution of the difference in relative frequency forthis galaxy type relative to those of the Nair & Abraham (2010)classifications. selections applied to the NUVNAIRsample (using
NUV − r rather than u − r ). Overall, the results are very similar,with both the GALAXY ZOO classified spirals and thespirals selected by the parameter combinations being moreweighted towards later type galaxies than the classificationsof Nair & Abraham (2010). We note the fact that the NUVNAIRsample is more weighted towards earlier typespirals than the
NAIRsample . H α Equivalent Width Distribution of the
NAIR-sample & NUVNAIRsample
A similar investigation of the possible bias against subclassesof spiral galaxies for the
OPTICALsample , respectively forthe
NUVsample , is not possible, as these lack independentvisual classifications and T-Types. However, to at leastgain a qualitative insight into the possible biases for theselarger samples, we make use of the distributions of H α equivalent width (EQW), an observable used neither in ourclassification nor in that supplied by GALAXY ZOO.Based on H α EQW, galaxies are often divided into twomain populations, ’line-emitting’ galaxies (i.e. galaxies with c (cid:13) , 1–39 M. W. Grootes, et al. e / kpc)0.00.20.40.60.81.01.28 9 10 11 12log(M ∗ / M Ο • )0.00.20.40.60.81.01.2 5 6 7 8 9 10log( µ ∗ / M Ο • kpc )0.00.20.40.60.81.01.2 -24 -22 -20 -18 -16M i Figure 5.
Normalized distribution of the suite of 8 parameters as recovered for all GALAXY ZOO reliable spirals in the
OPTICALsample (black dashed) and the selections defined using ( u − r ,log( r e ), e ) (red) ,(log( n ),log( r e ),log( µ ∗ )) (green), (log( n ),log( r e ), M i ) (blue), and(log( n ,log( M ∗ ),log( µ ∗ )) (orange), applied to the OPTICALsample . The parameter distribution of spirals as defined by the classificationsof Nair & Abraham (2010) in the
NAIRsample is shown as a grey dash-dotted line. e / kpc)0.00.20.40.60.81.01.28 9 10 11 12log(M ∗ / M Ο • )0.00.20.40.60.81.01.2 5 6 7 8 9 10log( µ ∗ / M Ο • kpc )0.00.20.40.60.81.01.2 -24 -22 -20 -18 -16M i Figure 6.
As Fig. 5 but for the
NAIRsample . non-negligible Balmer line emission, usually actively starforming) and passive galaxies (very little/no line emission,usually quiescent). In general, spirals tend to exhibit H α line emission (although a non-negligible fraction has verysmall H α EQWs indicative of passive systems), whileearly- types are predominantly passive. Similarly, earliertype spirals often have smaller values of H α EQW thanlater types (see e.g., Robotham et al. 2013. for a detaileddiscussion).Figs. 10 & 11 show the distributions of H α EQW for the
NAIRsample and
NUVNAIRsample , respectively. Thedistribution of the samples defined using the classificationsof Nair & Abraham (2010) is again shown in gray, withthat of the sample defined by GALAXY ZOO in black. Inboth cases the GALAXY ZOO defined sample is weightedmore towards intermediate values of H α EQW with respectto the classifications of Nair & Abraham (2010), showingevidence of a bias against low values of H α EQW as wellas, to a lesser extent, against the highest values. Thedistributions of H α EQW of the samples defined by the c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals e / kpc)0.00.20.40.60.81.01.28 9 10 11 12log(M ∗ / M Ο • )0.00.20.40.60.81.01.2 5 6 7 8 9 10log( µ ∗ / M Ο • kpc )0.00.20.40.60.81.01.2 -24 -22 -20 -18 -16M i Figure 7.
Normalized distribution of the suite of 8 parameters as recovered for all GALAXY ZOO reliable spirals in the
NUVsample (black dashed) and the selections defined using (
NUV − r ,log( r e ), e ) (red) ,(log( n ),log( r e ),log( µ ∗ )) (green), (log( n ),log( r e ), M i ) (blue),and (log( n ,log( M ∗ ),log( µ ∗ )) (orange), applied to the NUVsample . The parameter distribution of spirals as defined by the classificationsof Nair & Abraham (2010) in the
NUVNAIRsample is shown as a grey dash-dotted line. selections (green) all display a similar, yet more pronouncedbias against low values of H α EQW. The selections, withthe exception of ( u − r ,log( r e , e ), all also appear weightedagainst the highest values of H α EQW. These biases againstlow values of H α EQW may be considered to be consistentwith the distributions of the T-types in the samples, withthe selections favoring later type spirals.In summary, we find that the GALAXY ZOO classificationsdisplay a simultaneous mild bias against early type spiralsand systems with low values of H α EQW for the
NAIRsam-ple and
NUVNAIRsample , and that this bias is slightly morepronounced for the parameter combination based selections. H α Equivalent Width Distribution of the
Optical-sample & NUVsample
Bearing this mild simultaneous bias in mind, we consider thedistributions of H α EQW for parameter combinations as ap-plied to the
OPTICALsample and the
NUVsample , shown inFigs. 12 & 13, respectively.The samples selected by the same parameter combinationsas previously applied to the
NAIRsample display a biasagainst low values of H α EQW when applied to the
OPTI-CALsample , similar to that observed for their application tothe
NAIRsample . Overall, all the considered parameter com-binations recover the peak in the H α EQW correspondingto star-forming galaxies well, with high values of H α EQWbeing only minimally favored with respect to the GALAXYZOO defined sample. However, all selections display a biasagainst very low values of H α EQW, least so for the combi-nation ( u − r ,log( r e , e ). The general trends in the distribu-tions of H α EQW appear very similar to those identified forthe selections applied to the
NAIRsample , hence we expectthat the selections applied to the
OPTICALsample will also exhibit a similar bias towards later type spirals.It is important to note the very good agreement betweenthe H α EQW distributions of all reliable spirals in the
OP-TICALsample (black) and
NUVsample (gray) shown in thepanels of Fig. 13. This indicates that the NUV preselectionitself is not introducing a strong bias. Nevertheless, NUVpreselection does appear to lead to a slight bias against sys-tems with low H α EQW, favoring high H α EQW systems.As for the
OPTICALsample the selections applied to the
NUVsample display a bias against low values of H α EQW,although the bias is reduced under NUV preselection. How-ever, the parameter combinations are slightly more weightedtowards high values of H α EQW than for the
OPTICALsam-ple . Overall, the trends in the H α EQW distributions aresimilar to those observed in the selections drawn from the
OPTICALsample , the
NAIRsample , and the
NUVNAIR-sample . Accordingly, we expect that the parameter basedselections will be, to some extent, biased against early typespirals.
A final avenue of possible bias we address here, is thedependence of the performance of the selection on thedistance/redshift of the sources. This is of particularinterest, as the parameters with the best performance arelargely structural or related parameters, e.g. log( n ), log( r e ),log( µ ∗ ), and as such may depend on the resolution of theimages in terms of physical sizes.Over the time span corresponding to the redshift rangeof z = 0 − .
13 we do not expect the distribution ofgalaxy morphologies to evolve in a significant manner(e.g. Bamford et al. 2009), hence the fraction of spiralsshould be approximately constant. However, as massive c (cid:13) , 1–39 M. W. Grootes, et al. (u-r,log(r e ),e) -0.15-0.10-0.050.000.050.100.15 ∆ f r e l a t i v e f r equen cy f (log(n),log(r e ),log( µ ∗ )) -5 0 5 10T-Type (Nair & Abraham, 2010) (log(n),log(r e ),M i )-5 0 5 10T-Type (Nair & Abraham, 2010)-0.15-0.10-0.050.000.050.100.15 ∆ f r e l a t i v e f r equen cy f (log(n),log(M ∗ ),log( µ ∗ )) Figure 9.
As Fig. 8 but for galaxies in the
NUVNAIRsample . bright galaxies are less likely to be spirals than lessmassive, fainter galaxies, this will only be the case forvolume-limited samples. In Fig. 14 we show the fractionof galaxies classified as spirals by the parameter combi-nations ( u − r ,log( r e ), e ), resp. ( NUV − r ,log( r e ), e ) inthe case of NUV-preselection, (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) for dif-ferent volume-limited samples of galaxies. At top leftwe show the spiral fractions as a function of z for avolume-limited subsample of the NAIRsample extendingto z = 0 .
07 (i.e. M g < − D ( z = 0 . D ( z ) isthe distance module and M g is the absolute magnitude inthe g band). We find that the spiral selections recoveredby the parameter combinations (with the exception of( u − r ,log( r e ), e )) are flat in z , and are in good agreementwith the z dependence of the spiral selection for thissample defined by the visual classifications of Nair &Abraham (2010) (black dash-dotted line). The middleleft panel shows that the distribution of spirals selectedfrom a volume-limited subsample of the OPTICALsample extending to z = 0 .
09 (i.e. M r < . − D ( z = 0 . z forthe selections not using colour as a parameter, while thebottom left panel shows a similar result for a volume-limitedsubsample of the OPTICALsample extending to z = 0 . M r < . − D ( z = 0 . z ). In the latter two panels, the dash-dotted blackline indicates the z dependence of the spiral fraction asdefined by the GALAXY ZOO visual classifications. The (u-r,log(r e ),e) -0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f (log(n),log(r e ),log( µ ∗ )) -1 0 1 2 3log( H α EQW [Angstrom]) (log(n),log(r e ),M i )-1 0 1 2 3log( H α EQW [Angstrom])-0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f (log(n),log(M ∗ ),log( µ ∗ )) Figure 10.
Distribution of H α EQW for galaxies in the
NAIR-sample classified as spirals based on the classifications of Nair &Abraham (2010) (gray), GALAXY ZOO (black), and the parame-ter combination listed top left (green). The H α EQW distributionof galaxies with P CS , DB > . located in cells associated with spiralgalaxies is shown in magenta. The inset panel below each distribu-tion shows the distribution of the difference in relative frequencyfor each bin in H α EQW relative to that of the Nair & Abraham(2010) classifications. decline in the spiral fraction is largely due to the certaintyof the classifications decreasing with increasing z . If theassumption of a constant spiral faction as a function of z isvalid, these results may be seen to imply that for marginallyresolved sources, the automatic cell-based non-parametricclassification schemes may be superior to the GALAXYZOO DR1 classifications.The right hand panels of Fig. 14 show the results ofapplying the parameter combinations to NUV preselectedsamples, taking into account the UV sensitivity limits(i.e. with the additional requirement on the samples that M NUV < − D ( z sel ), where z sel is the limiting redshiftof the sample). For volume-limited subsample of the NUVNAIRsample we find, as for the
NAIRsample , thatthe spiral fraction is flat in z . For the other volume-limitedsamples, although the selections are largely flat in z ,there is nevertheless an increase with increasing redshift.Notably, the spiral fraction of selections which only de-pend on parameters determined at long wavelengths (e.g.(log( n ),log( r e ), M i )), and which have spiral distributionswhich are flat in z without the requirement of an NUV c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals (NUV-r,log(r e ),e) -0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f (log(n),log(r e ),log( µ ∗ )) -1 0 1 2 3log( H α EQW [Angstrom]) (log(n),log(r e ),M i )-1 0 1 2 3log( H α EQW [Angstrom])-0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f (log(n),log(M ∗ ),log( µ ∗ )) Figure 11.
As Fig. 10 but for galaxies in the
NUVNAIRsample . detection, also display an increase of the spiral fractionwith z under NUV preselection. This can most readilybe understood in the context of an evolution in theUV properties of the volume-limited samples of spiralsconsidered, with an increasing fraction of spiral galaxieswith NUV emission as a function of increasing redshift z . Such a scenario is consistent with the observed declinein star-formation rate density from z − M ∗ & M ⊙ over this redshift range (Moustakas et al.2013, and references therein). The volume-limited samplesconsidered will be dominated by galaxies in this mass rangeand be accordingly sensitive to such evolutionary effects.We note that as the redshift range spans over a Gyrin lookback time, some evolution in the spiral fraction maybe expected linked to a slight decline in the fraction ofspirals with decreasing z , i.e. we do not expect a perfectlyconstant fraction of spirals. Nevertheless, the lack of anymajor dependence on the spiral fraction as a function ofredshift, implies that no major redshift dependent biasesare introduced into the selection when using combinationsof three parameters with the non-parametric cell-basedmethod, and that the method may even prove to be morereliable than visual classifications. (u-r,log(r e ),e) -0.02-0.010.000.010.02 ∆ f r e l a t i v e f r equen cy f (log(n),log(r e ),log( µ ∗ )) -1 0 1 2 3log( H α EQW [Angstrom]) (log(n),log(r e ),M i )-1 0 1 2 3log( H α EQW [Angstrom])-0.02-0.010.000.010.02 ∆ f r e l a t i v e f r equen cy f (log(n),log(M ∗ ),log( µ ∗ )) Figure 12.
Distribution of H α EQW for galaxies in the
OPTI-CALsample classified as spirals by GALAXY ZOO (black), andthe parameter combination listed top left (green). The H α EQWdistribution of galaxies with P CS , DB > . located in cells associ-ated with spiral galaxies is shown in magenta. The inset panelbelow each distribution shows the distribution of the differencein relative frequency for each bin in H α EQW relative to that ofthe GALAXY ZOO classifications.
Using the cell-based method presented in Sect. 3 wehave identified combinations of parameters includinglog( r e ), log( µ ∗ ), log( n ), log( M ∗ ), and M i , in partic-ular (log( n ),log( r e ),log( µ ∗ )), (log( n ),log( r e ), M i ), and(log( n ),log( M ∗ ),log( µ ∗ )), to result in simultaneously pureand complete samples of spirals. These selections appearto be robust against redshift dependent biases, and to belargely unbiased in their parameter distributions, only dis-playing a slight bias against early type spirals. Accordingly,the cell-based method using these combinations appearswell suited to selecting samples of spiral galaxies. In thefollowing we investigate the contribution of the cell-basedmethod to the demonstrable success, and compare itsperformance to a selection of widely used morphologicalproxies, as well as to a novel algorithmic approach based onsupport vector machines (Huertas-Company et al. 2011). While the use of the parameter combinations in concertwith the cell-based method presented in sect. 3 can lead to c (cid:13) , 1–39 M. W. Grootes, et al. (NUV-r,log(r e ),e) -0.02-0.010.000.010.02 ∆ f r e l a t i v e f r equen cy f (log(n),log(r e ),log( µ ∗ )) -1 0 1 2 3log( H α EQW [Angstrom]) (log(n),log(r e ),M i )-1 0 1 2 3log( H α EQW [Angstrom])-0.02-0.010.000.010.02 ∆ f r e l a t i v e f r equen cy f (log(n),log(M ∗ ),log( µ ∗ )) Figure 13.
As Fig. 12 but for galaxies in the
NUVsample . simultaneously pure and complete samples of spiral galaxies,the use of the cell-based method requires a training sample,ideally of &
30k galaxies (cf. Fig. 2) In contrast to this, theadvantage of simple hard cuts on parameters is that theyrequire no (or much smaller) such calibration samples. Inour investigations we have made use of a suite of parametersincluding ones traditionally used in the morphological clas-sification of spirals (e.g. n ), as well as novel parameters suchas µ ∗ . In order to investigate to what extent the demon-strable success is due to the parameters used, and whatthe effect of the cell-based algorithm is, we have appliedthe combinations ( u − r ,log( r e ), e ), (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) to the OPTICALsample and the
NAIRsample using fixed bound-aries derived by eye from the parameter distributions shownin Fig. 3. In this context we have chosen to treat galaxieswith u − r .
1, log( r e ) . e > .
3, log( n ) . µ ∗ ) .
3, log( M ∗ ) .
7, and M i > −
22 as spirals.The results tabulated in Table 9 show that the bijective dis-crimination power of the selections using fixed boundariesis much lower than when the same parameter combinationsare used with the cell-based method. It is clear that theuse of fixed boundaries entails a strong trade-off betweenpurity and completeness. Although the parameter com-binations ( u − r ,log( r e ), e ), (log( n ),log( r e ),log( µ ∗ )), and(log( n ),log( r e ), M i ) all attain high values of purity (even ∼ .
05 greater than with the cell based method), they,however, are highly incomplete, with completeness values ∼ . − . s p i r a l f r a c t i on M r < 17.7 - D(z=0.13) s p i r a l f r a c t i on M r < 17.7 - D(z=0.13)M NUV < 23 - D(z=0.13) s p i r a l f r a c t i on M r < 17.7 - D(z=0.09) s p i r a l f r a c t i on M r < 17.7 - D(z=0.09)M NUV < 23 - D(z=0.09) s p i r a l f r a c t i on M g < 16 - D(z=0.07) s p i r a l f r a c t i on M g < 16 - D(z=0.07)M NUV < 23 - D(z=0.07)
Figure 14.
Spiral fraction as a function of redshift z in binsof width 0.01 for selections defined using ( u − r ,log( r e ), e )resp. ( NUV − r ,log( r e ), e ) (red) ,(log( n ),log( r e ),log( µ ∗ )) (green),(log( n ),log( r e ), M i ) (blue), and (log( n ,log( M ∗ ),log( µ ∗ )) (orange),respectively. The top left panel shows the results for the combina-tions applied to a volume-limited subsample of the NAIRsample (the selection criteria are indicated in each panel). The redshiftdependence of the spiral fraction defined by the classifications ofNair & Abraham (2010) in the considered subsample is shownblack as a dash-dotted line. Error bars indicate Poisson 1- σ un-certainties. The top right panel shows the same, but applied toa subsample of the NUVNAIRsample as defined in the panel.The middle and bottom left panels show the redshift dependenceof the spiral fraction for the selection applied to two volume-limited subsamples of the
OPTICALsample with the GALAXYZOO defined reliable spiral fraction shown as a black dash-dottedline. The middle and bottom right panels show the same for the
NUVsample . The parameter combination (log( n ),log( M ∗ ),log( µ ∗ )), onthe other hand, attains a completeness only ∼ .
07 lessthan the cell-based method, but with the purity of theselection reduced by ∼ .
1. The high values of completeness,attained simultaneously to the high values of purity whenmaking use of the parameter combinations together withthe cell-based method, thus appear largely due to theflexibility of the boundaries given by the cell-based method. c (cid:13)000
1. The high values of completeness,attained simultaneously to the high values of purity whenmaking use of the parameter combinations together withthe cell-based method, thus appear largely due to theflexibility of the boundaries given by the cell-based method. c (cid:13)000 , 1–39 hotometric Proxies for Selecting Spirals Table 9.
Purity, completeness, bijective discrimination power, and contamination for the combinations ( u − r ,log( r e ), e ),(log( n ),log( r e ),log( µ ∗ )), (log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) using fixed boundaries, applied to the OPTICALsample (columns 2-5) and the
NAIRsample using the GALAXY ZOO visual classifications (columns 6-9) as well as the independent classi-fications of Nair & Abraham (2010, columns 10-12).
OPTICALsample NAIRsample
GALAXY ZOO NAIR & Abraham 2010Parameter combination P pure P comp P bij P cont P pure P comp P bij P cont P pure P comp P bij ( u − r ,log( r e ), e ) 0.793 0.398 0.316 0.015 0.911 0.257 0.234 0.006 0.961 0.236 0.227(log( n ),log( r e ),log( µ ∗ )) 0.794 0.567 0.450 0.006 0.934 0.487 0.455 0.007 0.976 0.442 0.431(log( n ),log( r e ), M i )) 0.782 0.507 0.396 0.007 0.922 0.372 0.343 0.013 0.965 0.339 0.327(log( n ),log( M ∗ ),log( µ ∗ )) 0.654 0.700 0.458 0.028 0.861 0.573 0.493 0.023 0.946 0.547 0.517 Having identified the cell-based method used with combina-tions of three parameters including log( r e ), log( µ ∗ ), log( n ),log( M ∗ ), and M i , in particular (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )), as amethod to select simultaneously pure and complete samplesof spirals we compare its performance to that of a selectionof widely used morphological proxies, as well as to thatof a novel algorithmic approach based on support vectormachines (Huertas-Company et al. 2011).Two well-known proxies for the general morphologicaltype of a galaxy are the concentration index in the r band,defined as C r = R ,r R ,r where R ,r and R ,r are the radiiwithin which 90 resp. 50 per cent of the galaxy’s (petrosian)flux are contained, and the S´ersic index n , i.e., the indexobtained for the best fit of a S´ersic profile (S´ersic 1968)to the galaxy’s light distribution. Strateva et al. (2001)suggest the use of the concentration index as a proxy formorphological classification with galaxies with C r < . n < . u − r colour vs. absolute r magnitude diagram, with theseparator parameterized by a combination of a constantand a tanh function dependent on the absolute r bandmagnitude (their Eq. 11).A different approach, also making use of two parameters,has been adopted by Tempel et al. (2011). They define asubvolume in the two dimensional space spanned by theSDSS parameters f deV (i.e., the fraction of a galaxy’s fluxwhich is fit by the de Vaucouleurs profile (de Vaucouleurs1948) in the best fit linear combination of a de Vaucouleursand an exponential profile) and q exp (the axis ratio of theSDSS best fit exponential profile) associated with spiralgalaxies and calibrated on visual classifications of SDSSgalaxies in the Sloan Great Wall region (Einasto et al.2010) and GALAXY ZOO.Recently Huertas-Company et al. (2011) have publisheda catalogue of morphological classifications of SDSS DR7spectroscopic galaxies based on support vector machines,which compare well with GALAXY ZOO classifications ofthe same sample. Similarly to GALAXY ZOO Huertas-Company et al. (2011) assign probabilities to the possible galaxy classes, so that for the purposes of our comparisonwe have chosen to treat objects with a probability greaterthan 70 per cent of being a spiral as a spiral, analogouslyto our treatment the GALAXY ZOO sample .Table 10 shows the purity, completeness, and bijectivediscrimination power for the five morphological proxiesdiscussed above as well as the three parameter combinationsapplied to the OPTICALsample and the
NAIRsample . Allmorphological proxies, with the exception of that proposedby Tempel et al. (2011), attain values of completenesssimilar to, or larger than, that of the cell based methodwhen applied to the
OPTICALsample , although only theclassification of Huertas-Company et al. (2011) achievesa completeness notably exceeding that of the cell-basedmethod ( P comp = 0 . OPTICALsample , much lower than thevalue of ≈
75 per cent achieved by the cell- based method,the exception again being the method of Tempel et al.(2011). As a result, the bijective discrimination power ofthese selections is lower than that achieved by the optimalcombinations of three parameters, using the cell-basedmethod, with only the method of Huertas-Company etal. (2011) attaining a comparable value of P bij . However,the contamination by ellipticals introduced by the proxiesconsidered is at least a factor 3 greater than that resultingfrom the cell based method.Applied to the brighter NAIRsample the purity of the con-sidered proxies increases notably, while the completenessslightly decreases. The purity of the selections resultingfrom the use of the considered proxies remains significantlylower than that achieved by the parameter combinations,both when using the GALAXY ZOO visual classifications aswell as those of Nair & Abraham (2010), as can also be seenin the distributions of the T-types in the samples selectedby the considered proxies (Fig. 15). The completeness, onthe other hand, is greater than for the parameter basedselections, so that the bijective discrimination power of theconsidered proxies is comparable to that of the parameterbased selections when applied to the
NAIRsample .As can be seen in Fig. 15, the T-type distributions ofthe considered proxies display a bias towards later type Huertas-Company et al. (2011) provide probabilistic morpho-logical classifications for all but 311 of the sources in our samplec (cid:13) , 1–39 M. W. Grootes, et al. spirals very similar to that of the cell-based selections.However, the bias against Sa and Sa/b galaxies appears tobe slightly less pronounced, with the relative frequency ofearly type galaxies being marginally higher for the samplesrecovered by the proxies than by the cell-based selections.On the other hand, the T-type distributions in Fig. 15 alsoshow the considerably larger contamination by ellipticalsnot present in the cell based selections.Considering the distributions of H α EQW for the samplesobtained by these proxies applied to the
NAIRsample asshown in Fig. 16 one finds that the samples recovered bythe proxies (with the exception of the methods of Huertas-Company et al. 2011 and Tempel et al. 2011) display a biastowards sources with large values of H α EQW, considerablymore so than the cell-based selections, with ∼
10% moreof the sample consisting of high H α EQW sources thanin the samples recovered by the cell-based method. Thisresult is most pronounced for the samples selected by theconcentration index, the S´ersic index and the method ofBaldry et al. (2004). Similar but more pronounced resultsare obtained if one considers the distributions of H α EQWfor the samples obtained by these proxies applied to the
OPTICALsample , as shown in Fig. 17. In contrast, theselections based on the method of Tempel et al. (2011)and Huertas-Company et al. (2011) appear to be weightedmore towards high and low values of H α EQW than theGALAXY ZOO reference and the selections based on theparameter combinations used in concert with the cell-basedmethod.Overall, we find that the selections resulting from theproxies are similar to, or more biased than, the selectionsbased on the cell-based method, and are clearly morecontaminated.In conclusion, we thus find that for the purpose ofselecting a pure, yet nevertheless largely complete, sampleof spiral galaxies, not limited to the brightest galaxies,the use of the cell-based method presented in combina-tion with one of the optimal parameter combinations ispreferable over the investigated well-established proxies,and at least comparable to the sophisticated approach ofHuertas-Company et al. (2011).
Using the non-parametric cell-based method presented, wehave successfully identified several combinations of threeparameters which allow for an efficient and rapid selectionof pure and simultaneously complete, largely unbiasedsamples of spiral galaxies. When applied to parent samplesnot limited to the brightest galaxies, these are superiorin performance, in terms of bijective discrimination powerand bias (e.g. in H α EQW), to the widely establishedsimple morphological proxies investigated, such as theconcentration index C r , the S´ersic index n , and the divisioninto red and blue galaxies. Furthermore, they are at leastcomparable in performance to the algorithmic approachusing SVMs of Huertas-Company et al. (2011). -0.10-0.050.000.050.10 ∆ f r e l a t i v e f r equen cy f n < 2.5 Baldry+2004 -0.10-0.050.000.050.10 ∆ f r e l a t i v e f r equen cy f Huertas-Company+2011 -5 0 5 10T-Type (Nair & Abraham, 2010) Tempel+2010-5 0 5 10T-Type (Nair & Abraham, 2010)-0.10-0.050.000.050.10 ∆ f r e l a t i v e f r equen cy f C r < 2.6 Figure 15.
T-type distributions of the discussed selection meth-ods applied to the
NAIRsample indicated top left in each panel.The distribution of GALAXY ZOO spirals with P CS,DB > . P CS,DB > . However, depending upon the effort required to obtaina given parameter, either in terms of data processing oracquisition, the ‘cost’ of parameters, and hence of param-eter combinations, will vary. For example, a parametercombination including only quantities such as r e , M i , u − r , and e which can, at least for reasonably resolvedsources, often be measured directly by SExtractor (Bertin& Arnouts 1996) is ‘cheaper’ than a combination involvingparameters which require additional data reduction such c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals Table 10.
Purity, completeness, bijective discrimination power, and contamination for other widely used morphological proxies, appliedto the
OPTICALsample (columns 2-5) and the
NAIRsample using the GALAXY ZOO visual classifications (columns 6-9) as well as theindependent classifications of Nair & Abraham (2010, columns 10-12). The values attained by the combinations (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) are shown for comparison. OPTICALsample NAIRsample
GALAXY ZOO NAIR & Abraham 2010Method P pure P comp P bij P cont P pure P comp P bij P cont P pure P comp P bij (log( n ), log( r e ), log( µ ∗ )) 0.739 0.774 0.572 0.017 0.884 0.712 0.629 0.024 0.945 0.660 0.624(log( n ), log( r e ), M i ) 0.740 0.779 0.576 0.021 0.879 0.706 0.621 0.032 0.935 0.652 0.610(log( n ), log( M ∗ ), log( µ ∗ )) 0.731 0.773 0.565 0.019 0.885 0.707 0.626 0.024 0.946 0.657 0.621Huertas-Company et al., 2011 0.588 0.903 0.531 0.077 0.806 0.836 0.673 0.054 0.898 0.802 0.720Baldry et al., 2004 0.522 0.802 0.419 0.081 0.745 0.747 0.557 0.115 0.834 0.721 0.601Tempel et al., 2011 0.648 0.411 0.266 0.078 0.786 0.387 0.304 0.064 0.896 0.380 0.340 n < . C r < . as fitting S´ersic profiles using., e.g. GIM2D (Simard et al.2002) or
GALFIT (Peng et al. 2002) . Similarly the relative‘cost’ of additional NUV data is much higher than that ofrelying solely on optical pass- bands, as it involves the useof additional observational facilities.Encouragingly, we find that various parameter combi-nations perform similarly well, allowing for a choice ofparameter combination informed by both the envisionedscience application as well as the relative ‘expense’ of theparameters used.Overall, the most important parameters in selecting asample of spiral galaxies are the effective radius log( r e ),the stellar mass surface density log( µ ∗ ), and the S´ersicindex log( n ). These parameters perform especially well incombination with the stellar mass or a tracer thereof (e.g M i ). We find the combinations (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) to bethose with the greatest bijective discrimination power whenapplied to the OPTICALsample . These are also amongstthe most powerful under NUV preselection, although thecombination (
NUV − r ,log( r e ), M i ) is comparably powerful.In the latter case, however, the selection appears to bedriven by the parameters M i and, in particular, log( r e ). Interms of relative ‘expense’ the combinations requiring NUVpre- selection are more ‘expensive’ than those applicable tothe whole sample. Although the best-performing combina-tions all require S´ersic profiles to be fit, the cost is stronglyameliorated by the fact that only single S´ersic profiles arerequired.Unsurprisingly, the ellipticity e proves to be an effectiveparameter, as only spirals seen edge-on appear stronglyelliptical. In this sense, it even counters the bias againstedge-on spirals, which can be introduced by using UV-optical colours as selection parameters, as dusty edge-onspirals may drop out of a colour selection due to attenua-tion of their UV emission. However, selections using e asa parameter are strongly biased against any spirals seen Where high resolution imaging is available these codes them-selves present a different method of automatic morphological clas-sification, as they can perform multiple component fits which canbe used to determine the morphological type of a galaxy. How-ever, the requirements on resolution are severe and fitting multiplecomponents is often not justified (Simard et al. 2011). approximately face-on, respectively not edge-on. Thus,while the observed ellipticity represents a powerful criterionfor selecting a pure sample of spirals and has a low relativecost, it leads to generally less complete samples, which arestrongly biased towards edge-on systems.Although our results indicate that simple structural pa-rameters derived at longer wavelengths are efficient atselecting spirals, the combinations (
NUV − r ,log( r e ), M i ),and to a lesser extent ( u − r ,log( n ),log( r e )), indicate thatUV/optical colours linked to younger stellar populations doprovide valuable information for selecting spiral galaxies.As mentioned above, however, use of UV-optical colouras a parameter can lead to biases in the selection. Dustin spirals will cause galaxies seen edge-on to appear veryred, hence, the use of a UV-optical colour can bias theselection against these systems. Furthermore UV-opticalcolour selection can introduce a bias against any spiralswhich appear intrinsically red due to lack of star formation.This is the case both for the u − r and NUV − r colours.Finally, when using a colour as a parameter (in particular aUV colour) the possibility of different depths of photometrymust be accounted for, i.e., the photometry in both bandsmust be deep enough to ensure that the entire range ofcolour normally attributed to the galaxy population iscovered over the entire redshift range of the sample. Failureto do so will give rise to both additional incompleteness, aswell as a colour bias in the resulting sample.Depending on the science application for which the sampleis intended, and on the availability of data different com-binations may be optimal in selecting spiral galaxies. Forexample, using the combination (log( n ), log( r e ), M i ) wouldbe appropriate to obtain a selection of spiral galaxies fora project aiming at investigating the total star formationrates of a large sample of spiral galaxies as derived fromthe UV. Such a selection would avoid a bias againstquiescent systems, as would be introduced by using a NUVpreselection or a UV-optical colour, while also guardingagainst any orientation biases which could arise if e wasused as a selection parameter. Accordingly such a samplewould be largely unbiased with respect to star formationcharacteristics. Another suitable combination for such anapplication would be (log( n ), log( r e ), log( µ ∗ )), which is c (cid:13) , 1–39 M. W. Grootes, et al. -0.050.000.05 ∆ f r e l a t i v e f r equen cy f n < 2.5 Baldry+2004 -0.050.000.05 ∆ f r e l a t i v e f r equen cy f Huertas-Company+2011 -1 0 1 2 3log( H α EQW [Angstrom]) Tempel+2010-1 0 1 2 3log( H α EQW [Angstrom])-0.050.000.05 ∆ f r e l a t i v e f r equen cy f C r < 2.6 Figure 16. H α EQW distributions of the discussed selectionmethods indicated top left in each panel applied to the
NAIRsam-ple . The distribution of GALAXY ZOO spirals with P CS,DB > . P CS,DB > . α EQW relative to that of the Nair & Abraham (2010) classifi-cations. also largely independent of UV-optical colours
The stellar mass estimate used in deriving µ ∗ does depend onan optical color, i.e the g − i colour, however, this colour is linkedmainly to intermediate age and old stellar populations. Givenphotometry of sufficient depth, the g − i colour does not present adirect selection criterion but is only used in calculating the stellarmass, such that M ∗ and µ ∗ can be considered unbiased in terms -0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f n < 2.5 Baldry+2004 -0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f Huertas-Company+2011 -1 0 1 2 3log( H α EQW [Angstrom]) Tempel+2010-1 0 1 2 3log( H α EQW [Angstrom])-0.04-0.020.000.020.04 ∆ f r e l a t i v e f r equen cy f C r < 2.6 Figure 17. H α EQW distributions of the discussed selectionmethods indicated top left in each panel applied to the
OP-TICALsample . The distribution of GALAXY ZOO spirals with P CS,DB > . P CS,DB > . α EQW relative to that of the GALAXY ZOO clas-sifications.
Conversely, however, a sample which required the greatestachievable purity should include both NUV preselectionand e as a parameter. Thus, the selection can and shouldbe adapted to the science case at hand, although the lack of star-formation properties. Further more the stellar mass M ∗ derived in this manner is largely independent of dust attenuation(Bell & de Jong 2001; Nicol et al. 2011; Taylor et al. 2011).c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals of requirement of UV data allows the method to be easilyapplied to very large samples with minimum requirementson wavelength coverage. As discussed in Sect. 6.1, we find that the most importantparameters in selecting spirals are the effective radiuslog( r e ), the stellar mass surface density log( µ ∗ ), and theS´ersic index log( n ) in combination with the stellar mass ora tracer thereof (e.g. M i ). In addition e leads to very pureif incomplete selections. All these properties are derived inpass-bands normally associated with older stellar popula-tions ( g , r , and i ), rather than with recent star formation.The success achieved by using parameters not obviously di-rectly related to the young stellar population is remarkableand implies that the spiral and non-spiral population aremore or less distinct in these parameters. While the successof e bases on the appearance in projection of spiral galaxies,that of log( r e ) and log( µ ∗ ), on the other hand, entails thatthe radial extent and in particular the ratio of mass to sizeof the old stellar population is distinctly different in spiralsand ellipticals. Rotationally supported systems (i.e. spirals)appear to be significantly more extended than pressuresupported systems (i.e spheroidals/ellipticals) at a givenstellar mass .This is consistent with the notion that the stellar pop-ulations evolve via distinct evolutionary tracks for disksand spheroids, with the evolution of present day spiralsthought to involve a smooth infall of gas and inside-outstar formation, with merger activity restricted to minormergers.In contrast, ellipticals are thought to be the products ofmajor mergers in which angular momentum is redistributedmaking the central system more compact (e.g., Bournaud,Jog, & Combes 2007, and references therein).In light of our results we emphasize that parameterslinked to the old stellar population of galaxies, normallynot employed in the classification of spirals, may providevaluable information on the morphology of a galaxy. Inparticular the stellar mass surface density and/or the radialextent (together with another parameter, e.g. M i ) may bepowerful due to the physically motivated characterizationparameters. We have shown the cell-based method to work well for SDSSgalaxies, in particular a subset of the SDSS spectroscopicsample. Hence we expect the method to be applicable tosamples of similar depth and similar angular resolution, andthus be applicable to upcoming surveys similar to SDSS,e.g. SKYMAPPER (The Skymapper Southern Sky Survey;Keller et al. 2007). Many upcoming surveys (DES, VST This size dichotomy can be boosted further by the presence ofdust in the disks, which can increase the apparent size of disksrelative to the intrinsic size (M¨ollenhoff, Popescu, & Tuffs 2006;Pastrav et al. 2013a)
ATLAS, KiDS, and GAMA (Galaxy And Mass Assembly;Driver et al. 2011), as well as SDSS itself, however, extendto greater photometric depths than the sample used here.To answer the question of how applicable the method is toother, deeper surveys we have used a sample consisting ofthe 50k r -band brightest galaxies in the OPTICALsample (i.e m r < .
48) as a calibration sample and have subse-quently classified the faintest 50k galaxies ( m r > . n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) . Theresults are shown in Table 11, where we have included theresults obtained using the calibration sample employed insect. 4, as well as the results obtained using the widely usedproxies discussed in sect. 5 for comparison. Using the brightsubsample to classify the faint subsample we find that theselections are very complete, yet appear to be less pure thanwhen classifying the entire OPTICALsample . However, thisis largely due to a decrease in the certainty of the GALAXYZOO classifications for sources which appear fainter as theypredominantly lie at greater redshifts and are smaller andless resolved. This is underscored by the very low valuesof contamination achieved for the different combinations.The performance of the cell-based method remains easilysuperior to that of the simple proxies, achieving muchgreater purity and similar completeness.These resultssuggest that galaxy samples extending faintwards of theSDSS spectroscopic limit can also be classified using themethod presented (cf. also Sect. 4.3.3).Penultimately, the increased angular resolution and sen-sitivity of the upcoming surveys with respect to SDSSmay allow the method to be extended to sources at higherredshifts than the current very local sample. A somewhatsimilar approach defining subspaces associated with early-and late-type galaxies using U − V and V − J restframecolours, calibrated using HST ACS imaging, has recentlybeen proposed by Patel et al. (2012) for galaxies at z ∼ . c (cid:13) , 1–39 M. W. Grootes, et al.
Table 11.
Purity, completeness, bijective discrimination power, and contamination for the combinations (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) and the proxies discussed in sect. 5 applied to the faintest 50k galaxies in the OP-TICALsample , i.e m r > .
24. The results are presented for calibrations of the cell based method using the brightest 50k galaxies inthe
OPTICALsample ( m r < . P pure P comp P bij P cont P pure P comp P bij P cont (log( n ), log( r e ), log( µ ∗ )) 0.596 0.860 0.513 0.009 0.657 0.787 0.517 0.005(log( n ), log( r e ), M i ) 0.607 0.861 0.523 0.009 0.664 0.799 0.530 0.006(log( n ), log( M ∗ ), log( µ ∗ )) 0.602 0.844 0.508 0.009 0.647 0.781 0.506 0.006Huertas-Company et al., 2011 0.477 0.934 0.446 0.078Baldry et al., 2004 0.434 0.825 0.358 0.098Tempel et al., 2011 0.549 0.551 0.302 0.071 n < . C r < . The cell-based method presented here could, in principle,be adapted to identifying reliable samples of ellipticalgalaxies in an analogous fashion to that described for theidentification of spirals. A certain population of the cells,dependent upon the requirements imposed, will not beassignable to either the spiral or the elliptical subvolumeand will remain undefined. However, it is by no meansclear, that the parameter combinations which perform bestat selecting a pure and complete population of spirals willdo the same for ellipticals. As our focus has been to identifya method of reliably selecting spirals,we do not furtherdiscuss the selection of ellipticals. We note, however, that itwould be straight-forward to implement and optimize sucha method. We have also supplied the elliptical fractionsand relative errors for the three discretizations supplied inappendix A.
The use of parametric methods, such as linear discriminantanalysis for example, in classifying galaxies is attractive,as these methods are capable of assigning a probabilisticclassification to the morphology of a galaxy, rather thana binary one such as that presented here, which willsuffer from contamination due to quantization effects.Furthermore, as also discussed in section 3, calibratingthe cell based method requires substantial samples ofgalaxies with visual classifications, while the training setsfor parametric methods can be smaller. However, theapplicability of such a parametric method depends on theprobability distributions of galaxy properties conformingto the assumed parameterization, which may not be thecase. Obviously, a strength of the non-parametric methodpresented in this work is that it removes such biases arisingfrom assumptions about the correct parameterization.We suggest, that the non-parametric method presentedhere can also be used to investigate the performance ofparametric methods. If the results of both approaches are inreasonable agreement it may be possible to confidently em- ploy the parametric method to selecting samples, relaxingthe required size of a putative calibration sample. A furtherinvestigation into the performance of multi-parameter mor-phological classifications using linear discriminant analysisand the cell-based method presented here as a comparisonwill be presented in a companion paper (Robotham et al.,in prep.).
As an application of the cell-based technique for selectingspiral galaxies we use it to rederive the empirical scalingrelation between the specific star-formation rate and thestellar mass (the ψ ∗ − M ∗ relation) for this class of objects.Previous derivations of the ψ ∗ − M ∗ relation have usedgalaxy samples sensitive to star formation properties intheir definition, thus potentially biasing the obtainedresults. A further factor influencing the derivation of the ψ ∗ − M ∗ relation is the attenuation of stellar emission fromthe galaxy due to its dust content, which introduces a largecomponent of scatter, as well as potentially of bias, into therelation. Here we capitalize on the selection of a relativelypure sample of galaxies of known disk-like geometry, byapplying a radiation transfer technique to correct forthe attenuation of stellar emission by dust, utilizing thegeometrical information (effective radii & axis ratio) of eachgalaxy. To this end, we utilize the method of Grootes etal. (2013), who have presented a method to obtain highlyaccurate radiation transfer based attenuation correctionson an object-by- object basis, using only broadband opticalphotometric observables not directly linked to star forma-tion, in particular the stellar mass surface density. Themethod of Grootes et al. (2013), however, critically relieson the underlying radiation transfer model of Popescu etal. (2011) being applicable to the galaxies considered, andthus requires a clean sample of galaxies with disk geometrynot hosting AGN. c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals ψ ∗ − M ∗ relation formorphologically-selected spiral galaxies in thelocal universe Starting from the
OPTICALsample we define a sampleof spirals using the cell-based method and the parametercombination (log( n ),log( r e ), M i ) and impose a redshift limitof z = 0 .
05. As shown by (e.g. Taylor et al. 2011), theSDSS with a limiting depth of r petro, = 17 .
77 is &
80 %complete for M ∗ > . M ⊙ to this redshift. The sampleconsidered thus represents a volume-limited sample for thismass range. The sample is further limited to objects withan NUV detection as well as those for which there is no UVcounterpart to the SDSS galaxy in the preliminary GCATMSC (Seibert et al., 2013 in prep.), excluding ambiguousmultiple matches which would require flux redistribution.For the sources lacking an NUV counterpart, 3- σ upperlimits have been calculated. Finally, objects defined as AGNfollowing the prescription of (Kewley et al. 2006) using theratios of [NII] to H α and [OIII] to H β have been excluded.This results in a total of 9885 galaxies, 536 of which haveno counterpart in the preliminary GCAT MSC. A visual in-spection of a random selection of these non-detected sourcesfinds that a large fraction ( ∼
50 %) of these non-detectionslie in the vicinity of bright stars or at the very edge ofGALEX tiles, so may actually have an NUV counterpart.In the following, we therefore proceed by considering twosamples: i) the entire selected sample of spiral galaxies,treating all nondetections as real non-detections, andii) only the subset of spirals with an NUV counterpart,implicitly assuming that all non-detections actually possessan NUV counterpart, and can thus be discarded. Bycomparing the ψ ∗ − M ∗ relation for the two samples, we willshow that the effect of the NUV non-detections is negligibleon the derivation of the ψ ∗ − M ∗ relation.For all spiral galaxies, we have corrected the observed UVphotometry (detections and upper limits) for the effectsof attenuation by dust using the radiation-transfer basedmethod presented in Grootes et al. (2013), and have derivedvalues of ψ ∗ from the de-attenuated UV photometry usingthe conversion factors given in Kennicutt (1998), scaledfrom a Salpeter (1955) IMF to a Chabrier (2003) IMF as inTreyer et al. (2007) and Salim et al. (2007). The requiredstellar masses have been derived as detailed in Sect. 2. Incli-nations (required for the attenuation corrections alongsidethe effective radii) have been derived from the observedellipticity as i = arccos(1 − e ) and subsequently correctedfor the effects of finite disk thickness as detailed in Sect. 3of Driver et al. (2007), using an assumed intrinsic ratio ofscale-height to semi-major axis of 0.12.Fig. 18 shows the values of ψ ∗ as a function of M ∗ before and after correction for dust attenuation (middleand top panel, respectively), with the median in bins of0 . M ∗ shown as large filled circles with the errorbarsindicating the interquartile range in logarithmic scatter ineach bin. Without attenuation corrections, the ψ ∗ − M ∗ relation displays a mean logarithmic scatter of 0 .
70 dex(0 .
63 dex considering only NUV-detected sources) for the The mean logarithmic scatter is calculated as the differencebetween the quartiles of the distribution in ψ ∗ , averaged over 15 volume-limited sample. A pure power-law fit to the mediandistribution of the uncorrected sample finds an index of γ ≈ − .
8, but also shows that a pure power-law is onlymarginally suited to describing the distribution.After applying attenuation corrections, we find that themean logarithmic scatter is reduced to 0 .
48 dex (0 .
43 dexconsidering only NUV-detected sources). In addition to thislarge reduction in scatter, we find that the median ψ ∗ − M ∗ relation for the volume-limited corrected sample is wellrepresented by a pure power-law with an index of γ ≈ − . M ∗ , and that this power-law alsoprovides a good parameterization of the relation at leastdown to M ∗ = 10 M ⊙ . The exact value of the power- lawindex found using a linear regression analysis of the binwisemedian of ψ ∗ as shown in Fig. 18 is γ = − . ± . M ∗ considered, despitethe use of a sample incorporating red, quiescent spirals notconsidered in previous studies.Both for the corrected and uncorrected samples the median ψ ∗ − M ∗ relation is largely invariant between the wholesample, and the subsample considering only NUV-detectedgalaxies, indicating that the true distribution of NUVdetections and upper limits would provide similar results.As the selection of spiral galaxies is purely morpho-logically based, the sample is capable of including veryred and potentially passive spiral galaxies and shouldhave a low contamination rate by ellipticals ( ∼ ψ ∗ ,which might affect the ψ ∗ − M ∗ relation. To investigateto what extent the visible population of passive spirals isin fact a population of misclassified ellipticals, we havevisually inspected a random sample of galaxies with NUVdetections, M ∗ > . M ⊙ , and log( ψ ∗ / yr − ) −
11 aftercorrection for dust. Fifteen randomly selected such galaxiesare shown in Fig. 19. All but two galaxies (top right paneland middle panel of second row) are clearly disk dominatedspirals, showing that the large majority of the consideredpopulation appear to be disk-like galaxies. This serves asfurther validation of the cell-based selection technique, andimplies that the derived ψ ∗ − M ∗ relation is not biased bya large contamination of elliptical galaxies.Conversely, even for the combination(log( n ),log( r e ), M i ) a slight bias against early typespirals remains, which could potentially affect the ψ ∗ − M ∗ relation, in particular if a large fraction of the massive, redspiral population were missed by the cell-based selectionmethod. To investigate this potential effect we begin byconsidering the early-type spirals in the NAIRsample (i.e.T-type > NAIRsample using the cell-based methodto be red ( u − r > .
2) and massive ( M ∗ > . M ⊙ ),compared to 38 % red and massive galaxies amongst the equal sized bins in M ∗ spanning 10 . M ⊙ M ∗ M ⊙ , andweighted by the number of galaxies in each bin.c (cid:13) , 1–39 M. W. Grootes, et al.
Figure 18.
Specific star formation rate ( ψ ∗ ) versus stellar mass( M ∗ ) for a sample of spiral galaxies selected using the cell-basedmethod and the parameter combination (log( n ),log( r e ), M i ) andnot hosting an AGN following the prescription of Kewley et al.(2006), with z .
05. Individual sources are plotted as filled cir-cles with the grayscale color indicating the relative source densityat their position in the ψ ∗ − M ∗ plane. Values of ψ ∗ have beenderived from NUV photometry as described in Sect. 7.1. Galaxieswithout an NUV counterpart in the GCAT MSC (Seibert et al.,2013 in prep.) are show as 3- σ upper limits. The limiting stellarmass of M ∗ > . M ⊙ above which the sample can be consid-ered volume limited is indicated by a vertical dash-dotted line.The median value in bins of 0 . M ∗ is shown as large filledcircles, with errorbars depicting the interquartile range in eachbin. The medians and scatter for the whole sample are shown inblack, while those of the sample considering only sources withNUV counterparts are shown in red. The top panel shows thedistribution and median relations after radiation transfer basedattenuation corrections following Grootes et al. (2013) have beenapplied, while the middle panel shows the uncorrected distribu-tion and median relations. The black and red dashed lines in thetop and middle panels show power-law fits to the median rela-tion in the mass range 10 . M ⊙ M ∗ M ⊙ , correspond-ing to the volume-limited sample. The bottom panel shows thecorrected (circles) and uncorrected (stars) relations to facilitate adirect comparison of the slope and scatter before and after correc-tion for dust attenuation. Spirals found to host an AGN followingthe prescription of Kewley et al. (2006) are shown by blue starsin the middle panel. The relations found using the prescriptionof Baldry et al. (2004) and a simple S´ersic index cut are shownin azure and orange respectively, with the dashed line showingthe relation as determined from all galaxies considered, and thedash-dotted line indicating the relation as recovered using onlythe detected sources. early type-spirals NOT recovered by the cell- based method,implying that the early-type galaxies not recovered are notstrongly weighted more towards massive red objects thanthose recovered. To judge the impact of the bias againstearly-type spirals on the ψ ∗ − M ∗ relation, however, it isnecessary to consider not only the early-type galaxies, butthe entire populations of spiral galaxies in the NAIRsample recovered, respectively not recovered by the cell-basedmethod. Overall, one finds that for galaxies classified asspirals by Nair & Abraham (2010) and recovered by thecell-based method with the parameter combination (log( n ),log( r e ), M i ) massive red galaxies constitute 15 % of thesample, while massive red galaxies constitute 27 % ofthe spirals not recovered by the cell-based method. Thisrelatively small shift in weight at the massive red end( ∼
12 %) combined with the high completeness fraction( >
65 %) attained by the cell-based selection implies thatthe results obtained for the ψ ∗ − M ∗ relation for spiralgalaxies in the local universe are robust. Thus, althoughit is possible that the actual ψ ∗ − M ∗ relation may stillbe slightly steeper, this further steepening will be smallcompared to the steepening to the ψ ∗ proptoM − . ∗ lawfound for the cell-based sample.Finally, Fig. 18 shows the location of spiral galaxieshosting an AGN on the ψ ∗ − M ∗ relation. Although theinterpretation of the NUV emission of such sources asbeing indicative of their SFR is by no means secure, sincethe AGN can also significantly contribute to the NUVemission, we find that the ratio of NUV emission to stellarmass of spiral galaxies hosting optically identified AGN isnot readily distinguishable from that of similar galaxieswithout an AGN. AGN host galaxies do, however, appearto be more massive than ∼ M ⊙ as a rule, and displaya larger scatter. Fig. 21 shows the locations of opticallyidentified AGN in a sample of galaxies with the additionalrequirement of H α and H β lines with S/N > ψ ∗ − M ∗ relations for colour-selected andS´ersic-index selected samples We have previously argued and demonstrated, that thecell-based method of selecting pure and complete samplesof spiral galaxies is capable of including quiescent spiralsand is therefore well-suited to investigating the ψ ∗ − M ∗ relation for a morphologically defined sample of spiralgalaxies. In order to illustrate the effect that the choiceof classification method has on the results derived forthe ψ ∗ − M ∗ relation and demonstrate the necessity ofan adequate selection method, Fig. 20 shows the relationfor galaxy samples drawn from the OPTICALsample andlimited to z .
05 selected using the prescription of Baldryet al. (2004) (left) and the S´ersic index (right). Attenuationcorrections have been applied using the method of Grooteset al. (2013) as previously described. The derived relationshave also been overplotted in Fig. 18 for comparison. For c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals Figure 19.
SDSS DR7 5 band images of a random selec-tion of 15 spiral galaxies from the sample considered with anNUV counterpart in the GCAT MSC, M ∗ > . M ⊙ , andlog( ψ ∗ /M ⊙ kpc − ) −
11 after attenuation corrections havebeen applied. All but two of the sources (top right and secondrow middle) display a disk-like morphology. The images have beenretrieved using the SDSS Explore tool. the sample selected following the method of Baldry et al.(2004) we find a power law index of γ = − . ± . γ = − . ± .
11 after applying attenuation corrections.Both before and after correction a single power-law appearsto be an adequate representation of the ψ ∗ − M ∗ relation forthis sample. Considering the scatter in the ψ ∗ − M ∗ relationwe find that the relation is tight both before and afterapplying attenuation corrections, with values of 0 .
52 dexinterquartile and 0 .
40 dex, respectively.Using the S´ersic index to select a sample of spiralgalaxies, we find a power-law index of γ = − . ± . γ = − . ± .
14 after applying attenuation cor-rections. The ψ ∗ − M ∗ relation before correction, however, is not well described by a single power-law. For the sampleselected in this manner, the ψ ∗ − M ∗ relation displays ascatter of 0 .
89 dex interquartile before applying attenuationcorrections which is reduced to 0 .
59 dex interquartile byapplying attenuation corrections.For both these sample selection methods - by S´ersic-indexand by colour - the power-law indices recovered are indica-tive of a shallower relation than for the cell-based selection.Given the similarity of the relations at lower stellar masses( ∼ . M ⊙ ) this appears to be largely due to a differencein the samples in the high stellar mass range, with thecell-based selection recovering more quiescent spirals. Thisis in line with the finding that the samples selected bythese widely used proxies are more strongly biased towardssources with large values of H α equivalent width. It isparticularly note-worthy that the colour based selectionof Baldry et al. (2004) leads to a much shallower slopeand a very low scatter, most likely due to the exclusion ofquiescent galaxies.This comparison demonstrates the care necessary inconstructing galaxy samples for the purpose of statisticalinvestigations and illustrates the suitability of the cell-basedmethod of morphological classification for the investigationof the star formation properties of morphologically selectedsamples of spiral galaxies. A further discussion of the effectsof sample construction on the ψ ∗ − M ∗ relation is given inSect. 7.3. In deriving the intrinsic ψ ∗ − M ∗ relation for spiral galaxiesin the local universe we have made use of the prescriptionfor obtaining attenuation corrections given by Grootes etal. (2013) and the radiation transfer model of Popescu etal. (2011), as empirically calibrated on a sample of nearbyspirals (see Xilouris et al. 1999; Popescu et al. 2000, 2004;Misiriotis et al. 2001) and incorporating corrections forthe effects of dust on the perceived effective radii of disksby Pastrav et al. (2013b). In order to investigate to whatextent the results obtained depend on the chosen methodof deriving attenuation corrections, we compare the resultsobtained using the prescription of Calzetti et al. (2000)with those obtained using the method of (Grootes et al.2013). These two correction methods, while both beingempirically based, have a very different basis. Whereas themethod of Grootes et al. (2013) is calibrated on a sample oflocal universe spirals with FIR-UV detections, the methodof Calzetti et al. (2000) is calibrated on a sample of distantstarburst galaxies, utilizing measurements of emission linefluxes. Furthermore, whereas, by virtue of its radiationtransfer treatment, the method of Grootes et al. (2013)does not assume a fixed attenuation law in the UV/optical,this is the case for the method of Calzetti et al. (2000).This is potentially a critical factor when correcting for dustattenuation in spiral galaxies which lie on the transitionbetween optically thick and thin systems, for which oneexpects a large range in the shape of the attenuationcurve. Because of the requirement of emission line fluxes,the comparison must be based on a different sample, thistime incorporating galaxies with H α and H β line fluxes c (cid:13) , 1–39 M. W. Grootes, et al.
Figure 20.
Specific star formation rate ( ψ ∗ ) versus stellar mass ( M ∗ ) for a sample of spiral galaxies selected using the method of(Baldry et al. 2004) (left top and bottom) and a simple S´ersic index cut (right top and bottom) and not hosting an AGN following theprescription of Kewley et al. (2006), with z .
05. Individual sources are plotted as filled circles with the grayscale color indicatingthe relative source density at their position in the ψ ∗ − M ∗ plane. Values of ψ ∗ have been derived as previously detailed. Galaxieswithout an NUV counterpart in the GCAT MSC (Seibert et al., 2013 in prep.) are show as 3- σ upper limits. The median values of ψ ∗ in bins of 0 . M ∗ are shown as large symbols, with errorbars depicting the interquartile range in each bin. The medians andscatter for the whole sample are shown in filled symbols and colour, while the medians of the sample considering only sources withNUV counterparts are shown as black outlines. The top panels show the distribution and median relations after radiation transfer basedattenuation corrections following Grootes et al. (2013) have been applied, while the bottom panels show the uncorrected distributionand median relations. The dashed and dash-dotted lines in the top and bottom panels show power-law fits to the median relations in themass range 10 . M ⊙ M ∗ M ⊙ , corresponding to the volume-limited samples, with the dashed line showing the relation derivedfor the entire sample and the dash-dotted line showing the relation as derived only for the detected sources. Spiral galaxies found to hostan AGN following the prescription of Kewley et al. (2006) are shown by blue stars in the bottom panels. measured at > σ , which effectively removes the popu-lation of red, quiescent galaxies. Thus, we select a sampleof spiral galaxies with NUV counterparts, selected usingthe cell- based method with the parameter combination(log( n ),log( r e ), M i ), with z .
05, not hosting an AGN,and with H α and H β line fluxes measured at > σ as thebasis for the following comparison. We emphasize that therequirements on the spectroscopic information serve onlyto facilitate the comparison with the corrections obtainedusing the prescription of Calzetti et al. (2000).Fig. 21 shows the distributions of ψ ∗ as a functionof M ∗ without corrections for dust (top left) and withcorrections obtained using the radiation-transfer basedmethod of Grootes et al. (2013) (bottom left). The ψ ∗ − M ∗ relation obtained using the method detailed in Calzetti etal. (2000) for correcting dust attenuation is shown in the topright panel. As in the case for the full sample incorporatingred disks, the radiation transfer based corrections lead to asignificant tightening of the relation, in this case reducingthe mean logarithmic scatter from 0 .
58 dex to 0 .
37 dex. Thislends confidence that the radiation transfer method also hasthe ability to predict the correct overall shift in the relation(see also discussion in Grootes et al. 2013, Sects. 5 & 6).By contrast, under the application of the corrections basedon the Balmer decrement the scatter remains at 0 .
49 dex. Nevertheless, the overall shift in the relation towardslarger values of ψ ∗ by 0 . ψ ∗ on M ∗ than found for the uncorrectedrelation, with the slope of the relation obtained using theprescription of Calzetti et al. (2000) being slightly shallowerthan that of the relation obtained by applying the methodof Grootes et al. (2013). The power-law index found underboth corrections is close to γ ≈ − .
4. The flatteningcompared to the power-law index of γ ≈ − . M ∗ ≈ . M ⊙ ,not found when using the Grootes et al. (2013) attenuationcorrections. The fact that the Grootes et al. attenuation c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals corrections significantly reduce the overall scatter in therelation, may imply that the break is actually not physicalin nature, but rather may be an artifact of the applicationof the Calzetti et al. (2000) corrections to high mass spiralgalaxies. ψ ∗ − M ∗ relation with previous determinations Previous determinations of the ψ ∗ − M ∗ relation havegenerally necessarily been restricted to galaxy samplesencompassing the complete population of galaxies (e.g.Salim et al. 2007; Elbaz et al. 2007; Noeske et al. 2007), orto samples selected on the basis of colour or star-formationactivity (e.g Peng et al. 2010; Whitaker et al. 2012). Assuch, the ψ ∗ − M ∗ relation has been defined in terms of ablue sequence, or more generally a sequence of star-forminggalaxies, and has been contrasted with a red sequence,or more generally a sequence of non-star-forming galaxies(Peng et al. 2010, respectively Noeske et al. 2007; Whitakeret al. 2012). However, the more fundamental distinctionmay be the morphology of the galaxy. This is because, whilerotationally supported galaxies can support an extendedcold ISM which can support distributed star-formation,any extended ISM in a spheroid must be hot and tenuousif it is in virial equilibrium with the total mass distributionas traced by the stars, in which case it would be expectedto be inefficient in forming stars. To constrain processesdriving star-formation in galaxies, it is therefore instructiveto establish the ψ ∗ − M ∗ relation for a pure disk sample.We have found this relation to be a relatively tight(0 .
42 dex mean logarithmic interquartile range, correspond-ing to 0 .
31 dex 1- σ for a normal distribution) power-lawwith an index of γ = − . ± .
12, with no indicationof a cut-off at high stellar mass. This result shows thatthe phenomenon of down-sizing is also exhibited by amorphologically pure sample of disk galaxies, and is not justdue to an increasing fraction of spheroids with increasingstellar mass in the general galaxy population.The lack of an obvious turn-off in the ψ ∗ − M ∗ relationfor spirals, despite the inclusion of red quiescent spirals,suggests that if a mechanism exists to restrict the growthof spiral galaxies beyond the stellar mass range probed,such a mechanism must be accompanied by an abrupttransformation of galaxy morphology.As outlined above, previous works addressing the ψ ∗ − M ∗ relation have concentrated on the sequence ofstar-forming galaxies rather than a morphologically definedsample. For example Peng et al. (2010) make use of a U − B color selection (their Eq. 2) akin to that of Baldry et al.(2004) investigated in Sect. 7.1 of this paper, applying it toa sample of SDSS galaxies with star formation rates derivedfrom H α line measurements as provided by Brinchmannet al. (2004). These authors find a power law index of Down-sizing describes the phenomenon that star-formation inthe current epoch is biased towards low mass structures, in con-trast to the sequence of growth in dark matter structures, whichprogresses from low mass to high mass γ = − .
1, much shallower than the relation found in thiswork. Similarly, Whitaker et al. (2012) find that for localuniverse star-forming galaxies selected using U − V & V − J restframe colors, selecting a blue subset of thesegalaxies results in a shallow slope similar to that of Penget al. (2010). However, considering their full sample ofstar-forming galaxies Whitaker et al. (2012) find a steeperslope of γ ≈ − .
4. Finally, Noeske et al. (2007) find aslope of γ = − . ± .
08 for local universe galaxies withindications of on-going star-formation either in form of24 µ m emission and/or H α emission.The fact that these previously determined values of γ areall shallower than the relation found for a morphologicallyselected sample of spirals presented in this work, can bereadily understood. By selecting actively star-formingsystems, quiescent galaxies of similar morphology areexcluded from the samples. As passive spirals tend to bemore massive, on average, this leads to a flattening ofthe ψ ∗ − M ∗ with respect to a morphologically defined,sample, as similarly argued by Whitaker et al. (2012) in thecontext of the result of Peng et al. (2010). Indeed, for thesample of spirals selected using the cell-based method withthe combination (log( n ), log( r e ), M i ) and the additionalrequirement of H α and H β detections, as used in Sect. 7.2,we find the ψ ∗ − M ∗ relation to be well described by asingle power-law with an index of γ = − . ± .
09 anda scatter of 0 .
37 dex interquartile (0 .
27 dex 1- σ , assuminga normal distribution), very similar to the results forstar-forming galaxies as obtained by other authors aspreviously discussed.Overall, we thus find that the ψ ∗ − M ∗ relation for amorphologically selected sample of spiral galaxies with anindex of γ = − . γ = − . · · · − . We have presented a non-parametric cell-based method ofselecting robust, pure, complete, and largely unbiased sam-ples of spirals using combinations of three parameters de-rived from (UV/)optical photometry. We find that the pa-rameters log( r e ), log( µ ∗ ), log( n ), and M i perform well inselecting simultaneously pure and complete samples, whilethe use of the ellipticity e leads to pure yet incomplete sam-ples. These parameters, which are linked to older stellarpopulations, perform at least as well as selections using the u − r colour or the NUV − r colour after NUV preselec-tion. The remarkable success/importance of these seldomutilized parameters is consistent with the expected contrastin the structural properties of rotationally supported sys-tems (spirals) and pressure supported systems (ellipticals),in agreement with different evolutionary tracks for spiral andelliptical galaxies.For a selection of combinations of three parameters, thecell-based method is superior to a range of (widely used)photometric morphological proxies, and comparable to the c (cid:13) , 1–39 M. W. Grootes, et al.
Figure 21.
Specific star formation rate ψ ∗ versus stellar mass M ∗ for a subsample of spirals galaxies drawn from the OPTICALsample using the cell-based method and the parameter combination (log( n ),log( r e ), M i ) with z .
05, NUV detections and H α and H β fluxesat > σ , not hosting an AGN. The linear grayscale indicates the relative galaxy density in the ψ ∗ − M ∗ plane at the position of thegalaxy. The same scale has been applied to all panels. The vertical dashed-dotted line indicates the stellar mass limit above which thesample can be considered complete. The sources are binned in bins of equal size in M ∗ , with the bars showing the interquartile range andthe filled symbols (stars, inverted triangles and circles) showing the median value of ψ ∗ in each bin. The dashed line in the top panelsand the bottom left panel shows a single power-law fit to the binwise median values in the mass range 10 . M ⊙ M ∗ M ⊙ . Thebottom right panel shows the median relations to facilitate comparison. The uncorrected relation is shown as inverted triangles and adash-dotted line. The relation corrected for dust attenuation following Grootes et al. (2013) is shown as circles and a solid line, whilethe relation corrected for dust attenuation following Calzetti et al. (2000) is shown as stars and a dashed line. The bin centers have beenoffset by 0.01 in log( M ∗ ) for improved legibility. The scatter in the relation due to the scatter in the NUV is significantly reduced forthe corrections based on the radiation transfer model, while the Balmer decrement based corrections have no discernible effect on thescatter. In both cases the intrinsic values of ψ ∗ are shifted upwards w.r.t. the uncorrected values. Spiral galaxies fulfilling the criteria ofthe sample but hosting an AGN have been overplotted as blue stars in the top left panel. algorithmic classification approach using support vector ma-chines presented by Huertas-Company et al. (2011) in select-ing pure and complete samples of spirals from faint galaxysurveys.The optimum combinations for use with the method mayvary according to the science application for which the sam-ple is being constructed. For application to optically de-fined galaxy samples comparable in depth or deeper thanSDSS we identify the combinations (log( n ),log( r e ),log( µ ∗ )),(log( n ),log( r e ), M i ), and (log( n ),log( M ∗ ),log( µ ∗ )) to be themost efficient in selecting a sample of spirals balanced be-tween purity and completeness.While using NUV data can lead to purer samples, it posesthe possibility of a bias against UV faint sources andedge-on systems. Furthermore, we caution that making useof UV/optical colours additionally poses stringent require-ments on the depths of the samples used in order to providecomplete and unbiased samples.In this paper, we have used the cell-based classificationscheme with the parameter combination (log( n ),log( r e ), M i )to investigate the specific star-formation rate - stellar mass ( ψ ∗ − M ∗ ) relation for a purely morphologically defined sam-ple of spiral galaxies. Using this approach which is unbiasedin terms of star-formation properties and includes red, qui-escent spiral galaxies, we find that the intrinsic, i.e. dust cor-rected, ψ ∗ − M ∗ relation for spiral galaxies can be representedas a single continuous power-law with an index of − . . M ⊙ M ∗ M ⊙ , likely even ex-tending to 10 M ⊙ M ∗ . Despite the inclusion of quiescentgalaxies, the relation is also found to be very tight, with amean interquartile range of 0 . n ),log( r e ), M i ), as used in the investi-gation of the ψ ∗ − M ∗ relation, as well as for the combinations(log( n ),log( r e ),log( µ ∗ )) and (log( n ),log( M ∗ ),log( µ ∗ )) in Ap-pendix A, together with a brief instruction on their use.Immediate future work will focus on using the method pre-sented to test the performance of linear discriminant analysis c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals using multiple parameters in the morphological classificationof galaxies (Robotham et al., in prep.), as well as on defin-ing samples of spirals for use in applications of radiationtransfer modelling techniques (Popescu et al. 2011), whichcritically rely on the existence of the appropriate geometry(in this case spiral disk geometry), to derive self-consistentcorrections of the attenuation of UV/optical light by dustin these objects. ACKNOWLEDGEMENTS
We thank Ted Wyder for his assistance in compiling thesample. Some of the results in this paper have been derivedusing the HEALPix REFERENCES
Abazajian K. N., et al., 2007, ApJS, 182, 543Abraham R. G., van den Bergh S., Nair P., 2003, ApJ, 588,218Adelman-McCarthy J. K., et al., 2006, ApJS, 162, 38 http://healpix.jpl.nasa.gov Baldry I. K., Glazebrook K., Brinkmann J., Ivezi´c ˇZ., Lup-ton R. H., Nichol R. C., Szalay A.S., 2004, ApJ, 600, 681Balogh M. L., et al., 2004, ApJ, 615, L101Bamford S., et al.,, 2009, MNRAS, 393, 1324Banerji M., et al., 2010, MNRAS, 406, 342Barden M., et al., 2005, ApJ, 635, 959Bell E., de Jong R. S., 2001, ApJ, 550, 212Bell E., et al., 2004, ApJ, 600, L11Bernardi M., et al., 2012, MNRAS, in press,arXiv:1211.6122Bertin E., Arnouts S., 1996, A&AS, 112, 393Blanton M. R., et al., 2003, ApJ, 594, 186Blanton M., Roweis S., 2007, AJ, 133, 734Bournaud F., Jog C. J., Combes F., 2007, A&A, 476, 1179Brinchmann J., et al., MNRAS, 2004, 351, 1151Calzetti D., et al., 2000, ApJ, 533, 682Chabrier G., 2003, PASP, 115, 763Conselice C., J., 2003, ApJS, 147, 1de Jong J. T. A., Verdoes Kleijn G. A., Kuijken K. H.,Valentijn E. A., 2012, ExA, in press, arXiv:1206.1254de Vaucouleurs G., 1964, AnAp, 11, 247Driver S. P., Popescu C. C., Tuffs R. J., Liske J., GrahamA. W., Allen P. D., de Propris R., 2007, MNRAS, 379,1022Driver S. P., et al., 2011, MNRAS, 413, 971Driver S. P., et al., 2012, MNRAS, 427, 3244Einasto M., et al., 2010, A&A, 522, 92Elbaz D., et al., 2007, A&A, 468, 33Gini C., 1912, reprinted in Memorie di Metodologia Statis-tica. ed. E. Pizetti & T. Salvemini (1955; Rome: LibreriaEredi Virigilio Veschi)G´orski K. M., et al., 2005, ApJ, 622, 759Graham A. W., Driver S. P., Petrosian V., Conselice C. J.,Bershady M. A., Crawford S. M., Goto T., 2005, AJ, 130,1535Grootes M. W., et al., 2013, ApJ, 766, 59Hopkins A. M., McClure-Griffiths N. M., Gaensler B. M.,2008, ApJ, 682, L13Hubble E. P., 1926, ApJ, 64, 321Huertas-Company M., Rouan D., Tasca L., Soucail G., LeF`evre O., 2008, A&A, 478, 971Huertas-Company M., Aguerri J. A. L., Bernardi M., MeiS., S´anchez Almeida J., 2011, A&A, 525, 157Jogee S., et al., 2004, ApJ, 615, L105Kauffmann G., et al., 2003, MNRAS, 341, 54Keller S. C., et al., 2007, PASA, 24, 1Kelvin L. S., et al., 2012, MNRAS, 421, 1007Kennicutt R. C., 1998, ARA&A, 36, 189Kewley L. J., Groves B., Kauffmann G., Heckman T., 2006,MNRAS, 372, 961Laureijs R., et al., 2011, arXiv:1110.2193v1Lintott C. J., et al., 2008, MNRAS, 389, 1179Lintott C. J., et al., 2011, MNRAS, 410, 166Lotz J. M., Primack J., Madau P., 2004, AJ, 128, 163Martin D. C., et al., 2005, ApJL, 619, L1Morrissey P., et al., 2007, ApJS, 173, 682Morgan W. W., Keenan P. C., 1973, ARA&A, 11, 29Misiriotis, A., Popescu, C.C., Tuffs, R.J., & Kylafis, N.D.2001, A&A, 372, 775M¨ollenhoff C., Popescu C. C., Tuffs R. J., 2006, A&A, 456,941Moustakas J., et al., 2013, ApJ, 767, 50 c (cid:13) , 1–39 M. W. Grootes, et al.
Patel S. G., Holden B. D., Kelson D. D., Franx M., van derWel A., Illingworth G. D., 2012, ApJ, 748, L27Nair P. B., Abraham R. G., 2010, ApJS, 186, 427Nicol M.-H., Meisenheimer K., Wolf C., Tapken C., 2011,ApJ, 727, 51Noeske K. G., et al., 2007, ApJ, 660, L43Pastrav B. A., Popescu C. C., Tuffs R. J., Sansom A., 2013,A&A, 553A, 80Pastrav B. A., Popescu C. C., Tuffs R. J., Sansom A., 2013,A&A, 557A, 137Peng C. Y., Ho L. C., Impey C. D., Rix H.-W., 2002, AJ,124, 266Peng Y.-j., et al., 2010, ApJ, 721, 193Popescu C. C., Misiriotis A., Kylafis N. D., Tuffs R. J.,Fischera J., 2000, A&A, 362, 138Popescu, C.C., Tuffs, R.J., Kylafis, N.D., & Madore, B.F.2004, A&A 414, 45Popescu C. C., Tuffs R. J., Dopita M. A., Fischera J., Ky-lafis N. D., Madore B. F., 2011, A&A, 527, 109Ravindranath S., et al., 2004, ApJ, 604, L9Robotham A. S. G., Driver S. P., 2011, MNRAS, 413, 2570Robotham A. S. G., et al., 2013, MNRAS, in press,arXiv:1301.7129Salim S., et al., 2007, ApJS, 173, 267Salpeter E. E., 1955, ApJ, 121, 161Scarlata C., et al., 2007, ApJS, 172, 406Schelgel D. J., Finkbeiner D. P., Davis M., 1998, ApJ, 500,525S´ersic J.-L. ,1968, Atlas de Galaxias Australes (Cordoba:Observatorio Astronomico)Simard L., et al., 2002, ApJS, 142, 1Simard L., Mendel J. T., Patton D. R., Ellison S. L., Mc-Connachie A. W., 2011, ApJS, 196, 11Stoughton C., et al., 2002, AJ, 123, 485Strateva I., et al., 2001, AJ, 122, 1861Taylor E. N., et al., 2011, MNRAS, 418, 1587Tempel E., Saar E., Liivam¨agi L. J., Tamm A., Einasto J.,Einasto M., M¨uller V., 2011, A&A, 529, A53The DES collaboration, 2005, 2005astro.ph.10346TTreyer M., et al., 2007, ApJS, 173, 256Tuffs R. J., Popescu C. C., V¨olk H. J., Kylafis N. D., DopitaM. A., 2004, A&A, 419, 821Whitaker K. E., van Dokkum P. G., Brammer G., FranxM., 2012, ApJ, 754, L29Wyder T., et al., 2007, ApJS, 173, 293Xilouris, E. M., Byun, Y.I., Kylafis, N. D., Paleologou, E.V., Papamastorakis, J., 1999, A&A, 344, 868York D. G., et al., 2000, AJ, 120, 1579
APPENDIX A: CELL DECOMPOSITIONS OFPARAMETER SPACE
We have found the parameter combinations(log( n ),log( r e ), M i ), (log( n ),log( r e ), log( µ ∗ )), and (log( n ),log( M ∗ ), log( µ ∗ )) to be most efficient in retrieving asimultaneously pure and complete, largely unbiased sampleof spiral galaxies when applied to the optically definedgalaxy sample used in this work. In addition to the highvalues of purity and completeness, these selections requirea minimal amount of spectral coverage, hence can readily be applied to various samples of galaxies.Tabs. A1, A2, & A3 provide the decompositionsof the parameter space spanned for the combinations(log( n ),log( r e ), M i ), (log( n ),log( r e ), log( µ ∗ )), and (log( n ),log( M ∗ ), log( µ ∗ )) respectively. These discretizations havebeen performed using the entire OPTICALsample as a cal-ibration sample to maximize the purity and completeness.The full tables are available in the online version of thepaper and in machine readable form from the VizieR Serviceat the CDS . Rather than supply a binary classificationinto spiral and non-spiral cells we supply the spiral fractionand its relative error for each cell, allowing the reader toadapt the classification to his purposes. We do, howevernote, that the underlying definition of a reliable spiral( P CS , DB > .
7) is fixed.In addition, we have chosen to provide the elliptical fractionfor each cell and it relative error, where ellipticals are,analogously to spirals, defined as sources with P EL , DB > . F sp , itsrelative error ∆ F sp , rel , the elliptical fraction F el , its relativeerror ∆ F el , rel , and the resolution level the cell belongs to (1;1 division per axis, 2; 4 divisions per axis, 3; 8 divisions peraxis, 4; 16 divisions per axis) . With this information theentire grid can, if desired, be reconstructed. For classifyinggalaxies the tables can be used as follows: • select criteria for being a spiral (or elliptical) cell interms of F sp and ∆ F sp , rel (respectively F el and ∆ F el , rel ) • for each source identify the nearest grid point to itsforward lower left • assign the values of F sp and ∆ F sp , rel from the corre-sponding cell to the source in question • after completion for all sources select those correspond-ing to the selection criteria determined Tabs. A1, A2, & A3 are available in machine readable form atthe CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5)or via http://cdsarc.u-strasbg.fr/c (cid:13) , 1–39 hotometric Proxies for Selecting Spirals resolution corner coordinates cell dimensions Spiral fractions Elliptical fractionslog( n ) log( r e ) M i dlog( n ) dlog( r e ) d M i F sp ∆ F sp , rel F el ∆ F el , rel Table A1.
Excerpt of cell grid for the combination (log( n ),log( r e ), M i ). For cells with a spiral(elliptical) population of 0 the relativeerror is set to 1e6.resolution corner coordinates cell dimensions Spiral fractions Elliptical fractionslog( n ) log( r e ) log( µ ∗ ) dlog( n ) dlog( r e ) dlog( µ ∗ ) F sp ∆ F sp , rel F el ∆ F el , rel Table A2.
Excerpt of cell grid for the combination (log( n ),log( r e ),log( µ ∗ )). For cells with a spiral(elliptical) population of 0 the relativeerror is set to 1e6. This paper has been typeset from a TEX/ L A TEX file preparedby the author. c (cid:13) , 1–39 M. W. Grootes, et al. resolution corner coordinates cell dimensions Spiral fractions Elliptical fractionslog( n ) log( M ∗ ) log( µ ∗ ) d(log( n )) dlog( M ∗ ) dlog( µ ∗ ) F sp ∆ F sp , rel F el ∆ F el , rel Table A3.
Excerpt of cell grid for the combination (log( n ),log( M ∗ ),log( µ ∗ )). For cells with a spiral(elliptical) population of 0 the relativeerror is set to 1e6. c (cid:13)000