Noise Based Detection and Segmentation of Nebulous Objects
aa r X i v : . [ a s t r o - ph . I M ] S e p T HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September ©2015, The American Astronomical Society. All rights reserved. doi:10.1088/0067-0049/220/1/1
NOISE-BASED DETECTION AND SEGMENTATION OF NEBULOUS OBJECTS M OHAMMAD A KHLAGHI , AND T AKASHI I CHIKAWAAstronomical Institute, Tohoku University, Aoba, Sendai 980-8578, Japan; [email protected]
Received 2014 November 17; accepted 2015 April 6; published 2015 August 26
ABSTRACTA noise-based non-parametric technique for detecting nebulous objects, for example, irregular or clumpy galaxies,and their structure in noise is introduced. “Noise-based” and “non-parametric” imply that this technique imposesnegligible constraints on the properties of the targets and that it employs no regression analysis or fittings. The sub-sky detection threshold is defined and initial detections are found, independently of the sky value. False detectionsare then estimated and removed using the ambient noise as a reference. This results in a purity level of 0 . .
29 for
SExtractor when a completeness of 1 is desired for a sample ofextremely faint and diffuse mock galaxy profiles. The difference in the mean of the undetected pixels with the knownbackground of mock images is decreased by 4 . NoiseChisel is our software implementation of this new technique. Contrary to the existing signal-basedapproach to detection, in its various implementations, signal-related parameters such as the image point spreadfunction or known object shapes and models are irrelevant here. Such features make this technique very useful inastrophysical applications such as detection, photometry, or morphological analysis of nebulous objects buried innoise, for example, galaxies that do not generically have a known shape when imaged.
Key words: galaxies: irregular – galaxies: photometry – galaxies: structure – methods: data analysis – techniques:image processing – techniques: photometric
Free reproduction:
All the data-generated numbers and figures in this paper are exactly reproducible/configurablewith a make command. See reproduce/README in the arXiv source files. All results generated by free software.1. INTRODUCTIONGalaxies are one of the most prominent forms of nebulousor amorphous objects in astrophysics and astronomical imageprocessing. They appear to have a very diverse kinematic his-tory and thus display a very large variety of shapes and forms.They also have complex three-dimensional structures which wecan observe in two dimensions. Galaxy images can host an ar-bitrary number of clumps positioned anywhere over their lightprofiles. These clumps might be minor or major mergers, su-pernovae, other galaxies, or stars on the same line of sight, orhighly localized unobscured star-forming regions. With suchhurdles, astronomical image processing, and in particular galaxymorphological analysis, can be considered one of the most de-manding of all disciplines in image processing.Observations are inevitably diluted with noise. Any system-atic bias in measuring the noise properties will directly propagateto all higher level measurements on the detected signal or scien-tific targets. Therefore, accurately measuring the noise charac-teristics is the most fundamental and primary issue in data analy-sis and subsequent astrophysical interpretations. However, noisecan only be accurately characterized when the signal or object(s)buried in the noise are detected and removed. Therefore any dif-ficulties in detection can directly translate into a systematic biasin the measured noise properties. For example, objects that arevery faint and diffuse overall are very hard to detect. For brighterobjects that can be detected, identifying the possible boundariesis a major concern. The uniformity in shape, or morphology,between various objects can also play a significant role in thedetection ability.Detection of nebulous astronomical targets significantly suf-fers from all the problems mentioned above. Because of the in- strument and atmospheric point spread function (
PSF ) and alsothe intrinsic shapes of galaxies, their profiles sink deep into thenoise very slowly with a hardly detectable clear cutoff. Astro-nomical targets also have an extremely wide range of appar-ent luminosity and surface brightness profiles, from very brightnearby stars and galaxies to faint objects far below the detectionlimits.The most commonly used detection methods in astronomy,regardless of their implementation, can be classified as signal-based detection (see Appendix B for a review). The signal’s apriori known properties are the basis of this method. Thus theinherent shape (for example, being an ellipse or a functional ra-dial profile) of the galaxies and the atmospheric and instrumentaleffects, or
PSF , have to be known prior to running the detectionalgorithm. This approach to detection will therefore be most ac-curate for objects that satisfy the a priori known shapes and willbecome less reliable as the inherent shapes of the targets deviatefrom the expected shape. Failing to correctly find the
PSF willalso hamper the accuracy of the results.Systematic biases that are caused by cosmological surfacebrightness dimming along with K corrections and morphologi-cal evolution can hamper the accuracy of any study attemptingto compare galaxy populations at different redshifts. Therefore itis very important that the detection and measurement tools usedcan provide less biased results. An example of such compar-isons is the observed morphological evolution of blue galaxieswhere it has been observed that at higher redshifts ( z > z counterparts (Cowie et al. 1995;Abraham et al. 1996; Lotz et al. 2006; Elmegreen et al. 2007;Murata et al. 2014). This observed morphological evolution bi-ases the photometric measurements of any detection method that1 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA is based on a fixed elliptical shape and surface brightness pro-file for all redshifts, for example, the Petrosian (1976) and Kron(1980) methods. It is currently thought that the the peak of cos-mic star formation density was at z ∼ L CDM cosmological model is anotherfield that heavily depends on detection ability. It predicts muchfainter, lower surface brightness dwarf galaxies than currentlyobserved and may be lurking below the currently used detectionthresholds (Kauffmann et al. 1993; Klypin et al. 1999). From thetheoretical point of view, changing the astrophysical constraintsof the semi-analytic models results in the simulations and ob-servations becoming more consistent (see Macciò et al. 2010,and references therein). However, an observational confirma-tion is still lacking. Since such low mass galaxies are very faintand have irregular/amorphous morphologies, existing detectionmethods that rely on an ellipse and a priori known radial pro-files are insufficient and new detection methods that do not havethis limitation might provide the solution from the observationalpoint of view.In the study of individual galaxies, the faint outer wings area treasure trove for expanding our knowledge of galaxy evolu-tion. Being much less affected by internal processes, the fainteroutskirts of galaxies preserve valuable information about the dy-namic history of the galaxy. For example, finding tidal tailspresent in these regions can be a very good indicator of possibleprevious merger events (for example, see Ellison et al. 2013).Accurate detection of the galaxy light profiles at large radii isalso very dependent on the detection technique and Sky sub-traction (see Section 2.2). It has been observed that the radialprofiles of star-forming galaxies deviate from a purely exponen-tial profile at large radii (for example, see de Grijs et al. 2001;Pohlen and Trujillo 2006). Successfully detecting this behaviorin the outskirts can help explain the dependency of the star for-mation on gas surface density (Kennicutt 1989; Kregel and vander Kruit 2004).Intra-cluster light is another field in galaxy evolution that re-lies heavily on the ability to accurately detect and subtract thefaintest parts of cluster galaxies. These regions harbor starsthat have been stripped from the galaxies but are gravitation-ally bound to the cluster (see Presotto et al. 2014, and referencesthere in). Using creative and simple hardware (Abraham andvan Dokkum 2014), van Dokkum et al. (2015) recently found47 very low surface brightness but large galaxies in the commacluster. Some of them could be visually confirmed in archivedimages but remained undetectable using the existing detectionmethods due to their very diffuse structure.Increasing the detection ability by using larger and more ac-curate hardware is one the primary reasons that new telescopesand cameras are being commissioned. Over the past severaldecades the instruments used for astronomical observations haveundergone significant advancements from analog photographicplates to digital
CCDs . However, the most commonly used detec-tion techniques still rely on the Petrosian (1976) or Kron (1980)approximations or de Vaucouleurs (1959) or Sérsic (1963) pro-file fittings. These functions are defined on a continuous space,so they can be considered analog techniques. The detection tech-nique introduced in this paper exploits the digital nature of mod- ern data sets. Most of the operation is done on the noise (un-detected regions), which is also used as the basis for identifyingfalse detection and segmentation results. Therefore statisticallysignificant prior knowledge of signal-related properties becomesirrelevant. It also makes no use of any regression analysis or fit-ting which will bias the results when the targets deviate from therequired model functional forms.In Section 2 the properties of the images used and some defi-nitions are provided. The new noise based solution is fully elab-orated with images showing every step in Section 3. In Section4 application of the concepts on a large image for increased ac-curacy and efficiency is explained. In Section 5 the success indetection and the purity of the proposed algorithm on mock im-ages with mock noise is analyzed. Results on mock profile andnoise are only used for this “proof of concept” paper, a full com-pleteness, purity, number count, and etc, analysis on real imageswith no mock profiles will be provided in a companion paper(M. Akhlaghi et al. 2015. in preparation). Finally, in Section 6the usefulness of this approach for existing and future astronom-ical research is discussed. In Appendices A and B the existingapproaches to calculating the Sky and detection and segmenta-tion/deblending are critically reviewed. Appendix C explains anew algorithm for finding the mode of a distribution. This algo-rithm lies at the heart of our novel sub-Sky thresholding whichis independent of the Sky value. Appendices D and E show theparameter lists used for running
SExtractor and
NoiseChisel . NoiseChisel is our software implementation of the proposedtechnique in this paper. Throughout this paper,
NoiseChisel isused to refer to the proposed new approach for detection andsegmentation.
NoiseChisel is distributed as part of the
GNU
As-tronomy Utilities which is a new collection of tools for astro-nomical data analysis and manipulation. It is being introducedhere as the first astronomical software package that fully con-forms to the GNU coding standards and has been evaluated andendorsed by GNU . The copyright was also assigned to the FreeSoftware Foundation to guarantee its “freedom” and facilitateits use and future development by the worldwide astronomicalcommunity similar to all their major “free” software projects. Itintegrates nicely with the
GNU /Linux operating system. It canalso be natively compiled and installed on most other major op-erating systems. The existing tools in
GNU
Astronomy Utilitieswere used for all the data analysis and displaying steps in this pa-per (making mock profiles, convolving, adding simulated noise,producing a catalog, cropping parts of large survey tiles,
FITS image conversion to document-friendly image formats, creatingmasks, and calculating image statistics). The
GNU
AstronomyUtilities is “free” software, as defined in the
GNU general publiclicense (
GPL ), version 3.2. DEFINITIONSPrior to a complete explanation of the proposed new algo-rithm, some basic issues need to be addressed first. This newtechnique is non-parametric (with no functional profiles or regression-based fittings); therefore, demonstrating the concepts and steps The
GNU coding standards define specific generally accepted requirementsfor software configuration, compilation, and installation as well as codingstyle, command line user interface, and a manual in various web-based, print,and command line formats. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA (a) a.1 (logscale) a.2 (b) b.1 (logscale) b.2 N u m b e r o f p i x e l s − −
100 0 100 200 300 400 500 600 700 800 9000200400 (c) c.1 c.2 − −
100 0 100 200 300 400 500 600 700 800 9000200400 Pixel intensity ( e − ) − B ( e − ) Figure 1:
Mock galaxies and their pixel distributions with and without noise. The plots have two inset images. The left insets show the image of mock galaxies afterconvolution with a PSF prior to adding noise. Logarithmic scaling has been used for (a.1) and (b.1) to show the full extent of the light profiles. The other insets alluse a linear scale. The right insets show the image of the mock galaxies with noise added. The histogram starting from or after zero (yellow) shows the pixel countdistribution of the non-noised mock image (left inset). The histogram is vertically truncated for a clear comparison with the noised histogram. The wider histogram(red) shows the histogram of the noised image (right inset). The five vertical lines on each histogram show the median of the noised image after (4,5) s -clipping (seeSection A.3). Insets have been truncated to be in the range of the plot’s horizontal axis: white is −
250 counts and black is 1000 counts. (a) 95 random Sérsic profileswith total magnitudes between −
11 to − −
12 magnitude. The zero-point magnitude for all images in this paper is set to 0 .
00. (b)A −
16 magnitude Sérsic profile with n = . r e =
30 pixels, and q = .
25, (c) 6 large and bright Sérsic profiles with −
15 to −
18 magnitude. The PSF is a circular2D Moffat profile with FWHM = b = .
76 prior to adding noise. All the galaxies and the PSF are truncated at 5 times the effective radius and FWHMrespectively. B is the background count, see Section 2.2. on actual images plays a vital role in explaining the technique(Anscombe 1973). In Section 2.1 a short explanation of thepostage stamps used in this paper (including the appendices) isgiven. Fundamentally, this new technique grew out of the defi-nition of the Sky value, which is a very important concept in anymeasurement significantly affected by noise and is elaborated inSection 2.2. 2.1.DisplayedImagesAll the images in the figures are inverted: the darker thepixel, the larger its value. Unless otherwise stated, the displayedimages are 200 ×
200 pixels. They are all crops from original1000 × SExtractor and
NoiseChisel . The reason for setting thesesizes was
SExtractor ’s modeling deficiencies in background in-terpolation and profile modeling on the corners and sides of theimages (see Appendices A.6 and B.4). Using larger images en-sures that for both programs there is an abundant area of back-ground pixels regardless of the central object shape and size. Inthe purity analysis of Section 5.2 the pure noise regions of themock images are used to identify false detections. Unless other-wise stated, all the 200 ×
200 pixel cutouts are from the centerof the larger image.The real images used in this paper for real demonstrations areeither cutouts from the
COSMOS survey (Scoville et al. 2007),
Hubble Space Telescope ( HST ) F814W images (see Koekemoeret al. 2007; Massey et al. 2010), or one full
CCD image taken bySubaru Telescope’s SuprimeCam (Miyazaki et al. 2002).2.2.Background,Sky,andNoiseThroughout this paper, pixel units are assumed to be in photon-counts or simply ‘count’ ( e − ). All the methods can be readilyapplied to count sec − units. For example, all the displayed HST images were in units of counts sec − . Let us assume that all in-strument defects—bias, dark and flat—have been corrected and3 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA the total count of a detected object, O , is desired. The sourcesof e − in a pixel ( i , j ) of the image can be written as follows: (1)contribution from the target object, O i j , (2) contribution fromother detected objects, D i j . (3) undetected objects or the fainterundetected regions of bright objects, U i j , (4) cosmic rays, C i j ,and (5) the background count, which is defined to be the count ifnone of the others exists on that pixel, B i j . The total count in thepixel, T i j , can thus be written as: T i j = B i j + D i j + U i j + C i j + O i j . Figure 1 shows the pixel count distribution before and afteradding noise in three sample mock images. By definition, D i j isdetected and it can be assumed that it is correctly subtracted, sothat D i j can be set to zero. There are also methods to detect andremove cosmic rays (for example, the method of van Dokkum2001), enabling us to set C i j =
0. Note that in practice, D i j and U i j are correlated because both directly depend on the detectionalgorithm and its input parameters. Also note that no detection orcosmic ray removal algorithm is perfect. With these limitationsin mind, the observed Sky value for this pixel ( S i j ) can be definedas S i j = B i j + U i j . (1)Therefore, as the detection process (algorithm and input parame-ters) becomes more accurate, or U i j →
0, the Sky value will tendtoward the background value or S i j → B i j . Thus, while B i j is aninherent property of the data (pixel in an image), S i j depends onthe detection process. Over a group of pixels, for example, in animage or part of an image, this equation translates to the aver-age of undetected pixels. With this definition of Sky, the objectcount in the data can be calculated with T i j = S i j + O i j ⇒ O i j = T i j − S i j . (2)Hence, the more accurately S i j is measured, the more accu-rately the count of the target object can be calculated. Similarly,any under- (over-)estimation in the Sky will directly translateinto an over- (under-)estimation of the measured object’s count.In the fainter outskirts of an object a very small fraction of thephoto electrons in the pixels actually belong to objects. In Fig-ure 1(b.1) for example, only 19 .
83% of the pixels belonging tothe central object have a value larger than 10100 counts whilethe background is 10000 ±
100 counts. Therefore even a smalloverestimation of the Sky value will result in the loss of a verylarge portion of this mock galaxy.Based on the definition above, the Sky value is only correctlyfound when all the detected objects ( D i j and C i j ) have been re-moved from the data. However, as shown in Section B.1, exist-ing methods define their detection threshold using the Sky value(see Section B.1.2 in particular). Therefore they cannot use thisdefinition in finding the Sky. They have to apply the methodsexplained in Appendix A to approximate it. The foundation ofthe technique presented here is that the detection threshold andinitial detection process is defined independent of the Sky valuesee Figure 2 and Section 3.In the mock images in this paper each pixel value ( T i j ) iscomposed of the object contribution, O i j , and the background B : T i j = B + O i j ; see Figure 1. The noise is added to each pixelby replacing its value with a random value taken from a Gaus-sian distribution with a mean of T i j and s i j = p T i j . In this I N I T I A L D ETE C T I ON S I D E N T I F YAND R E M OV E F A L S E D ETE C T I ON S S KY R E G I ON S ( R s ) D ETE C TE D R E G I ON S ( R d ) ConvolveFind and apply quantile threshold on CIErodeOpen I ← Number of CC. A ≡ { A ,..., A I − } Average of undetected:
Sky (initial)Sky thresholdFill holesOpen K ← Number of CC C ≡ { C ,..., C K − } S ≡ { s ,..., s K − } s q ← Quantile of S Sky thresholdFill holesOpen J ← Number of CC B ≡ { B ,..., B J − } D ≡ { s ,..., s J − } i ← T ≡ { j ∈ { ,..., J − } | ( D j > s q ) ∧ ( B j ∩ A i , /0 ) } T , /0 A i is a falsedetection, remove it. i ← i + i = I DilateAverage of undetected:
Sky
FalseTrue FalseTrue
Figure 2:
Flowchart of the complete detection algorithm. The steps of the topand bottom boxes can be seen in the example images of figures 3 and 7 re-spectively. CI stands for convolved image, CC for connected components, s forsignal-to-noise ratio, ∧ for logical ‘and’, ← for assignment, = for comparison,and /0 for the empty set. Each CC is defined as a set of image pixels. I is thenumber of CCs in column 1 of Figure 7. J and K represent the number of CCs inthe bottom and top of column 4 in Figure 7, respectively. A i , B i , and C i are setsof pixels belonging to the i th connected component in their respective images. A , B , and C are families of sets (connected components). D i and S i are the totalsignal-to-noise ratios of B i , and C i respectively. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1 a.2 a.3 a.4a.4 a.5a.5 a.6a.6 a.7 a.8b.1 b.2 b.3 b.4b.4 b.5b.5 b.6b.6 b.7 b.8c.1 c.2 c.3 c.4c.4 c.5c.5 c.6c.6 c.7 c.8d.2 d.3 d.4d.4 d.5d.5 d.6d.6 d.7 d.8
Figure 3:
Demonstration of major detection steps in
NoiseChisel . The rows are (a) the same mock profile as in Figure 1(b). (b) 25 mock profiles with the sameparameters with n = n =
1, (d) a real star-forming z ∼ . HST / ACS F814W . The columns represent(1) mock profiles before the addition of noise and blank for the real image (it is blank since the signal without noise is not known in a real image; All mock imagesin this column are plotted in the logarithmic scale), (2) noisy image, (3) convolved with a 2D Gaussian kernel of FWHM = . paper, B = e − . The magnitude of each object is definedas − . ( F ) , were F (sum of pixel values) is in e − . In otherwords, the zeropoint magnitude is 0.00 since this paper is notlimited to any particular instrument. So the background magni-tude is − .
00. The total magnitude is calculated from the sumof all the object pixels in the image (not integrated to infinity).All profiles are truncated at 5 r e .3. NOISECHISELThe general flowchart of the proposed technique can be seenin Figure 2. In the proposed algorithm, a Sky-independent ap-proach is used to arrive at an initial detection. Individual initialdetections are subsequently classified as true or false based ontheir signal-to-noise ratio ( S/N ) using the surrounding undetectedregions as a basis. As seen in Figure 2, this second classificationstep requires an initial measurement of the Sky value as the av-erage of the initially undetected pixels and is fully elaborated inSection 3.1. The substructure in each detection is then foundand classified through defining “clumps” and “objects” which isexplained in detail in Section 3.2.The input parameters for NoiseChisel are specified in thetext with --parametername=value . The values used for theparameters of
NoiseChisel reported here are chosen such thatonce applied to the mock and real image tests that are shown Not to be confused with the non-parametric basis of this algorithm. By non-parametric it is implied that there are no functions and fittings, not that thereare no parameters. This is similar to the long format of how the options should be called on thecommand line in
NoiseChisel . in this paper, accurate results are achieved. The full parameterlist can be seen in Appendix E. The criteria are to have fewerfalse detections in pure noise while detecting the faintest pixelsof the mock profiles (see Section 5). We do not claim that theseare “the best” set of parameters for any generic data set.3.1.DetectionandSkyThe basic idea behind this detection algorithm is that if thereis a true signal buried in the noise over a certain contiguous area,it will systematically (contiguously) augment that region (keep-ing the noise fluctuations). The steps explained here are specif-ically designed to best separate these connected, augmented re-gions deep in the noise. The initial detection steps in NoiseChisel are schematically shown in the top group of steps of the flowchartin Figure 2 and their application to sample images is demon-strated in Figure 3. First, a very small
FWHM kernel is convolvedwith the image (Section 3.1.1, column 3 in Figure 3). A very lowthreshold is then applied to the smoothed image (Section 3.1.2,column 4). The regions below the threshold are expanded byeroding those that are above it (Section 3.1.3, column 6). Sincethe objects are not completely separated yet, “opening” is ap-plied (Section 3.1.4, column 7). False detections are estimatedand removed (Section 3.1.5) and the fainter regions of true de-tections, which were carved off during erosion are returned in aprocess of dilation (Section 3.1.6).Three of the examples in Figure 3 are mock and one is areal galaxy. Figure 3(a) is the same mock profile as in Figure1(b). Figures 3(b) and (c) contain 25 mock 2D Sérsic profilesplaced with equal spacing on a grid across the image. Other5 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA −
100 0 100 −
100 0 100Pixel intensity ( e − ) − B ( e − ) OriginalConvolved −
50 0 50 100 −
50 0 50 100Pixel intensity ( e − ) − B ( e − ) ( e − ) − B ( e − ) ( e − ) − B ( e − ) ( e − ) − B ( e − ) a FWHM=1 b FWHM=3 c FWHM=5 d FWHM=10 e FWHM=20
Figure 4:
Effect of convolution with wider kernels on Figure 3(c.2). The green histogram shows the count distribution of the convolved image. The blue histogramshows the distribution of the original image pixel count values within the green histogram’s range. The convolution kernel used is a Moffat function with b = . × FWHM pixels. Note how theconvolved histogram becomes more skewed as a wider kernel is used for convolution. The range of the horizontal axis significantly decreases with increasing kernelwidth. B is the background count, see Section 2.2. than their magnitude and position, all of the parameters are equalwith a position angle of 45 ◦ , an axis ratio of q = .
5, and an ef-fective radius r e =
10 pixels. The Sérsic index ( n ) of all thegalaxies in each image is also the same. The bottom left pro-file is the faintest with − . − .
06 magnitudes brighter towards the top right.Such extremely faint profiles were intentionally chosen to push
NoiseChisel (and
SExtractor in Section B.1.3) to the limits. Fig-ure 3(d) shows the same steps with exactly the same parametersapplied to a real z ∼ . COSMOS fieldimaged with the
HST
Advanced Camera for Surveys (
ACS ) usingthe
F814W filter.
Convolution is used to maximize the ratio of an object’s peaksignal to the local noise level (Bijaoui and Dantel 1970; Irwin1985). Convolution smooths the image by removing high spatialfrequencies. This means that after convolution, “nearby” pix-els with large differences in value, which are commonly due tonoise, will receive an intermediate value. Therefore, by smooth-ing the noise, the fraction of signal-to-noise is increased. With-out convolution, the large spatial frequencies due to noise willsignificantly limit the ability to detect a contiguous region innoise. Hence, in order to maximize the detection ability, con-volution is necessary. The definition of a nearby pixel and thefunctional form of how to define the intermediate value are de-fined with the kernel input into convolution.Figure 4 shows the noised image of 25 very faint n = FWHM ) kernelsapplied. As the kernel used for convolution widens, the imagebecomes more and more blurred, its histogram becomes moreskewed, and the range of pixel values (dynamic range) in the im-age decreases with more pixels having similar values. Compar-ing the blue and green histograms in Figure 4 shows this effect.The skewness induced into the histogram is another manifesta-tion of the increase in the
S/N of connected objects. Due to theskewness, separation of the brightest sections of the objects fromthe noise is facilitated. A wider convolution kernel has importantconsequences. (1) The shapes of the object tend to the shape ofthe kernel used to convolve the image. (2) The dynamic rangesignificantly decreases. Therefore, separating the fainter parts ofthe brighter objects from noise becomes much harder. (3) Com-pact and faint objects, that is, those that are not wide relative tothe convolution kernel, will be lost.The effects above are particularly harmful in ground-based images, where the
FWHM is generally very large. The decreasein the dynamic range is not a considerable issue in the signal-based technique because only the brightest pixels of the objectsare sought and the fainter parts are modeled (see Section B.4).As seen in the sequence of steps in Figure 3, the proposed detec-tion technique is exactly the opposite and aims to impose negligi-ble constraints and models on the detected signal. The dynamicrange is hence extremely important to successfully find the faintbut valuable object pixels. Therefore a sharp convolution kernelis used.Unlike contiguous (multi-pixel) signal, noise is ideally inde-pendent from pixel to pixel, and thus is directly connected withsampling or pixel size. Inverting the Nyquist sampling theorem,the sharpest convolution kernel can be considered an FWHM = FWHM = FWHM is used, so that the kernel is 11 ×
11 pixels .The fact that the same convolution kernel gives very accurate re-sults on any type of image is one example of the objectivity of NoiseChisel ’s output.Throughout
NoiseChisel , the convolved image is only usedfor relative pixel values, for example in thresholding (Section3.1.2) or oversegmentation (Section 3.2.1). The input image isused for absolute pixel values, for example, the Sky value, themagnitude of an object, or its
S/N . This distinction is not com-monly practiced in most existing techniques. For example, in
SExtractor , the threshold value is found from the input imageand applied to the convolved image; see Section B.1.2.
Pixels with values far below the inherent background or mea-sured Sky values (see Section 2.2) can harbor valuable informa-tion (see Figure 1). After accurate Sky subtraction, a real objectwill always have a positive total count. However, the standarddeviation of the noise is always much larger than the smallestdetectable positive count value of the real objects. For example,assume that the Gaussian noise standard deviation is 30 e − . Afteraccurate Sky subtraction, ∼
37% of the pixels that harbor 10 e − In processed images we have correlated noise which is addressed in Section3.1.5. The convolution kernel is created with
MakeProfiles which is part of the GNUAstronomy Utilities. The non-zero pixels in the square kernel have a circularshape. The cumulative distribution function of a normal distribution with mean of10 e − and standard deviation of 30 e − calculated at 0 e − . HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA ( e − ) − B ( e − ) OriginalConvolved0 1000 100Pixel intensity ( e − ) − B ( e − ) ( e − ) − B ( e − ) e − s − ) × s a b c d Figure 5:
Thresholds used in the demonstration examples of Figure 3 with the same labels. The blue histograms belong to the actual image (column 2 in Figure 3).Only the range that overlaps with the green histogram is plotted here. The green histogram is the histogram of the convolved image (column 3 of Figure 3). The thickcurve shows the cumulative frequency of the convolved image. The thin solid vertical lines show the 0 . s sky of the input image(blue histogram), which is 100 e − . B is the background count, see Section 2.2. of real counts will have negative recorded pixel value. Once alarge number of such pixels are found and the Sky value is accu-rately subtracted, the random effects of noise will be decreasedand useful information can be extracted from such pixels.Therefore it is wrong to assume that the Sky value is thelowest possible threshold. To accurately detect the faintest pixelsof an object, the threshold has to be less than the Sky value. Thisis very important in accurate measurement of the Sky value asthe average of undetected pixels.In NoiseChisel the threshold is defined using the cumulativedistribution of the pixels in the convolved image. To ensure thatthe threshold is below the Sky value, the quantile has to be lessthan 0 .
5, or the median. For the examples in this paper, the quan-tile threshold is set to --qthresh=0.3 , which means that 70% ofthe pixels in the convolved image will have a value above it. Theresults of applying a threshold to the convolved image can beseen in Figure 3, column 4, where black pixels are the ones thathad a count above this threshold. The thresholds can be seenrelative to the actual and convolved image histograms in Fig-ure 5. Notice that the threshold is safely below the backgroundvalue which is 0 e − for the mock images. In Section 4.2.1 thecomplete procedure for finding the threshold on a large image iselaborated. Applying a threshold to the image (Section 3.1.2) results in abinary image with pixels either having a count above the thresh-old or below it. The former are known as foreground pixelswhich are black in Figure 3, column 4. The latter, called back-ground pixels, are displayed in white in the same figure. Be-cause of this extremely low threshold, it is clear from column 4of Figure 3 that the regions harboring true signal in them covera 2D contiguous (non-porous) region of connected pixels, whileregions where no data is present form a porous structure in theforeground. The figure shows that when signal is present, evenif it is very close to the background value, it will augment theregion that it covers. The augmentation can be seen throughsmaller and fewer white holes in the black regions that harbordata. The first step in
NoiseChisel after applying the threshold isthus to exploit this porous structure and expand the holes througherosion of the foreground to separate the augmented regions.Erosion is an operation in morphological image processing(Starck and Murtagh 2006; Gonzalez and Woods 2008). Byeroding the foreground, it is implied that all foreground pixels that neighbor a background pixel are carved off the foregroundand changed to a background pixel. The neighborhood of, orconnected pixels to, a pixel is defined based on a structuring el-ement and generally has two types: 4 and 8 connectivity (seeFigures 6(a) and (b)). Eroding based on 8 connectivity will re-move more pixels. Therefore, in this step, NoiseChisel uses a 4connected structuring element to erode the foreground. The sta-tistical properties of this and other operations in mathematicalmorphology are reviewed in Dougherty (1992).Eroding the foreground expands the holes (white regions inFigure 3 column 4) and connects them to each other. The bestnumber of erosions ( --erode ) and the type of connectivity ( --erodengb )applied to an image is intertwined with --qthresh . The resultsof respectively setting them to and can be seen in Figure 3,column 5. If a set of foreground pixels had the shape of the non-white pixels of Figure 6(c), after applying this erosion the pixelswith the two brighter shades of gray would be carved off andonly the central 9 pixels would remain. Dilation is the inverse of erosion in morphological imageprocessing. When dilating the foreground, if a background pixeltouches a foreground pixel, it will be changed to the foreground.Applying these two steps in order, namely, first eroding then di-lating the foreground, is known as opening .Opening is very useful for separating regions of the fore-ground that are only connected through a thin thread of pixels.It also has the benefit of removing extremely small objects (seeVollmer et al. 2013, for such an application in astronomy). If theimage is first eroded n times and then dilated the same n times,then n is referred to as the “depth” of the opening. Note thatthe erosion discussed here is part of opening and separate fromthe one in Section 3.1.3. The depth can be set in NoiseChisel through the parameter --opening and the type of connectivitycan be set through --openingngb . In the examples shown in Fig-ure 3, they are set to and , respectively. The opened imagecan be seen in column 6 of Figure 3. Compared with column 5,it is clear that, as expected, opening has separated the foregroundinto separately connected regions or islands without bridges.Multiple erosions and dilations with the same structuring el-ement will result in the outer shapes of the foreground becominglike a diamond in 4 connectivity or a rectangle in 8 connectiv-ity. Therefore, since the erosion of Section 3.1.3 was based on Hence the name
NoiseChisel : a tool for carving off noise, similar to whatwood chisels or stone chisels do on their respective materials. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a b c
Figure 6:
Structuring elements. The black pixel is the pixel under considerationand the gray pixels are the neighbors of that pixel. (a) 4 connected neighbors.(b) 8 connected neighbors. (c) Non-white pixels show the smallest area an objectshould have to be detected with the given two 4 connected erosions (see Section3.1.3) and one 8 connected opening (see Section 3.1.4). It is assumed that all thepixels outside the box shown are background (white). The three shades of grayshow the pixels that are removed in each step (brightest first). After the threesteps finish, the inner 9 pixels remain. In processing, white pixels have a valueof 0 and others have a value of 1. a 4 connected structuring element, in this step an 8 connectedstructuring element was used to compensate for the shape lossin the previous step as well as more successfully removing thinconnections. The composite effect of the two steps of erosionin Section 3.1.3 and the opening done here can be seen in Fig-ure 6(c). Before the dilating section of the opening step, onlythe central black pixel will remain. If a region above the thresh-old is any smaller than the colored pixels of Figure 6(c), it willdisappear after this step since no pixels remain for dilation.The erosion and dilation defined here are based on a binaryimage as discussed in Section 3.1.3. It is also possible to de-fine such operations on grayscale images where a pixel can havea range of values, not just 0 or 1. Perret et al. (2009) used acombination of such grayscale erosion and dilation (hit-or-misstransform) for a demonstration on detecting low surface bright-ness galaxies. Since the structuring element is no longer a bi-nary, the variation of pixel values over the structuring elementbecomes very important and has to be modeled based on spe-cific elliptical parameters, surface brightness profiles, and the
PSF ; see Figures 4 and 5 of Perret et al. (2009). Therefore, it isoptimal only for profiles that satisfy the count distribution of thestructuring element. Applying it generically to any image withany distribution of galaxy morphologies will require a high levelof customization that may fail for complex galaxy morphologieslike the real galaxies shown in this paper.
The foreground is now composed of separately connectedobjects that we define as initial detections. However, on themock images (particularly comparing Figure 3(a.1) with (a.6)),it is clear that some detections can be associated with true signalwhile most cannot. These “false” detections are sets of pixelswith random values (above the very low threshold on the con-volved image) that were randomly positioned close enough toeach other to pass the erosion and opening steps of detectionexplained above. As the threshold decreases, the probability ofsuch random positioning increases. If a larger threshold, numberof erosions, or number of openings is used, false detections willdecrease or completely disappear.The problem with more stringent detection parameters is that besides rejecting more inherently faint objects, they will also re-move the valuable fainter parts of the final detections. Therefore,a classification scheme is defined to find and remove false detec-tions very accurately. This allows for keeping the valuable faintpixels of bright detections and most of the detections that wouldhave been lost otherwise.The bottom group of steps in the flowchart in Figure 2 showall the steps used to define and remove false detections. Thesame steps can also be seen in practice in Figure 7 and elaboratedhere. The detections have divided the pixels into 2 sets: (1) Theundetected sky, R s (white regions of column 1 in Figure 7) and(2) the detected regions, R d (black regions in column 1 of Figure7). It can be reasonably assumed that no significant contributionfrom a detectable object exists in R s . Some of the detections in R d harbor signal and some are localized random noise pixels,namely false detections.In order to identify the false detections in R d , a secondarydetection process is applied to the pixels of both R s and R d inde-pendently. The detections in the former will provide a scale onwhich the certainty of the detections in the latter can be defined.Let S a and s s be the average and standard deviation of the countin R s , respectively. The second threshold is defined by these twoparameters. Note that this S a is not the final Sky value becausesome of the high valued, localized noise pixels (false detections)have systematically been removed from it; it is just used here asan approximation.The user is free to set this second threshold based on S a + dthresh × s s . In the examples here, --dthresh=-0.1 . The re-sult of applying this threshold to the detected and blank regionscan be seen in Figure 7, column 2. Note that as discussed inSection 3.1.1, absolute values such as S a and s s come from theinput image. Therefore, while the pixel indices comprising R s and R d are defined from column 6 of Figure 3, which used theconvolved image, the pixel values for finding and applying thissecond absolute threshold come from the actual image. The pix-els that were above this threshold are marked in black in column2 of Figure 7.As discussed in Section 3.1, a significant advantage of dataover noise detections is that the data augment a contiguous re-gion. In order to exploit this contiguous property of true detec-tions, all the holes that are engulfed in 4 connected foregroundregions will be filled. Note that holes are white pixels in Figure7, column 2. The result of filling such holes can be seen in col-umn 3 of Figure 7. Such holes are most probably due to noisesince the 4 connectivity of all the surrounding pixels is a verystrong constraint (compare Figure 7(b.2.2) and (b.3.2)). Com-bined with the next step (opening), filling holes in this mannersignificantly enhances the ability to identify a faint contiguoussignal.To remove the thin connections between thicker regions andobtain individual connected components, one level of 4 con-nected opening is applied. The result can be seen in column 4 ofFigure 7. The two images of column 4 in Figure 7 are now com-posed of separate connected components (pseudo-detections) thatwere created in exactly the same manner but on different regionsof the input image. The parameter used to quantify the defini-tion of a false detection is the total S/N ( S/N T ) of each pseudo-detection which is a combination of its total count and area.Let F be the average count in each pseudo-detection and N be its area. The areas vary widely, ranging from one pixelto thousands of pixels. Therefore in order to get a reasonable8 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1a.1 a.2.1a.2.1 a.3.1a.3.1 a.4.1a.4.1 1 2 3 4a.5.1 a.2.2a.2.2 a.3.2a.3.2 a.4.2a.4.2 a.5.2a.5.2D
ETECTIONS MASKED : NOISE MASKED : Detections ( R d )Noise ( R s ) b.1b.1 b.2.1b.2.1 b.3.1b.3.1 b.4.1b.4.1 1 2 3b.5.1 b.2.2b.2.2 b.3.2b.3.2 b.4.2b.4.2 b.5.2b.5.2c.1c.1 c.2.1c.2.1 c.3.1c.3.1 c.4.1c.4.1 2 4 6 8c.5.1 c.2.2c.2.2 c.3.2c.3.2 c.4.2c.4.2 c.5.2c.5.2 Figure 7:
Identifying and removing false detections, demonstrated for three of the examples from Figures 3(a), (b) and (d). Column 1 is the same as column 6 in Figure3. Columns 2 to 4 apply the same steps (explained below) on the undetected regions (top) and detections (bottom) separately. Column 2: Applying the threshold.Column 3: Fill holes surrounded by one 4 connected region above the threshold. Column 4: Application of opening to the previous column and small regions removed.Column 5, top: distribution of the total signal-to-noise ratio of all the dark connected components in column 4 (top). The dashed vertical line in the histogram showsthe --detquant=0.99 quantile of this distribution. This value is used as a threshold to define false detections in column 4 (bottom). (a.5.1) is from a nearby large mesh(see Section 4) because the number of connected components in (a.4.1) is less than --minnumfalse=100 . Column 5 (bottom): all detections with total signal-to-noiseratio above that threshold. The arrows in (a) visualize the steps which are schematically shown in the lower box of Figure 2. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA average count comparison, all pseudo-detections with an areasmaller than --detsnminarea are removed from the analysis inboth R s and R d . In the examples in this section, it is set to . Note that since a threshold approximately equal to the Skyvalue is used, this is a very weak constraint. For each pseudo-detection, S/N T can be written as,S / N T = NF − NS a q NF + N s S = √ N ( F − S a ) q F + s S . (3)See Section 3.3 for the modifications required when the input im-age is not in units of counts or has already been Sky subtracted.The distribution of S/N T from the objects in R s for the three ex-amples in Figure 7 can be seen in column 5 (top) of that figure.Image processing effects, mainly due to shifting, rotating, andre-sampling the images for co-adding, on the real data furtherincrease the size and count, and hence, the S/N of false detec-tions in real, reduced/co-added images. A comparison of scaleson the
S/N histograms between the mock ((a.5.1) and (b.5.1)) andreal (c.5.1) examples in Figure 7 shows the effect quantitatively.In the histograms of Figure 7, the bin with the largest numberof false pseudo-detections respectively has an
S/N of 1 .
89, 2 . . S/N T distribution of detections in R s provides a very ro-bust and objective scale that can be used to select or reject a givendetection in R d . The level of certainty for true detections can beinput by the user and is defined based on --detquant whichis the desired quantile in the S/N T distribution of R s . In theflowchart shown in Figure 2, this is displayed with D j > s q . Theaccepted detections from the second thresholding can be seen inthe bottom row of column 5 in Figure 7. Any initial detectionthat overlaps with at least one of the accepted pseudo-detections( B j ∩ A i , /0 in Figure 2) is defined as a true detection, whilethose that don’t are considered a false detection and removed.The successful detections can be seen in the last two columns ofFigure 3 (after a final dilation; see Section 3.1.6).Outside of astronomical data analysis, the technique pro-posed here to separate true from false detections (and later clumps;see Section 3.2.1) can be considered a form of anomaly detec-tion in data mining. An anomaly or outlier is defined as “an ob-servation which deviates so much from other observations as toarouse suspicion that it was generated by a different mechanism”(Hawkins 1980, page 1). This is a very general and qualitativedefinition applicable to any form of anomaly (see Chandola et al.2009, for a review).The S/N of the connected components in R s (after applyingthe threshold, filling holes, and opening) gives a scale to definethe outliers in R d (true pseudo-detections). However, a quantileand not the histogram is used (the histogram is only displayedin Figure 7 for demonstration). Therefore based on the classifi-cation of Chandola et al. 2009, the true/false classification tech-nique proposed here can be considered a non-parametric, “quan-tile based” statistical anomaly detection technique.This technique for determining an “anomaly” (or a real ob-ject within noise) has some similarities to the Higher Criticismstatistic (Donoho and Jin 2004). It was used in detecting non-Gaussianities in the cosmic microwave background (Cayón etal. 2005) with the modification of using p -values. However, thehigher criticism statistic is derived from and applied to the sameset of observations, while here the S/N threshold is found using acompletely separate set of pixels (those in R s ). It is then applied to another set of pixels ( R d ). Also, while the Gaussian/Poissondistribution is assumed for the noise, there is no such assumptionfor the distribution of the S/N . The final step is to restore, through dilation, the pixel layersof each remaining initial detection that were removed by erosionin Section 3.1.3. Dilating --erosion times would presumably re-store the object pixels out to the initial --qthresh quantile. Theuser can set the number of final dilations with the --dilate op-tion. Images of astronomical objects never have a strong cutoff.Even if they have a physical cutoff, the
PSF will soften any sharpboundary except cosmic rays which are independent of the
PSF .Therefore it can be assumed that the object extends beyond theinitial threshold. To exploit this fact, this final dilation is onlybased on 8 connectivity and in the examples of this paper; it isset to --dilate=3 . 3.2.SegmentationThe detection thresholds used are extremely low in
NoiseChisel (see Section 3.1.2). Combined with the fact that galaxies haveno clear cutoff, it often happens that several apparently nearbyobjects on the image will be detected as one region. There aretwo approaches to separating a detected region into potentiallyseveral sub-detections or to find substructure: segmentation anddeblending.Segmentation is the procedure that assigns each pixel to onesub-component. Deblending, on the other hand, identifies thecontribution in each pixel from different blended sources. There-fore in deblending, each pixel can belong to more than one ob-ject. Deblending is more realistic: the sum of the light profiles ofmultiple overlapping objects specifies the final pixel value. How-ever, to be accurately done, deblending will require parametricanalysis and fitting algorithms. Segmentation, on the other hand,is a very low-level measurement and only goes so far as to as-sign each pixel to the sub-component that contributes most to it.Therefore segmentation provides the best starting point for thehigher level deblending if it is needed. NoiseChisel will not dodeblending but only performs segmentation.In order to segment each detection into sub-components, clumps (Section 3.2.1) and objects (Section 3.2.2) are defined. A de-tected region might contain one or more objects and anythingfrom none to multiple “true” clumps. Finding the clumps in anobject or detected region is a low-level measurement that pro-vides us with a multitude of false clumps mixed with possiblytrue clumps. True clumps can be found robustly by studying thenoise properties in the vicinity of the detected regions. This pro-cess is very similar to how true/false detections were separatedin Section 3.1.5. In Section 3.2.1 the proposed definition for aclump and the true/false classification is explained. The segmen-tation of a detected region to objects is a high-level measurementthoroughly discussed in Section 3.2.2.
Pixels at a local maximum are defined as those whose valueis larger than their 8 connected neighbors. The 8 connected re- We will be working on such a fitting tool for deblending as part of the
GNU
Astronomy Utilities to start using the outputs of
NoiseChisel . HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA
Figure 8:
Pixel-by-pixel demonstration of oversegmentation (see Section 3.2.1 for a complete explanation of the algorithm). The order is from the top left to thebottom right. First: a small region in the center of Figure 9(b.2). The rest show how pixels are labeled in order of their value. The shades of gray become darker as thelabels increase. The darkest shade of gray represents river pixels. gion around each local maximum whose pixels all have a lowercount than the peak is defined as a clump . Note that the clumpcan be much larger than 9 pixels.The approach introduced here for finding the clumps associ-ated with all the local maxima of an image is conceptually verysimilar to and inspired by the method proposed by Vincent andSoille (1991), without using layers because of the biases theyproduce (see the discussion in Section B.2). Figure 8 shows theprocess for 124 pixels of the central region of Figure 9(b.2).A zero valued array with the size of the image is first createdto store the labels of the different clumps. All the pixels of eachobject are ordered based on their value. Starting from the bright-est pixel and in decreasing order, the pixels are labeled based onthe following conditions.1. If the pixel is not 8 connected to any other labeled pixels, itis a local maximum and is assigned a new label. This canbe seen as new labels (darker shades of gray) are added inFigure 8.2. If its 8 connected neighbors all have the same label, thepixel is given that label as well. In Figure 8, this occursmost often (when the colored areas expand).3. If its neighbors have different labels, like a river flowingbetween two mountains, the pixel is a local minimum.The river pixels act as separators of each mountain orclump. River pixels are the darkest pixels in Figure 8. Thepixel-by-pixel construction of the rivers can be seen fromthe middle of the third row in Figure 8 as the growingregions start touching. In this setup the rivers are 4 con-nected. If 4 connectivity was chosen for checking neigh-bors, the rivers would be 8 connected.The application of this simple algorithm to the non-detectedregions of the image can be seen in column 3 of Figure 9 and As Vincent and Soille (1991, page 583) put it, “In the field of image process-ing and more particularly in mathematical morphology, grayscale pictures areoften considered as topographic reliefs. In the topographic representation of agiven image I , the numerical value (that is, the gray tone) of each pixel standsfor the elevation at this point. Such a representation is extremely useful sinceit allows one to better appreciate the effect of a given transformation on theimage under study.” In this analogy, the noisy image can be considered as amountain range and when rain comes, the water gathers in the local minimaof those mountain ranges to form rivers. on the detected regions in column 5. Similar to our discussionin Section 3.1.5, the region of no detection is used as a basisor calibrator to find the true clumps over the detected regions.This algorithm for finding the clumps uses their relative values.Therefore as discussed in Section 3.1.1, the convolved imageis used to find the index of pixels in each clump. The smallconvolution kernel is beneficial for this step because the spatialresolution is very important for an accurate result. This pixel-by-pixel expansion process provides the ultimate usage of thedynamic range of the convolved image.A very large fraction of the clumps found over a detected ob-ject (column 5 in Figure 9) are due to background and correlatednoise. To identify the false clumps, the segmentation results onthe noise regions of the image (column 3 of Figure 9) are used asa reference. As discussed in Section 3.1, it can be assumed withhigh certainty that there is no significant signal (from a true phys-ical object) in the noise regions. Furthermore, the noise regionsin the vicinity of the object would contain the same undetectedastronomical objects and instrumental and data analysis biasesas the object pixels. Therefore this region is the most accurateand objective reference available to judge between a true and anoise clump.On the original unconvolved and not Sky subtracted image, F i and F o are respectively defined as the average count in a clumpand on all river pixels surrounding it. The average value of theriver pixels around each clump approximately show the base el-evation or count if the clump was not present. Let N i be the totalarea or number of pixels inside a clump. The S/N of each clumpcan be written as,S / N = N i F i − N i F o √ N i F i + N i F o = √ N i ( F i − F o ) √ F i + F o . (4)See Section 3.3 for how this equation can be modified forimages that are not in units of counts or have already been Skysubtracted. The S/N distribution of all the clumps in the blankSky can be seen for all the examples of Figure 9 in column 4 ofthat figure. The --segquant=0.99 quantile of the distribution isused to find the
S/N threshold to accept or reject a clump over thedetected object. This threshold is thus completely independentof the particular object in which the potential clumps are embed-ded. Note that the criteria in
SExtractor and some other existing11 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1 a.2 a.3a.3 a.4
109 clumps a.5 a.6b.1 b.2 b.3b.3 b.4
166 clumps b.5 b.6c.1 c.2 c.3c.3 c.4
156 points c.5 c.6d.1 d.2 d.3d.3 d.4
155 points d.5d.5 d.6d.6
Figure 9:
Finding clumps through oversegmentation. Column 1: input images. Column 2: convolved image (Section 3.1.1). Column 3: oversegmentation appliedto non-detected regions on the convolved image. Column 4: signal-to-noise ratio histogram (Equation (4)) of all segments in column 3 with an area larger than --segsnminarea=25 . The dashed line shows the position of the --segquant=0.99 quantile. Column 5: oversegmentation applied to the central object. Column 6:clumps in all the detected objects with signal-to-noise ratio larger than the threshold of column 4. Note that some detections have no true clumps. tools for deblending an object is the fraction of counts in eachclump to that in the parent detection (see Section B.2 for a dis-cussion of the resulting biases). Since no layering is necessaryand the clump counts are not compared to the object anymore,the clumps in various galaxies can now be objectively comparedwith each other.Since the average count in each clump is very importantin the
S/N calculation, only clumps having an area larger than --segsnminarea are considered in the comparison. Any clumpthat is smaller than this area will be discarded as noise. Forthe examples in this paper it is set to . This is larger than --detsnminarea , because the areas there were found by apply-ing a threshold to the input image, while the clumps here arefound on the convolved (smoothed) image where the areas be-come larger. Clumps smaller than this area add very strongskewness to the S/N distributions.
The very low thresholds that were used (see Section 3.1.2)cause objects that are close enough to each other on the image tobe blended in one detection. Three real examples are displayedin the first column of Figure 10. The basis for separating po-tential objects in a given region is the true clumps which havebeen found in that detection. Therefore, if a detected region has none or only one clump, it is considered to be only one object.If there is more than one clump in a detected region, the clumpsare grown until a certain threshold of the input image. If theSky value (average of non-detected pixels) and its standard devi-ation are labeled with S and s s , then the threshold to stop clumpgrowth is S + gthresh × s s . In the examples of Figure 10 it is setto --gthresh=0.5 .The process of growing the clumps is very similar to the al-gorithm in Section 3.2.1 (Figure 8) with the exception that nonew labels are added. If a pixel has no labeled neighbors, it iskept in a queue to be checked on a next loop. The loop continuesuntil no new pixel can be labeled. The grown segments of eachclump of the examples in Figure 10 can be seen in column 3.Objects are defined based on the average S/N of the riverpixels between the grown clumps of Figure 10, column 3. Twogrown clumps are defined as separate objects if the river betweenthem is below a user-defined
S/N . If it is larger, then the connec-tion between the grown clumps is too strong to be regarded asseparate objects. Therefore the grown clumps are considered asparts of one object. Let F i j represent the average count on theriver between the grown clumps i and j . The average S/N of theriver between these two clumps is defined as:S / N i j = F i j − S p F i j + s s . (5)12 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1 a.2 a.3 a.4 a.5b.1 b.2 b.3 b.4 b.5c.1 c.2 c.3 c.4 c.5
Figure 10:
Segmenting objects. Column 1: actual images. Column 2: true clumps in the detections (see Section 3.2.1). Each clump is color-coded based on its label.The underlying light gray region shows the detection region. Column 3: the clumps are grown until --gthresh=0.5 (see Section 3.2.2). Column 4: all objects withboundary signal-to-noise ratio larger than --objbordersn=1 are given the same label. Column 5: the remaining pixels of the region are filled by growing the labels ofcolumn 4. In the last column only the central segmented objects are shown for clarity. If S/N i j < objbordersn , then these two grown clumps are con-sidered separate objects and if not, they are defined as belong-ing to the same object. In the examples of Figure 10 this isset to --objbordersn=1 . S/N i j (= S/N ji ) is calculated for all thegrown clumps of each detection. Finally, all the clumps that areconnected to each other within a detection are given one label(see column 4 of Figure 10). Note that if two grown clumpshave S/N i j < objbordersn , and they are both connected to a thirdclump with S/N ik > objbordersn and S/N jk > objbordersn thenall three grown clumps will be given the same label. The riversbetween two grown clumps are only one pixel wide. Thereforein the calculation of F i j , the count of each pixel is taken as theaverage of that pixel and its 8 connected neighbors. Hence, inpractice the rivers between two grown clumps are 3 pixels wide.Because of the absolute nature of the two parameters forobject definition (as opposed to the relative nature in the caseof detection and clumps), namely the user directly providing --gthresh and --objbordersn , the object definition is not ro-bust and objective like that of detections and clumps. In fact --objbordersn is the only user given S/N value in
NoiseChisel .The definition of objects made here is purely based on theone image input to
NoiseChisel . The example above shows thatthis definition of objects is fundamentally subjective (directly de-termined by the particular data-set and the user and not neces-sarily based on physical reality). Two galaxies at vastly differentredshifts might be sufficiently close to each other, on the sameline of sight, that they are each defined as clumps in one largerobject. However, due to surface brightness limits, the connection (for example, through a spiral arm or a tidal tail) of a real star-forming region of a galaxy might not be detected. Therefore thatregion will be identified as a separate object. This also appliesto rings and filaments. In order to solve these issues, the infor-mation in one data set (image in this case) is not sufficient, andancillary data, for example, from spectroscopic data or imagesin other wavelengths, will be necessary. Therefore the physicalreality of the ‘objects’ defined here is beyond the scope of anyanalysis based solely on one image.3.3.S/NEquationModificationsThe standard deviation of the non-detected regions ( s s ), in-corporates all sources of error: Poisson noise, read-out noise,etc. Hence, if the image is in units of counts and the backgroundsare not already subtracted once, then Equations (3) – (5) can beused no matter if the noise is background-dominated or read-out-noise-dominated. However, input images do not necessarilysatisfy this condition. For example, the HST / ACS images shownabove are processed (already Sky-subtracted) and distributed inunits of counts per second.When the image is in units of counts s − , and if the noise isbackground-dominated, all the terms in the equations have to bemultiplied by a constant in units of time. In finding true detec-tions and clumps that depend on the quantile of the S/N distribu-tion, the absolute value of this constant makes absolutely no dif-ference. However, in identifying objects, it is important, becausethe user specifies an absolute
S/N value for rivers.
NoiseChisel HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA will automatically detect if an image is in units of counts orcounts s − using the minimum standard deviation in all the meshes(see Section 4). If this minimum is larger than unity, then it isassumed that the image is in units of counts. If not, the inverseof this minimum (which has units of time) will be used as thetime constant discussed above.If the input images are processed, then Sky subtraction hasalready been applied. Therefore the error in the average count F in the denominators should be F + s s , not F alone as is currentlypresent in the equations. It is up to the user to specify this condi-tion through the --skysubtracted parameter (see the manual formore details). 4. LARGE (REAL) IMAGESReal astronomical images are not as small and clean as theexamples in the figures in Section 3. They are often much largeror might contain masked pixels. The former often causes varia-tions in background and noise (see Figure 11). These gradientsdo not necessarily have to be created by the background, but theymight be due to processing, for example, bad fittings in flat field-ing. Even if the image is very clear, and the targets are alreadyaccurately defined, the statistics when using such small postagestamps will not be accurate. By statistics we mean the numberof false detections and clumps in the non-detected region thatgo into calculating the S/N quantile; see Section 3.1.5 and 3.2.1.Therefore much larger postage stamps, for example, from sur-veys, have to be used. As discussed in Section 2.1, the input
FITS images of all the figures so far were much larger than thoseshown here.The image is considered to be a grid of meshes. All the stepsexplained in Section 3 can be applied independently on eachmesh. Since the operation on each mesh is completely inde-pendent of the rest, parallel processing can be applied to signif-icantly improve processing time. This is only possible becauseall the methods in
NoiseChisel are non-parametric.Some procedures are directly linked to the potential gradi-ents in the image, for example, convolution (Section 4.1), thresh-olds (Section 4.2.1), and Sky subtraction (Section 4.2.2), whileothers are only defined after the gradients have been removedand need a large area for statistical accuracy, for example, re-moving false detections (Section 3.1.5) and finding clumps (Sec-tion 3.2.1). To be able to adequately do both jobs, two mesh sizesare defined: a small mesh specified by --smesh for the formerclass of operations and a large mesh specified by --lmesh for thelatter. Each specifies the sides of a square mesh. As explained inSection 2.1, in the examples of Section 3, --lmeshsize=200 (soone mesh covers the displayed postage stamps) and for Figure 11it is set to . In all the examples of this paper, --smeshsize=32 .4.1.ConvolutionImage convolution is mostly done using the discrete Fouriertransform in the frequency domain (see Gonzalez and Woods2008). When the convolution kernels are large, for example,when the image
PSF is used, this technique provides signifi-cant performance benefits. However, it fails on the edges of theimage. Spatial domain convolution, on the other hand, can becorrected to account for the edge effects or masked pixels andwhen the kernels are small it can be even faster. The kernels in
NoiseChisel are small so added to its extra capability, spatial do-main convolution is used (see the Convolve section of the
GNU
Astronomy Utilities manual for a complete explanation).4.2.InterpolationandSmoothingSome meshes will not be able to provide an input in thegrid, for example, because of a large object that is larger thanthe mesh. Therefore interpolated values will be used for unsuc-cessful meshes, and finally, the mesh grid is smoothed. We willdiscuss the case for the quantile threshold (Section 4.2.1) andSky value (Section 4.2.2) more accurately here.Following the non-parametric nature of
NoiseChisel , func-tional interpolation techniques like spline or cubic interpolationwill not be used. They can result in strong outliers on the edgesor corners like those in Figure 16(a)–(c) and (l). To interpolateover each blank mesh, the median value of the nearest --numnearest acceptable meshes is used. The nearest non-blank neighbors arefound efficiently through a breadth-first search strategy in graphtheory. In such a median interpolated grid, the result will not besmooth. Hence an average filter is applied to the interpolatedgrid to make smoother variations between neighboring meshes.The final interpolated grid for the Sky value can be seen in Fig-ure 11(c). The quantile threshold of Section 3.1.2 can only have a com-parable value between all meshes, if the signal in the mesh is notsignificant compared to the noise. Otherwise, with more signalcontributing to the mesh, the pixel distribution in the mesh willbecome more skewed and thus the quantile value found will alsoshift and not be comparable between different meshes. In orderto find the threshold, the novel algorithm of Appendix C is usedto find the mode of the image. The mode acts as a gauge for theamount of data that is mixed with the noise in that mesh.With more signal contributing to the mesh, the average, me-dian, and mode will shift (with decreasing rates) to the positive(see Figure 1 and Appendix C). Exploiting this property, the sig-nificance of signal in noise of a mesh regardless of its morphol-ogy can be assessed using the difference between the median andthe mode such that the closer the mode is to the median, the lesssignificant signal there is in the mesh. Through the parameter --minmodeq , the user can set the minimum acceptable quantilefor the mode. In this paper it is . Note that the skewnessinduced by convolution, further skews the distribution, or, dis-tances the mode and median (see Section 3.1.1 and Figure 4).Hence the value is a very strong constraint. s sky As defined in Section 2.2, the Sky value is the average ofundetected pixels in each mesh and its error is their standarddeviation. Figure 11(a) shows an example Subaru TelescopeSuprimeCam image that has undergone initial processing afterflat fielding and prior to Sky subtraction. The visible gradientsin the input image are most probably due to problems in flatfielding rather than actual variation in the Sky. Finding the cause The value of each mesh is replaced with the average of it and its 8 connectedneighbors (see Figure 6(b)). HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA aa bb cc ddee ff
Figure 11:
Applying
NoiseChisel on a raw image prior to Sky subtraction from Subaru SuprimeCam. The image is 2048 × of this gradient is very important, but beyond the scope of thispaper. No matter how it was created, a good detection algorithmhas to be as resilient to such gradients as possible. This particu-lar image was intentionally chosen to demonstrate NoiseChisel ’sability to account for such gradients.Figure 11(b) shows the detected objects removed from theinput image. The image is covered with a mesh grid of size --smesh . The Sky on each mesh is found from the average of theundetected pixels in that mesh. Only those meshes with a suffi-ciently large fraction of undetected pixels are used. If too muchof the mesh area is covered with detected objects, that mesh willnot be used because of the potential fainter wings penetrating tothe undetected regions; compare Figures 3(a.1) and (a.8). Thefraction can be set by --minbfrac , in this paper it is set to .Cosmic rays can be a problem for determining the averageand standard deviation. For example, long exposures in raw
HST / ACS images are heavily peppered with cosmic rays (see Fig-ure 15 for a normal example). Since cosmic rays are very sharpand can be very small, some will not be detected by the detec-tion algorithm. Therefore a simple calculation of the average andstandard deviation will be significantly biased if cosmic rays arepresent. To remove the effect of cosmic rays, s -clipping with the same convergence-based approach as SExtractor is used (seeSection A.3). Note that by this step the mode and median areapproximately equal, so that s -clipping is just used to removethe effect of cosmic rays on the average and standard deviation.The interpolated and smoothed (see Section 4.2) Sky mapcan be seen in Figure 11(c), each pixel in this image representsa 32 ×
32 pixel mesh. Finally in Figure 11(d), the Sky valuefor each mesh is subtracted from the pixels in that mesh. Thegradients and differences between the different
CCD chips havebeen significantly suppressed. Notice that even with this smallmesh size some of the strong top right gradient still remains.5. ANALYSIS
NoiseChisel is based on detecting an object deeply buried innoise using its fainter parts (see Section 3). In contrast, existingtechniques rely on detecting the brightest regions and growingthose (based on a priori models) to include the fainter parts (seeAppendix B). Hence these methods are fundamentally different.The detection accuracy is analyzed here using three fundamentaland basic statistics that can be done with mock profiles in mocknoise. Comparing the Sky and background value when the latter15 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA
Figure Label Median Mode s -clip s -clip SExtractor NoiseChisel NoiseChisel
Converge (4,5) Average of Average of N f (Section 5.2)Undetected Undetected12(a) and 18(a) 5 . . LS . ± . . ± . . ± . . ± . . . . ± . . ± . . ± . . ± . . . . ± . . ± . . ± . . ± . . . . ± . . ± . . ± . . ± . . . . ± . . ± . − . ± . − . ± . N/A . . . ± . . ± . − . ± . − . ± . N/A . . LS . ± . . ± . − . ± . − . ± . N/A . . . ± . . ± . − . ± . − . ± . N/A . . . ± . . ± . . ± . . ± . . . . ± . . ± . . ± . . ± . . ± . . ± . − . ± . Table 1:
Sky value on the figures in units of counts. counts s − data multiplied by 10 s. For mock images, the background ( = e − ) was subtracted. LS : lowsymmetricity, see Appendix C. S/A: same as above. N/A: not applicable. is accurately known (in a mock image) is the most basic test toevaluate the success of a detection algorithm. This comparison isdone in Section 5.1. Purity is a quantitative measure to assess thecontamination of false detections that remain in the final result.In Section 5.2 purity is defined and used for an analysis of falsedetections in the faintest limits. Finally, in Section 5.3, the faintend magnitude dispersion is studied. The statistical significanceof the imposed requirements on the objects detected as true isdiscussed in Section 5.4.While being demonstrative for a “proof-of-concept” paperlike this one, tests using mock objects or noise can never simu-late all real circumstances accurately. Thus NoiseChisel is testedusing only real data without any modeling in a companion paperby M. Akhlaghi et al. (2015, in preparation). Its accuracy is fur-ther evaluated, for example completeness, purity, number countsand etc on real data in that work. The details of the real dataset, necessary modifications to the existing pipeline, tests, andan analysis of their results is beyond the scope of this definitionor “proof-of-concept” paper.5.1.SkyValueThe Sky is the average of undetected regions. Therefore thesuccess of a detection algorithm can be assessed with the effectof undetected objects on the Sky value. The more successful thedetection algorithm is, the closer the Sky value will be to thebackground value, which is 10000 e − for all the mock images(see Section 2.2). There are two types of “faint” pixels that canbe left undetected. (1) When the brightest pixels of an object arefaint, for example, Figure 3(c.2). (2) When the object is brightenough to be detected but has wings that penetrate into the noisevery slowly (for example Figure 3(a.2)). The failure to detectboth will be imprinted into the average of undetected regions ofthe image.Table 1, has all the relevant data for this comparison. Ex-cept for the last column, all columns in Table 1 show the pixelstatistics of images shown in the figures of column 1. In thesestatistics, only the pixels within the 200 ×
200 pixel box are usedand not the whole image (see Section 2.1). As s -clipping is stillextensively used in existing pipelines (see Section A.3) in Table1 the results of the two major s -clipping strategies are also in-cluded, by convergence and (4,5). In SExtractor , which uses theformer, the average is used as the Sky value and in the latter, the median is used, so we take the same approach here.The s -clipping results show that due to the skewness causedby galaxies (which do not have a sharp cutoff), none are signifi-cantly different from the median. Since the final average is usedin the former, its reported values can be even larger than the ini-tial median. The correlated noise, which acts like a convolutionand skews the distribution, causes this difference with the modeto be much larger in the real (processed) images of Figures 13and 19 (a)–(d) than the mock images.In order to check the sensitivity of NoiseChisel , it is appliedto a set of mock and real images of Figures 12 and 13 with thedefault parameters discussed in Section 3 and Section 4 and fullylisted in Appendix E. The values for the Sérsic index in Figure12 ( n = . , , HST / ACS , F814W images from the
COSMOS survey of what are thought tobe three z ∼ . UV . The same input images are given to SExtrac-tor (Figures 18 and 19 in Appendix B with the configuration fileshown in Appendix D) and the results are reported here. Somestudies use the average of undetected pixels when
SExtractor ’sdetections have been removed. Therefore in column 6 of Table 1the average pixel value when all
SExtractor detections have beenmasked to 3 , . r k is reported.The mock images (with a number in the last column) showthat compared to s -clipping, removing SExtractor ’s detected ob-jects has been more successful in approaching the real back-ground value (0 e − ), but the average of the undetected regionsare still a significant overestimation. The last 4 rows of Table 1,which successively mask more of the bright profile, show thatthis is not only due to undetected objects, but also the fainterparts of bright objects. The last row shows that when too muchof the image is masked, the average of non-detected pixels goesbelow the real Sky value. This is reasonable because localizedhigh valued noise pixels are systematically removed.Over-estimating the Sky value results in a systematic under-estimation of the total count of the detected objects. In the mostdiffuse and low surface brightness cases, a severe Sky overesti-mation can even result in a negative total count for the targets.Some of the individual pixels of a real object can be below theSky value (Section 3.1.2) but if the Sky value is accurately found,the total count of a real detection can never be negative. If it oc-16 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1: n = . n = n = n =
10 d.2 d.3
Figure 12:
NoiseChisel sensitivity test. In each image there are 25 mock pro-files, equally spaced (see Figure 3(b.1) and (c.1) for the images prior to addingnoise for the cases of n = n = r e =
10 pix, q = ◦ , q = senq .The Sérsic index in each input image is different. The profiles on the bottomleft are the faintest with − . − .
64 magnitude. The second column shows the regions of thedetected objects masked from the input image. The third column is the inverseof the second. curs, it can only be due to the overestimation of the Sky. Thisproblem can only occur in the existing signal based approach todetection where any pixel above the threshold is assumed to bethe top of a profile. Negative total counts have been reportedand extensively used in the
SDSS survey, for example. However,only the problem of not being able to measure the logarithm ofa negative value (for conversion to a magnitude) was addressed(see Lupton et al. 1999). The source of the problem, which isan overestimation of the Sky, was not sought. In
NoiseChisel , ifthe numerator of any of the
S/N measurements (Equations (3)–(5)) becomes zero or negative, that detection, clump, or river isdiscarded.For the mock galaxies, where the background is accuratelyknown, when the area covered by a diffuse source is the greatest(Figures 12(a), 13(e) and 17(a) and (b)),
NoiseChisel ’s detectionalgorithm is 2 .
3, 2 .
9, 4 .
6, and 3 . SExtractor .For Figures 13(e) and 19(e), Table 1 shows the edge andcrowded field effects on both
SExtractor and
NoiseChisel . Com-pared to the other mock images,
NoiseChisel ’s Sky value is sys-tematically higher in this case. This is not due to the edge butdue to the six large n = .
00 profiles that exist in this image (seeFigure 1 (b.1) for the complete pixel coverage of one of those six a.1 a.2 a.3b.1 b.2 b.3c.1 c.2 c.3d.1 d.2 d.3e.1 e.2 e.3
Figure 13:
NoiseChisel applied to realistic conditions. (a)–(d) Four real galax-ies. (e) Six mock profiles with the same parameters of Figure 1(b) and positionangle of q = ◦ . Five of the profiles are centered 35 pixels outside the bottomand left edges of the image; see Section 2.1. The columns are the same as thosein Figure 12. The truncation count (where black pixels are defined) in (e.1) ishalf that of Figure 1(b) to emphasize the faint wings of the profiles outside theimage edge. profiles). If --smesh was set to a larger size to find the threshold(see Section 4.2.1), the meshes covered by the objects would beignored and NoiseChisel ’s final Sky value would be lower. How-ever, Figure 11(d) shows that even such a small --smesh was notenough to completely remove some strong gradients. Thereforethis higher Sky value can be corrected if there are no gradientsin the image. Only one value for the full displayed region wasdiscussed here. In practice,
SExtractor interpolates over such re-gional values to find a background value for each pixel. SectionA.6 discusses the process and shows the histogram of the pixelby pixel Sky values used. 5.2.PurityAssume that we have made mock profiles in mock noisewhere the number and positions of the mock objects buried inthe noise are already known. By construction we know there17 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA
SExtractor NoiseChisel − − − − − − − − − − − − − − − − − − − − − − − − (a) (b)(c) (d) Figure 14:
Comparison of purity (false detections) and dispersion in measured magnitude when completeness is 1 for identical and very faint n = . ×
200 region, so any detection outside of it is false, see Section 2.1. Completeness is constrained to 1 for both techniques.
SExtractor and
NoiseChisel have 69 and 7 false detections respectively. The two histograms (c) and (d) show the distribution of magnitude (with zeropoint magnitudeof 0) for the true and false detections for the respective technique. The dashed line shows the true magnitude of all the profiles. The histograms are derived from fourruns (100 true detections) on an identical no-noised image but with different simulated noise seeds. (a) and (b) had the most number of false detections for each method. is no other source of signal in such an image. Take N f to be thenumber of false detections, N t , the the number of true detections, N T , the total number of detections, and N I the total number of in-put mock profiles. Therefore N f + N t = N T . The completeness, C , and purity, P , of a given detection algorithm with a given setof parameters are defined as C ≡ N t N I , P ≡ N t N T = − N f N T . (6)Completeness is commonly measured with mock profiles,for example, mock point sources or mock galaxies. The resultsfrom such mock objects are not realistic. This is because, as dis-cussed throughout this paper, real galaxies display very diversemorphologies and do not generically satisfy the clean ellipticaland radial profiles that can be modeled. Therefore any resultbased on mock objects will overestimate C . When the detectionalgorithm also depends on profiles having simple shapes with aclearly defined center and radially decreasing profile, the sys-tematic bias is exacerbated. Purity, on the other hand, deals withfalse detections which don’t require modeling.Without true detections, purity will be 0 since N t = N f = N T . In general, for a given set of parameters in any detec-tion algorithm, C and P are anti-correlated: as purity increases,completeness decreases and vice versa. Therefore neither can be studied independently. For example, in SExtractor decreasingthe threshold in Figure 18 will allow the detection of more of thefaint mock profiles therefore increasing completeness. Simulta-neously, however, the number of false detections will increase(see Figure 17, for example) and thereby decreasing the purityof the output.The methodologies of
NoiseChisel and
SExtractor are fun-damentally different, therefore comparing their purity should bedone carefully. To have a reasonably objective measure of purityin this case, the only way is to constrain completeness betweenthe two. Therefore a mock image that has C = NoiseChisel is created. All the parameters of
SEx-tractor are fixed to Appendix D, except
DETECT_THRESH which isdecreased until C = SExtractor too. This strategyis chosen based on the advice of
SExtractor ’s manual to keep thesensitivity only related to
DETECT_THRESH . The result will not beindependent of the mock profiles used to define completeness.The sharper the profiles, the fewer faint pixels will be needed toreach C =
1. Hence an ideal test would be on the flattest profilesor those with n = .
5, which both
SExtractor and
NoiseChisel show as their lowest completeness (see Figures 12 and 18).To get an accurate measure of purity, a large area of the im-age has to have no data. The mock image is very similar to theinput image of Figures 12(a.1). Recall that 96% of the area of18 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA all the mock images in this paper is blank noise (Section 2.1).The only difference is that all 25 profiles have the same total − . NoiseChisel detects all 25 in the image. To make the testmore robust, the mock noisy image is created four times, suchthat only the noise differs. The magnitude for all 25 × = NoiseChisel would give C = SExtractor could not detect all the objects without flaggingsome of their magnitudes as unreliable. So ignoring the flags,the largest threshold that
SExtractor also gives C = DETECT_THRESH=0.6 . The masked results of one of the imagescan be seen in Figure 14 once masked with
SExtractor ’s detec-tions and once with
NoiseChisel ’s. In total, in the four images,100 mock galaxies are present and in all four, completeness is1. Therefore the result of all four images can be reported in onepurity value for
SExtractor ( P s ) and NoiseChisel ( P n ). P s = + + + + + + = . P n = + + + + + + = . NoiseChisel is so muchlower than the smallest possible threshold to
SExtractor (whichis the Sky value), this result shows that the purity of
NoiseChisel is about three times higher in detecting real, faint objects with theparameters used (see Appendices D and E). In the last columnof Table 1 the number of false detections ( N f ) in all the mockimages is reported. Recall that like Figure 14, all mock imagesare also covered by Gaussian noise for 96% their area. Withexactly the same input parameters, the reported N f values agreewell with those reported in this test.As discussed in Sections 3.1.5 and 3.2.1, NoiseChisel is veryrobust in dealing with correlated noise that processed real imagesalso have. However,
SExtractor and all existing signal-based de-tection methods in general treat the pixels above the thresholdas the top of a known profile; therefore they are not immune tothe effects of correlated noise, resulting in lower purity whenapplied to a real processed image.5.3.MagnitudeDispersionAll the profiles in Figure 14 are identical, however, Figure14(a) displays a large dispersion in the shapes and areas in thedetections. This results in the roughly 2 magnitude rage that isvisible in
SExtractor ’s true detections (see Figure 14(c)). It isvery important to note that the bimodality that is observed inFigure 14(c) (ignoring color) is due to the fact that all mock pro-files were identical. In a real image, the faintest true detectionscan have a variety of inherent profiles. Therefore the distribu-tion will be unimodal with negative skew. The net result is acommon plot in existing catalogs (for example see Figure 10 inIllingworth et al. (2013)). On the contrary,
NoiseChisel ’s true de-tections all have roughly the same area in Figure 14(b) and thusthe range in the measured magnitude for true detections in Fig-ure 14(d) is about half that of
SExtractor . The two
NoiseChisel detections with magnitudes brighter than −
11 are the result offour of the profiles being detected as two detections.5.4.ImposedRestrictionsontheSignalThe proposed algorithm imposes some restrictions on the ob-jects it will detect as true. Therefore it is not “absolutely” noisebased or independent of any signal parameter. Here we will dis-cuss its methodological limits. With the chosen parameters inthis paper (listed in Appendix E) the target has to satisfy the fol-lowing conditions.1. To be detected, an object has to have an area equal to orlarger than Figure 6(c), above the initial quantile threshold(which is below the Sky value; see Section 3.1.2) on theconvolved image.2. To be classified as a true detection, it has to contain at leastone connected component larger than 15 pixels above theSky threshold with a sufficiently large
S/N specified fromthe ambient noise (see Section 3.1.5) on the actual image.As the methodological limits or requirements imposed on thetarget by a detection technique decrease, the detection algorithmbecomes more successful because it becomes more generic. The“success” of a detection algorithm can be defined as how accu-rate its Sky measurement can approximate a known background,the purity of its result and the scatter in its photometry as dis-cussed above. It was shown that because of far fewer constraintson the target objects, this technique was significantly more “suc-cessful” compared to
SExtractor (with the parameters of Ap-pendix D).In astronomical data analysis, an
S/N above five is usuallyconsidered as a “solid” result, any measurement with a lower
S/N is usually included in an analysis with caution (McLean 2008,Section 1.5.2). Using ideal simulated noise we showed that ob-jects and clumps with an
S/N as low as 3 .
61 (Figure 7) can beaccurately (0 . S/N s are overestimated. Hence the true limiting
S/N isslightly less than what is reported in those plots.Through setting fewer constraints (if a similar purity levelcan be achieved) it would be possible to decrease the
S/N of truedetections even further. The extreme case would be to detect ob-jects with an
S/N of unity. However, the statistical significanceof any resulting measurement (for example, the effect of that de-tection on the measured Sky, its total count, its central position,or any other measurements on the object) will correspondinglydecrease to values that are no longer significant in any scien-tific/statistical measurement and analysis.6. DISCUSSIONA new method for detecting extremely faint objects and fainterparts of brighter objects in noise is introduced. Unlike the exist-ing approach to detection where the signal is the basis, this ap-proach is based on operations and calculations on the noise withinsignificant requirements for the signal. This method is alsonon-parametric, in other words, it does not involve a functionalregression analysis. It is therefore ideal for the study of nebu-lous/amorphous objects that are heavily immersed in noise, forexample, galaxies.19 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA
Galaxies can have rich dynamic histories involving internalprocesses as well as external interactions with the halo or othergalaxies. This creates very complicated, diverse morphologies.The fact that we cannot image them in their full 3D glory furthercomplicates their final observed image. Therefore, imposing anya priori model on their shape or radial light profile during the de-tection process is a self-inflicted systematic bias in their study.Modeling the light profiles of galaxies is vital to our understand-ing of the galaxies and testing our models, but it should be done after detection is complete. Mixing detection and modeling willonly bias both.One major argument in favor of elliptical-based detectionand photometry techniques like the Kron and Petrosian radiias opposed to isophotal methods (including
NoiseChisel ) is thatthey are impervious to ( + z ) dimming and K-correction andfor any profile, they are “well defined.” By well defined it ismeant that if the same galaxy (with the same morphology) existsin multiple redshifts, then such methods will yield the same frac-tion of total light for all the redshifts. Ideally, when the aperturesused extend to infinity (see Graham and Driver 2005), and thegalaxies have no morphological evolution, the result would beindependent of the noise and thus the threshold used. However,this is not necessarily the case in the real world.The interpretation of signal-based detection results dependson the morphology of the galaxies under study. To demonstratethis point, consider the simple mean in a 1D distribution as ananalogy. Note that the Kron radius is a weighted mean. Themean is a well defined statistic for most distributions because itgives a unique and unambiguous value. Regardless of whetherthe underlying distribution is a Gaussian or a log-normal dis-tribution, for example, an unambiguous mean can be defined ,but for interpreting the mean, the distributions should not be ig-nored; otherwise the analysis will be biased when such differentdistributions are involved.The same applies to the Kron or Petrosian radii. For anygiven surface brightness (morphology), a unique and unambigu-ous radius can be found, rendering them well defined. How-ever, when it comes to interpreting what that radius physicallymeans (in kiloparsecs or fraction of total light, for example), theinterpretation will be biased when the morphology (spatial andcount distribution of pixels) differs between the sample of galax-ies. Therefore, comparing Kron magnitudes for the galaxies ofa general population can be problematic. For example, all thegalaxies in a (noisy) survey or image which can simultaneouslycontain irregulars, spirals, and elliptical galaxies.The Kron or Petrosian radii satisfy the redshift independencein studies of galaxies that are already known to have approx-imately the same morphology; in other words, a morphologi-cally selected sample where the sample has been detected in-dependently of the shape of the galaxies. For example, unlikethe references in Section 1, galaxies might indeed have simi-lar morphologies across redshifts (for example, see Lee et al.2013). However, this claim, or any other claims, including thosein Section 1, about the morphology of detected sources can onlybe tested when the detection technique does not depend on themorphology as an assumption. In other words, if the detectiontechnique depends on galaxies having a similar morphology atsimilar or various redshifts, it should not come as a surprise ifgalaxies with similar morphologies are detected.Isophotal detection methods like NoiseChisel do have thelimitation that they use a different threshold for objects at differ- ent redshifts, or ( + z ) dimming. However, it should be con-sidered that the Sky value and its standard deviation in the inputimage are also completely independent of such issues. Thesevital image statistics, which propagate in all subsequent mea-surements, only depend on the arriving photons in each pixelregardless of the physical origin of the source. Therefore im-posing such high-level (related to physical interpretation and notthe data) constraints on low-level measurements (to do with thedata, regardless of later physical interpretations) is a source ofsystematic bias. This noise-based approach to detection was de-signed to create an accurate low-level measurement tool with asfew assumptions as possible so high-level interpretations can beless biased.High-level interpretations of the outputs of low-level mea-surements can easily be done after the low-level procedure iscomplete. For example, to account for ( + z ) dimming, an in-dividual threshold can be defined based on the measured redshift(a high-level product) for each object in a study. In the 1970s,when the Kron and Petrosian radii were defined, the processingpower to achieve this level of customization was not available.Therefore such high-level constraints had to be imposed on thetechniques that originated decades ago. However with the veryfast computers cheaply available today, there is no more need forsuch self-inflicted biases.Through the particular definition of threshold (in Figure 2,Section 3.1.2 and 4.2.1) the cycle of the Sky estimate dependingon detection and detection depending on Sky are significantlyweakened. In the proposed technique, the detection thresholdand initial detection process are defined independently of theSky value. An independent initial approximation of the Skyis found by averaging the initial undetected regions and is onlyused in classifying and removing false initial detections (see theflowchart of Figure 2).Due to all the novel, low-level and non-parametric methodsemployed in this noise based detection, fainter profiles can be de-tected more successfully than the existing signal-based detectiontechnique (at least with the parameters used here). The puritymeasurements of Section 5.2 show how successful this algorithmand its implementation have been in removing false detections,while detecting the very faint true profiles and the fainter parts oflarge profiles. Section 5.3 then showed the significantly smallerscatter in the photometry of those very diffuse and faint galaxies.A new approach to segment a detected region into objectsthat are heavily immersed in noise is also developed. In thisapproach, the true clumps are first found using the global prop-erties of the noise, not based directly on a user input value orthe particular object (compare Section 3.2.1 with Section B.2).By expanding the clumps over the light profile of the detectedregion, possible objects are found and the detected region is seg-mented.One of the design principles of NoiseChisel was to decreasethe dependency of the output on user input. To ensure that the re-sults are more dependent on the image than the user’s subjectiveexperience. To achieve this, the true/false criteria were definedbased on the image and not an absolute hand input value (seeSection 3.1.5 and Section 3.2.1). This approach accounts forlocal variations in the noise properties or correlated noise thatbecomes a significant issue in processed data products. Notethat by using relative measures the dependence on user input hassignificantly decreased, but not disappeared.This noise based approach to detection allows for the input20 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA parameters to require minimal customization between images ofdifferent instruments and in different processing stages. For ex-ample, in this paper, all the outputs for mock images with idealnoise, raw ground-based images with real noise and
HST / ACS drizzled space-based images with correlated noise were processedwith nearly identical input parameters.
NoiseChisel is designed with efficiency in mind. Thereforeit is also very fast in processing a large image. As an exam-ple the image of Figure 11 was processed in 4 .
43 seconds usinga 3 . GH z, 4 physical core CPU on a desktop computer. Untilnow the design and applicability of the concepts introduced herewas the primary concern, but in time more efficient algorithmsand methods might be found to increase its efficiency. For ex-ample including
GPU functionality can significantly decrease therunning time. The detection and segmentation techniques intro-duced through
NoiseChisel are easily expandable to the detectionof any signal in noise, for example, to 3D data in radio studiessimilar to clumpfind (Williams et al. 1994) and 1D data like spec-troscopic data.The technique proposed here for detection and segmentationalong with the ancillary
NoiseChisel program provides a funda-mentally fresh and new approach to astronomical data analysis.The limited preliminary tests on mock profiles and noise in thispaper show that it significantly increases the detection ability,with a reasonable purity and very small photometric scatter, todetect the fainter parts of bright galaxies with any shape or tofind small faint objects that will be undetected with the existingsignal based approach to detection. Both of these new abilitiescan play a major role in most branches of galaxy evolution or aastronomy research as whole (some applications are reviewed inSection 1). Along with the new instruments such as
James WebbSpace Telescope , the Large Synoptic Survey Telescope (
LSST ),and the Thirty Meter Telescope, which will be available in thenext few years, this new approach to detection will play a veryimportant role in current and future astronomical discoveries.ACKNOWLEDGMENTSWe are most grateful to John Silverman for providing veryconstructive comments and suggestions, particularly on the sci-entific questions that led to the creation of this technique. RobertLupton’s very critical assessment of this work during severaldiscussions in various stages of its development were very con-structive and we are most grateful to him. Anupreeta More pro-vided some very critical and useful comments on testing, partic-ularly in Section 5.2. Paul Price kindly supplied us with Figure11 along with his valuable opinions on this work. Alan Leforreviewed this text and provided very useful writing style com-ments for which we are most grateful; he also patiently helped inthe initial portability tests of the
GNU
Astronomy Utilities. TheApJS Scientific Editor (Eric Feigelson) provided valuable com-ments on the layout and presentation. The critical comments,and references introduced, by the anonymous referee were alsovery important for the final presentation. Mohammad-reza Khel-lat, Henry McCracken, Masashi Chiba, Marcin Sawicki, SurhudMore, James Gunn, Emanuelle Daddi, Steve Bickerton, and ClaireLackner also provided valuable critical suggestions throughoutthe duration of this work. M.A. is supported by a Ministry ofEducation, Culture, Sports, Science and Technology in Japan(MEXT) scholarship. This work has been supported in part bya Grant-in-Aid for Scientific Research (21244012, 24253003) from MEXT and by the Global Center of Excellence (GCOE),Young Scientist Initiative A.A. EXISTING SKY CALCULATION METHODSIn existing algorithms for detection, the first step of detec-tion, namely setting a threshold, is based on the Sky value (seeAppendix B). To approximate the Sky value prior to detection,researchers use multiple techniques, such as the mode, s -clipping,etc. A critical analysis of the most common techniques for ap-proximating the Sky value is given here.A.1.LocalModeIn a symmetric random distribution, for example, pure noisein an astronomical image, the mean, median, and mode areequal within their statistical errors. With the addition of signal(astronomical objects, yellow histogram in Figure 1), a skeweddistribution is formed (red histogram in Figure 1). The mean isthe most highly dependent on the skewness since it shifts towardthe positive much faster. The median is less affected since it willbe smaller than the mean for a given positively skewed distribu-tion. Finally, the mode of the distribution is the least affectedstatistic of the three. This argument has thus led many to takethe mode of a noisy image as the Sky value.There are several approaches to finding the mode of the im-age: Bijaoui (1980) finds a functional form to the image his-togram in the vicinity of the mode using Bayes’s theorem. Kron(1980) finds the mode of the pixel distribution in a 50pixel × ′′ × ′′ ) box about the peak of each object based onthe mean of the 7 bins in the histogram with the largest num-ber of pixels and considered that as the Sky value. Beard et al.(1990) use a more elaborate approach in the search to find themode. The histogram is first smoothed with a moving box filter.The highest peaks are found and used in a cubic fit to find themode which they consider to be the Sky. DAOPHOT (Stetson 1987) also uses a very similar method:a circular annulus with an inner radius several times the stel-lar
FWHM is taken from a nearby part of the image, sufficientlyclose to the target and the mode of the pixels in that region isconsidered as the Sky value. The mode is approximated basedon the following relation between the mean and median: 3 × median − × mean. This particular relation between the mode,median and mean, can only exist when one assumes a particularpixel count probability distribution function ( PDF ) or histogramfor the image pixels.
SExtractor can be considered a hybrid, first relying on s -clipping, and then on finding the mode to find the Sky value.The former will be discussed in Section A.3. In the latter, like DAOPHOT , SExtractor uses the following relation to find themode: 2 . × median − . × mean. In a recent re-analysis of the SDSS stripe 82 (the
SDSS deep field), Jiang et al. (2014) use asimilar but more accurate hybrid method. The main difference is The photon noise actually comes from a Poisson distribution which is notsymmetric especially for very low mean ( l ) values, but with the very highmean values ( l = B = e − from Figure 1), the Poisson distribution cansafely be considered approximately equal to a Gaussian distribution with µ = l and s = √ l . Images taken from the HST commonly have much smallerSky values. In such cases, the main source of noise is noise produced bythe instrumentation, for example, read-out noise which can out-weight thenon-symmetric low- l Poisson noise. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA that they remove
SExtractor detections first—with
DETECT_THRESH=2 and
DETECT_MINAREA=4 (see Section B.1). If the mean is notsmaller than the median, the mode is assumed to be found basedon a similar relation to
DAOPHOT between the mean and mode.Therefore existing methods to use the mode as a proxy forthe Sky either employ the parametric approximations of Bijaoui(1980),
DAOPHOT , SExtractor , and Jiang et al. (2014) based onan assumed function for the pixel count
PDF or a non-parametricone (Kron 1980; Beard et al. 1990). It is clear from Figure 1that the histogram, which can be considered a binned
PDF cantake any form of skewness depending on the type and position-ing of astronomical data in the image. Therefore by assuminga fixed generic functional form for all possible
PDF s, significantsystematic errors will be induced.The existing non-parametric methods reviewed here rely onthe image histogram. The results from a histogram depend onthe bin-width and the bin positioning, especially in the vicinityof the mode. For example note how the peaks near the mode varyin Figure 20 only due to bin positioning. The method of Kron(1980) for finding the mode (explained above) for the histogramin Figure 1(b.1), yields 4 . e − ± . e − , where the histogramis made from 200 ×
200 pixels; recall that Kron (1980) used a50 ×
50 pixels box, and the area here with no signal is muchlarger than the object. Different bin widths would change thisresult, but since no generic distribution can be assumed for allimages, any generic bin width choice would ultimately be arbi-trary. A very accurate, non-parametric method for finding themode of a distribution that does not rely on the histogram, buton the cumulative frequency plot of the pixels is introduced inAppendix C.Regardless of how the mode is found, such attempts at usingthe mode of a distribution as a proxy for the Sky fail to considerthe fact that the mode of the distribution also shifts significantlyfrom the actual image background depending on the data that isembedded in the noise. Figure 1 shows that like the mean andmedian, the mode is a biased estimator for the Sky value due tothe data. As the fainter parts of large objects or a large numberof faint objects cover a larger fraction of the image, the modeof the distribution shifts to the positive. Figure 1 shows howthis shift increases, with mode values of 9 . e − (symmetricityof 1 .
26, see Appendix C), 9 . e − (0 .
48) and, 156 . e − (0 . e − in all three.See Section 5.1 for further discussion on the mode as comparedto the true Sky.Reversing the argument above (that data cause a differencebetween the mode and median), we conclude that if the modeand median are approximately equal, there is no significant con-tribution of signal or data. In fact this argument is very importantfor NoiseChisel (see Section 4.2.1). Note that this argument isonly possible when the mode is found independently from themedian as in Appendix C.Based on the idea in the previous paragraph we have alsocreated
SubtractSky , which is also distributed as part of the
GNU
Astronomy Utilities. It can be used for cases where only Skysubtraction is desired. Note that the argument above “detects”signal through its effect on the distance of the mode and median,but only based on pixel values and independently of the signal’sspatial distribution. See the manual for more information; itsoperation is very similar to that explained in Section 4 with smallmodifications. A.2.RemovingDetectionsIf the detection algorithm is independent of noise, this methodof finding the Sky is the most accurate (see Section 2.2). How-ever, the existing detection algorithms depend on knowing theSky before they run (see Section B.1). Even so, this method isused by some researchers to find the Sky value, for example,Stoughton et al. (2002) used this technique over a grid to es-timate the Sky in the SDSS survey. Bright objects, which aredefined as those with at least one pixel above 200 s , are first sub-tracted from the image. They are all assumed to have power-lawwings which are modeled and subtracted from the image andfinally a median filter is applied to the image in a box of side100 ′′ ≈
252 pixels. Increasing the box size will be a seriousburden on the computational process of the pipeline and therewill always be objects that are, collectively or individually, largeenough to bias the result.Mandelbaum et al. (2005) reported a systematic decrease inthe number of faint galaxies near < ′′ of the center of brightobjects. It was also observed that this method underestimatesthe total count of the brighter galaxies (for example, Lauer et al.2007). The issue was finally addressed in Aihara et al. (2011).The major solution was to change the threshold to detect brightsources to 51 s . Detected objects were deblended, and a linearcombination of the best exponential and de Vaucouleurs modelswere subtracted.While the decreased threshold and new fitting functions doincrease the number of objects that are masked, they still fail inthe following respects. (1) Subtraction is done based on para-metric model fitting which can potentially be severely wrong inthe fainter outer parts of the galaxies that are the primary sourceof bias. Such cases are when the galaxy cannot be character-ized by a simple ellipse or when it is on the edge of the image(see Section B.4). Other cases are when the galaxy might notbe a simple n = n = s are still a cause ofsystematic bias. For example, none of the bright profiles in Fig-ure 1((c), untruncated) have a pixel above that extremely highthreshold. The brightest pixel in Figure 1(c.1) is only 39 . s above the background. A.3. s -clippingWhen the outliers of a data set are sufficiently distinct fromthe majority of the (noisy) distribution, s -clipping can be veryuseful in removing their effect. Assuming m is the distributionmedian, any data point lying beyond m ± a s can be clipped (re-moved) and this process can be repeated on the smaller data set n times. The parameters to define s -clipping can thus be writtenas ( a , n ) . Such s -clipping is only useful if the objects (outliers)have a very high S/N with very sharp boundaries, for examplecosmic rays (see Figure 15).Some prominent uses of this approach in finding the Skyvalue in an image are: the
HST image processing pipeline (Gon-zaga et al. 2012) with ( , ) and SDSS Stripe 82 (Annis et al.2014) with ( , ) and again (Jiang et al. 2014) with ( _ , ) . Asits primary tool, SExtractor also relies on this approach thoughthe termination criterion is not a fixed number of times, it is theconvergence of s . Asymmetric clipping was proposed by Rat-natunga and Newell (1984). Gonzaga et al. (2012) obtains the Jiang et al. (2014) do not to mention the first parameter. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA − −
20 0 20 40 60 80 100 120 140Pixel intensity ( e − ) Figure 15:
Effect of cosmic rays on the histogram in a small postage stamp(100 ×
100 pixels) of a raw Sky subtracted
HST / ACS ( flt.fits ) image. Thehistogram is vertically truncated and the color range in the displayed image hasthe same range as the histogram. 84 cosmic ray pixels have values above 150 e − and are not shown in the histogram. The outlying cosmic rays can easily bedistinguished in the histogram due to their sharp boundaries in contrast to thefading boundaries of galaxies, for example, Figure 1(b). value globally and uses it for the whole image; others use thistechnique to find the local Sky value and use interpolation overthe whole image. The five vertical lines in all three examples ofFigure 1 show the position of each iteration’s s -clipped medianwith ( , ) on the mock images. Due to their extreme proximity,they might not be resolved.Once s converges, SExtractor uses the mean as the Sky value.For the displayed area of the three examples of Figure 1, themean, when
SExtractor ’s convergence is achieved, is, respec-tively, 11 . ± . e − , 23 . ± . e − , and 236 . ± . e − .Prior to s -clipping, the median of each image was: 11 . e − ,20 . e − , and 199 . e − . Hence, a simple median (no s -clipping)was a better approximation to the true background value (0 ± e − ). The medians of the ( , ) s -clipping technique are shownon Figure 1. It is clear that because the profiles of astronomicalobjects go slowly into the noise, the correction provided by s -clipping is very insignificant for the mean and median. For thestandard deviation, however, s -clipping is very useful. The ini-tial standard deviation of these three examples was 167 . e − ,455 . e − and 341 . e − (see Section 5.1 for a more completecomparison).Figure 1 shows how through their very gradual penetrationinto the noise, stars (the PSF ), galaxies, and undetected objects inastronomical images are very different from cosmic rays. There-fore s -clipping will not suffice when such objects are present. Inother words, no matter how much the brightest pixels of objectsin an image are clipped, their faint wings penetrate very deepinto the noise. Therefore when astronomical objects are presentin the image, a generic s -clip is a very crude overestimation ofthe Sky value. A.4.RadiusofFlatProfileAnother possible solution to finding the Sky was presentedrecently as a proxy for the local Sky background behind an ob-ject. Barden et al. (2012) has proposed finding the s -clipped mean value on elliptical annuli after removing all known sourcecontributions, using SExtractor . As long as the s -clipped meanis very high above the noise, the slope of the mean pixel valueon an annulus as a function of radius will be negative. However,as it approaches the noise, due to the scatter caused by noise,the slope will become repeatedly positive and negative. Theyconsider the position of the second positive to be the annulus onwhich they find Sky value for each object.The drawbacks to this method are that it assumes that theprofiles can be represented as an ellipse with a well-defined cen-ter. It is also based on the Kron radius (Section B.4) which canunderestimate the full extent of an object when DETECT_THRESH>1 (see Section B.1.2 for a complete discussion). The most impor-tant issue with this approach is the area (or number of pixels)which is used to estimate the Sky value. As Stetson (1987) noted,to reach a statistically acceptable precision, the area used to findthe Sky value has to be sufficiently larger than the object itself.However, the average value on an annulus is based on a smallernumber of pixels than that spanning the parent object especiallywhen all neighboring objects are removed (see their Figure 5).Therefore the random scatter due to the small number of pixelsused will limit the ability to measure the true Sky level with ac-curacy. This is further exacerbated by the fact that the growthof the elliptical annuli is halted based on scatter in the measuredSky of each.A.5.MedianValueofDitheredImagesMedian values of dithered images can only be used in imagesprior to co-adding. This method exploits the dithering, where thefield of view of the telescope is shifted slightly between differ-ent exposures (for example in Kajisawa et al. 2011). In orderto find the Sky value between several temporally close observa-tions, the median pixel value of those observations is taken asthe Sky value on each pixel.The weaknesses of this method are: (1) the Sky value mightchange between separate observations, particularly in infraredimaging. (2) The main objects in the image have to be far smallerthan the dithering length. (3) The field should not be crowded.(4) It assumes perfect bias, dark, and flat fielding since it dependson operations that are done prior to them.A.6.BackgroundInterpolation
SExtractor and most other algorithms do not find one Skyvalue for the whole image. Instead, the Sky values are foundon a mesh grid. In
SExtractor the sizes of the meshes are setto
BACK_SIZE pixels. It then uses bicubic-spline interpolation tofind a the Sky value for each pixel. Figures 16(a)–(c) show
SEx-tractor ’s Sky value calculated for each pixel for the three mockimages of Figure 1 along with the images in Figures 18 and 19.Recall that the background value for all mock images is zero. Itis clear that as the area covered by the faint parts of mock galax-ies has increased from (a) to (c) the background values calcu-lated by
SExtractor ’s s -clipping and mode approximation havebeen shifted very strongly to the positive. In this figure the prob-lems with using model based interpolation techniques are alsoevident. Note how the corners of the images have extremelyhigh and low values compared to the centers of the images inFigure 16 (a)–(c) and (l). Note that the other cases of Figure 16do not include image corners.23 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a (Mock) b (Mock) c (Mock) d (Mock) e (Mock) f (Mock) g (Mock) − −
10 0 10 − −
10 0 10Pixel intensity ( e − ) − − − − − − e − ) − − − − e − ) −
10 0 −
10 0Pixel intensity ( e − ) 20 40 60 8020 40 60 80Pixel intensity ( e − ) − B ( e − ) h (Real) i (Real) j (Real) k (Real) l (Mock) Figure 16:
Sky value check images and their histograms, produced by
SExtractor with the input parameters of Appendix D applied to the images in some of the figuresin this paper. First row (a)–(c): the three mock images of Figure 1. Only for these three, the displayed 200 ×
200 region was input to
SExtractor . Second row (d)–(g):the four mock images of Figure 18. Third row: (h)–(l) the four real and one mock image of Figure 19. Failure to interpolate accurately on the corners of the images areclearly visible in (a)–(c) and (l) where the image edge can be seen. Real images are in units of counts s − , so that the horizontal axis of their histograms are multipliedby 10 s . The true background ( B ) value of all mock images is subtracted in the histograms. In SExtractor , through the parameter
BACK_SIZE (see AppendixD), a user can set a smaller grid on the image.
SExtractor thenfinds the Sky in those sub grids, removes the outliers with me-dian filtering, and interpolates over them. Therefore for each par-ticular image of a particular target, a good choice of
BACK_SIZE will slightly correct the very large overestimations discussed here.However, for any
BACK_SIZE , there might be objects in the imagethat cause a similar or even larger bias. Therefore, unless the best
BACK_SIZE is found for each particular input image—dependingon the objects and their position in the input image—there willbe a bias. Such customized application is not practical becauseof the sheer number of galaxies studied in most papers. If theuser chooses a custom background for each image of a largeset of targets, comparison between the different images cannotbe accurate because of the different statistical properties of thebackground. Therefore, such significant systematic overestima-tions will plague any astrophysical analysis or hypothesis testingthat is based on them.B. EXISTING DETECTION ANDSEGMENTATION/DEBLENDING METHODSThe detection technique used in astronomy up until now cangenerally be classified as a signal-based approach with variousimplementations. It is impractical to review all implementationsand their minor differences here. Thus in this appendix we use
SExtractor
SEx-tractor was chosen here because it can easily be installed and hasa relatively complete manual. Because of this it is by far the most commonly used tool by most authors in generating largecatalogs or source extraction of known targets and is thus widelyrecognized by the community. Its techniques are also conceptu-ally very similar and inherited from other packages, for example,
DAOPHOT (Stetson 1987), clumpfind (Williams et al. 1994) andKron (1980).To extract sources from a noisy image,
SExtractor first needsto find the noise characteristics or the Sky value and its error forevery pixel. A comprehensive review of the common methodsfor finding the Sky value are thus provided in Appendix A. Basedon the noise, a threshold is applied and regions above noise arechosen as detections (Section B.1). The detections are then de-blended (Section B.2). The deblended detections are extendedbeyond the threshold using the Kron radius concept that is crit-ically analyzed in Section B.4. In Section B.3 a discrete hierar-chical Markov image modeling method for simultaneous detec-tion and segmentation of objects in an image is discussed. Sinceit also requires growing the objects afterwards, it is discussedbefore Section B.4. The images are outputs of
SExtractor . Theinterpretations of how it has operated are based on its manual orcorresponding paper. The configuration file used to run
SExtrac-tor is shown in Appendix D.B.1.DetectionAfter finding the Sky value,
SExtractor uses it to define athreshold (see Section B.1.2). The threshold is applied to a smoothedimage (see Section B.1.1). Of the various regions found abovethe threshold, those with an area smaller than a user definedvalue are excluded as a false detection (Section B.1.3) and the24 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA remaining objects (true detections) are deblended (Section B.2).Using the Kron (1980) radius concept, the deblended detectionsare grown on a fixed ellipse to include pixels below the threshold(see Section B.4).
B.1.1.Convolution
A convolution kernel comparable with the
PSF has been thebest choice for this purpose. A qualitative argument in supportof this choice is that it is the widest convolution kernel that canbe used such that no object is smoothed out since no object canbe smaller than the image
PSF (see Bijaoui and Dantel 1970, fora detailed quantitative discussion). Based on this argument, forthe example runs of
SExtractor on the mock images, the same
PSF (Moffat function with b = .
76 and
FWHM =3 pixels) thatwas used to create the input image was used. In
SExtractor theconvolution kernel is specified by the parameter
FILTER_NAME . B.1.2.Threshold
In order to detect objects above the noise, a threshold needsto be applied to the convolved image. Only pixels with val-ues above this threshold are considered for object detection. Ahigher threshold will decrease the probability of false detections,because fewer noise pixels will have a count above it. However,it will miss fainter objects and the fainter parts of brighter ob-jects.In
SExtractor , the threshold is defined in terms of the approx-imated Sky ( s , see Appendix A) and its standard deviation ( s ): s + DETECT_THRESH × s which are measured on the original inputimage (prior to convolution). The values set for DETECT_THRESH are different in various studies. The recently released - HST
WFC3 -selected Photometric Catalogs in the five
CANDELS / - HST
Fields (Skelton et al. 2014) set this parameter to 1 .
8. In theK s -selected Catalog in the COSMOS / ULTRAVISTA field (Muzzinet al. 2013), it is set to 1 .
7. As a final example, in the Sub-aru/
MOIRCS deep survey of the
GOODS-N field (Kajisawa et al.2011), it is set to 1 . DETECT_THRESH is set to a very high value of 2.3, andonly the brightest regions of the brightest objects are detectedand (2) a “hot” run where it is set to a lower value of 1.65 todetect the fainter objects. The two resulting catalogs are thencompared and “hot” objects that lie over the “cold” ones are re-moved from the final catalog. In their “hot” run, such techniquesuse the lowest possible threshold they can consider. Working ondeeper images, Leauthaud et al. (2007) used 2 . ∼
95% of the noisepixels have a value below 2 s . This occurs because of the de-creased dynamic range after convolution (see Section 3.1.1 andFigure 4). In Figure 4, 1 s , which is the standard deviation ofthe un-convolved image, corresponds to 100 e − . However, whenapplied to the convolved image of Figure 4(b) only 44 (of the40,000) pixels are above this threshold. Note that the kernelused in Figure 4(b) is the same PSF that was used to make the mock image. In Figure 4(c)–(e), 100 e − far exceeds the brightestconvolved pixel.Based on this idea, Galametz et al. (2013) adopt a differentapproach to the cold and hot detection technique. DETECT_THRESH is approximately similar between the two (0.75 and 0.7), but theconvolution kernels (see Section B.1.1) used are significantlydifferent such that the cold run has a much wider kernel (seetheir Appendix A for the respective
SExtractor parameters).A different approach to finding the threshold of an image,called the “False-Discovery Rate,” was introduced in astronomyby Miller et al. (2001). In this approach the average fraction offalse pixel detections to true pixel detections is held fixed whenassuming a fixed null hypothesis (background and its noise).It was implemented as a detection technique in Hopkins et al.(2002). All the pixels in the region must be sorted and a p valuehas to be calculated from an approximated mean and standarddeviation. Thus this thresholding technique also needs the Skyand its error prior to the actual detection process. B.1.3.Detection: True/False
Once the threshold of Section B.1.2 is applied to the smoothedimage of Section B.1.1, the image is divided into two groups ofpixels: those that are above and below the threshold. The non-white pixels in Figure 17 (a.1)–(d.1) and column 2 in Figures18 and 19 show this image with a threshold applied (they are all
SExtractor ’s segmentation maps). In Figure 17,
SExtractor is runon the mock image of Figure 1(b.2) with the configuration file ofAppendix D. All of
SExtractor ’s parameters in all four cases arefixed to those in Appendix D except for
DETECT_THRESH , which isset to , , , and , respectively. The first observation afterexamining Figure 17 is that as the detection threshold decreases,the number of false detections increases: 0, 0, 319 and 11296,(in the 1000 × SExtractor to define and thus re-move such false detections is
DETECT_MINAREA . With this param-eter the user can specify the minimum area that a true detectionshould have. As the threshold decreases, more and more pixelsbecome connected. As an example when
DETECT_THRESH=0.1 isfixed but
DETECT_MINAREA is set to , , , , and pixels,8989, 3880, 966, 350 and 260 false detections are found. Thelarger DETECT_MINAREA will also result in not being able to de-tect many compact bright objects that will also be present in areal image. The
SExtractor manual therefore correctly suggestssetting
DETECT_MINAREA to small values (1–5) so that the convo-lution kernel and
DETECT_THRESH define the sensitivity.Figure 18 shows the outputs of
SExtractor , configured withAppendix D, for extremely faint mock galaxies; see Sections 3.1and 5.1.
SExtractor ’s results on a bright and large profile arealready analyzed in Figure 17. With
DETECT_THRESH=1 , only thetop few pixels of the brightest and sharpest profiles have beensuccessfully detected. Note that as discussed in Section B.1.2,most surveys use much larger thresholds.When a large
S/N for detections is required for accurate fit-tings, for example, in photometric redshifts or
SED fittings, someresearchers might opt for discarding such faint detections. How-ever, their detection in the image is nevertheless very importantbecause if they are not detected and removed, they will causean overestimation in the Sky value, hence systematically under-estimating the total count of the brighter, high
S/N , objects (seeSection 5.1).25 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1 a.2a.2 b.1 b.2b.2c.1 c.2c.2 d.1d.1 d.2d.2
Figure 17:
SExtractor results of various thresholds with configuration parameters from Appendix D except for
DETECT_THRESH . The input image only has the oneprofile of Figure 1(b) (see Section 2.1). The threshold values for
DETECT_THRESH are (a) 2, (b) 1, (c) 0 .
5, (d) 0 .
1. For each threshold there are two images. Left: thesegmentation map. Right: the elliptical region within three times the Kron radius ( r k , see Section B.4) is masked (white) from the original image. Notice the darkerprofile regions surrounding the elliptical mask that have not been included in the 3 , . r k elliptical aperture. The smaller elliptical border marks the aperture containing ∼
90% of the counts in the profile. The larger elliptical border shows the region containing ∼
95% of the counts. The segmentation maps in (c.1) and (d.1) havedifferent color codes for the main object (gray) and any other detections (black). The marked regions show how connectivity is not a criteria in
SExtractor ’s deblendingalgorithm.
B.2.Segmentation/DeblendingSegmentation and deblending are defined in Section 3.2. Aninteresting deblending algorithm is proposed by Lupton (2015)and is employed in the
SDSS , Subaru Telescope Hyper Suprime-Cam, and
LSST pipelines. In one image (one color) real peaksare defined as those that are at least 3 s sky above the saddle pointor valley that separates it from neighboring peaks in terms ofcount. True peaks also have to be at least one pixel distant froma neighboring peak. Once the true “child” peaks are found over adetected region, a circular template is built for the child by usingthe smaller value of two pixels on the opposite sides of the peak.A weight is given to each child and a cost function is defined andminimized to deblend the sources (see Figure 1 of Lupton 2015,for a 1D demonstration).In choosing a true peak, the area belonging to the peak andthe count in it is ignored and a peak pixel is defined as true onlyif that pixel alone is sufficiently above its immediate valley orsaddle point (compare to the discussion in Section 3.2.1). Thevery important technical issue of how to find the saddle pointcount is not discussed in that report. One drawback is that thisselection criteria will miss real peaks that are below this largethreshold but are not due to noise, for example with a very flatprofile. In this approach, deblending is completely independentof segmentation. Such pure deblending will also suffer when thedetected regions cannot be fitted (the cost function minimized)with the combinations of templates over the detection. SExtractor uses a concept very similar to contour maps knownas multi-thresholding . The count range of the detected region isdivided into several layers. Then the separate objects in thoselayers are analyzed and “real” peaks are modeled to find the sep- arate local maxima and their contribution to the total count (seeFigure 2 in Bertin and Arnouts 1996). clumpfind (Williams et al.1994) also uses a very similar approach. Therefore in this ap-proach, first a segmentation map is derived and used to modelthe objects, and then deblending is done.In
SExtractor ’s implementation, a fixed number of layers isdefined for all objects, specified by
DEBLEND_NTHRESH in SectionD. Therefore brighter objects, with a very large difference be-tween their peak count and the Sky, will receive layers that aremuch more widely separated than fainter objects. This will over-segment the fainter objects where the spacing between the layersis less and undersegment the brighter ones. The scale on whichthe count is divided will also significantly alter the result. In
SExtractor , DETECTION_TYPE specifies the scale. If set to
CCD , anexponential scale is used and setting it to
PHOTO will use a linearscale. Each will output significantly different results.Once peaks are found between the layers, they are acceptedas a true detection or rejected as false based on the parameter
DEBLEND_MINCONT which specifies the fraction of the total inten-sity in that peak, or clump, compared to the total, undeblended,detection. Similar to the layer spacing problem mentioned above,this parameter is by definition, biased toward detecting moreclumps in fainter objects. Since the threshold to accept a clumpdiffers from one detection to another (based on the total countwithin each parent detection), a fixed peak will be detected in afainter object and discarded in a bright one. Therefore when thisselection criteria is adopted for a comparison of clumps in galax-ies, added with the layering bias mentioned above, it will bias theresults by preferentially detecting more segments (or clumps) inobjects with a lower total count before deblending. Another con-26 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA a.1: n = . n = n = n =
10 d.2 d.3
Figure 18:
SExtractor sensitivity test. Column 1: the input images, which arethe same as those in Figure 12. Column 2:
SExtractor ’s segmentation maps.Column 3:
SExtractor ’s detections extended to three times the Kron radius. The
SExtractor configuration parameters can be seen in Appendix D. sequence of this method in finding true clumps is that the resultwill vary significantly depending on
DETECT_THRESH . When thethreshold is high, the detected total count will be less, and there-fore more peaks will be detected in a given object.The highlighted (magnifying glasses and red arrows) regionsin Figure 17 (for mock galaxies) and the large fraction of flaggeddetections in Figure 19 (for real images) show how
SExtractor ’sdeblending algorithm has failed in the low surface brightness re-gions. The primary reason for this failure is that
SExtractor wasdesigned for high thresholds. When reliable peaks are found(with a high threshold), each pixel is assigned to a peak, assum-ing all the peaks are a 2D Gaussian far larger than the regionabove the threshold. This is useful in cases where the peaks’
S/N and the threshold are sufficiently high and all peaks follow aGaussian profile over the whole image.In practice, hardly any non-stellar objects of astronomicalinterest has a Gaussian profile and postage stamps are usuallychosen to be larger than the objects in order to get an accurateestimate of the Sky value. Figure 17 shows two ways that
SEx-tractor fails in deblending as the threshold decreases: (1) someof the very far detections of Figure 17(c.1) and (d.1), designatedwith arrows have been identified as belonging to the main objectand (2) a connected region can be shared between two objectsthat are not physically connected to it as shown in the magnifiedregions of the same figure. These problems might not cause asignificant problem when the magnitude of the very bright mock a.1 a.2 a.3b.1 b.2 b.3c.1 c.2 c.3d.1 d.2 d.3e.1 e.2 e.3
Figure 19:
Tests of
SExtractor on the same input images as Figure 13. Firstcolumn: input image. Second column:
SExtractor ’s segmentation maps. Thirdcolumn: 3 , . r k elliptical apertures (see Section B.4). In the real images, becauseof their complexity, only the borders of the elliptical apertures are shown. Inthe mock image, the elliptical apertures are filled. Most of the central objectdetections in a.3, b.3 and d.3 were flagged by SExtractor for possibly biasedmagnitude measurements. galaxy in this figure is desired, but when applied to low surfacebrightness and diffuse galaxies like Figure 19 (a)–(d), very seri-ous systematic errors can be produced.Figure 19 (a.1)–(d.1) shows
SExtractor ’s result for the samegalaxies in Figure 13. Appendix D shows the configuration pa-rameters used to run
SExtractor on these galaxies. The input
PSF for these four real galaxies is obtained from the web interface of
TinyTim (Krist et al. 2011). The PSF of TinyTim is the theoret-ical
PSF produced on the
CCD . In order to create the processedimages shown here, various steps of image processing have beenapplied to these images (see Koekemoer et al. 2007; Massey etal. 2010). Therefore, in practice, the actual PSF on the image http://tinytim.stsci.edu/cgi-bin/tinytimweb.cgi. On Chip 1, pixel position(2047, 1048), filter F814W, Spectrum value 12, PSF diameter 3 ′′ , focus 0.0. SExtractor does not accept kernels larger than 31 ×
31 pixels. Therefore, thecentral region of this size was cropped from
TinyTim ’s output for input into
SExtractor . HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA will be slightly different from that produced by
TinyTim (see vander Wel et al. 2012). While this difference plays a very importantrole in model fitting, for the demonstrative purposes on detectionhere, the
TinyTim output is adequate.It is clear from Figure 19 that the two more diffuse objectsin Figure 19(b) and (d) have been deblended into many moreseparate objects than the two brighter cases of Figure 19 (a) and(c). As explained above, this is due to the inherent biases of
SExtractor ’s deblending algorithm.B.3.CombiningDetectionandSegmentationA discrete hierarchical Markov image modeling approachwas introduced in astronomy recently by Vollmer et al. (2013).This is a fundamentally different approach to detection and seg-mentation. It is reviewed here because like the more popularmethods used today, the detected region needs to be grown tocomplete the detection as we will discuss next in Section B.4. Inthis approach object detection and segmentation are done simul-taneously. The basic technique comes from image processingapplications in particular the quadtree of Laferte et al. (2000).Hierarchical Markov models have the advantage that they al-low to use multi-band images at potentially different pixel reso-lutions without the need to modify the data as most techniquesused in astronomy currently need to do. This makes them “scalefree.” However Vollmer et al. (2013) acknowledge that their im-plementation (
MARSIAA ) does not account for
PSF variation be-tween the inputs. In this approach the problem is labeling ofthe image pixels “which are spatially connected through a pre-defined neighborhood system. These labels correspond to dis-crete classes of objects with similar surface brightness”(Vollmeret al. 2013, page 3). Using the multi-band data as input, the pro-cedure iterates along the hierarchies to find a distribution for thelabels over the image, see Laferte et al. (2000) for details.In this approach there is no specific functional basis to do thedetection, therefore it is considered as a model-independent de-tection and segmentation technique. However, instead of knownfunctions, it classifies the image pixels into “classes” which areshared by all the objects in the image. For example, noise has aclass of 0 and as the gradient over the object increases, the classalso increase; see Figure 9 in Vollmer et al. (2013) for how theclass changes toward the center of the object. The number ofclasses is a free parameter specified by the user. The final simul-taneous detection/segmentation depends on the total number ofclasses and the dynamic range of the image. To detect low sur-face brightness galaxies, Vollmer et al. (2013) had to truncate orclip the image pixels beyond 20 s , therefore this approach cannotsimultaneously detect/segment very bright and very faint objectsin one image.An interesting consequence of this technique is that it doesnot need to define a count threshold, however, it can only get ∼ . s close to the Sky value. Figures 8–10 of Vollmer et al.(2013) show the results of this technique compared to SExtrac-tor ’s segmentation maps for three galaxies. In the three cases, thedetected area of
SExtractor ’s segmentation map with a thresholdat 1 . s was comparable or even larger (better covering the ob-jects). This technique heavily relies on iterative processes andthus can be very slow. Vollmer et al. (2013) report it was at least90 times slower than SExtractor for the same data set. B.4.GrowingDetectionsBelowtheThresholdThe high thresholds that were used to avoid false detectionscause a very large fraction of the object to not be initially de-tected (see Figure 17(a.1)–(d.1)). Failing to detect a bright ob-ject’s complete area will deprive us of all the valuable infor-mation lying hidden there and cause a systematic under- (over-)estimation of the object’s magnitude (Sky value). Thereforewhen using such large thresholds, it is necessary to use the re-gions above the threshold as a basis to “grow” the detections andinclude their fainter parts.The Kron radius, r k (Kron 1980), is one of the most popularmethods to grow the detection into regions of the image with acount lower than the threshold. It is defined as the light-weightedaverage radius of a profile, r k = (cid:229) rI ( r ) (cid:229) I ( r ) . From Figure 17 it is clear that the elliptical parameters and r k are found based on the region above the threshold, thereforethe fainter regions of the galaxies that often harbor valuable in-formation on the dynamical history of the galaxy and might havedifferent morphological parameters cannot be directly measuredin the existing signal based approach to detection.It is generally considered that 2 r k contains ≥
90% of the to-tal count, integrated to infinity ( F I ), of a galaxy light profile (seeKron 1980; Infante 1987). Therefore in the literature it is com-mon to use 2 . r k to find the total count of a galaxy. In orderto test this claim, SExtractor ’s FLUX_AUTO is divided by F I forthe four different masked 3 , . r k apertures shown in Figure 17.For this profile, 3 , . r k contains 74 . . .
22% and87 .
33% of the total count. The area within the 3 , . r k has beenmasked to show the extent of the object that has not been de-tected. The elliptical apertures containing 90% and 95% of F I are also shown in Figure 17 for visual comparison.Graham and Driver (2005) did a thorough analytic study ofthe mathematical (continuous) 1D Sérsic profile to show howsuch differences from the expected >
90% could occur in thecalculation of the Kron radius. They showed how dependent thetotal count within the 2.5 times the Kron radius is to the area thatis used to find the Kron radius. If this area is extended to infinity,it is indeed as expected and 2.5 times the Kron radius contains >
90% of F I , even for n =
10 profiles. However, integrationto infinity is not a realistic condition. In practice any measure-ment of the Kron radius is confined to a certain aperture. Byconsidering the effects of the finite aperture that is used in realmeasurements of r k , they show how the values reported above,which have also been reported by other authors are reasonable(see Bernstein et al. 2002; Benítez et al. 2004).This discussion demonstrates that the statement that an aper-ture > r k will contain nearly all ( ≥ F I is approximated by the total count within 30 r e . Note that the larger ellipsein Figure 17 which contains 95% of the total light is on 7 r e . HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA in this example has a Sérsic index ( n ) of 4 .
00, while in the localuniverse, galaxies with n as large as 11 .
84 have been reported(Kormendy et al. 2009). The case for distant galaxies becomesmuch worse, since their apparent r e in pixels, due to the angu-lar diameter distance, is much smaller with much lower surfacebrightness.Figure 18 shows SExtractor ’s output for mock galaxies, bothflat, low Sérsic index, and sharp or high Sérsic index. Even with
DETECT_THRESH=1 , only a few pixels have been found for eachobject, the elliptical parameters, namely center, axis ratio, posi-tion angle and major axis length ( r k ) are all measured using thesefew pixels. The second argument to PHOT_AUTOPARAMS is set to aminimum radius such that if r k is too small, this minimum radiusis used instead of it. This is why most of the ellipses in Figure 18have similar sizes. However, such hand input values could addmore bias to the final result (see Section 5.3 for a discussion onthe dispersion of the magnitudes measured in this method).By definition, r k depends on finding the full 1D light profileof an object. In order to do that, it is necessary that the cen-ter of the profile, its position angle and axis ratio are accuratelydefined. Furthermore, a good enough sampling of the profile atvarious ( r , q ) is needed in order to approximate the count at acertain radius. These preconditions are only valid in the mostoptimistic cases, like the bright purely elliptical mock profileof Figure 1(b). In general only a very rare fraction of galaxiesdisplay such a simple and bright profile, clearly separated fromneighbors or image edges.Galaxies can have multiple components, for example, thebulge, a bar, and a disk. Each can have a different axis ratio andposition angle. The central parts (bulge and a possible bar) aremuch brighter than the outer disk. Therefore their effect on theoverall, light-weighted, estimation of ellipticity will completelyoutweigh the fainter outer parts, which might not even be de-tected because of the high thresholds ( DETECT_THRESH>1 ) that arecommonly used. This compounds the problem mentioned in theprevious paragraph for a real galaxy’s total light because all theelliptical parameters in one profile can change in a real galaxydue to their rich dynamical history.The four real galaxies in Figure 19 visually appear to havea distinct region in the noise. Whether they are composed ofseparate objects in the same line of sight or are actually oneobject cannot be judged from one image. In all four exam-ples, because the connectivity of the objects is below the thresh-old (
DETECT_THRESH=1 ), SExtractor considers them to be spatiallyseparated objects. Furthermore, in none of these examples canthe connectivity be modeled as a simple ellipse to be able tomodel.Figure 19(a.1) appears to have one bright clump and a dif-fuse host. Because of the relatively flat, diffuse structure, theellipses showing 3 , . r k for the three detections have become ex-tremely large. When this happens, the final result is dependenton the accuracy of Sky subtraction, because a large area of sky isalso included as part of the object. The large fraction of flaggeddetections in the real galaxies of Figure 19 shows how MAG_AUTO has been flagged as unreliable for most of their sub-components.A user of
SExtractor will thus have to either ignore flagged detec-tions or rely on
MAG_ISO , using only the segmented regions abovethe threshold. If they choose the latter, for such diffuse objects,
MAG_ISO is going to miss a very large fraction of the object’s areaand photon count.Due to the sheer number of galaxies that researchers use for their studies, it is not feasible to check each detection by eyeto check when
MAG_AUTO should be used or
MAG_ISO . SExtractor does provide
MAG_BEST to automatically choose between the two,but it is not commonly used in studies today. The main reason isthat
MAG_AUTO and
MAG_ISO are not consistent and cannot simplybe compared with each other in the study of a large number ofgalaxies.In Figure 19(e), because of the strong gradient created bybicubic-spline interpolation in
SExtractor ’s Sky measurement (seeFigure 16(l)) no pixel from the the profile on the corner of the im-age has been detected. Looking at
SExtractor ’s detections in Fig-ure 19(e.3), it is clear that for such objects, lying on the edges ofimages, any modeling method of trying to account for their faintregions fails. A very large portion of their total count is left out oftheir designated Kron aperture. Like the case for clumpy objects,all the necessary requirements to define r k fail when enough ofthe object is beyond the edge of an image. Such objects have tobe taken into account, particularly if they are the wings of brightobjects. While the total magnitude of such objects would alwaysbe flagged by software like SExtractor , and not included in anyscientific work, failing to account for as much of their light aspossible will systematically increase the Sky value that is calcu-lated in such an image. Such cases are common when postagestamp images (for example, Figure 19 (a.1)–(d.1)) of a selectedgroup of galaxies in a larger survey are studied.The discussions in this section show that the only way to getas close as possible to the “true” total photon count and areaof an object of interest is to use lower thresholds. Trying tocorrect for the photon count which is lost, when a high thresholdthat is required in the signal-based detection technique is used,will only be a source of systematic bias in measurements andthe resulting scientific results. The problems mentioned in thissection were the primary motivation in finding a new approachto detection and segmentation that would mitigate many of theproblematic issues described above.C. FINDING THE MODEIn Section A.1, the existing methods to find the mode of animage were reviewed. Here a novel approach to finding the modeof a distribution is proposed. It is very accurate as long as themode of the distribution is approximately symmetric around it,like the pixel count distributions of Figure 1. This is a safe con-straint for images of astronomical targets, which generally havemore fainter pixels than brighter ones (the yellow histograms ofFigure 1). When the noise is significant, the noise will ensurea symmetric mode similar to the examples of Figure 1. As thesignal becomes more significant, the quantile of the mode willshift to lower values.Given an ordered data set X with n elements, where X n ≥ X n − ( n > ) . A mirrored distribution is defined about the mirror point located at index m as follows. All the data prior to m areplaced in opposite order after m , as if a mirror was placed there.The elements of the mirrored data set are thus defined with M i = (cid:26) X i i ≤ mX m + ( X m − X m − ( i − m ) ) i > m The mirrored distribution is therefore identical to the orig-inal distribution until and including X m . M has 2 ( m − ) + X , it is ordered. Figure 20 shows one example29 HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA e − ) - M m OriginalMirrored0 50 1000 50 100Pixel intensity ( e − ) - M m −
50 0 50 −
50 0 50Pixel intensity ( e − ) - M m −
50 0 50 −
50 0 50Pixel intensity ( e − ) - M m −
50 0 50 −
50 0 50Pixel intensity ( e − ) - M m a: b: c: d: e: Figure 20:
Mirrored distribution compared with the original distribution. The data set is the pixels of Figure 9(b.2) multiplied by 10 seconds. M m is the value onthe mirror point. Blue and green histograms are the histograms of the original and mirrored distributions, respectively. The thick and thin lines are the cumulativefrequency plots of the original and mirrored distributions, respectively. The percentage value shown after the label of each plot shows the percentile of the originaldistribution on which the mirror was placed. It is clear that as the mirror point gets closer to the mode of the distribution, the two lines diverge more smoothly. Thecumulative frequency curve is scaled to the tip of the blue histogram. data set (the pixels of Figure 9(b.2)) with mirrored distributionsoverplotted. The mirror points in Figure 20 are placed at variousquantiles of the original distribution in each case. The histogramis only displayed in this demonstration to aid the understanding.It is not used at all during the analysis.Figure 20 shows that when the mirror is very far from themode, the cumulative frequency plot of the mirrored distribu-tion will deviate below or above that of the original pixels verysharply. As the mirror approaches the mode, the maximum dif-ference of the two cumulative frequency plots in the vicinity ofthe mode becomes smaller. Therefore, on the symmetric modeof a distribution, the two cumulative frequency plots should re-main very similar for a large number of pixels (or distance in theordered array) after the mirror.Let d ( M i , m ) represent the difference between the cumula-tive frequency plots of the original and mirrored data sets for M i ,when the mirror is placed on the index m . For any given mir-ror position and positive integer N , let D ( m ) be the maximum | d ( M i , m ) | for all m < i < m + N . Then the symmetric mode ofa distribution can be found by finding m such that D ( m ) is min-imized. The Golden section search algorithm (Kiefer 1953) isused to find the mirror point that minimizes D ( m ) .If the distribution has several symmetric local maximums,and as long as N is large enough, the first (with the lowest quan-tile) will be found, even if the brighter peak has more pixels inits proximity (see Figure 21(c) for a distribution with two localmaximums in the histogram). This is exactly what is desired inastronomical image processing because if there is a second localmaximum with a larger count and larger number of pixels, thenthat local maximum is not due to noise. Noise will always pro-duce the first (lowest count) local maximum, which can be calledthe mode . In most astronomical applications of low S/N objects,the first local maximum will be the only one. If not, it will havemore pixels in its vicinity than any other local maximum in thepixel distribution.The mirror cumulative frequency plot found for the modeshould be approximately equal to the actual data in the vicinityof the mode. Beyond that, it should ideally always remain belowthe actual data (see Figure 21). However, in minimizing D ( m ) as defined above, the resulting mirror cumulative frequency plotwill zig-zag around the actual distribution since | d ( M i , m ) | wasused. The cumulative frequency plot is fundamentally a count-ing function. Therefore, the error in choosing an index, say m ,follows a Poisson distribution. Thus the standard deviation infinding m is √ m . Hence a limit can be set on how negative d ( M i , m ) can become. Taking a to be a given constant speci-fied by the user, if d ( M i , m ) < − a √ m , D ( m ) is set to a fixed constant, let’s call it C . The golden section search is modifiedfrom its standard algorithm to choose the interval with the low-est values when it confronts C . Recall that such conditions onlyoccur when m is a significant overestimation of the mode.Figure 21(a) shows the result for the same distribution ofFigure 20. In Figure 21(b) this technique is applied to the moreskewed distribution of the convolved image of Figure 10(c.1).A very bright spiral galaxy has caused the postage stamp pixeldistribution to be more skewed. Figures 21(c) and (d) show thedistributions of two mock images with two very bright and largeSérsic profiles of n = . n = . N is set to be 1 / d ( M i , m ) is sampled in 1000 equally spaced points between M m and M m + N . a is also taken to be 1 . a be the value of the 0 .
01 quantile of the mirrored dis-tribution. The minimum is not chosen due to its scatter. Take b to be the value on the mode and finally, let c be the value ofthe first M i beyond the mode where | d ( M i , m ) | > a √ m , same a as above. This point is marked on the plots of Figure 21 with adotted vertical line. These three values can define a measure of“symmetricity” ( s ) about the mode with: s = c − bb − a The symmetricity for the examples of Figure 21 are respec-tively: 0 .
43, 0 .
25, 0 .
84 and 0 .
74. Based on these and other simu-lations, s > . The value of C is an internal issue and is irrelevant to this discussion or theuser. It is only necessary that it should be larger than the largest possible D ( m ) . HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA e − ) - M m e − ) - M m , , e − ) - M m ,
000 4 , ,
000 4 , e − ) - M m a: 43 .
37% b: 34 .
35% c: 15 .
05% d: 4 . Figure 21:
Finding the mode by minimizing D ( m ) . Similar to Figure 20, with the mirror placed on the mode. The dotted line shows the first value where | d ( M i , m ) | > . √ m , which is used to find the “symmetricity” of the mode (see text). (a) Same data set as Figure 20. (b) The convolved image of Figure 10(c.1): a bright and largegalaxy skewing the distribution, further shifting the mode to lower percentiles. (c) The convolved and noisy image of a circular −
20 magnitude mock Sérsic profilewith n = . r e =
50 pixels, truncated at 2 r e with center on (100, 140) in a 200 ×
200 pixel image. (d) Similar to (c) but −
28 magnitude with n = . r e = r e placed at (110, 285). “symmetric” mode, when the mode is above the ∼ .
02 quantileof the image. When the output mode approaches such quantiles, b − a in the equation above will be too small to give an accurateresult.In addition to the initial cost of sorting of the data set, therest of the process is very simple and not CPU intensive, sinceonly 1000 points are checked in each round of the golden sec-tion search. Therefore the
CPU cost of running this technique issimilar with the cost for s -clipping which also requires sorting,to find the median, but needs multiple passes through the wholedata. D. SEXTRACTOR PARAMETERSBelow is a list of the SExtractor parameters used in this study.We make no claim that these are “the best” (or worst) possibleset of parameters. The values for each parameter were selectedbased on values reported by other research teams which exten-sively used
SExtractor in their astrophysical data analysis and itsmanual; see Appendices A and B.The file
PSF.conv refers to the
PSF used to create mock im-ages when
SExtractor was applied to the mock images and tothat produced by
TinyTim when real
HST images were used; seeAppendix B for more details.
MEMORY_OBJSTACK 3000MEMORY_PIXSTACK 300000MEMORY_BUFSIZE 1024DETECT_TYPE CCDGAIN_KEY GAINPIXEL_SCALE 1.0FITS_UNSIGNED NBACK_TYPE AUTOBACK_SIZE 64BACK_FILTERSIZE 3BACKPHOTO_TYPE LOCALBACKPHOTO_THICK 24WEIGHT_TYPE BACKGROUNDWEIGHT_GAIN NFILTER YFILTER_NAME PSF.convMASK_TYPE CORRECTINTERP_MAXXLAG 16INTERP_MAXYLAG 16INTERP_TYPE ALL DETECT_THRESH 1ANALYSIS_THRESH 1THRESH_TYPE RELATIVEDETECT_MINAREA 5DEBLEND_NTHRESH 64DEBLEND_MINCONT 0.001CLEAN YCLEAN_PARAM 1.5FLAG_IMAGE flag.fitsFLAG_TYPE ORSEEING_FWHM 1.2STARNNW_NAME default.nnwMAG_ZEROPOINT 0.0PHOT_APERTURES 5PHOT_AUTOPARAMS 3,3.5PHOT_AUTOAPERS 0,0PHOT_FLUXFRAC 0.5PHOT_PETROPARAMS 2.0,3.5SATUR_LEVEL 50000.0SATUR_KEY SATURATECHECKIMAGE_TYPE SEGMENTATION,APERTURES,BACKGROUNDCHECKIMAGE_NAME seg.fits,aper.fits,back.fitsCATALOG_NAME SEresults.txtCATALOG_TYPE ASCII_HEADPARAMETERS_NAME ./parameters/sextractor/sextractor.paramVERBOSE_TYPE NORMAL
E. NOISECHISEL PARAMETERSThe full list of parameters that was used for
NoiseChisel inthis paper is printed here. If a different parameter was used, it isexplained in the text. Note that we do not consider these param-eters to be “the best” for any generic data set. For the processingof Figure 11, --nch1=4 , --lmesh=256 and the following twoon/off options were also activated --fullinterpolation and --fullsmooth . For the HST images, --skysubtracted wasactivated. HE A STROPHYSICAL J OURNAL S UPPLEMENT S ERIES , 220:1 (33pp), 2015 September A
KHLAGHI , & I
CHIKAWA minnumfalse 100
REFERENCES
Abraham, R. G. and P. G. van Dokkum (Jan. 2014). PASP, 126, 55Abraham, R. G. et al. (Nov. 1996). ApJS, 107, 1Aihara, H. et al. (Apr. 2011). ApJS, 193, 29Annis, J. et al. (Oct. 2014). ApJ, 794, 120Anscombe, F. J. (Feb. 1973).
The American Statistician , 27, 17Barden, M. et al. (May 2012). MNRAS, 422, 449Beard, S. M., H. T. MacGillivray, and P. F. Thanisch (Nov. 1990). MNRAS,247, 311Benítez, N. et al. (Jan. 2004). ApJS, 150, 1Bernstein, R. A., W. L. Freedman, and B. F. Madore (May 2002). ApJ, 571, 56Bertin, E. and S. Arnouts (June 1996). A&AS, 117, 393Bijaoui, A. (Apr. 1980). A&A, 84, 81Bijaoui, A. and M. Dantel (May 1970). A&A, 6, 51Cayón, L., J. Jin, and A. Treaster (Sept. 2005). MNRAS, 362, 826Chandola, V., A. Banerjee, and V. Kumar (July 2009).
ACM Comput. Surv.
Handbuch der Physik , 53, 275Donoho, D. and J. Jin (2004).
The annals of Statistics , 32, 962Dougherty, E. (1992). 1st. CRC Press.
ISBN : 9780824787240.Ellison, S. L., J. T. Mendel, D. R. Patton, and J. M. Scudder (Nov. 2013). MNRAS,435, 3627Elmegreen, D. M., B. G. Elmegreen, S. Ravindranath, and D. A. Coe (Apr. 2007).ApJ, 658, 763Galametz, A. et al. (June 2013). ApJS, 206, 10Gonzaga, S., W. Hack, A. Fruchter, and J. Mack (June 2012).Gonzalez, R. C. and R.E. Woods (2008). 3rd. Prentice Hall.
ISBN : 978-0131687288.Graham, A. W. and S. P. Driver (2005). PASA, 22, 118Hawkins, D. M. (1980). Chapman & Hall.
ISBN : 978-94-015-3996-8.Hopkins, A. M. and J. F. Beacom (Nov. 2006). ApJ, 651, 142Hopkins, A. M. et al. (Feb. 2002). AJ, 123, 1086Illingworth, G. D. et al. (Nov. 2013). ApJS, 209, 6Infante, L. (Sept. 1987). A&A, 183, 177 Irwin, M. J. (June 1985). MNRAS, 214, 575Jiang, L. et al. (July 2014). ApJS, 213, 12Kajisawa, M. et al. (Nov. 2010). ApJ, 723, 129Kajisawa, M. et al. (Mar. 2011). PASJ, 63, 379Kauffmann, G., S. D. M. White, and B. Guiderdoni (Sept. 1993). MNRAS, 264,201Kennicutt Jr., R. C. (Sept. 1989). ApJ, 344, 685Kiefer, J. (1953).
Proc. Amer. Math. Soc.
4, 502Klypin, A., A. V. Kravtsov, O. Valenzuela, and F. Prada (Sept. 1999). ApJ, 522,82Koekemoer, A. M. et al. (Sept. 2007). ApJS, 172, 196Kormendy, J., D. B. Fisher, M. E. Cornell, and R. Bender (May 2009). ApJS,182, 216Kregel, M. and P. C. van der Kruit (Nov. 2004). MNRAS, 355, 143Krist, J. E., R. N. Hook, and F. Stoehr (Oct. 2011).
Proc. SPIE , 8127,Kron, R. G. (June 1980). ApJS, 43, 305Lackner, C. N. and J. E. Gunn (Apr. 2012). MNRAS, 421, 2277Laferte, J.-M., P. Perez, and F. Heitz (Mar. 2000).
ITIP , 9, 390Lauer, T. R. et al. (June 2007). ApJ, 662, 808Leauthaud, A. et al. (Sept. 2007). ApJS, 172, 219Lee, B. et al. (Sept. 2013). ApJ, 774, 47Lotz, J. M. et al. (Jan. 2006). ApJ, 636, 592Lundgren, B. F. et al. (Jan. 2014). ApJ, 780, 34Lupton, R. H. (Aug. 2015).
Personal home page ,Lupton, R. H., J. E. Gunn, and A. S. Szalay (Sept. 1999). AJ, 118, 1406Macciò, A. V. et al. (Mar. 2010). MNRAS, 402, 1995Mandelbaum, R. et al. (Aug. 2005). MNRAS, 361, 1287Massey, R. et al. (Jan. 2010). MNRAS, 401, 371McLean, I. S. (2008). 2nd. Praxis Publishing Ltd.
ISBN : 978-3-540-76582-0.Miller, C. J. et al. (Dec. 2001). AJ, 122, 3492Miyazaki, S. et al. (Dec. 2002). PASJ, 54, 833Murata, K. L. et al. (May 2014). ApJ, 786, 15Muzzin, A. et al. (May 2013). ApJS, 206, 8Perret, B., S. Lefévre, and Ch. Collet (2009).
Pattern Recognition , 42, 2470Petrosian, V. (Oct. 1976). ApJ, 209, L1Pohlen, M. and I. Trujillo (Aug. 2006). A&A, 454, 759Presotto, V. et al. (May 2014). A&A, 565, 126Ratnatunga, K. U. and E. B. Newell (Jan. 1984). AJ, 89, 176Rix, H.-W. et al. (June 2004). ApJS, 152, 163Scoville, N. et al. (Sept. 2007). ApJS, 172, 1Sérsic, J. L. (1963).
BAAA , 6, 41Skelton, R. E. et al. (Oct. 2014). ApJS, 214, 24Starck, J. and F. Murtagh (2006). 2nd. Springer.
ISBN : 978-3-662-04908-2.Stetson, P. B. (Mar. 1987). PASP, 99, 191Stoughton, C. et al. (Jan. 2002). AJ, 123, 485van der Wel, A. et al. (Dec. 2012). ApJS, 203, 24van Dokkum, P. G. (Nov. 2001). PASP, 113, 1420van Dokkum, P. G. et al. (Jan. 2015). ApJ, 798, 45Vincent, L. and P. Soille (June 1991).
ITPAM , 13, 583Vollmer, B. et al. (Feb. 2013). AJ, 145, 36Williams, J. P., E. J. de Geus, and L. Blitz (June 1994). ApJ, 428, 693, 13, 583Vollmer, B. et al. (Feb. 2013). AJ, 145, 36Williams, J. P., E. J. de Geus, and L. Blitz (June 1994). ApJ, 428, 693