Multiscale, multiwavelength extraction of sources and filaments using separation of the structural components: getsf
AAstronomy & Astrophysics manuscript no. getsf © ESO 2021March 2, 2021
Multiscale, multiwavelength extraction of sources and filamentsusing separation of the structural components: getsf
A. Men’shchikov
AIM, IRFU, CEA, CNRS, Université Paris-Saclay, Université Paris Diderot, Sorbonne Paris Cité, F-91191 Gif-sur-Yvette, Francee-mail: [email protected]
Received 16 November 2020 / Accepted 22 February 2021
ABSTRACT
High-quality astronomical images delivered by modern ground-based and space observatories demand adequate, reliable software fortheir analysis and accurate extraction of sources, filaments, and other structures, containing massive amounts of detailed informationabout the complex physical processes in space. The multiwavelength observations with highly variable angular resolutions acrosswavebands require extraction tools that preserve and use the invaluable high-resolution information. Complex fluctuating backgroundsand filamentary structures appear di ff erently on various scales, calling for multiscale approaches for complete and reliable extractionof sources and filaments. The availability of many extraction tools with varying qualities highlights the need to use standard modelbenchmarks for choosing the most reliable and accurate method for astrophysical research.This paper presents getsf , a new method for extracting sources and filaments in astronomical images using separation of their structuralcomponents, designed to handle multiwavelength sets of images and very complex filamentary backgrounds. The method spatiallydecomposes the original images and separates the structural components of sources and filaments from each other and from theirbackgrounds, flattening their resulting images. It spatially decomposes the flattened components, combines them over wavelengths,detects the positions of sources and skeletons of filaments, and measures the detected sources and filaments, creating the outputcatalogs and images. The fully automated method has a single user-defined parameter (per image), the maximum size of the structuresof interest to be extracted, that must be specified by users. This paper presents a realistic multiwavelength set of simulated benchmarkimages that can serve as the standard benchmark problem to evaluate qualities of source- and filament-extraction methods.This paper describes hires , an improved algorithm for the derivation of high-resolution surface densities from multiwavelength far-infrared Herschel images. The algorithm allows creating the surface densities with angular resolutions that reach 5 . (cid:48)(cid:48) when the70 µ m image is used. If the shortest-wavelength image is too noisy or cannot be used for other reasons, slightly lower resolutionsof 6 . − . (cid:48)(cid:48) are available from the 100 or 160 µ m images. These high resolutions are useful for detailed studies of the structuraldiversity in molecular clouds.The codes getsf and hires are illustrated by their applications to a variety of images obtained with ground-based and space telescopesfrom the X-ray domain to the millimeter wavelengths. Key words.
Stars: formation – Infrared: ISM – Submillimeter: ISM – Methods: data analysis – Techniques: image processing –Techniques: photometric
1. Introduction
Multiwavelength far-infrared and submillimeter dust continuumobservations with the large space telescopes
Spitzer , Herschel ,and
Planck in the past decades greatly increased the amount andimproved the quality of the available data in various areas of as-trophysical research. Observed images with di ff raction-limitedangular resolutions and high sensitivity reveal an impressive di-versity of the enormously complex structures, covering orders ofmagnitude in intensities and spatial scales. The images featureforemost the bright fluctuating backgrounds, omnipresent fila-ments, and huge numbers of sources of di ff erent physical nature,all blended with each other, whose appearance and resolution areoften markedly di ff erent at short and long wavelengths. The mas-sive amount of information that is coded in the fine structure ofthe observed images must contain clues to the complex physicalprocesses taking place in space, but these clues are extremelydi ffi cult to decipher. It is quite clear that the era of these high-quality data from space telescopes and large ground-based inter- Send o ff print requests to : Alexander Men’shchikov ferometers, such as the Atacama Large Millimeter / submillimeterArray ( ALMA ), requires more sophisticated tools for their accu-rate analysis and correct interpretation than those developed forthe lower-quality images of the past. Adequate extraction meth-ods must be explicitly designed for the multiwavelength imagingobservations with highly dissimilar angular resolutions acrosswavebands. They must also be able to handle the bright filamen-tary backgrounds that vary on all spatial scales, whose fluctua-tion levels di ff er by several orders of magnitude across the ob-served images.The source- and filament-extraction methods are growing innumbers. In the area of star formation, a new method was pub-lished every year or two within the seven-year period after thelaunch of Herschel . Rosolowsky et al. (2008) devised dendro-grams to describe the hierarchical structure of clumps observedin the data cubes from molecular line observations. The methodcarries out topological analysis of image structures by isophotalcontours at varying intensity levels and represents them graph-ically as a tree. Molinari et al. (2011) created cutex to extractsources in star-forming regions observed with
Herschel . Themethod analyzes multidirectional second derivatives of the ob-
Article number, page 1 of 37 a r X i v : . [ a s t r o - ph . I M ] M a r & A proofs: manuscript no. getsf served image to detect sources, and it measures them by fit-ting elliptical Gaussians on a planar background to their peaks.Men’shchikov et al. (2012) developed getsources , the multi-wavelength source extraction method for the
Herschel observa-tions of star-forming regions. The method spatially decomposesimages, combines them into wavelength-independent detectionimages, subtracts the backgrounds of detected sources, and mea-sures the sources, deblending them when they overlap. Kirk et al.(2013) presented csar for the
Herschel images. The method ana-lyzes areas of connected pixels that are bound by closed isopho-tal contours, descending to a predefined background level andpartitioning peaks at their lowest isolated contours into sources.Berry (2015) created fellwalker to identify clumps in submil-limeter data cubes. The method finds image peaks by tracingthe line of the steepest ascent and identifies sources as the hillwith the highest value found for all pixels in its neighborhood.Sousbie (2011) produced disperse to identify structures in thelarge-scale distribution of matter in the Universe. The methodapplies the computational topology to trace filaments and otherstructures. Men’shchikov (2013) developed getfilaments to im-prove the source extraction with getsources on the filamentarybackgrounds observed with
Herschel . The method separates fil-aments from sources in spatially decomposed images and sub-tracts them from the detection images, thereby reducing the rateof spurious sources. Schisano et al. (2014) devised a Hessianmatrix-based approach to extract filaments in
Herschel observa-tions of the Galactic plane. The method analyzes multidimen-sional second derivatives to identify filaments and determinetheir properties. Clark et al. (2014) presented rht to characterizefibers in the interstellar H i medium. The method has been ap-plied to various observations of di ff use H i , revealing alignmentof the fibers along magnetic fields. Koch & Rosolowsky (2015)published filfinder to identify filaments in the Herschel imagesof star-forming regions. the method applies a mathematical mor-phology approach to isolating filaments in observed images. Ju-vela (2016) presented tm to trace filaments in observed images.The method matches a predefined template (stencil) of an elon-gated structure at each pixel of an image by shifting and rotatingthe template and analyzing the parameters of the matches.These extraction methods all have several important issues.Sources and filaments are handled completely independently bythese methods, although numerous Herschel observations havedemonstrated that there is a tight physical relation between them.Most sources are found in filamentary structures, and the corre-sponding starless, prestellar, and protostellar cores are thoughtto form inside the structures that are created by dynamical pro-cesses, magnetic fields, and gravity within a molecular cloud. Allmajor structural components of the observed images, that is, thebackground cloud, filamentary structures, and sources, are heav-ily blended with each other; curved filaments are even blendedwith themselves. The degree of their blending increases at longerwavelengths with lower angular resolutions, which increases theinaccuracies in their detections, measurements, and interpreta-tions.Most of the extraction methods focus on detecting structures,whereas the most important and di ffi cult problem is measuringthem accurately. Numerous algorithms only partition the imagebetween sources and do not allow them to overlap, although de-blending of the mixed emission of the structural componentsis an indispensable property of an accurate extraction method.For best detection and measurement results, source-extractionmethods must be able to separate underlying filamentary struc-tures and filament-extraction methods must be able to separatesources. The existing source- and filament-extraction methods use completely di ff erent approaches, and the quality of their re-sults is expected to be very dissimilar.It seems unlikely that methods that are based on very di ff er-ent approaches would give consistent results in terms of detec-tion completeness, number of false-positive detections, and mea-surement accuracy. In practice, various methods do perform verydi ff erently, as can be shown quantitatively on simulated bench-marks for which the properties of all components are known.This highlights the need of systematic comparisons of di ff erentmethods in order to understand their qualities, inaccuracies, andbiases. The danger is real that numerous uncalibrated methodsare applied for the same type of star formation studies, whichwould give inconsistent, contradictory results and incorrect con-clusions. This would create serious, lasting problems for the sci-ence.Source- and filament-extraction methods are the criticallyimportant tools that must be calibrated and validated using astandard set of benchmark images with fully known properties ofall components before they are applied in astrophysical contexts.It would be desirable to use the same extraction tool to excludeany biases or dissimilarities that are caused by di ff erent meth-ods. If a new extraction method is to be used, it must be testedon standard benchmarks to ensure that its detection and measure-ment qualities are consistent or better. This approach is usuallypracticed within research consortia, but this does not solve theglobal problem that the results obtained from the same data bydi ff erent consortia or research groups using di ff erent tools maystill be a ff ected by the uncalibrated (or suboptimal) tools thatwere used.This paper presents getsf , a new multiwavelength methodfor extracting sources and filaments. It also describes a realisticsimulated benchmark, resembling the Herschel images of star-forming regions, which is used below to illustrate the methodand in a separate paper (Men’shchikov 2021, submitted) to quan-titatively evaluate its performance. The multiwavelength bench-mark simulates the images of a dense cloud with strong non-uniform fluctuations, a wide dense filament with a power-lawintensity profile, and hundreds of radiative transfer models ofstarless and protostellar cores with wide ranges of sizes, masses,and profiles. The simulated benchmark with fully known pa-rameters allows quantitative analyses of extraction results andconclusive comparisons of di ff erent methods by evaluating theirextraction completeness, reliability, and goodness, along withthe detection and measurement accuracies. The multiwavelengthimages can serve as the standard benchmark problem for othersource- and filament-extraction methods, allowing researchers toperform their own tests and choose the most reliable and accurateextraction method for their studies. Instead of publishing bench-marking results for some of the existing methods, it seems a bet-ter idea to provide researchers with the benchmark and a qual-ity evaluation system (Men’shchikov 2021, submitted) to enablecomparisons of the methods of their choice. In practice, this ap-proach of having own experience is much more convincing andit allows a consistent evaluation of newly developed methods.The new source and filament extraction method getsf rep-resents a major improvement over the previous algorithms get-sources , getfilaments , and getimages (Men’shchikov et al. 2012;Men’shchikov 2013, 2017, hereafter referred to as Papers I, II,and III); throughout this paper, the three predecessors are collec-tively referred to as getold . The new method (Fig. 1) consistentlyhandles two types of structures, sources and filaments, that areimportant for studies of star formation, separating the structural http://irfu.cea.fr/Pisp/alexander.menshchikov/ Article number, page 2 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf
Decompose the original I λ in single-scale images I λ j , iterate ! λ j Resample H λ in I λ on a pixel ∆, create masks M λ , estimate maximum sizes X λ , Y λ Decompose the detection images S λD , F λD in single scales S λD j , F λD j , iterate ! λ j Remove from S λD j , F λD j structures fainter than 5 ! λ j , 2 ! λ j , obtaining clean S λD j C , F λD j C Combine S λD j C , F λD j C over λ into S D j C , F D j C , detect and catalog sources and filamentsMeasure sources and filaments, create their images, catalog profiles, sizes, fluxes, etc.Subtract the background, computing the sources component S λ = I λ − B λ X Remove from I λ j all sources with sizes < X λ , obtaining the X λ -scale background B λ X Remove from U λ j all structures with sizes < X λ , obtaining the X λ -scale background Q λ X Decompose the standard deviations U λ = sd O ( S λ ) in single-scale images U λ j , iterate ! λ j Median filter Q λ X , obtaining the flattening factor Q λ and detection image S λD = S λ / Q λ Subtract the background, computing the filaments component F λ = B λ X − B λ Y Remove from B λ Xj all filaments with sizes < Y λ , obtaining the Y λ -scale background B λ Y Decompose the X λ -scale background B λ X in single-scale images B λ Xj , iterate ! λ j Decompose the standard deviations V λ = sd O ( F λ ) in single-scale images V λ j , iterate ! λ j Remove from V λ j all structures with sizes < Y λ , obtaining the Y λ -scale background R λ Y Median filter R λ Y , obtaining the flattening factor R λ and detection image F λD = F λ / R λ Fig. 1.
Flowchart of the image processing steps in getsf . The colored blocks represent preparation ( purple ), background subtraction ( blue ), imageflattening ( green ), and extraction of sources and filaments ( red ). components from each other and from their backgrounds. Allmajor processing steps of getsf employ spatial decompositionof images into a number of finely spaced single-scale images tobetter isolate the contributions of structures with various widths.The method produces accurately flattened detection images withuniform levels of the residual background and noise fluctuations.To detect sources and filaments, getsf combines independent in-formation contained in the multiwaveband single-scale imagesof the structural components, preserving the higher angular res-olutions. Then getsf measures and catalogs the detected sourcesand filaments in their background-subtracted images. The fullyautomated method needs only one user-defined parameter, themaximum size of the structures of interest to extract, constrainedby users from the input images on the basis of their research in-terests.This work follows Papers I–III in advocating a clear distinc-tion between the words source and object , unlike many publi-cations in which it is implicitly assumed that the two are com-pletely equivalent. "Source" used in the context of the source ex-tractions and statistical analysis of their results, and "object" isonly used in the context of the physical interpretation of the ex-tracted sources. In this paper, the sources are defined as the emis-sion peaks (mostly unresolved) that are significantly strongerthan the local surrounding fluctuations, indicating the presenceof the physical objects in space that produced the observed emis-sion. The implicit assumption that an unresolved far-infraredsource on a complex fluctuating background contains emissionof just one single object is invalid in general. Too often, an emis-sion peak is actually a blend of many components, produced bydi ff erent physical entities. This is illustrated by the recent im- ages of the massive star-forming cloud W43-MM1 (Motte et al.2018), obtained with the ALMA interferometer. This object ap-pears as a single source in the
Herschel images, even with the5 . . (cid:48)(cid:48) resolutions at 70 and 160 µ m. However, the ALMA image (Sect. 4.8) displays a rich cluster of much smaller sourcesthat are unresolved or just slightly resolved even at the 0 . (cid:48)(cid:48) resolution.Section 2 describes the new multiwavelength benchmark forsource- and filament-extraction methods, resembling the Her-schel observations of star-forming regions. Section 3 presents getsf , the new source- and filament-extraction method, employ-ing separation of the structural components. Section 4 illus-trates the performance of getsf on a large variety of images thatwere obtained with di ff erent telescopes in a wide spectral range,from X-rays to millimeter wavelengths. Section 5 describes allstrengths and limitations of getsf . Section 6 presents a summaryof this work. Appendix A discusses inaccuracies of the surfacedensities and temperatures, derived by spectral fitting of the im-ages. Appendix B describes the single-scale spatial decomposi-tion that is used by getsf in its processing steps. Appendix Cgives details on the software.In this paper, images are represented by capital calligraphiccharacters (e.g., A , B , C ) and software names and numericalmethods are typeset slanted (e.g., getsf ) to distinguish them fromother emphasized words. The curly brackets {} are used to col-lectively refer to either of the characters, separated by verticallines. For example, { a | b } refers to a or b and { A | B } { a | b } c expandsto A { a | b } c or B { a | b } c , as well as to A ac , A bc , B ac , or B bc . Article number, page 3 of 37 & A proofs: manuscript no. getsf backgroundbackground background+filamentbackground+filament temperaturetemperature
Fig. 2.
Background surface densities ( D B , D C ) and average line-of-sight dust temperatures ( T C ) used to compute the simulated Herschel images C λ of the filamentary cloud from Eq. (4). Square-root color mapping.
2. Benchmark for source and filament extractions
Realistic multiwavelength, multicomponent images of a simu-lated star-forming region were computed to present getsf in thispaper and to compare its performance with the previous bench-mark that was used in Papers I and III. The benchmark imageswere created for all
Herschel wavebands (at λ of 70, 100, 160,250, 350, and 500 µ m). They consist of independent structuralcomponents: a background cloud B λ , a long filament F λ , roundsources S λ , and small-scale instrumental noise N λ : H λ = B λ + F λ + S λ + N λ = C λ + S λ + N λ , (1)where C λ = B λ + F λ is the emission intensity of the filamentarybackground. All simulated images were computed on a 2 (cid:48)(cid:48) pixelgrid with 2690 × . ◦ × . ◦ or 3 . D =
140 pc of the nearest star-forming regions (e.g.,those in Taurus or Ophiuchus).
An image of the background surface density was computed froma purely synthetic scale-free background D A (cf. Paper I), with N H ∼ . × to 5 × cm − that had uniform fluctuationsacross the entire image. To simulate complex astrophysical back-grounds with strongly nonuniform fluctuations (e.g., Könyveset al. 2015), D A was multiplied by a circular shape P with a ra-dial profile defined by Eq. (2) below (with Θ = (cid:48)(cid:48) and ζ = . × cm − was added to increase the mini-mum value. The surface densities of the resulting backgroundcloud image D B (Fig. 2) are 1 . × to 4 . × cm − andthe fluctuations di ff er by approximately two orders of magni-tude. The total mass of the cloud is M B = . × M (cid:12) .To simulate filamentary backgrounds, a long spiral filamentwas added to the background cloud D B . The spiral shape waschosen so that the filament occupied various areas of the cloudwith very di ff erent surface densities and to cause the filamentto be blended (with itself) to some extent. The spiral filamentimage D F has a crest value of N = cm − , a full width athalf-maximum (FWHM) W = . Herschel in star-forming regions (e.g.,Arzoumanian et al. 2011, 2019), N H ( θ ) = N (cid:16) + (2 /ζ −
1) ( θ/ Θ ) (cid:17) − ζ , (2) where θ is the angular distance, Θ is the structure half-width athalf-maximum, and ζ is a power-law exponent. With Θ = (cid:48)(cid:48) (or0 .
05 pc at D =
140 pc) and ζ = .
5, this Mo ff at (Plummer) func-tion approximates a Gaussian of 0 . N H ( θ ) ∝ θ − for θ (cid:29) Θ .The filament mass M F = . × M (cid:12) and length L F = . Λ F = M (cid:12) pc − . The resultingsurface densities D C = D B + D F of the filamentary cloud are inthe range of 1 . × to 1 . × cm − (Fig. 2), and its totalmass is M C = . × M (cid:12) .To approximate the nonuniform line-of-sight dust tempera-tures of the star-forming clouds observed with Herschel (e.g.,Men’shchikov et al. 2010; Arzoumanian et al. 2019), an imageof average line-of-sight temperatures was improvised as T C = (cid:16) − D C + (cid:17) − +
15 K . (3)The pixel values of the resulting temperature image T C rangebetween 15 K in the innermost areas of the filamentary cloudand 20 K in its outermost parts (Fig. 2). The temperatures fromEq. (3) were used to simulate the cloud images C λ in all Herschel wavebands, assuming optically thin dust emission: C ν = B ν ( T C ) D C κ ν ηµ m H , (4)where B ν is the blackbody intensity, κ ν is the dust opacity, η = .
01 is the dust-to-gas mass ratio, µ = . molecule, and m H is the hydrogen mass. Thedust opacity was parameterized as a power law κ ν = κ ( ν/ν ) β with κ = .
31 cm g − (per gram of dust), λ = µ m, and β = To populate the filamentary cloud with realistic sources, 156 ra-diative transfer models were computed by a numerical solutionof the dust continuum radiative transfer problem in spherical ge-ometry (using modust , Bouwman 2001). The models adoptedtabulated absorption opacities κ abs for dust grains with thin icemantles (Ossenkopf & Henning 1994), corresponding to a den-sity n H = cm − and coagulation time t = yr. The opacityvalues at λ > µ m were replaced with a power law κ λ ∝ λ − ,consistent with the parameterization used in Eq. (4).The models of three populations of starless cores and onepopulation of protostellar cores cover wide ranges of masses Article number, page 4 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf µ m µ m
19 74 168 298 466 670 µ m µ m µ m µ m Fig. 3.
Component of sources S λ that is composed of the images of radiative transfer models of 828 starless and 91 protostellar cores andconvolved to the Herschel resolutions O λ (cf. Sect. 1), shown at three selected wavelengths. Only the bright unresolved emission peaks of theprotostellar cores, clearly visible at 100 µ m, appear in the 70 µ m image (not shown). Square-root color mapping. (from 0 .
05 to 2 M (cid:12) ) and half-maximum sizes (from ∼ .
001 to0 . ρ ( r ) ∝ r − . Starless cores consist oflow-, medium-, and high-density subpopulations, following the M ∝ R relation for the isothermal Bonnor-Ebert spheres (with T BE = , ,
28 K) in the area of the mass-radius diagram occu-pied by prestellar cores observed in the Ophiuchus and Orionstar-forming regions (Motte et al. 1998, 2001).Both types of cores were embedded in background sphericalclouds with a uniform surface density of 3 × cm − and outerradius of 1 . × AU (1000 (cid:48)(cid:48) or 0 .
68 pc). In an isotropic inter-stellar radiation field (Black 1994) with the strength parameter G =
10 (e.g., Parravano et al. 2003), the embedding clouds ac-quired temperatures of T ≈
22 K at their edges, consistent withthe highest values of T C from Eq. (3). The embedding cloudslowered T ( r ) toward the interiors of both starless and protostel-lar cores. Accreting protostars in the centers of the protostellarcores, however, produced luminosity L A ∝ M and thus sharplypeaked temperature distributions deeper in their central parts. Individual surface density images of the models of 828 starlessand 91 protostellar cores were distributed in the dense areas( N H ≥ × cm − ) of the filamentary cloud D C . They wereadded quasi-randomly, without overlapping, to the D C imageat positions, where their peak surface density exceeded that ofthe cloud N H value. An initial mass function (IMF)-like bro-ken power-law mass function with a slope d N / d M of − . M ≤ . M (cid:12) and − . M > . M (cid:12) was used to determine thenumbers of models per mass bin δ log M ≈ . D S , theintensities S λ of sources (Fig. 3), and in the complete simulatedimages C λ + S λ .The final simulated Herschel images H λ from Eq. (1) of themodeled star-forming region were obtained by adding di ff erentrealizations of the random Gaussian noise N λ at 70, 100, 160,250, 350, and 500 µ m and convolving the resulting images to theangular resolutions O λ of 8 .
4, 9 .
4, 13 .
5, 18 .
2, 24 .
9, and 36 . (cid:48)(cid:48) ,respectively (Fig. 4). The resulting images H λ have σ noise lev-els of 6, 6, 5 .
5, 2 .
5, 1 .
2, and 0 . − , resembling the actual noise measured in the Herschel images of the Rosette molecularcomplex (Motte et al. 2010).
3. Source- and filament-extraction method
The main processing steps of getsf are outlined in Fig. 1,where several major blocks of the algorithm are highlighted.The method may be summarized as follows: (1) preparation ofa complete set of images for an extraction, (2) separation of thestructural components of sources and filaments from their back-grounds, (3) flattening of the residual noise and background fluc-tuations in the images of sources and filaments, (4) combinationof the flattened components of sources and filaments over se-lected wavebands, (5) detection of sources and filaments in thecombined images of the components, and (6) measurements ofthe properties of the detected sources and filaments.Like its predecessors, getsf has just a single, user-definableparameter: the maximum size (width) of the structures of inter-est to extract. Internal parameters of getsf have been carefullycalibrated and verified in numerous tests using large numbers ofdiverse images (both simulated and real-life observed images)to ensure that getsf works in all cases. This approach rests onthe conviction that high-quality extraction methods for scientificapplications must not depend on the human factor. It is the re-sponsibility of the creator of a numerical method to make it asgeneral as possible and to minimize the number of free param-eters as much as possible. An internal multidimensional param-eter space of complex numerical tools must never be delegatedto the end user to explore if the aim is to obtain consistent andreliable scientific results.
The multiwavelength extraction methods must be able to use allavailable information contained in the observed images acrossvarious wavebands with di ff erent angular resolutions. It is usu-ally beneficial to collect all available images for a specific regionof the sky under study. Article number, page 5 of 37 & A proofs: manuscript no. getsf
275 410 636 950 1356 1850 µ m µ m
326 766 1500 2522 3844 5450 µ m µ m
60 166 342 587 905 1290 µ m µ m Fig. 4.
Images H λ of the simulated star-forming region, defined by Eq. (1), shown at three selected wavelengths. The benchmark images are asuperposition of four structural components: the background B λ , the filament F λ , the sources S λ , and the noise N λ . Two simpler variants of thisbenchmark are also available: without the filament and without the background. Square-root color mapping. To prepare multiwavelength H λ for processing with getsf , it isnecessary to convert them into the images I λ , all on the samegrid of pixels. To this end, getsf resamples all images (using swarp , Bertin et al. 2002) on a pixel size, chosen to be optimalfor the highest-resolution images available. It is very importantto carefully verify alignment of the resampled I λ and correct it(if necessary) to ensure that all unresolved intensity peaks re-main on the same pixel across all wavebands. To reveal possiblemisalignments, it is su ffi cient to open each pair of prepared im-ages in ds9 (Joye & Mandel 2003) and blink the two frames,going from the highest-resolution to the lowest-resolution im-ages.Most astronomical images have irregularly shaped coverageand limited usable areas that di ff er between wavebands. To in-clude only the “good” parts of the I λ coverage in the image pro-cessing, it is necessary to create masks M λ (with pixel values 1or 0). With these masks, getsf can process only the good areasof I λ that have a mask value of 1. To facilitate the image prepa-ration, getsf always creates default masks M λ =
1. However, formost real observations, the masks must be prepared very care-fully and independently for each image. To manually create themasks, one can use imagej (Abràmo ff et al. 2004) or gimp thatallows users to create a polygon over an image, convert the poly-gon into a mask, and save it in the FITS format. The multiwavelength far-infrared
Herschel images open the pos-sibility of computing maps of surface density and dust temper-ature by fitting the spectral shapes Π λ of the image pixels. Thestandard procedure assumes that (1) the original images repre-sent optically thin thermal emission of dust grains with a power-law opacity κ λ ∝ λ − β and a constant β value, (2) the dust temper-ature is constant along the lines of sight passing through eachpixel of the images, and (3) the lines of sight are not contami-nated by unrelated radiation at either end, in front of the observedstructures and behind them. Unfortunately, one or more of theassumptions are likely to be invalid, especially the stipulation ofthe opacity law and the constant line-of-sight temperatures (e.g., Men’shchikov 2016). The values of the derived surface densitiesand temperatures therefore must be considered as fairly unreli-able and implying large error bars.When we assume that the observations include the
Herschel images, the spectral shapes Π λ of each pixel can be fit at severalwavelengths (160 − µ m) and resolutions (18 . − . (cid:48)(cid:48) ), whichresults in three sets of surface densities and dust temperatures.The highest-resolution derived images are the least reliable be-cause they are obtained from fitting only two images (at 160 and250 µ m), whereas the lowest-resolution maps are the most accu-rate because they come from fitting four independent images (at160, 250, 350, and 500 µ m).In an attempt to combine the higher accuracy of the lower-resolution images with the higher angular resolutions of the lessaccurate images, Palmeirim et al. (2013) published a simple al-gorithm that uses complementary spatial information containedin the observed images to create a surface density image with theresolution O P = . (cid:48)(cid:48) of the 250 µ m image. When this approachis extended to temperatures, the sharper images can be com-puted by adding the higher-resolution information to the low-resolution images as di ff erential terms, {D|T } P = {D|T } + δ {D|T } + δ {D|T } , (5)where the base surface density and temperature {D|T } are de-rived by fitting the 160, 250, 350, and 500 µ m images at the low-est resolution O = . (cid:48)(cid:48) . The additional terms, containing thehigher-resolution contributions, are produced by unsharp mask-ing, δ {D|T } { | } = {D|T } { | } − G { | } ∗ {D|T } { | } , (6)where {D|T } are computed by fitting the three images at 160,250, and 350 µ m at the resolution O = . (cid:48)(cid:48) , and {D|T } areobtained by fitting the two images at 160 and 250 µ m at the res-olution O = . (cid:48)(cid:48) ; the Gaussian kernels G { | } convolve theimages to the next lower resolutions O { | } .The following generalization of the above algorithm allowsderiving surface densities and temperatures with any (arbitrarilyhigh) angular resolution existing among the observed I λ . Thethree independently derived maps of temperatures T { | | } withthe resolutions of 18 . − . (cid:48)(cid:48) and six observed Herschel imageswith their native resolutions O λ of 8 . − . (cid:48)(cid:48) define 18 surface Article number, page 6 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf
D 13.5" trueD 13.5" true
D 13.5"D 13.5"
15 16 18 19 20 21
T 13.5"T 13.5"
Fig. 5.
Derived surface densities and temperatures (Sect. 3.1.2). The true model image D C + D S and the hires surface density D (cid:48)(cid:48) and temperature T (cid:48)(cid:48) derived from Eq. (8) with λ H = µ m ( O H = . (cid:48)(cid:48) ) are shown. Many of the sources, clearly visible in the true image ( left ), are not discerniblein the derived surface density ( middle ) because of the inaccuracies in the temperatures from fitting spectral shapes Π λ . Square-root color mapping,except the right panel with linear mapping. densities, D O λ { | | } = I ν B ν ( T { | | } ) κ ν ηµ m H , (7)with the assumptions and parameterizations of Eq. (4). It is re-quired that the resolution of temperatures must not be higherthan O λ , which excludes D O and D O { | } from the algorithmand provides 15 independently computed variants of the surfacedensities of the observed region, with di ff erent resolutions. Thehigh-resolution surface density image is computed as D O H = D O + (cid:88) λ = λ H max (cid:0) δ D O λ , δ D O λ , δ D O λ (cid:1) , (8)where λ H denotes the wavelength of the image I λ H with the de-sired angular resolution O H ≡ O λ H and the di ff erential terms withhigher-resolution information are obtained by the same unsharpmasking, δ D O λ { | | } = D O λ { | | } − G O λ + ∗ D O λ { | | } , (9)where G O λ + is the Gaussian kernel (regarded as the delta functionat 500 µ m), convolving D O λ { | | } to a lower resolution of the nextlonger wavelength. For images at λ < µ m, only the positivevalues of δ D O λ { | | } are used in Eq. (8) to circumvent the prob-lem of creating artificial depressions and negative pixels aroundstrong peaks due to the resolution mismatch ( O λ < O ) between I λ and the lower-resolution T { | | } in Eq. (7).The problem is caused by the sharp radial temperature gra-dients toward the unresolved centers of protostellar cores (cf.Men’shchikov 2016). They are smeared out by the low resolu-tions of 18 . − . (cid:48)(cid:48) , hence the fitting of Π λ leads to underesti-mated temperatures T { | | } and overestimated values (within anorder of magnitude) of peak surface densities at higher resolu-tions O λ < O . This means that unsharp masking of the overes-timated peaks could create negative annuli in δ D O λ { | | } and neg-ative pixels in D O H . Fortunately, the surface densities are quiteaccurate outside the unresolved peaks (Appendix A).A slight modification of Eq. (8) allows deriving the high-resolution surface densities with an enhanced contrast of all un- resolved or slightly resolved structures, D + O H = D O H + (cid:88) λ = λ H (cid:88) n = max (cid:0) δ D O λ n , (cid:1) , (10)where the positive parts of the di ff erential high-resolution termsfrom Eq. (9) are added to D O H . These high-contrast images maybe useful for detection of unresolved structures because the latterare usually diluted by the observations with insu ffi cient angularresolution; a higher contrast improves their visibility.A high-resolution temperature T O H , consistent with the high-resolution surface density D O H , is computed by numerically in-verting the Planck function, T O H = B − ν H (cid:32) I ν H D O H κ ν H ηµ m H (cid:33) , (11)with ν H = c λ − , where c is the speed of light. The high-resolutionimages {D|T } (cid:48)(cid:48) are shown in Fig. 5 along with the true sim-ulated D C + D S (Sect. 2.3). A comparison demonstrates thatthe pixel-fitting procedure reduces visibility of many unresolvedor slightly resolved starless cores, which is the manifestationof the invalid assumption of the uniform line-of-sight tempera-tures. Starless (prestellar) cores have lower temperatures in theircenters, and their smearing by an insu ffi cient resolution leadsto overestimated temperatures that suppress the surface densitypeaks (cf. Fig. A.1).The new hires algorithm, outlined by Eqs. (7)–(11), bringsthe benefits of a resolution O H ≈ (cid:48)(cid:48) , twice better than O P andfour times better than O , if the image quality at the short-est wavelengths permits this. Moreover, the angular resolutionsof the Herschel images at 70, 100, and 160 µ m, obtained with aslow scanning speed of 20 (cid:48)(cid:48) s − , are even higher: 6, 7, and 11 (cid:48)(cid:48) , re-spectively. These observations, illustrated in Fig. 6, allow deriv-ing the surface densities and temperatures with O H ≈ (cid:48)(cid:48) , a threetimes better resolution than when using Eq. (5). If the 70 µ mimage is too noisy or there is evidence of its strong contami-nation by emission unrelated to that of the adopted dust grains(e.g., polycyclic aromatic hydrocarbons or transiently heatedvery small dust grains), then the derived images may still havethe 7 − (cid:48)(cid:48) resolution of the 100 or 160 µ m wavebands. In ad-dition to the high resolution, the images from Eq. (8) also have Article number, page 7 of 37 & A proofs: manuscript no. getsf
D 18.2"D 18.2"
D 5.9"D 5.9"
D 5.9" high contrastD 5.9" high contrast d 18.2"d 18.2" d 11.7"d 11.7" d 5.9"d 5.9"
Fig. 6.
High-resolution surface densities obtained for the
Herschel images of Cygnus X (HOBYS project, Motte et al. 2010; Hennemann et al.2012, Bontemps et al., in prep.). The top row shows the hires surface densities D O H from Eq. (8) with O H = . (cid:48)(cid:48) and 5 . (cid:48)(cid:48) resolutions, and thehigh-contrast D O H + from Eq. (10) with O H = . (cid:48)(cid:48) . The bottom row displays the relative di ff erences of D (cid:48)(cid:48) , D (cid:48)(cid:48) , and D (cid:48)(cid:48) with respect to the nextlower-resolution surface densities D (cid:48)(cid:48) , D (cid:48)(cid:48) , and D (cid:48)(cid:48) , respectively. Logarithmic and square-root color mapping in the top and bottom rows,correspondingly. a better quality than those from Eq. (5) because they accumu-late all available high-resolution information from the (up to 15)independently computed images D O λ { | | } that use all three tem-peratures T { | | } with each original I λ .The hires algorithm works with any number 2 ≤ N ≤ Herschel wavebands. If the 160 µ m image is unavailable or dis-abled, then the temperature T at the resolution O is re-moved from Eq. (7) and T { | } at the resolutions O { | } areobtained from fitting of only the 250, 350, and 500 µ m im-ages. If the 250 µ m image is also unavailable or disabled, thenonly the single temperature T at the lowest resolution O re-mains in Eq. (7), obtained from the 350 and 500 µ m images. Al-though the algorithm is una ff ected by the changes, the reduc-tion in the number of independently derived temperatures wouldlower the angular resolution and accuracy of the resulting sur-face density image. The improved algorithm can use the realistic,non-Gaussian point-spread functions (PSF) published by Anianoet al. (2011). However, the surface densities are largely deter-mined by the SPIRE bands with nearly Gaussian PSFs, whereasonly the PACS 160 µ m band is used in the pixel fitting. Thebenchmark tests have shown that e ff ects of the realistic PSFs onsurface densities are very small, at percent levels, much smallerthan the general uncertainties of the pixel-fitting methods (Ap- pendix A). It is therefore su ffi cient to use the Gaussian PSFswhen surface densities are derived.For some studies, it might be useful to have all images at thesame wavelength-independent angular resolution. With the high-resolution surface densities and temperatures from Eqs. (8) and(11), it is straightforward to obtain such images: J ν O H = B ν ( T O H ) D O H κ ν ηµ m H , (12)with the assumptions and parameterizations of Eq. (4). For ex-ample, the intensities J λ (cid:48)(cid:48) at 250, 350, and 500 µ m would besharper than I λ by the factors 1 .
3, 1 .
8, and 2 .
7, respectively.When the available original set of images I λ allows creationof D O H , it is advantageous to have it complement the originaldata set, handling it as an image I (cid:111) “observed” in a fictitiouswaveband (cid:111) . In the multiwavelength extractions with getsf , itmay be recommended to use D O H for better detections and de-blending of dense structures. The surface densities are not ac-curate enough for source measurements, as demonstrated in Ap-pendix A and Men’shchikov (2016).The following presentation and discussion of getsf implicitlyassumes that the additional detection images are contained in theset of images I λ . In other words, all supplementary wavebandsare included in the set of λ prepared for extraction. The latter was Article number, page 8 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf done for a multiwavelength data set that included all images inthe
Herschel wavebands (Fig. 4) and the high-resolution surfacedensity I (cid:111) ≡ D (cid:48)(cid:48) (Fig. 5), a total of N W = Before starting any extraction with getsf , it is necessary to for-mulate the aim of the study and determine what structures ofinterest are to be extracted. The method knows and is able toseparate three types of structures: sources, filaments, and back-grounds. To separate the structural components with getsf , themaximum size { X | Y } λ of the sources ( X λ ) and filaments ( Y λ ) ofinterest needs to be manually (visually) estimated from the pre-pared I λ independently for each waveband, which can be accom-plished by opening an image in ds9 and placing a circular regionfully covering the width of the largest structure. The maximumsize of structures is the single physical parameter that the methodneeds to know for each observed image. Being a function of thetype of structures (sources, filaments) and the waveband λ , it issplit into X λ and Y λ in this paper for convenience.The maximum size { X | Y } λ is defined as the footprint radius(in arcsec) of the largest source and the widest filament to be ex-tracted. A footprint size has the meaning of a full width at zero(background) level: the largest two-sided extent from a sourcepeak or filament crest at which this structure is still visible in I λ against its background. For a Gaussian intensity distribution, thefootprint radius is slightly larger than the half-maximum width H λ of a structure. For a power-law intensity profile, the foot-print radius may become much larger than H λ . If the widestfilaments of interest are blended (overlapping each other withtheir footprints), Y λ must be increased accordingly to approxi-mate the full extent of the blend. In contrast, it is not necessaryto adjust X λ for blended sources because their final backgroundwill be determined from their footprints at the measurement step(Sect. 3.4.6).It is not necessary (also not possible) to evaluate the max-imum size parameter { X | Y } λ very precisely, a 50% accuracy isquite su ffi cient. Its purpose is to set a reasonable limit to the spa-tial scales when separating the structural components, and to thesize of the structures to be measured and cataloged. The methodworks with spatially decomposed images, and it needs to knowthe maximum scale. It makes no sense to perform the decom-position up to a very large scale if the extraction is aimed atmuch smaller sources or narrower filaments. The method has nolimitations with respect to the sizes (widths) of the structures toextract. However, it is important to avoid detecting, measuring,and cataloging the peaks that are unnecessarily too wide becausethey would likely overlap with other sources of interest, whichpotentially would make their measurements less accurate.To extract all structures in the benchmark images presentedin Sect. 2, the estimated X λ values for sources are 16, 25, 30, 150,150, and 150 (cid:48)(cid:48) , whereas the estimated Y λ values for the filamentare 350 (cid:48)(cid:48) in all six Herschel wavebands (Fig. 4); in the additionalsurface density image I (cid:111) ≡ D (cid:48)(cid:48) (Fig. 5), the { X | Y } (cid:111) values arethe same as those for the 250 − µ m images. Complex observed images may be radically simplified by sub-tracting backgrounds on spatial scales much larger than the max-imum size { X | Y } λ . The independent largest sizes for sources andfilaments e ff ectively define two di ff erent backgrounds for thetwo scales. The X λ -scale background B λ X is derived to separate the component of sources S λ , whereas the Y λ -scale background B λ Y is obtained to separate the component of filaments F λ . Thebackgrounds are collectively referred to as B λ { X | Y } .The true background under the observed structures is funda-mentally unknown, and it is a major source of large uncertaintiesand measurement inaccuracies, especially for the faintest struc-tures. In practice, the backgrounds B λ { X | Y } are defined in getsf asthe smooth intensity distributions on spatial scales S j larger than4 { X | Y } λ that remain in I λ after a complete removal of all sourcesor filaments with the maximum size of { X | Y } λ . In contrast tothe background derivation by median filtering ( getimages , PaperIII), which may become extremely slow for very large imagesand wide structures, getsf employs a more direct, precise, ande ff ective clipping algorithm to separate the structures. In general, observed images are very complex blends of vari-ous structural components on di ff erent spatial scales, and greatadvantages are obtained when a spatial decomposition is usedto simplify the images (cf. Papers I and II). Following the get-old approach, getsf employs successive unsharp masking (Ap-pendix B) to decompose the original images I λ into a set ofsingle-scale images I λ j (Fig. 7). It also uses an iterative algo-rithm (Appendix B) to determine a single-scale standard devia-tion σ λ j , as well as its total value σ λ , which are used to separatethe structural components present in I λ . The backgrounds B λ { X | Y } are computed by cutting small roundpeaks and elongated structures o ff the decomposed images I λ j and recovering the full images using Eq. (B.4). It is importantto note that the appearance of the structures in the decomposedimages depends on both the spatial scale and intensity level.To remove the structural components, getsf slices I λ j by anumber N L of intensity levels I λ jl , spaced by δ ln I λ j = .
05 fromthe image maximum down to σ λ j for sources and to 0 . σ λ j forfilaments. Each slice l cuts through all the structures present in I λ j on that intensity level, producing various shapes of connectedpixels, I λ jl = min (cid:16) max (cid:16) I λ j , I λ jl (cid:17) , I λ jl (cid:17) , l = , , . . . , N L . (13)Relatively round source-like peaks in I λ j may be e ff ectivelydistinguished from elongated structures by the number of con-nected pixels N λ jl that their shapes occupy in the slice I λ jl (cf.Papers I and II). The single-scale images indeed most clearlyshow the structures with matching sizes ( H λ ≈ S j ), whereas thesignals from much narrower and much wider structures are sup-pressed. As a consequence, the source-like shapes occupy rela-tively small areas of connected pixels in I λ jl that are comparableto the area π S j of the convolution kernel G j . In contrast to theround peaks, elongated shapes in I λ jl have greater lengths L λ than widths W λ ≈ S j , which means that the filamentary shapes inslices I λ jl extend over much larger areas than π S j .In addition to N λ jl , getsf uses two more quantities to discrim-inate between sources and filaments: elongation E λ jl and sparsity S λ jl . They are defined by the major and minor sizes ( a λ jl and b λ jl )of each cluster of connected pixels, obtained from intensity mo-ments (cf. Appendix F in Paper I), E λ jl ≡ a λ jl b λ jl , S λ jl ≡ π a λ jl b λ jl N λ jl ∆ , (14) Article number, page 9 of 37 & A proofs: manuscript no. getsf
D 13.5"D 13.5" -1.68e+20 2.67e+20 7.01e+20 1.13e+21 1.57e+21 2.00e+21 -1.68e+20 2.67e+20 7.01e+20 1.13e+21 1.57e+21 2.00e+21 -1.68e+20 2.67e+20 7.01e+20 1.13e+21 1.57e+21 2.00e+21 -1.68e+20 2.67e+20 7.01e+20 1.13e+21 1.57e+21 2.00e+21 >1400">1400"
Fig. 7.
Spatial decomposition (Sect. 3.2.1, Appendix B) for I (cid:111) ≡ D (cid:48)(cid:48) from Eq. (8) in single scales between 4 and 1400 (cid:48)(cid:48) . The original hires surface density ( top left ) and decomposed I (cid:111) j on selected scales S j that di ff er by a factor of 4 are plotted. The remaining largest scales G N S ∗ I (cid:111) ( bottom right ) are outside the decomposition range. Linear color mapping. where ∆ is the pixel size. Only simple and relatively straight fila-mentary shapes can be identified in I λ jl by their elongation. Mostof the actually observed filaments in space are shaped quite ir-regularly on di ff erent scales and intensity levels. The elongation E λ jl alone cannot be used to quantify strongly curved, not verydense clusters of connected pixels that meander around (e.g., aspiral structure). Although E λ jl may well be close to unity forsparse shapes, high values of S λ jl for these structures would in-dicate that they do not belong to sources.The structural components are separated in single scales I λ j using the three quantities described above. The shapes producedby sources in a slice I λ jl are not very elongated, not very sparse,and not very large. In contrast, the shapes produced by filamentsin a slice I λ jl are elongated or sparse. Hence, these definitionsfor the source-like and filament-like shapes are written as E λ jl ≤ . ∧ S λ jl ≤ . ∧ N λ jl ≤ π (cid:16) ξ λ j S j (cid:17) ∆ − , E λ jl > . ∨ S λ jl > . , (15)where the limiting values of elongation and sparsity were deter-mined empirically from numerous benchmark extractions. The ξ λ j factor accounts for the fact that the area of a decomposed un-resolved peak increases nonlinearly toward the smallest spatialscales S j < ∼ O λ . The factor may be determined empirically by de-composing an unresolved peak P in single scales P j (Fig. B.1) and finding the distances θ , where the one-dimensional profile P j ( θ ) through the peak has d P j / d θ = P j < ξ λ j = . (cid:16) O λ S − j (cid:17) . + . . (16)The ξ λ j factor ensures that N λ jl has appropriate values and thatsingle-scale peaks are clipped cleanly on all spatial scales.Various shapes formed by connected pixels are identified andanalyzed in each single-scale slice using the tintfill algorithm(Smith 1979) , previously employed by getold to detect sourcesand filaments (Papers I and II). Deriving the background B λ X of sources, getsf decomposes I λ and removes all source-likeshapes from I λ jl , according to their definition in Eq. (15), in aniterative procedure (Sect. 3.2.3). Deriving the background B λ Y of filaments, getsf decomposes B λ X and removes all filament-like shapes from B λ X jl , according to their definitions in Eq. (15),in the same iterative procedure. The shapes are erased from eachslice l by setting all their pixels to zero. When we denote with B λ { X | Y } jl C either of the single-scale back-ground slices B λ X jl C or B λ Y jl C after the shape removal, the back-grounds on scale j are reassembled from the clipped slices as http://portal.acm.org/citation.cfm?id=800249.807456 Article number, page 10 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf backgroundbackground XX bg-subtractedbg-subtracted -0.4 -0.2 0.00059 0.2 0.4 0.6 background errorbackground error XX backgroundbackground YY bg-subtractedbg-subtracted -0.4 -0.2 0.00059 0.2 0.4 0.6 background errorbackground error YY Fig. 8.
Background derivation (Sect. 3.2) for I (cid:111) ≡ D (cid:48)(cid:48) from Eq. (8). The left panels show the backgrounds B (cid:111) X and B (cid:111) Y , obtained using theprocedure described by Eqs. (20)–(22). The middle panels show the corresponding background-subtracted S (cid:111) and F (cid:111) from Eq. (23). The right panels show the relative errors of B (cid:111) X and B (cid:111) Y with respect to the true model backgrounds D C and D B (Fig. 2), convolved to the same resolution.The filament is heavily blended with itself in the central area, therefore its background is systematically underestimated there ( lower right ).Square-root color mapping, except in the right panels, which show linear mapping. B λ { X | Y } j C = N L (cid:88) l = B λ { X | Y } jl C . (17)To properly reconstruct the complete backgrounds B λ { X | Y } from B λ { X | Y } j C , it is not su ffi cient to just sum them over scales. Thesingle-scale processing scheme requires that it must be done in-directly in several steps by reconstructing the complete imagesof sources and filaments.In the first step, getsf recomputes the single-scale sourcesand filaments that have been clipped, removing all negative val-ues from the reassembled single-scale backgrounds, {S|F } λ j = {I λ |B λ X } − max (cid:16) B λ { X | Y } j C , (cid:17) . (18)In the second step, getsf computes the full images of the sourcesand filaments over all scales, recursively summing the clippedstructures from the largest to the smallest scales and removingall negative values from each partial sum, {S|F } λ = max (cid:16) {S|F } λ + {S|F } λ j , (cid:17) , j = J λ { X | Y } , . . . , , , (19)where J λ { X | Y } is the number of the largest spatial scales 4 { X | Y } λ for the backgrounds B λ { X | Y } and the initial value of the recursive sum is set to zero. The complete backgrounds are obtained bysubtracting the structures from the original images, B λ { X | Y } = {I λ |B λ X } − {S|F } λ . (20)The initial backgrounds in Eq. (20) are only the first approx-imations because they contain substantial residual contributionsfrom the original structures. It is straightforward to define iter-ations to improve the backgrounds by decomposing them andclipping the residual shapes from each single scale. The algo-rithm described by Eqs. (17)–(20) remains the same, with twosubstitutions, {I λ |B λ X } i ← B i − λ { X | Y } , i = , , . . . , N I , (21)where N I is the number of iterations. Each successive iterationreduces contributions of the residual structures and improves thebackgrounds until corrections in all pixels become small com-pared to the originals, δ B i λ { X | Y } < .
003 ( {I λ |B λ X } + σ λ ) , (22)where the additional term helps avoid unnecessary iterations inrare cases when the images contain extremely faint pixels. Article number, page 11 of 37 & A proofs: manuscript no. getsf originaloriginal bg-subtractedbg-subtracted std deviationsstd deviations flattening imageflattening image bg-subtracted (flat)bg-subtracted (flat) std deviations (flat)std deviations (flat)
Fig. 9.
Flattening for the component S (cid:111) (Sect. 3.3) for I (cid:111) ≡ D (cid:48)(cid:48) from Eq. (8). The top row shows the original I (cid:111) , the background-subtracted S (cid:111) from Eq. (23), and the standard deviations U (cid:111) from Eq. (24). The bottom row shows the flattening image Q (cid:111) , the flat sources S (cid:111) D from Eq. (27),and its standard deviations sd O λ ( S λ R Q − λ ) that are much flatter (outside the sources) across the image. Square-root color mapping, except in the right panels, which show logarithmic mapping. The final background-subtracted structural components arecomputed as {S|F } λ = {I λ |B λ X } − B λ { X | Y } . (23)The original images can be recovered by summing the three sep-arated components: I λ = S λ + F λ + B λ Y . The positive parts of thesmall-scale background fluctuations and instrumental noise arecontained in the component S λ , hence the component F λ appearsfairly smooth (Fig. 8). Observations demonstrate that the levels of the large-scale back-grounds and their smaller-scale fluctuations often di ff er by or-ders of magnitude in various parts of large images. Although thesubtraction of the smooth backgrounds B λ { X | Y } greatly simplifiesthe original images, it does not reduce the strong variations ofthe smaller-scale fluctuation levels across {S|F } λ . As a conse-quence, many structures detected with global thresholds in theareas of stronger fluctuations may actually be spurious and un-related to any real physical objects. On the other hand, faint realstructures in the image areas with the lower levels of fluctua-tions may escape detection because the global threshold valueis likely to be overestimated for those areas. To produce com- plete and reliable extractions using constant thresholds, it is nec-essary to make small-scale fluctuations uniform over the entirebackground-subtracted images.The fluctuation levels are equalized using flattening images Q λ and R λ that are derived by getsf from the images U λ and V λ of the standard deviations computed in the structural componentswith a circular sliding window of a radius O λ , {U|V} λ = sd O λ ( {S|F } λ R ) , (24)where S λ R and F λ R are the regularized images S λ and F λ , ob-tained using a smoother version of their backgrounds that ismedian-filtered using a sliding window of a radius 2 O λ and con-volved with a Gaussian kernel of a half-maximum size O λ , {S|F } λ R = {I λ |B λ X } − G O λ ∗ mf O λ (cid:0) B λ { X | Y } (cid:1) . (25)This is done to improve the quality of {U|V} λ for further pro-cessing because the structural components from by Eq. (23) arepositively defined and have large areas of zero pixels. The regu-larized components in Eq. (25) acquire small-scale fluctuationsresembling the background and noise fluctuations of the originalimages. Article number, page 12 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf originaloriginal bg-subtractedbg-subtracted std deviationsstd deviations flattening imageflattening image
54 217 488 867 1356 1950 bg-subtracted (flat)bg-subtracted (flat) std deviations (flat)std deviations (flat)
Fig. 10.
Flattening for the component F (cid:111) (Sect. 3.3) for I (cid:111) ≡ D (cid:48)(cid:48) from Eq. (8). The top row shows the original I (cid:111) , the background-subtracted F (cid:111) from Eq. (23), and the standard deviations V (cid:111) from Eq. (24). The bottom row shows the flattening image R (cid:111) , the flat filaments F (cid:111) D from Eq. (27)and its standard deviations sd O λ ( F λ R R − λ ) , which are much flatter (outside the filament) across the image. Square-root color mapping, except in the right panels, which show logarithmic mapping. The advantages of the spatial decomposition (Appendix B) applyalso to the standard deviations {U|V} λ . The getsf method pro-duces the single-scales {U|V} λ j and employs the same iterativealgorithm (Appendix B) to determine the single-scale standarddeviation σ λ j and its total value σ λ . This is done using the sameprocedure as was applied to I λ in Sect. 3.2.1. The {U|V} λ images sample local fluctuations and intensity gra-dients, revealing all sources and filaments present in I λ (Figs. 9,10). To produce the corresponding flattening images, it is nec-essary to remove all such features from {U|V} λ , hence to deter-mine their { X | Y } λ -scale backgrounds. Deriving the latter, getsf creates single-scale slices {U|V} λ jl , in a complete analogy with I λ jl in Sect. 3.2.2, and clips from them all source- and filament-like shapes according to their definitions in Eq. (15). The recon-structed backgrounds Q λ X and R λ Y are computed using the iter-ative algorithm described in Sect. 3.2.3, with the largest spatialscale set to 2 . { X | Y } λ .When the background iterations converge, numerous sharpcraters remain in the derived backgrounds {Q|R} λ { X | Y } that could create spurious structures if the images were used to flatten thestructural components. To avoid this, the final flattening images Q λ and R λ (Figs. 9, 10) are obtained by median filtering the back-ground in circular sliding windows of radii 2 O λ and 5 O λ , respec-tively, {Q|R} λ = mf { | } O λ (cid:0) {Q|R} λ { X | Y } (cid:1) . (26)This important step ensures that flattening would never producespurious structures in the detection images. The detection images of the separated structural componentsare used to identify peaks of the sources and skeletons of thefilaments, respectively (Sect. 3.4). Both source- and filament-detection images are flattened, that is, divided by the flatteningimages, {S|F } λ D = {S|F } λ {Q|R} λ . (27)The standard deviations sd O λ ( {S|F } λ R {Q|R} − λ ) in the regularizedflattened components demonstrate that the detection images areremarkably flat outside the structures, as shown in Figs. 9 and Article number, page 13 of 37 & A proofs: manuscript no. getsf
10. This ensures an accurate separation of significant structuresfrom the fainter background and noise fluctuations during thesubsequent extraction of sources and filaments.
To extract sources and filaments means to detect them and mea-sure their properties. The background subtraction and flatteningalgorithms presented in Sects. 3.2 and 3.3 radically simplify theoriginals I λ , separating two distinct structural components andcreating the independent flat detection images {S|F } λ D . In con-trast to the originals, the flat images are suitable for the detectiontechniques that apply a threshold value for the entire image. In order to accurately extract various structures that widely rangein brightness and size, it is essential to use the benefits o ff ered bythe single-scale spatial decomposition (Appendix B). Followingits general approach (Fig. 1 and Sects. 3.2.1, 3.3.1), getsf de-composes the detection images S λ D and F λ D into single scales {S|F } λ D j and estimates the corresponding standard deviations σ λ S j and σ λ F j (Appendix B) that are necessary for separating sig-nificant structures from all other fluctuations. The decomposedcomponents S λ D j and F λ D j are shown in Figs. 11 and 12 afterthey were cleaned and combined over wavebands. Cleaning is the removal of insignificant background and noisefluctuations from detection images that needs to be done beforecombining them over wavebands (Sect. 3.4.3). The clean imagesof the structures are obtained by preserving only the pixels withvalues above the cleaning thresholds (cid:36) λ { S | F } j and by setting allfainter pixels to zero, {S|F } λ D j C = max (cid:16) {S|F } λ D j , (cid:36) λ { S | F } j (cid:17) , (28)where (cid:36) λ S j = σ λ S j and (cid:36) λ F j = σ λ F j . The filament threshold issignificantly lower than that for sources because getsf addition-ally cleans F λ D j C of the residual source-like clusters of con-nected pixels according to their definition in Eq. (15).The resulting clean images {S|F } λ D j C (Figs. 11 and 12) aredeemed to have signals only from the sources and filaments, re-spectively. In practice, some of them may have several faint spu-rious peaks and other structures that are discarded during thesubsequent detection and measurement steps. λ All previous image processing was done independently for eachwavelength. It is recommended to always process the input im-ages in parallel, independent getsf runs to reduce the total ex-traction time approximately by a factor N W , the number of wave-lengths. Now, getsf accumulates the clean single-scale images S λ D j C and F λ D j C over the wavebands in order to use the inde-pendent information from all images and enhance the signal ofthe significant structures. This procedure follows the getold ap-proach (Paper I), with the important improvement that filamentsare handled in the same way as sources. When decomposed im-ages on each scale are combined, di ff erences in the angular reso-lutions between the wavebands are much less important becausethe single-scale images select and enhance the structures withwidths similar to the scale size S j , not the resolution O λ . The clean detection images are normalized before their accu-mulation over wavelengths to make all cleaning thresholds equalto unity in all bands. The combination process is described by thefollowing expression: {S|F } D j C = N − { S | F } (cid:88) λ f λ j max (cid:16) {S|F } λ D j C , Z λ { S | F } j (cid:17) (cid:36) − λ { S | F } j , (29)where N { S | F } ≤ N W is the number of the wavebands chosen to beused in the combination, Z λ { S | F } j is the threshold image (equal to (cid:36) λ { S | F } j in all pixels), and f λ j is a factor that gradually turns thesmallest scales on, f λ j = min (cid:18)(cid:16) S j O − λ (cid:17) , (cid:19) . (30)This factor ensures that the small-scale noise or artifacts appear-ing on top of the resolved structures do not produce spurious de-tections in the combined images on scales S j < O λ . Su ffi cientlybright unresolved structures still contribute to {S|F } D j C on thesmallest scales below O λ . This super-resolution is useful to de-tect blended unresolved peaks. Selected combined images of thetwo structural components are shown in Figs. 11 and 12.The normalization to a common threshold in Eq. (29) is anatural way of maximizing sensitivity of the combined images.This procedure modifies the original dependence of the sourcebrightness on spatial scales, however, which is analyzed by thedetection algorithm (Sect. 3.4.4) to determine the characteristicsize for each source. Therefore a second set of combined im-ages is defined for the component of sources, normalized to thesmallest scale in each waveband,˜ S D j C = (cid:88) λ w λ S λ D1C S λ D j C , (31)where w λ is the weight that enhances the contribution of the im-ages with higher angular resolutions, w λ = (cid:32) ¯ OO λ (cid:33) , ¯ O = N − (cid:88) λ O λ , (32)where ¯ O is the average resolution, and the power of 7 ensurescomplete separation of the contributions of di ff erent wavebandsin Eq. (31). After the weighting, the summation of S λ D j C pre-serves the individual dependence of the peak intensity of eachsource on spatial scales, which provides an initial estimate of itssize during detection before the actual measurements. Sources are detected in S D j C with almost the same algorithmthat was used by getold (Sect. 2.5 of Paper I), which is brieflysummarized here for completeness. An inspection of the entireset of single-scale images S D j C shows that sources appear onrelatively small scales become the brightest on scales roughlyequal to their size and vanish on significantly larger scales (cf.Fig. B.1). All detectable sources appear isolated on small scalesand become blended with other nearby sources on larger scales.The getsf source detection scheme identifies the sources in S D j C and tracks their evolution from small to large scales, until theydisappear or merge with a nearby brighter source.To detect sources, getsf slices S D j C by a number N L of inten-sity levels I jl , spaced by δ ln I j = .
01, from the image maximumdown to the lowest non-zero value. Each slice l cuts through allpeaks brighter than I jl , producing a set of partial images, S D j C l = max (cid:16) S D j C , I jl (cid:17) , l = , , . . . , N L . (33) Article number, page 14 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf
Fig. 11.
Combination of the detection images S λ D j C (Sect. 3.4.2) for the set of images I λ containing all Herschel wavebands and I (cid:111) ≡ D (cid:48)(cid:48) from Eq. (8). The clean S D j C thresholded above (cid:36) λ S j = σ λ S j and combined over all wavebands are shown. Several faint spurious peaks visible onlarge scales near edges in the bottom row are the background and noise fluctuations that happened to be stronger than the threshold. They may bediscarded during the subsequent detection and measurement steps. Logarithmic color mapping. The source detection algorithm works on the sequence of partialimages, creating and updating source segmentation masks (foreach j and l ). This is done with the same tintfill algorithm thatwas applied in Sect. 3.2.2 to remove the source- and filament-like shapes. The resulting single-scale segmentation images ofsources sets all pixels belonging to a source to its number.The scale j F on which a source n becomes the brightest is re-ferred to as the footprinting scale. It provides an initial estimatefor its half-maximum size H n = S j F (cf. Appendix B), which de-fines the initial footprint, that is, the entire area of all pixels mak-ing non-negligible contributions to the total flux of the source.From a practical point of view, getsf defines the initial footprintdiameter of a circular source as φ n H n , where the footprint factor φ n =
3. For the Gaussian sources (e.g., Fig. B.1), these footprintslead to the total fluxes that are underestimated by only 1 . getsf creates their initial footprints with the diameters { A , B } F n = φ n H n . The footprints become elliptically shaped in thewavelength-dependent measurements, reflecting the elongationof sources that is obtained from intensity moments. During themeasurement iterations (Sect. 3.4.6), getsf changes φ n to expandor shrink the footprints for those sources that are bright enoughand whose intensity distributions indicate that their initial foot-prints are not optimal. Filaments are detected in F D j C with a completely new approach.In the getold algorithm (Sect. 2.4.4 of Paper II), intensity pro-files at each pixel of the component of filaments are measuredin four directions, and the pixel is deemed to belong to the crest(marks a skeleton point) if it has the highest value for each ofthe profiles. In practice, this simple approach sometimes createsartifacts at the filament end points, where the skeletons some-times appear forked like a snake tongue. An important limitationof the getold skeletons is that they trace crests of the images offilaments without any dependence on the spatial scales.The Herschel observations of nearby star-forming molecu-lar clouds demonstrated that filaments are very complex, multi-scale structures (e.g., Men’shchikov et al. 2010), unlike the sim-ple case of the relatively round sources, whose intensities rapidlydecrease in all directions from their peaks. Resolved sources areproduced by the emission of dense, compact objects and maybe reasonably well characterized by a single value of their half-maximum size (or spatial scale). In contrast to the sources, de-tection of filaments is fundamentally a scale-dependent problem,and a single skeleton that may be appropriate for a certain spatialscale cannot fully describe the complexity of the observed mul-tiscale, profoundly substructured filaments. Resolved filamentsoften appear to be composed of thinner filaments on smaller
Article number, page 15 of 37 & A proofs: manuscript no. getsf
Fig. 12.
Combination of the detection images F (cid:111) D j C (Sect. 3.4.2) for the set of images I λ containing all Herschel wavebands and I (cid:111) ≡ D (cid:48)(cid:48) from Eq. (8). The clean F D j C thresholded above (cid:36) λ F j = σ λ F j and combined over five wavebands are shown (excluding the noisier 70 and 100 µ mimages). The faint ring-like structures that are visible on some scales are the source residuals originating from the derived surface densities thathave substantial inaccuracies over the sources (cf. Figs. 5 and 8; Sect. A). Square-root color mapping. scales, down to the angular resolution, and their widths, profiles,and crest intensities are quite variable along their skeletons.The strong dependence of the observed filaments on spatialscales is illustrated in Fig. 13, which shows the surface densitiesof the filaments in three well-studied star-forming regions: Tau-rus, Aquila, and IC 5146. The observed images of the regionswere downloaded from the Herschel
Gould Belt Survey (Andréet al. 2010) archive , and the hires surface densities D (cid:48)(cid:48) of theregions were computed from Eq. (8). Figure 13 shows the imagesof the spatially decomposed filaments on three selected scales:small, intermediate, and large. The images demonstrate that theobserved filaments are highly substructured in the regions, andtheir shapes as traced by the skeletons are very di ff erent on var-ious spatial scales. The skeletons, obtained on the small scales,are completely incompatible with the shapes and crests of the fil-aments on larger scales. The detected small-scale skeletons areoften very curved, meandering back and forth even at the rightangles, which implies a high degree of self-blending and leadsto significant inaccuracies in the measured profiles and other de-rived properties of filaments. Therefore it is necessary to detecttheir skeletons on the scales that correspond to the widths of thestructures being studied. Moreover, the small-scale substructuresof the larger-scale filaments may even be the key to understand- http://gouldbelt-herschel.cea.fr/archives ing the filament properties, the physical processes taking placeinside them, and the formation of stars.Instead of tracing the original image intensity profiles, getsf employs the Hilditch algorithm (Hilditch 1969), which skele-tonizes two-dimensional shapes by erasing their outer pixels un-til the thinnest representation of the shapes is found. The originalHilditch algorithm has a deficiency in that the shapes orientedalong the two main diagonals become completely erased duringthe iterations. To enable its application in getsf , the algorithmhas been improved to preserve the diagonal skeletons.Through the multiscale decomposition, getsf allows findingcrests without any explicit analysis of the filament intensities.The single-scale images F D j C not only enhance the structures ofthe widths W ≈ S j , but also cause these filtered intensity distri-butions to become well centered on their zero-level footprints.The crests of the isolated decomposed filaments always approx-imate the medial axes of their footprints (cf. Figs. 12 and 13).If a single-scale filament blends with other filaments (or with it-self), there is always a smaller scale on which it is isolated. Thisallows determining the scale-dependent skeletons as the medialaxes of the zero-level filament masks.The single-scale skeletons K j are created using the Hilditchalgorithm, with a width of three pixels to tolerate one-pixel dis-placements in the skeleton coordinates between scales. They are Article number, page 16 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf
Fig. 13.
Filaments extracted by getsf on selected spatial scales in three star-forming regions: Taurus ( top ), Aquila ( middle ), and IC 5146 ( bottom ).The flattened components F (cid:111) D derived from the hires surface densities D (cid:48)(cid:48) obtained from Eq. (8) using the Herschel µ mimages are shown. The minimum scales of 36 (cid:48)(cid:48) ( left column) correspond to 2 . right column) correspond to 0 . middle column. The images were cleaned using the default threshold (cid:36) (cid:111) F j = σ (cid:111) F j . Overlaid on the filaments aretheir skeletons obtained from the images using the Hilditch algorithm (Sect. 3.4.5). The observed filaments are heavily substructured, and theirappearance, detected skeletons, and measured properties depend strongly on spatial scales. Logarithmic color mapping.Article number, page 17 of 37 & A proofs: manuscript no. getsf further accumulated over a limited range of scales to produce aset of N K skeletons tracing the filamentary structures of variouswidths, K k = J + (cid:88) j = J − K j , k = , , . . . , N K , (34)where J − and J + are the numbers of the smallest and the largestscales, S J − = − / S k and S J + = + / S k , in the accumulated skele-ton K k . The scale-dependent skeletons K k sample the followingscales: S k = / S k − , k = , , . . . , N K , (35)where the scale S = ¯ O is defined by Eq. (32) as the aver-age angular resolution over the wavebands combined in F D j C (Sect. 3.4.3), and S N K = λ ( Y λ ) is the largest spatial scale forthe filament detection.Each pixel of the accumulated skeleton K k in Eq. (34) con-tains information on the filament detection significance ξ , de-fined as the number of scales between J − and J + , on whichthe single-scale skeleton K j contributes to K k in that pixel. De-pending on the filament intensity at the skeleton pixel, the sig-nificance range is 1 ≤ ξ < ∼ ln 2 (ln f ) − ( ≈
14, assuming f ≈ . K k ξ = max ( K k , ξ ) with adefault ξ = , and applying the Hilditch algorithm to the resultingshapes. Segmentation images of the skeletons K k ξ are computedusing the tintfill algorithm, which sets all pixels belonging to afilament to its number. The source-measurement algorithm is an improved version ofthe one employed by getold (Sect. 2.6 of Paper I). Sources can-not be measured in their component S λ X (Sect. 3.2.3) becausethe subtracted background B λ X contains substantial source resid-uals at low intensity levels (Fig. 8). The background B λ X is de-rived specifically for the most complete and reliable source de-tection, not for accurate measurements. The sources are mea-sured by getsf in the original I λ after subtracting their back-grounds and deblending them from overlapping sources, whichentails iterations. The background determination and deblendingare more accurate for the sources with relatively small footprints.However, in crowded regions with larger areas of overlappingfootprints and strongly fluctuating backgrounds, they may be-come very inaccurate.The background B F λ of each source is determined by a linearinterpolation of I λ across its footprint. The interpolation alongtwo image axes and two diagonals is based on the adjacent pix-els (not belonging to any source) outside the footprint, as wasdone by getold . To improve the background estimate in the pres-ence of overlapping footprints, getsf evaluates the backgroundonly along those of the four directions for which the distancesbetween the outside points being interpolated are within a fac-tor of two of the smallest distance. For each pixel of the sourcefootprint area, the background value is averaged between the di-rections used in the interpolation. The background B F λ is me-dian filtered using a sliding window with a radius O λ , and thebackground-subtracted image of a source is then obtained as I S λ = I λ − B F λ .In the measurements, the source coordinates x n , y n are knownfrom the detection step and are kept unchanged. For the firstmeasurement iteration, it uses the initial characteristic size H n = S j F , provided by the detection algorithm (Sect. 3.4.4). Thecorresponding initial footprint { A , B } F n = φ n H n is a good approx-imation for only Gaussian sources, when H n is close to the actualwidths { A , B } λ n . However, the initial factor φ n = { A , B , ω } λ n from the previous iteration are used.The size derivation algorithm in getsf has become more ac-curate, hence it requires some clarifications. The half-maximumsizes were computed by getold using intensity moments (cf. Ap-pendix F in Paper I) that give accurate sizes only for the sourceswith Gaussian shapes. In real-life observations, however, somesources are markedly non-Gaussian and their intensity momentsgive either over- or underestimated sizes, corresponding to thelevels well below or above the half-maximum intensity.The inaccuracies of the moment sizes become very large forthe resolved sources with power-law intensity distributions. Thesimulated image of such a source shown in Fig. 14 has a half-maximum size of 10 (cid:48)(cid:48) . However, according to the intensity mo-ments (over the entire image), the model source has a diameter of76 (cid:48)(cid:48) . It is easy to see that this value corresponds to a level that islower by an order of magnitude than the half-maximum intensity.The source size depends on the adopted footprint. Within the twofootprints shown in the middle and right panels of Fig. 14, themoment sizes are 10 . . (cid:48)(cid:48) . The source flux is also under-estimated by correspondingly large factors of 5 . . λ < ∼ µ m, cf. Fig. 3), where the emis-sion of their low-temperature interiors fades away. A simulatedimage of such a source is shown in Fig. 15, with the model half-maximum size of 49 (cid:48)(cid:48) . However, the intensity moments (over theentire image) indicate that its diameter is 31 (cid:48)(cid:48) , which correspondsto a level by a factor of 2 above the half-maximum intensity. Inthe simple example in Fig. 15, the source size and flux do notdepend on the footprint size because the intensity profile in itsouter parts is steep and the background is flat (zero).The above examples demonstrate that the intensity momentsdo not provide accurate estimates of the half-maximum sourcesizes in the general case of arbitrary non-Gaussian intensity pro-files. Therefore getsf determines accurate half-maximum sizesby the direct Gaussian interpolation of the source intensity dis-tribution at its half-maximum and averaging the resulting dis-tances from the source peak, thereby estimating an average ra-dius h λ n . The source elongation E M λ n = A M λ n / B M λ n and posi-tion angle ω M λ n are computed independently from the inten-sity moments above the 10% level of the peak, excluding thelow-intensity pixels that may be a ff ected by the noise and back-ground fluctuations. The major and minor half-maximum axesof the source are then computed from B λ n = e − . E λ n − h λ n E − / λ n , A λ n = B λ n E M λ n , (36)where the (empirical) exponential factor converts the average ra-dius h λ n into the equivalent-area radius ( A λ n B λ n ) / of an ellipse.The FWHM ellipse { A , B , ω } λ n from Eq. (36) is guaranteed tocorrespond to the source half-maximum intensity, in contrast tothe ellipse estimated from the intensity moments. The momentsizes { A , B , ω } M λ n are also computed by getsf because they con-tain independent information that can be useful for the analysisof the extracted sources.During the measurement iterations (Sect. 2.6 of Paper I), getsf employs a footprint expansion and shrinkage algorithm to Article number, page 18 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf
Fig. 14.
Footprint expansion, illustrated in an image with 3 (cid:48)(cid:48) resolution of a source with a peak intensity of 100, half-maximum size of 10 (cid:48)(cid:48) , andS / N of 100 ( left ). The source has an intensity profile defined by Eq. (2) with
Θ = (cid:48)(cid:48) and ζ =
1, transforming into a power law I ∝ θ − for θ (cid:29) Θ andfilling up the entire image, its faint outer areas ( I ∼ .
2) are largely lost within the noise. The initial footprint factor φ n = middle ). The footprint expansion algorithm (Sect. 3.4.6) enlarges φ n by a factor of 2 . right ), which lowers the source background by a factor of5 , and as a result, increases the source flux by a factor of 2 .
7. The improved flux is still below the true value by a factor of 1 . correct the footprint areas of those sources that need such adjust-ments. It is based on a simple observation that when a footprintarea is too small, the source background contains a pedestal ofthe residual intensity distribution of the source (Fig. 14); whenthe source pedestal does not exist or is negative (Fig. 15), thefootprint may be accurate or too large. The analysis is made inthe regularized component S λ R from Eq. (25) without contribu-tion from the complex background and filaments.The presence of the background pedestal is indicated by thepositive di ff erence between the background values below thesource and those in an external annulus just outside the source,1 . B Φ λ n > B Ψ λ n + D Ψ λ n , (37)where B Φ λ n is the median value within the footprint and B Ψ λ n and D Ψ λ n are the mean and the standard deviation inside the annu-lus. When the condition of Eq. (37) is fulfilled and the source isnot too elongated ( A λ n < . B λ n ) and bright enough ( Ξ λ n >
50 and Ω λ n >
15, see Eq. (41)), getsf increases the factor φ n by 5% be-fore proceeding to the next measurement iteration. The footprintexpansion terminates when the residual background pedestals(Fig. 14) are reduced as much as possible and the condition inEq. (37) becomes false. As a final adjustment, the footprint isexpanded once more by the factor 0 . + . φ n /
3) to reduce theresidual pedestal.The footprints of the sources that do not need any expansionare attempted to be reduced in size. It is important to confinethe footprints to the most local area occupied by the sources be-cause oversized footprints may strongly decrease the accuracyof background subtraction and flux measurement for sources oncomplex (filamentary) backgrounds and in crowded areas. Theneed to shrink a source footprint is indicated by a negative dif-ference between the background values below the source and inan external annulus just outside the source,1 . B Φ λ n < B Ψ λ n + D Ψ λ n , (38)where the quantities are the same as in Eq. (37). When thecondition of Eq. (38) is fulfilled, getsf decreases the factor φ n by 2% before proceeding to the subsequent measurement itera-tions. The footprint shrinkage is completed when the conditionin Eq. (38) becomes false, that is, when the reduced footprintcauses a small residual background pedestal. In a final adjust-ment, the footprint is expanded by a factor of 1 . I S λ , getsf deblends overlapping sources, calculating the peak intensi-ties F P λ n and the total fluxes F T λ n for each source n . The iterativedeblending algorithm employs the Gaussian shapes G λ n ( x , y ) de-fined by the source ellipse { A , B , ω } λ n and peak intensity F P λ n .The intensity I S λ ( x , y ) is split between the source n and all over-lapping sources n (cid:48) according to a fraction of the shape intensities, I λ n ( x , y ) = G λ n ( x − x n , y − y n ) (cid:80) n (cid:48) G λ n (cid:48) ( x − x n (cid:48) , y − y n (cid:48) ) I S λ ( x , y ) , (39)where the summation is done over all surrounding sources whosefootprints cover the pixel ( x , y ). The iterative deblending of thepeak intensities starts with the original image values I S λ ( x n , y n )of each source and proceeds with the splitting of the pixel val-ues until I λ n ( x n , y n ) converges to the deblended peak intensity F P λ n . After obtaining F P λ n for all sources, getsf computes the de-blended intensities I λ n ( x , y ) of all pixels within their footprints,estimates the ellipses { A , B , ω } λ n and { A , B , ω } M λ n , and integratesthe total fluxes F T λ n . It also computes an independent flux esti-mate F G λ n by integrating G λ n ( x , y ), which may only be accuratewhen a source shape resembles the two-dimensional Gaussian.Uncertainties of the peak intensities F P λ n are estimated by getsf as the standard deviations σ P λ n , evaluated in the original Article number, page 19 of 37 & A proofs: manuscript no. getsf
Fig. 15.
Footprint shrinkage, illustrated in an image with 3 (cid:48)(cid:48) resolution of a flat-topped source with a peak intensity of 100, half-maximum sizeof 49 (cid:48)(cid:48) , and S / N of 100 ( left ), modeled as a 50 (cid:48)(cid:48) cylinder convolved with a 10 (cid:48)(cid:48)
Gaussian kernel. The initial footprint factor φ n = middle ), whose actual footprint relates to the FWHM value by a factor φ n = .
5. The footprint shrinkagealgorithm (Sect. 3.4.6) reduces φ n by a factor of 1 . right ), which shrinks the footprint and confines it to the pixels belonging to the source alone.This footprint adjustment improves the accuracy of background interpolation and flux measurement on complex backgrounds. Square-root colormapping. image I λ , in an elliptical annulus around each source n just out-side its footprint. In heavily crowded fields, no local source-freeannulus can be found near the sources, in which case the uncer-tainties are estimated from the more distant source-free pixels.The uncertainties σ T λ n of the total fluxes F T λ n are computed withthe same assumptions as in getold (Sect. 2.6 of Paper I), σ T λ n = σ P λ n ( A F λ n B F λ n ) / φ n O λ , (40)where A F λ n and B F λ n are the major and minor axes of the sourcefootprints.It is convenient to define the detection significance Ξ λ n andthe signal-to-noise ratios (S / Ns) Ω λ n and Ψ λ n , describing the de-tection and measurement properties of each extracted source, Ξ λ n = S λ D j F n σ λ S j F , Ω λ n = F P λ n σ P λ n , Ψ λ n = F T λ n σ T λ n , (41)where j F is the footprinting scale (Sect. 3.4.4) and S λ D j F n is theintensity at the source position in S λ D j F (Sect. 3.4.1). The abovequantities can be combined together to characterize the overall“goodness” of a source, Γ λ n = Ξ λ n Ω λ n Ψ λ n ) / B λ n A λ n , (42)normalized such that all acceptable sources in the extraction cat-alogs have Γ λ n > ∼
1. The sources with Γ λ n < ∼ λ . The corresponding globalquantities Ξ n and Γ n describe the source detection significanceand goodness, respectively, in all wavebands, Ξ n = (cid:88) λ Ξ λ n / , Γ n = (cid:88) λ Γ λ n / . (43)The getsf source extraction catalogs contain detailed head-ers, documenting the extraction parameters and explaining thetabulated quantities. Each data line presents the source number n , coordinates x n , y n (in pixels), world coordinates α n , δ n (com-puted with the xy2sky utility, Mink 2002), global flag f n , signif-icance Ξ n , and goodness Γ n , n x n y n α n δ n f n Ξ n Γ n , followed (in the same line) by the measured quantities in each ofthe N W wavebands,( f λ n Ξ λ n Γ λ n F P λ n σ P λ n F T λ n σ T λ n A λ n B λ n A M λ n B M λ n ω λ n ) N W , where f λ n is a wavelength-dependent flag. In addition to this in-formation, an expanded version of the catalog adds (to the sameline) the Gaussian flux F G λ n , characteristic size S j F , footprint fac-tor φ n , and footprint axes A F λ n , B F λ n . For surface density images,the F G λ n column is replaced with source mass M (cid:111) n .It is necessary to emphasize that sources from extraction cat-alogs must always be carefully selected (for each waveband sep-arately) to ensure that only su ffi ciently good and accurately mea-surable sources are used in further analysis. This is especiallyimportant for the multiwavelength extraction catalogs, wheresources can be prominent in one waveband and completely un-detectable or not measurable in another. The getsf catalogs pro-vide various quantities to enable the evaluation and selection ofonly acceptable sources and recommended the following selec-tion criteria: Ξ λ n > ∧ Γ λ n > ∧ Ω λ n > ∧ Ψ λ n > ∧ A λ n < B λ n ∧ A F λ n > . A λ n . (44)These empirical conditions, based on numerous test resultsobtained in various benchmarks (Pouteau et al., in prep.;Men’shchikov 2021, submitted), and verified in applications toa variety of observed images (e.g., Sect. 4), ensure that the se-lected subset of sources is reliable (not contaminated by signifi-cant numbers of spurious sources) and that selected sources haveacceptably accurate measurements. Filaments are measured in their background-subtracted F λ Y , de-rived in Sect. 3.2.3. When the maximum size Y λ of the filaments Article number, page 20 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf of interest is estimated su ffi ciently accurately (Sect. 3.1.3), theirbackground B λ Y does not reveal any filamentary residuals. Nev-ertheless, the background may well have substantial inaccura-cies, especially when the filaments are wide and blended (Fig. 8).Observed filaments are the two-dimensional projections of thecomplex three-dimensional structures, which are much more dif-ficult to disentangle, deblend, measure, and analyze than sourceswith their well-defined round shapes and compact footprints.Sources can be represented by their peak intensity and half-maximum size, but filaments are extremely complicated in theirshapes and widths, often interconnected with each other and withvarious nearby branches, and have variable intensity along theircrests. It is quite clear that blending of the structures is a majorsource of large inaccuracies in the measured quantities of generalinterest (widths, fluxes, masses, profiles) and in other properties,derived from the measurements.Another di ffi culty in understanding filaments (distinct phys-ical structures) is that the filament length cannot be determinedobjectively. In most cases, it is quite unclear where a physicalfilament starts, where it ends, and which branches of the three-dimensional filamentary network belong to that filament. Fortu-nately, the global properties of the entire filaments (even if thelatter could be clearly defined) are not as important for studyingstar formation as the local properties of their relatively short seg-ments that develop appropriate physical conditions for the for-mation of prestellar cores.The approach that is adopted in getsf is to simplify thevery complex problem by separating all branches of the skele-ton network, converting the latter into the simple, non-branchingskeletons. The set of non-branching skeletons is derived duringthe segmentation of the skeletons K k ξ , the last step of the fil-ament detection process (Sect. 3.4.5). The simplified skeletonsenable an easy selection and better measurements of only thewell-behaving, preferably isolated (not blended), and relativelystraight parts of the filaments. No attempt is made by getsf todeblend filaments because a general algorithm for accurately de-blending them is not available.The segmentation image of all skeletons is scanned to traceeach skeleton n and find coordinates of all its pixels; to smooththe skeletons, the integer coordinates of their pixels are averagedwithin a seven-pixel window. The resulting high-resolution coor-dinates x n ( i ) , y n ( i ) of each skeleton point i are cataloged, togetherwith the local position angles ϑ n ( i ) of the skeleton direction and α n ( i ) , β n ( i ) of the left and right normals. A normal is called left( α ) or right ( β ) depending on which side it is located from thefirst skeleton point to the last. With an adopted distance to the ob-served region, getsf converts the angular units of the pixels intoparsecs and measures each filament as a function of the length l along its skeleton and the distance r along its normals. If thedistance is unknown or unspecified, a default distance of 100 pcis used; the measurements can be scaled to another distance afterthe extraction.The observed filaments usually meander, hence their skele-ton normals diverge from each other on one side and intersectwith each other on the other side. In the absence of deblending,more accurate measurements for them are usually obtained fromthe one-sided quantities that correspond to the side on which thefilament is the least a ff ected by blending with itself and withother nearby structures. The filament surface density (or inten-sity) profiles and their full half-maximum widths are catalogedas the one-sided quantities D { α | β } n ( l , r ) and W { α | β } n ( l ) and as theaverage quantities D n ( l , r ) and W n ( l ) between the two sides. Alsocataloged are the corresponding average profiles D { α | β } n ( r ) and D n ( r ) along the skeleton with their standard deviations ς { α | β } n ( r ) and ς n ( r ), as well as the median widths W n and the slopes γ ( r )of the filament profiles.Although the total length L n of a skeleton and mass M n of afilament may not always be objective and physically meaningfulquantities (see the discussion above), getsf derives the mass bydirect integration of F λ Y within a filament footprint, assumingthat the image is obtained from surface densities, M { α | β } n = µ m H (cid:34) Υ { α | β } n F λ Yn ( x , y ) d x d y , (45)where M { α | β } n are the one-sided mass estimates, from which theaverage mass M n between the two sides is obtained. The one-sided footprints Υ { α | β } n used in Eq. (45) are defined as the areasbetween the skeleton and the maximum extent of the filamenton either side. In practice, a filament footprint Υ n is the set ofall pixels whose shortest distances from the skeleton are smallerthan the filament normals.When the filament mass M { α | β } n and length L n are known, theone-sided estimates of the average linear density of the entirefilament are readily obtained,¯ Λ { α | β } n = M { α | β } n L − n , (46)together with the average linear density ¯ Λ n between the twosides. The linear density of filaments is also computed by getsf as a function of the coordinate l along their skeletons, Λ { α | β } n ( l ) = µ m H R { α | β } n ( l ) (cid:90) F λ Yn ( l , r ) d r , (47)where the integration limits R { α | β } n ( l ) along the left and rightnormals are chosen at zero surface density values or at a ra-dial distance of the profile minimum at which the filament be-comes blended with another structure. The median one-sided lin-ear densities Λ { α | β } n for the entire length L n of the filament and itsaverage linear density Λ n are also computed and cataloged. Thelinear density values from Eq. (46) and Eq. (47) are expected tobe similar to each other for the well-behaved filaments.
4. Applications to observed regions
The multiscale, multiwavelength source- and filament-extractionmethod presented in Sect. 3 was very extensively tested using ∼
40 images that were observed with di ff erent instruments andboth ground-based and orbital telescopes during the past twodecades. Multiwaveband observations of star-forming regionsobtained in the Herschel
Gould Belt Survey (André et al. 2010)and HOBYS (Motte et al. 2010) key projects, as well as the mostrecent interferometric images observed in the
ALMA -IMF pro-gram (Motte et al., in prep.), played an important role in validat-ing getsf.
The new extraction method has demonstrated very good re-sults in
ALMA benchmarks (Pouteau et al., in prep.) on im-ages, created from a magnetohydrodynamic (MHD) simulationof a star-forming region (Ntormousi & Hennebelle 2019) thatwas populated with model cores and processed by the casa task simobs (McMullin et al. 2007) to resemble the real
ALMA observations (Louvet et al., in prep.). Furthermore, getsf has In some publications, the filament linear density is also referred to asthe mass per unit length. Article number, page 21 of 37 & A proofs: manuscript no. getsf been applied to source extraction in 15 regions of the
ALMA -IMF program and 12 infrared dark clouds of the ASHES sur-vey (Sanhueza et al. 2019; Li et al. 2020). However, the mostsignificant and definitive testing and validation of extractiontools is achieved with simulated benchmarks for which every-thing is fully known about their components. The second paper(Men’shchikov 2021, submitted) presents a quantitative analysisof getsf extractions using several variants of the new benchmark(Sect. 2) and old benchmark (Papers I and III).Sections 4.1–4.8 illustrate the performance of getsf on nineimages obtained with di ff erent telescopes: XMM-Newton , theGalaxy Evolution Explorer (
GALEX ), Hubble , Spitzer , Herschel ,the Atacama Pathfinder Experiment (
APEX ), the James ClerkMaxwell Telescope (
JCMT ), and
ALMA from the X-ray domainto the millimeter wavelengths. These examples are presented todemonstrate that the method is applicable to a wide variety of ob-served images, visualizing the e ff ects of the separation of struc-tural components and flattening of detection images. Scientificanalyses and discussions of these results, as well as their com-parisons with previous studies, are beyond the scope of this pa-per. This can be accomplished using the corresponding extrac-tion catalogs that are available on the getsf website . RXJ 1713.7-3946 was observed with
XMM-Newton (EPIC cam-era) in the X-ray waveband (0 . − . µ m.The 0 . ◦ × . ◦ image in Fig. 16 is a mosaic of multiple obser-vations , first presented in Acero et al. (2017). With an aver-age angular resolution of 7 (cid:48)(cid:48) , it reveals the southeast segment ofthe supernova remnant shell that may have been created by theexplosion of the historical supernova SN 393, whose center ofexplosion is located beyond the upper right image corner. Forthis source and filament extraction with getsf , maximum sizes { X | Y } λ = { , } (cid:48)(cid:48) were adopted (Sect. 3.1.3).The observed X-ray image (Fig. 16) has relatively low countsof the detected photons per pixel and high levels of Poissonnoise. The image is contaminated by linear artifacts and severalspurious single-pixel spikes. The latter may appear in these im-ages when just one or several photons are detected at an edge ofthe mapped area.The image features several elongated shock fronts createdby the expanding supernova shell, and a number of faint andbright point sources, all of them well isolated. The getsf extrac-tion greatly simplified the image by separating the componentsof sources S λ , filaments F λ , and their backgrounds B λ { X | Y } . Thesmall-scale fluctuation levels across the observed image are onlywithin a factor of two, therefore the improvement caused by theflattening is not clearly discernible in S λ . However, the imagesof standard deviations show that the flat source detection im-age S λ D has uniform fluctuations over the entire image, which isbeneficial for source detection.The extraction catalog contains measurements of 41 sources,all of them selected as acceptably good by Eq. (44). Although thespurious one-pixel spikes were not removed before the extrac-tion, getsf identified them as such (red squares in Fig. 16) andeliminated them from the catalog. Despite the faintness of theobserved X-ray image and the Poisson noise, the three promi-nent shocks of the supernova shell become clearly visible andare extracted in the filament component. http://irfu.cea.fr/Pisp/alexander.menshchikov/ http://nxsa.esac.esa.int/nxsa-web/ NGC 6744 was observed with
GALEX in a far-ultraviolet (FUV)waveband (1350 − . µ m (Lee et al. 2011).The 0 . ◦ × . ◦ image in Fig. 17 with an angular resolution of4 (cid:48)(cid:48) shows the spiral galaxy, which is considered to be similar toour own Galaxy. Despite noisiness of the FUV image, it dis-plays the spiral arms with many hundreds of unresolved emis-sion sources. These are the regions of ongoing star formation,heated by the embedded young massive stars. For this source andfilament extraction, maximum sizes { X | Y } λ = (cid:48)(cid:48) were adopted(Sect. 3.1.3).Separation of the structural components by getsf providedindependent images of sources S λ , filaments F λ , and their back-grounds B λ { X | Y } (Fig. 17). Fluctuation levels in the observed im-age vary within a factor of two, largely in the central, brighterpart of the galaxy. Flattening of the S λ component e ff ectivelyequalized the fluctuations across the detection image S λ D , im-proving the extraction results.The source catalog contains measurements of 1169 sources,1130 of which are selected as acceptably good by Eq. (44). Mostof the sources likely correspond to the star-forming regions alongthe galactic spiral arms; many of them overlap with each other,hence they required deblending for accurate measurements oftheir fluxes. The filaments extracted in the F λ component repre-sent the spiral arms and their branches. The 147 skeletons tracethe simple, non-branching segments of the filamentary network(Sect. 3.4.5). NGC 6960 was observed with
Hubble in five UVIS wavebands(0 . − . . µ m, within the frame of the Hub-ble
Heritage project (Mack et al. 2015, PI: Z. Levay). The small73 (cid:48)(cid:48) × (cid:48)(cid:48) image in Fig. 18 with an angular resolution of 0 . (cid:48)(cid:48) represents a small fragment of the Veil Nebula, which is a seg-ment of the Cygnus Loop, the large expanding shell of a su-pernova remnant (Fesen et al. 2018). For this source and fil-ament extraction with getsf , maximum sizes { X | Y } λ = { . , } (cid:48)(cid:48) were adopted (Sect. 3.1.3).The observed image (Fig. 18) is dominated by impressivefine filamentary structure of the nebula, seen in emission of anumber of atomic lines. Many unresolved intensity peaks ofsources are less prominent on this bright backdrop. The struc-tural components were separated by getsf in the independentimages of sources S λ , filaments F λ , and backgrounds B λ { X | Y } ;together with the flattening of detection images, this greatly fa-cilitates their extraction and analysis.The source catalog contains measurements of 786 sources,690 of which are selected as acceptably good by Eq. (44). Thestrings of sources that run up through the middle of the imageare the spurious peaks created by the linear artifacts. The spuri-ous spikes were not cut out of the image before this extraction toillustrate that they need to be removed to avoid contamination ofthe source catalogs. The finely structured filamentary network ofthe nebula that is extracted by getsf in the F λ component com-prises 100 skeletons representing its simple, non-branching seg-ments (Sect. 3.4.5). https://archive.stsci.edu/missions-and-data/galex/ https://archive.stsci.edu/prepds/heritage/veil/ Article number, page 22 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf
L 1688 was observed with
Spitzer in the IRAC 8 µ m waveband(Evans et al. 2009). The 1 ◦ × ◦ image in Fig. 19 with an an-gular resolution of 6 (cid:48)(cid:48) shows a complex intensity distribution inthis well-known star-forming region, with the background vary-ing by almost two orders of magnitude and many sources situ-ated in both faint and bright background areas. For this sourceand filament extraction with getsf , maximum sizes { X | Y } λ = (cid:48)(cid:48) were adopted (Sect. 3.1.3).The clean separation of the components of sources S λ andfilaments F λ from their backgrounds B λ { X | Y } provided by getsf (Fig. 19) represents an obvious improvement over the resultsobtained with getimages (Fig. 6 in Paper III). The old methodof background derivation was indiscriminate with respect to theshapes of the components, hence the background-subtracted im-age also contained some filamentary structures on small scales.In contrast to getimages, which produced a single background, getsf derived and subtracted individual backgrounds for S λ and F λ . The component S λ of sources (Fig. 19) is completely free ofthe elongated structures. The standard deviations U λ reveal thatthe small-scale background fluctuation levels vary by roughlythree orders of magnitude across the image. If not equalized, thefluctuations would be extracted as numerous spurious sourcesand contaminate the source catalog. The very e ff ective flatten-ing of the detection image S λ D leads to a much more reliableextraction.Several bright unresolved sources in the lower part of theobserved image have very wide power-law wings and cross-likeartifacts that are induced by the complex PSF of the Spitzer
IRAC camera at 8 µ m. Their intensity profiles are markedlynon-Gaussian, and for a proper measurement of their integratedfluxes, getsf expanded their footprints by factors ∼ getsf as filaments andwere moved to the filament component, thereby improving S λ D for source detection. In addition to the cross shape, the com-plex PSF has ∼
20 faint peaks that surround the main beam.They were extracted as several spurious sources, surroundingthe brightest peaks; they must be eliminated in a post-extractionanalysis.The source catalog gives measurements of 1474 sources,1162 of which are selected as acceptably good using Eq. (44).The filament component produced by getimages (Fig. 7 in PaperIII) contains only the brightest parts of the filaments, their fainterintensities are missing. In contrast, getsf determines the intensitydistributions down to very low intensity levels, with 286 skele-tons tracing the simple, non-branching segments of the filaments(Sect. 3.4.5).
L 1689B, one of the nearest well-resolved starless cores (a dis-tance of 140 pc) embedded in a resolved filament, was observedwith
Herschel in five PACS and SPIRE wavebands (Ladjelateet al. 2020). The 160 − µ m images and Eq. (8) were used tocompute a 1 . ◦ × . ◦ surface density image D (cid:48)(cid:48) in Fig. 20 witha resolution of 13 . (cid:48)(cid:48) to illustrate the new extraction method ona single image. In addition to the reduction of the number of im-ages, the use of surface densities allows getsf to catalog physical https://sha.ipac.caltech.edu/applications/Spitzer/SHA/ http://gouldbelt-herschel.cea.fr/archives parameters of the core and filament. For this extraction, maxi-mum sizes { X | Y } (cid:111) = { , } (cid:48)(cid:48) were adopted (Sect. 3.1.3).The D (cid:48)(cid:48) image in Fig. 20 presents L 1689B in the wide fil-amentary structure near the edge of a di ff use cloud, all compo-nents are blended. The filament surface density is a factor of ∼ N H = . × cm − , whereasat the values, lower by just a factor of 2, a round shape of thesource becomes distorted by its complex environment. Separa-tion of the components by getsf greatly simplifies the image,isolating the structures in their individual images S (cid:111) , F (cid:111) , and B (cid:111) { X | Y } (Fig. 20). Subsequent flattening of the small-scale fluc-tuation levels allowed a reliable identification of the filamentand sources in both low- and high-background areas of the ob-served image. The extraction catalog contains measurements of20 sources, 12 of which are selected as acceptably good byEq. (44). The single skeleton was obtained on spatial scales of ∼ (cid:48)(cid:48) , corresponding to the maximum width Y (cid:111) .The main physical parameters of the starless core L 1689B, M = . M (cid:12) and N H = cm − , are underestimated because ofthe inaccuracies (Appendix A) of the standard surface densityderivation approach (Sect. 3.1.2). The errors and correction fac-tors can be found using the benchmark models from Sect. 2.2.A model of the critical Bonnor-Ebert sphere with T BE =
14 K, M = M (cid:12) , and N H = . × cm − has an FWHM size of 57 (cid:48)(cid:48) ,almost the same as the size A = (cid:48)(cid:48) of L 1689B, measured by getsf . However, in the derived D (cid:48)(cid:48) image, the same model has M = . M (cid:12) and N H = cm − , implying correction factorsof 1 . . M ≈ . M (cid:12) ; masses of the other sources in the imageare lower by (at least) a factor of ∼
10 . The filament measure-ments (Sect. 3.4.7) give its median value N = . × cm − ,length L = . W = .
14 pc (205 (cid:48)(cid:48) ),mass M = M (cid:12) , and linear density Λ = M (cid:12) pc − ; the valuesare little a ff ected by the fitting inaccuracies. NGC 6334 was observed with
APEX at 350 µ m, equipped withthe ArTéMiS camera (André et al. 2016). The 0 . ◦ × . ◦ im-age in Fig. 21 with an angular resolution of 8 (cid:48)(cid:48) represents animprovement by a factor of 3 with respect to the Herschel imagesat 350 µ m. Subtraction of the correlated sky noise resulted in animage without signals on spatial scales above 120 (cid:48)(cid:48) (André et al.2016). Therefore the large-scale background and the zero levelof the image are not known, and the structures in the image aresmaller than the largest scale. Fortunately, these observationalproblems are entirely unimportant for getsf . For this source andfilament extraction, maximum sizes { X | Y } λ = (cid:48)(cid:48) were adopted(Sect. 3.1.3).The observed image of NGC 6334 displays complex blendedstructures of various shapes and intensities (Fig. 21), includingsubstantial numbers of negative areas and artifacts from the datareduction and map-making algorithms. The separated S λ com-ponent shows all source-like peaks very clearly, even those thatare hardly visible in the original image, because getsf is ableto distinguish sources from the elongated filamentary shapes.Many of the sources overlap each other, therefore they requiredeblending for accurate measurements. The background B λ Y offilaments is fairly low, hence its subtraction enhanced the visi-bility of filaments in F λ only little. Nonuniform small-scale fluc- http://cdsarc.unistra.fr/viz-bin/cat/J/A+A/592/A54 Article number, page 23 of 37 & A proofs: manuscript no. getsf tuations in S λ were e ff ectively equalized in the detection image S λ D by the flattening algorithm.The source catalog contains measurements of 124 sources,91 of which are selected as acceptably good by Eq. (44). In thecomponent of filaments, getsf identified 26 skeletons, tracing thesimple, non-branching segments of the filaments (Sect. 3.4.5) onspatial scales of ∼ (cid:48)(cid:48) , corresponding to the maximum width Y λ . Orion A was observed with
JCMT at 450 and 850 µ m with theSCUBA-2 camera (Lane et al. 2016) with angular resolutions of9 . . (cid:48)(cid:48) , respectively. The 0 . ◦ × . ◦ image at 850 µ min Fig. 22 displays the northern part of the integral-shaped fila-ment (ISF). Like with other ground-based submillimeter obser-vations that must subtract sky background, large-scale emissionin the images has been filtered out (Kirk et al. 2018). A visual es-timate suggests that the image contains substantial signal on spa-tial scales of up to ∼ (cid:48)(cid:48) . For the two-wavelength getsf extrac-tion, employing both 450 and 850 µ m images, maximum sizes { X | Y } { | } = { , , , } (cid:48)(cid:48) were adopted (Sect. 3.1.3).The 850 µ m image of the ISF in Fig. 22 reveals the small-scale structures of the area most clearly because of the spatialfiltering e ff ect of the observational technique. However, the cen-tral bright part of the ISF remains blended, and the spatial de-composition by getsf helps isolate the sources S λ in that areafrom the filaments F λ and their backgrounds B λ { X | Y } . The back-ground of filaments is found to be slightly negative, except in itscentral bright round area. In comparison with an average valueof small-scale fluctuations in S λ , they are larger by a factor of2 . . U λ reveal imprints of thefive overlapping scans from the observations. The flattening al-gorithm of getsf e ff ectively equalizes them and creates the flatdetection images {S|F } λ D of sources and filaments, improvingtheir detection reliability.The two-band source extraction in ISF with getsf cataloged344 sources, detected and measured in both wavebands simulta-neously. Only 257 and 212 sources at 450 and 850 µ m, respec-tively, are selected as acceptably good by Eq. (44); the S / N forthe remaining detections is too low or they have other defectsthat are identified by the measurements. Two additional getsf ex-tractions, done on each image independently, resulted in catalogswith 319 and 283 sources at 450 and 850 µ m, respectively. In-dependent extractions ignore the valuable information from theother image, hence there are higher chances of spurious sources.With the additional condition that cataloged sources must be de-tected in both images, the combined extraction catalog contains223 sources;196 and 183 of these sources at 450 and 850 µ m, re-spectively, are acceptably good. They represent the most reliablesources in the images, hence it is highly unlikely that there arespurious sources among them.The missing large-scale emission of the SCUBA-2 imagehelped getsf expose the many relatively faint, narrow filamentswithin the wide, massive ISF. In the F λ component, getsf iden-tified 267 and 199 simple, non-branching segments of the fila-ments (Sect. 3.4.5) at 450 and 850 µ m, respectively, on trans-verse scales of 28 and 39 (cid:48)(cid:48) . This is similar to the existence ofnarrow sub-filaments on small scales within the resolved Tau-rus, Aquila, and IC 5146 filaments (Fig. 13) and consistent withthe recent ALMA observations of ISF (Hacar et al. 2018). W 43-MM1 was observed with the 12 m array of the
ALMA in-terferometer (baselines 13 − µ m (Motte et al. 2018; Nony et al. 2020). Thesmall 68 (cid:48)(cid:48) × (cid:48)(cid:48) image in Fig. 23 with an angular resolutionof 0 . (cid:48)(cid:48) contains spatial scales of up to 12 (cid:48)(cid:48) , beyond whichthe interferometer was insensitive to the emission. For thissource and filament extraction with getsf , the maximum size { X | Y } λ = { . , . } (cid:48)(cid:48) was adopted (Sect. 3.1.3).The interferometric image of W 43-MM1 (Fig. 23) shows acluster of relatively bright sources, some of them blended, andthree faint filamentary structures that appear to connect them.Separation of the components of sources S λ and filaments F λ confirms that most sources are concentrated on (or near) the faintcontinuous filaments. Almost the entire background B λ Y of thefilamentary structures is negative, which is caused by the missinglarge scales in the observed images.The small-scale fluctuation levels steeply increase toward theimage center by more than an order of magnitude (Fig. 23),as evidenced by the standard deviations U λ . The small-scalestructured noise from the interferometric observations have bothround or irregular, elongated shapes. Consequently, the separa-tion of structural components by getsf leads to many faint peaksin S λ and F λ . The flattening algorithm equalizes the fluctuationlevels very e ff ectively, providing reliable detection of sources inthe flat {S|F } λ D . If not suppressed, such highly variable struc-tured noise would produce many spurious sources and filamentsin the central area of the image.This ALMA image of W 43-MM1 contains only moderatenumbers of sources and filaments. The extraction catalog con-tains measurements of 44 sources, and all of them are selectedas acceptably good by Eq. (44). This simple field allows a visualverification that they all are the true sources and are not con-taminated by the noise fluctuations. In the filament component, getsf identified 15 skeletons, tracing the simple, non-branchingsegments of the filaments (Sect. 3.4.5) on spatial scales of ∼ (cid:48)(cid:48) ,similar to the maximum width Y λ adopted for the extraction.
5. Strengths and limitations
In contrast to the other methods, getsf extracts sources and fila-ments simultaneously by combining available information fromall wavebands. Its flexible multiwavelength design enables han-dling of up to 99 images, not necessarily all of them observedin di ff erent wavebands. The maximum number of images is ar-bitrary, representing the largest two-digit integer number used inthe output file names; the code can be updated to use a highervalue if required for some applications. Any subset of the in-put images that is deemed beneficial for the detection purposescan be used to detect the sources and filaments, whereas mea-surements of the identified structures are provided for all inputimages. In a nonstandard application, the method can also beemployed with the position-velocity cubes if they are split intoseparate images along the velocity axis ( getold was used in thisway by Shimajiri et al. 2019).The images that are selected for detection are spatially de-composed to isolate the contributions of similar scales (Ap-pendix B) and are then combined in a wavelength-independentset of single-scale detection images (Sect. 3.4.3). This eliminatesthe necessity of associating independent detections across wave-lengths in images with greatly di ff erent angular resolutions and Article number, page 24 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf improves the detection and measurement accuracy. For example,positional association of nearby sources detected at 160 µ m andcompletely blended into a single clump at 500 µ m does not makesense.Separation of structural components in the images of highlystructured observed regions in space provides independent im-ages of sources, filaments, and their backgrounds (Sect. 3.2),which is highly beneficial for the analysis and interpretationof observations. Flattening of detection images equalizes the(nonuniform) small-scale background and noise fluctuations(Sect. 3.3). This greatly simplifies the images and allows reli-able detection of sources and filaments in decomposed single-scale images using a constant threshold, with a very low rate ofspurious sources.Sources and filaments of any size and width can be extractedby getsf provided that they are significantly smaller than the im-age. Only the maximum size of the structures of interest mustbe specified for each image in order to limit the range of spatialscales considered and the size of the structures to be measuredand cataloged. The single parameter of the observed images that getsf needs to know is the maximum size, which is determinedfrom the images by users (Sect. 3.1.3) on the basis of their re-search interests. This single constrained parameter reduces thedependence of the extraction results on the human factor to aminimum and makes their analysis and derived conclusions asobjective as possible.The numerical code is designed to be user-friendly and easyto run, providing diagnostics to help users avoid common prob-lems. It verifies the getsf configuration, input images, and masksfor consistency, and it suggests solutions in various circum-stances during extractions. The software includes 21 utilities andscripts (Appendix C), providing all kinds of image processingnecessary for getsf to run and more. They include the fitfluxes utility for spectral energy distribution fitting of source fluxes orimage pixels (and mass derivation) and the hires script that com-putes the high-resolution surface density images (Sect. 3.1.2).Most of the utilities are very useful for command-line image ma-nipulations, even without source and filament extractions. The method is designed and expected to work for the imagesthat are not very sparse: most pixels must contain detectable sig-nals (measurable data). Examples of the images for which getsf might not produce reliable results are some extremely faint X-ray or UV low-count images with isolated spiking pixels thatare surrounded by large areas of pixels that were not assignedany detectable signal. For such nonstandard images, getsf wouldstill work and complete extractions, but its results might not bereliable because the method relies on the standard deviations ofthe background or noise fluctuations outside structures, whosevalues may not correctly represent the statistics of the observeddata in these images. On the other hand, the images for getsf ex-tractions must not be extremely smooth: they must have somevariations on scales of about the angular resolution. However,such smooth images can easily be made perfectly suitable for getsf just by adding Gaussian noise at some faint level that doesnot alter the structures of interest.Separation of sources from filaments is not (and cannot be)perfect. It leaves very faint residuals of sources that end up in thefilament component. In practice, this is not important becausemost of the residuals are too faint (Fig. 12) to a ff ect the filamentproperties. The background of very wide and / or overlapping fil-aments is likely to be derived less accurately than that of the narrower and / or isolated filaments because the filaments are sep-arated from the wider background areas. Filaments that are sep-arated from wider background peaks of comparable widths arelikely to receive some contribution from the background (Fig. 8).In very rare cases, the footprint of a bright power-law peak mightnot be su ffi ciently expanded, which leads to an underestimatedflux.The method takes quite considerable time to complete ex-tractions, although getsf was optimized to run as fast as possible.The aim of its design was to produce extraction results that areas reliable as possible because completeness and accuracy, notspeed, are of prime importance in astrophysical research. Theruntime for the getsf applications presented in Sect. 4 is in therange of three hours to a week (the images with 430 to 2000 pixels and file sizes of 800 KB to 16 MB). The two-wavelengthextraction of sources and filaments for the subfield of Orion Adescribed in Sect. 4.7 took 43 hours and required ∼
10 GB ofdisk space. The total processing time with getsf depends on thenumbers of pixels, wavelengths, iterations, detected sources andfilaments, and on the processor and file system speed and load. Asource extraction run on eight large images, each with 4800 pix-els (92 MB file size), that detects and measures ∼ ∼
200 GB of disk space. Mostof the time getsf spends in the iterative separation of structuralcomponents: the actual extraction of sources and filaments takesless than 10% of the runtime. For the source extraction alone, theexecution time is halved. In a properly planned research, the pro-cessing time is almost never a limiting factor: much more time isusually spent on the analysis and interpretation of the informa-tion delivered by the extraction and on describing the findings ina paper.Many intermediate images are produced in the getsf extrac-tions at each wavelength (for spatial decomposition, iterations,etc.), hence they require large storage space. Between hundredsof MB and GB may be necessary for an extraction, depending onthe image size and the numbers of wavebands and iterations. Itis necessary to keep many images until the end of the extractionprocess; however, most of them may be deleted by getsf after theextraction has finished. The extraction results themselves rep-resent only ∼
20% of the total size of the extraction directory.Computers with su ffi ciently large random access memory arerequired to run getsf extractions on very large images. For theabove range of image sizes, between 8 and 64 GB may be nec-essary (the more memory, the better). The actual memory usagestrongly depends on the number of sources being detected andmeasured. Numbers of sources up to ∼ getsf , but substantially larger numbers of detectedsources require very large memory and long execution time.
6. Conclusions
This paper presented getsf , the new multiscale method for ex-tracting both sources and filaments in astronomical images usingseparation of their structural components. It is specifically de-signed to handle multiwavelength sets of images and extremelycomplex filamentary backgrounds, but it is perfectly applicableto a single image or very simple backgrounds. The new code isfreely downloadable from its website , from the AstrophysicsSource Code Library , and also available from the author.The main processing steps of getsf include (1) preparation ofa complete set of images and derivation of high-resolution sur- http://irfu.cea.fr/Pisp/alexander.menshchikov/ https://ascl.net/2012.001 Article number, page 25 of 37 & A proofs: manuscript no. getsf face densities, (2) spatial decomposition of the original imagesand separation of the structural components of sources and fila-ments from each other and from their backgrounds, (3) flatteningof the residual noise and background fluctuations in the sepa-rate images of sources and filaments, (4) spatial decompositionof the flattened components of sources and filaments and theircombination of the over wavelengths, (5) detection of sources(positions) and filaments (skeletons) in the combined images ofthe components, and (6) measurements of the properties of thedetected sources and filaments and creation of the output cata-logs and images. Like its predecessor ( getold , Papers I–III), getsf has a single user-definable parameter (per wavelength), the max-imum size of the structures of interest to extract. All internal pa-rameters of getsf have been calibrated and verified in numeroustests using various images from simulations and observations toensure that the method works well in all cases.This paper formulated hires , the algorithm for the deriva-tion of high-resolution surface densities and temperatures fromthe di ff raction-limited multiwavelength far-infrared and submil-limeter continuum observations, such as those obtained with Herschel . A substantial improvement over the original algorithm(Palmeirim et al. 2013) is the angular resolution of the derivedsurface densities that may become as high as that of the shortest-wavelength image of a su ffi cient quality. In the case of the Her-schel observations, the resolution may be as high as 5 . (cid:48)(cid:48) for theslow scanning speed (20 (cid:48)(cid:48) s − ) or 8 . (cid:48)(cid:48) for the fast parallel mode(60 (cid:48)(cid:48) s − ). If the 70 µ m image appears too noisy, excessively con-taminated by the emission of polycyclic aromatic hydrocarbonsor transiently heated very small dust grains, or if it cannot beused for other reasons, then the highest resolution of surface den-sities is limited to that of the 100 or 160 µ m images, that is, to6 . − . (cid:48)(cid:48) or 8 . − . (cid:48)(cid:48) , for the slow- or fast-scanning modes,respectively. These high-resolution surface density images areespecially useful for the detailed studies of the highly complexstructural diversity in the observed images and for deeper under-standing of the physical processes within the heavily substruc-tured filaments and their relation to the formation of stars.This paper described the set of simulated multiwavelengthbenchmark images for testing and comparing the source and fil-ament extraction methods to allow the researchers who need toperform such extractions to choose the most accurate algorithmfor their projects. Although the benchmark was designed to re-semble the Herschel observations of star-forming regions, theimages are suitable for testing and evaluating extraction meth-ods for any astronomical projects and applications. It consists ofthe complex fluctuating background cloud, the long dense fila-ment, and many starless and protostellar cores with wide rangesof sizes, masses, and intensity profiles, computed with a radiativetransfer code. A separate paper (Men’shchikov 2021, submit-ted) presents a series of the multiwavelength source extractionswith getsf using three variants of the new benchmark with in-creasing complexity levels and compares their results with thoseproduced by getold . All benchmark images, the truth catalogscontaining the model parameters, and the reference extractioncatalogs obtained by the author with getsf are available on itswebsite.The new extraction method can be used to conduct consis-tent and comparable studies of sources and filaments in var-ious projects: getsf is designed to work for all images withnonzero background or noise fluctuations, where most pixelscarry nonzero measured signal. The method is not limited toany particular area of astronomical research nor to the type ofthe telescopes or instruments used, as demonstrated by its ap-plications to the images obtained with
XMM-Newton , GALEX , Hubble , Spitzer , Herschel , APEX , and
ALMA . Although no finitenumbers of specific examples can prove that getsf is universallyapplicable, they confirm a remarkably wide applicability of themethod.
Acknowledgements.
This study used the cfitsio library (Pence 1999), developedat HEASARC NASA (USA), saoimage ds9 (Joye & Mandel 2003) and wcstools (Mink 2002), developed at the Smithsonian Astrophysical Observatory (USA),and the stilts library (by Mark Taylor), developed at Bristol University (UK).The plot utility and ps12d library, used in this work to draw figures directly inthe PostScript language, were written by the author using the psplot library (byKevin E. Kohler), developed at Nova Southeastern University OceanographicCenter (USA), and the plotting subroutines from the MHD code azeus (Ramseyet al. 2012), developed by David Clarke and the author at Saint Mary’s Uni-versity (Canada). This work used observations obtained with
XMM–Newton , anESA science mission with instruments and contributions directly funded by ESAMember States and NASA. This work used observations made with the
Spitzer
Space Telescope, which is operated by the Jet Propulsion Laboratory, Califor-nia Institute of Technology, under a contract with NASA. This work used ob-servations made with the NASA / ESA
Hubble
Space Telescope, and obtainedfrom the
Hubble
Legacy Archive, which is a collaboration between the SpaceTelescope Science Institute (STScI / NASA), the Space Telescope European Co-ordinating Facility (ST-ECF / ESA) and the Canadian Astronomy Data Centre(CADC / NRC / CSA). This paper used the SCUBA-2 data obtained at
JCMT un-der program MJLSG31. The James Clerk Maxwell Telescope is operated by theEast Asian Observatory on behalf of The National Astronomical Observatory ofJapan; Academia Sinica Institute of Astronomy and Astrophysics; the Korea As-tronomy and Space Science Institute; Center for Astronomical Mega-Science (aswell as the National Key R&D Program of China with No. 2017YFA0402700).Additional funding support is provided by the Science and Technology FacilitiesCouncil of the United Kingdom and participating universities and organizationsin the United Kingdom and Canada. Additional funds for the construction ofSCUBA-2 were provided by the Canada Foundation for Innovation. This paperused the following ALMA data: ADS / JAO.ALMA / NRAO and NAOJ. The simulated surfacedensity background B was derived from a synthetic scale-free background im-age created by Ph. André. A large set of images, used for testing and validationof getsf , includes those obtained in the Herschel
Gould Belt Survey (HGBS,PI Ph. André), HOBYS (PIs F. Motte, A. Zavagno, S. Bontemps), and ALMA -IMF (PIs F. Motte, A. Ginsburg, F. Louvet, P. Sanhoueza). HGBS and HOBYSare the
Herschel
Key Projects jointly carried out by SPIRE Specialist Astron-omy Group 3 (SAG3), scientists of several institutes in the PACS Consortium(e.g., CEA Saclay, INAF-IAPS Rome, LAM / OAMP Marseille), and scientists ofthe
Herschel
Science Center (HSC). The author appreciates the valuable feed-back, received from G. Zhang, F. Louvet, and N. Kumar, on the getsf extractionsin the X-shaped nebula, MHD simulations, and Mon R2, respectively. The authoris grateful to A. Zavagno, T. Nony, Y. Shimajiri, Ph. André, D. Arzoumanian, andP. Palmeirim for their comments on the manuscript. http://gouldbelt-herschel.cea.fr http://hobys-herschel.cea.fr Article number, page 26 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf originaloriginal s backgrounds background f backgroundf background sourcessources extractedextracted filamentsfilaments stdevstdev s flatfactors flatfactor stdev (flat)stdev (flat)
Fig. 16.
Application of getsf to the
XMM-Newton λ ≈ . µ m image (7 (cid:48)(cid:48) resolution) of the supernova remnant RX J1713.7-3946, adopting { X | Y } λ = { , } (cid:48)(cid:48) . The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows thecomponent S λ , the footprint ellipses of 41 acceptably good sources on S λ D (red squares mark the spurious peaks), and the component F λ D with 13non-branching skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in the flattened component S λ R Q − λ . Intensities (in photons cm − s − ) are limited in rangewith square-root color mapping, except for Q λ , which is shown with linear mapping. Article number, page 27 of 37 & A proofs: manuscript no. getsf originaloriginal s backgrounds background f backgroundf background sourcessources extractedextracted filamentsfilaments stdevstdev s flatfactors flatfactor stdev (flat)stdev (flat)
Fig. 17.
Application of getsf to the
GALEX λ = . µ m image (4 (cid:48)(cid:48) resolution) of the spiral galaxy NGC 6744, adopting { X | Y } λ = (cid:48)(cid:48) . The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows the component S λ , the footprintellipses of 1130 acceptably good sources on S λ D , and the component F λ D with 147 skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in the flattenedcomponent S λ R Q − λ . Some skeletons may only appear to have branches because they were widened for this presentation. Intensities (in counts s − )are limited in range with logarithmic color mapping, except for Q λ , which is shown with squared mapping.Article number, page 28 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf originaloriginal s backgrounds background f backgroundf background sourcessources extractedextracted filamentsfilaments stdevstdev s flatfactors flatfactor stdev (flat)stdev (flat) Fig. 18.
Application of getsf to the
Hubble λ = . µ m image (0 . (cid:48)(cid:48) resolution) of the supernova remnant NGC 6960, adopting { X | Y } λ = { . , } (cid:48)(cid:48) .The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows the component S λ , thefootprint ellipses of 690 acceptably good sources on S λ D , and the component F λ D with 100 skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) .The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in theflattened component S λ R Q − λ . Some skeletons may only appear to have branches because they were widened for this presentation. Intensities (inelectrons s − ) are limited in range with logarithmic color mapping, except for Q λ , which is shown with square-root mapping.Article number, page 29 of 37 & A proofs: manuscript no. getsf originaloriginal s backgrounds background f backgroundf background sourcessources extractedextracted filamentsfilaments stdevstdev s flatfactors flatfactor stdev (flat)stdev (flat)
Fig. 19.
Application of getsf to the
Spitzer λ = µ m image (6 (cid:48)(cid:48) resolution) of the L 1688 star-forming cloud, adopting { X | Y } λ = (cid:48)(cid:48) . The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows the component S λ , the footprintellipses of 1162 acceptably good sources on S λ D , and the component F λ D with 286 skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in the flattenedcomponent S λ R Q − λ . Some skeletons may only appear to have branches because they were widened for this presentation. Intensities (in MJy sr − )are limited in range, with logarithmic color mapping.Article number, page 30 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf originaloriginal s backgrounds background f backgroundf background sourcessources extractedextracted filamentsfilaments stdevstdev s flatfactors flatfactor stdev (flat)stdev (flat) Fig. 20.
Application of getsf to the
Herschel surface density (13 . (cid:48)(cid:48) resolution) of the starless core L 1689B, embedded in a filament, adopting { X | Y } λ = { , } (cid:48)(cid:48) . The top row shows the original hires image D (cid:48)(cid:48) obtained from Eq. (8) and the backgrounds B λ { X | Y } of sources and filaments.The middle row shows the component S λ , the footprint ellipses of 12 acceptably good sources on S λ D , and the component F λ D with one skeleton K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flatteningimage Q λ , and the standard deviations in the flattened component S λ R Q − λ . Surface densities (in N H cm − ) are limited in range with logarithmiccolor mapping, except for Q λ , which is shown with linear mapping. Article number, page 31 of 37 & A proofs: manuscript no. getsf -483 206 2394 9269 31151 99899 originaloriginal -483 206 2394 9269 31151 99899 s backgrounds background -483 206 2394 9269 31151 99899 f backgroundf background
93 447 1800 6900 26402 99967 sourcessources extractedextracted filamentsfilaments
55 142 421 1297 4085 12844 stdevstdev
43 56 70 83 97 110 s flatfactors flatfactor stdev (flat)stdev (flat)
Fig. 21.
Application of getsf to the
APEX λ = µ m image (8 (cid:48)(cid:48) resolution) of the NGC 6334 star-forming cloud, adopting { X | Y } λ = (cid:48)(cid:48) . The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows the component S λ , the footprintellipses of 91 acceptably good sources on S λ D , and the component F λ D with 26 skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in the flattenedcomponent S λ R Q − λ . Some skeletons may only appear to have branches because they were widened for this presentation. Intensities (in MJy sr − )are limited in range with logarithmic color mapping, except for Q λ , which is shown with linear mapping.Article number, page 32 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf -10 7 86 450 2152 9999 originaloriginal -10 7 86 450 2152 9999 s backgrounds background -10 7 86 450 2152 9999 f backgroundf background sourcessources extractedextracted filamentsfilaments stdevstdev s flatfactors flatfactor stdev (flat)stdev (flat) Fig. 22.
Application of getsf to the
JCMT λ = µ m image (14 . (cid:48)(cid:48) resolution) of the Orion A star-forming cloud, adopting { X | Y } λ = (cid:48)(cid:48) . The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows the component S λ , the footprintellipses of 212 acceptably good sources on S λ D , and the component F λ D with 199 skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in the flattenedcomponent S λ R Q − λ . Some skeletons may only appear to have branches because they were widened for this presentation. Intensities (in MJy sr − )are limited in range with logarithmic color mapping, except for Q λ , which is shown with linear mapping. Article number, page 33 of 37 & A proofs: manuscript no. getsf -491 -145 955 4409 15405 49949 originaloriginal -491 -145 955 4409 15405 49949 s backgrounds background -491 -145 955 4409 15405 49949 f backgroundf background
108 450 1536 4950 15815 49950 sourcessources extractedextracted filamentsfilaments
22 90 307 990 3163 9990 stdevstdev
10 21 38 62 93 130 s flatfactors flatfactor stdev (flat)stdev (flat)
Fig. 23.
Application of getsf to the
ALMA λ = µ m image (0 . (cid:48)(cid:48) resolution) of the W43-MM1 star-forming cloud, adopting { X | Y } λ = { . , . } (cid:48)(cid:48) .The top row shows the original image I λ and the backgrounds B λ { X | Y } of sources and filaments. The middle row shows the component S λ , thefootprint ellipses of 44 acceptably good sources on S λ D , and the component F λ D with 15 skeletons K k corresponding to the scales S k ≈ (cid:48)(cid:48) . The bottom row shows the standard deviations U λ in the regularized component S λ R , the flattening image Q λ , and the standard deviations in theflattened component S λ R Q − λ . Some skeletons may only appear to have branches because they were widened for this presentation. Intensities (inMJy sr − ) are limited in range with logarithmic color mapping, except for Q λ , which is shown with square-root mapping.Article number, page 34 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf Appendix A: Inaccuracies of the derived surfacedensities and temperatures
The algorithms described in Sect. 3.1.2 imply that the 160, 250,350, and 500 µ m images have an accurate (consistent) intensitycalibration. When we assume that the calibration inaccuraciescan be described by constant wavelength-dependent o ff sets, sim-ple consistency checks and corrections can be made. Three in-dependent estimates of low-resolution temperatures ( T L1 , T L2 , T L3 ) are readily available from fitting the images in three pairsof wavebands (160 − − − µ m) with a lowresolution of O . If the median values of the three temperaturemaps di ff er by more than several percent, it would be necessaryto adjust some of the o ff sets and estimate T L { | | } again. This it-erative process is stopped when the three temperatures becomeconsistent.The higher-resolution images are obtained at the cost of sig-nificantly stronger noise and greater chances of distortions andspurious peaks. The quality of the resulting {D|T } P from Eq. (5)strongly depends on the quality of the original short-wavelengthimages. Higher levels of noise or map-making artifacts in the 250and 160 µ m images would be amplified in the resulting maps inthe process of fitting the spectral shapes Π λ of pixels, which islikely to create significant small-scale distortions, predominantlyin the pixels with strong line-of-sight temperature gradients thatare usually located over the dense sources or filaments. The dif-ferential terms δ {D|T } { | } that contain the higher-resolution in-formation are increasingly less accurate because they are ob-tained from fitting of only three and two (noisier) images. It isvery important to carefully inspect {D|T } P to ensure that theyare free of spurious small-scale structures before using them inany extraction. The hires images {D|T } O H from Eqs. (8) and (11)are much less a ff ected by the problems because they use the con-tributions δ {D|T } { | | } from all three variants of the fitted tem-peratures for each of the six resolutions of the original images.The essential idea of the di ff erential algorithm for improv-ing the angular resolution of surface density was validated us-ing the benchmark images (Sect. 2). The complete surface den-sity D C + D S was first convolved to the resolutions of all Her-schel wavebands (Sect. 2.3). The algorithm of Eq. (5), gener-alized to all six wavebands, was then applied to improve thelowest-resolution surface density using the unsharp masking ofEq. (6) and to successively recover each of the higher-resolutionsurface densities, all the way up to the highest adopted resolu-tion O = . (cid:48)(cid:48) , with a resulting maximum error below 0 . Π λ , hence they inevitably su ff er from larger inaccuracies(Fig. A.1).The derived surface densities D P and D O H (Sect. 3.1.2) arenot suitable for measuring dense structures, especially those witha central source of heating, because their inaccuracies in the pix-els with strong line-of-sight temperature gradients are too large(e.g., Men’shchikov 2016). Comparisons with the true surfacedensities in Fig. A.1 show that the vast majority of the pixelsoutside bright sources are quite accurate, to better than 0 . ff erent radial temperature profiles, therefore theerrors that are induced in the derived surface densities are alsovery dissimilar in both their sign and magnitude. Appendix B: Single-scale spatial decompositionand standard deviations
Following the getold general approach, getsf employs succes-sive unsharp masking to decompose the prepared original images I λ (Sect. 3.1.1) into N S single scales, I λ j = G j − ∗ I λ − G j ∗ I λ , j = , , . . . , N S , (B.1)where G j are the circular Gaussian convolution kernels ( G is tobe regarded as the delta function) with progressively increasinghalf-maximum sizes, S j = f S j − , S = S f − , S min ≤ S j ≤ S max , (B.2)where f > f ≈ .
05) andthe limiting scales of the decomposition range are S min = max (2 ∆ , .
33 min λ ( O λ )) , S max = max λ (max (4 X λ , Y λ )) , (B.3)where ∆ is the pixel size. The first image I λ contains the con-tribution from all scales below S min , whereas the last image I λ N S does not contain the signals from the scales above S max , they areoutside the range of scales being analyzed. The convolution isdone with rescaling to conserve the total flux, hence the origi-nals I λ can be recovered by summation of the N S scales and allremaining largest spatial scales, I λ = N S (cid:88) j = I λ j + G N S ∗ I λ . (B.4)The spatial decomposition is illustrated in Fig. B.1 usingan example of a simple two-dimensional Gaussian shape P . Asdemonstrated in Papers I and II, the spatial decomposition hasmany useful properties. The filtered single-scale images con-tain signals from a relatively narrow range of spatial scales, andtheir properties resemble the Gaussian statistics much better thanthose of the originals, which are blends of all spatial scales. Onthe scales much smaller than the image size, the decomposed im-ages are well described by the global value of the standard devia-tion σ λ j . Significant departures from the Gaussian distribution insingle scales above a certain threshold (e.g., I λ j > ∼ σ λ j ) indicatethe presence of the real structures. The decomposition highlightsthe structures of a specific width in the decomposed images on amatching scale. For example, a resolved isolated circular sourcewith a half-maximum size H λ has its maximum brightness in I λ j on the scale S j ≈ H λ and a completely unresolved source pro-duces the brightest signal on the smallest spatial scales S j < ∼ O λ .Following the getold approach (Papers I and II), getsf em-ploys an iterative algorithm to determine the single-scale σ λ j over the entire usable area I λ j M λ of the image to separate thereal structures from other insignificant background or noise fluc-tuations. Before the iterations, the global σ λ j and the thresh-old (cid:36) λ j = σ λ j are computed over all pixels. At the first andall subsequent iterations ( i = , , . . . , N I ), significant peaks andhollows with | I λ j | ≥ (cid:36) λ ji − are masked. The absolute value istaken, because structures have both positive and negative coun-terparts in the decomposed images. Then getsf calculates a new(lower) σ λ ji value outside the masked areas and all structureswith | I λ j | ≥ (cid:36) λ ji are masked again. The iterations continue untilthe threshold converges to a stable value of (cid:36) λ ji , with corrections δ(cid:36) λ ji < σ λ j = (cid:36) λ j / σ λ = (cid:80) j σ λ j . The constant 3,chosen empirically, provides both suitable values of the resulting σ λ j values and good convergence of the iterations. Article number, page 35 of 37 & A proofs: manuscript no. getsf -0.13 -0.067 0.0002 0.067 0.13 0.2
D 8.4" relative errorsD 8.4" relative errors -0.13 -0.067 0.0002 0.067 0.13 0.2
D 18.2" relative errorsD 18.2" relative errors -0.13 -0.067 0.0002 0.067 0.13 0.2
D 36.3" relative errorsD 36.3" relative errors -0.067 -0.033 9.8e-05 0.033 0.067 0.1
T 8.4" relative errorsT 8.4" relative errors -0.067 -0.033 9.8e-05 0.033 0.067 0.1
T 18.2" relative errorsT 18.2" relative errors -0.067 -0.033 9.8e-05 0.033 0.067 0.1
T 36.3" relative errorsT 36.3" relative errors
Fig. A.1.
Relative accuracies (cid:15) of the hires surface densities and temperatures derived from Eq. (8) (Sect. 3.1.2) with respect to the true modelimages convolved to the same resolutions. The top row shows the errors in D (cid:48)(cid:48) ( σ = . D (cid:48)(cid:48) ( σ = . D (cid:48)(cid:48) ( σ = .
05) and the bottom rowshows the errors in T (cid:48)(cid:48) ( σ = . T (cid:48)(cid:48) ( σ = . T (cid:48)(cid:48) ( σ = . (cid:48)(cid:48) , the derived images are the most accurate,with the exception of the unresolved protostellar peak surface densities (Fig. 3), which become strongly overestimated (up to a factor of ∼ T { | | } along the lines of sight with large temperature gradients are underestimated. The range of displayed values isreduced for better visibility. Linear color mapping. A notable di ff erence with getold is that getsf does not needto correct the iterated thresholds using the higher-order statisticalmoments (skewness and kurtosis) because significant structuresare detected in accurately flattened detection images (Sect. 3.4),which ensures that the majority of pixels resemble a normal dis-tribution. Furthermore, precise σ λ j values are of relatively minorimportance for the separation of structural components becausethe separation is done in iterations and is based on the shapesthat are removed from the single-scale slices (Sect. 3.2.2), noton the σ λ j value itself. Appendix C: Software suite
The method has been developed as a bash script getsf that exe-cutes a number of FORTRAN utilities, doing all numerical com-putations. Linux or macOS systems with the ifort or gfortran compilers can be used to install the code. For reading and writ-ing images, getsf uses the cfitsio library (Pence 1999); for re-sampling and reprojecting images, it calls swarp (Bertin et al.2002); for convolving images, it uses the fast Fourier transformroutine rlft3 (Press et al. 1992); for determining the source coor-dinates α and δ , it applies xy2sky from wcstools (Mink 2002); and for a colored screen output, it uses the highlight utility (byAndré Simon) , if the latter is installed.The following list of the getsf utilities and scripts explainstheir purpose and functions. They are quite useful for command-line image manipulations, even if there is no need in a source orfilament extraction. Their usage information is displayed whena utility is run without any parameter. The utilities are sorted inthe decreasing sequence of their general usability outside getsf . modfits modify an image or its header in various ways:math transformations; profiling an image along aline; image segmentation; filament skeletonization;removal of connected clusters of pixels; addition orremoval of border areas; correction of saturated orbad pixel areas; conversion of intensity units;changes of the header keywords; etc . operate operate on two input images: addition, subtraction,multiplication, division; relative di ff erencing;minimization or maximization; extension orexpansion of masks; copying of an image header;computation of surface densities, temperatures, or Article number, page 36 of 37. Men’shchikov: Multiscale, multiwavelength extraction of sources and filaments: getsf -60 -40 -20 0 20 40 60
Angular distance ( † ) -1.0 0.0 1.0 2.0 3.0 4.0 I n t en s i t y P † S < S † S † S † S † S † S > Fig. B.1.
Single-scale spatial decomposition for an unresolved source P with a peak value of 100 and resolution O λ = (cid:48)(cid:48) into 99 scales between S min = (cid:48)(cid:48) and S max = (cid:48)(cid:48) , with the scale factor f = . S < S min to S max ), and of the largest scales ( G ∗ P ), outside thedecomposition range ( S > S max ). intensities; etc . imgstat display and / or save image statistical quantities;produce mode-, mean-, or median-filtered images;compute images of standard deviations, skewness,kurtosis; etc . ff tconv fast Fourier transform or convolve image with fewpredefined kernels or an external kernel image fitfluxes fit spectral shapes of source fluxes or image pixelintensities to derive masses or surface densities convolve convolve an image to a desired lower resolution resample resample and reproject an image with rotation hires high-resolution surface densities and temperatures prepobs convert observed images into the same pixel grid installg install getsf on a computer (macOS, Linux) iospeed test I / O speed of a hard drive for a specific image readhead display an image header or save selected keywords cleanbg interpolate background below source footprints ellipses overlay an image with ellipses of extracted sources sfinder detect sources in combined single-scale images smeasure measure and catalog properties of detected sources fmeasure measure and catalog properties of detected filaments finalcat produce the final catalogs of detected sources expanda expand masked areas of an image to its edges extractx extract all image extensions in separate images splitcube split a data cube into separate imagesThe code is automated, flexible, and user-friendly; it can bedownloaded from the website , the Astrophysics Source CodeLibrary , and it is also available from the author upon request.The website also contains a detailed User’s Guide and a com-plete validation extraction of sources and filaments in a smallimage for those who would like to verify that their installed getsf produces correct results. http://irfu.cea.fr/Pisp/alexander.menshchikov/ https://ascl.net/2012.001 References
Abràmo ff , M. D., Magalhães, P. J., & Ram, S. J. 2004, Biophotonics Interna-tional, 11, 36Acero, F., Katsuda, S., Ballet, J., & Petre, R. 2017, A&A, 597, A106André, P., Men’shchikov, A., Bontemps, S., et al. 2010, A&A, 518, L102 + André, P., Revéret, V., Könyves, V., et al. 2016, A&A, 592, A54Aniano, G., Draine, B. T., Gordon, K. D., & Sandstrom, K. 2011, PASP, 123,1218Arzoumanian, D., André, P., Didelon, P., et al. 2011, A&A, 529, L6 + Arzoumanian, D., André, P., Könyves, V., et al. 2019, A&A, 621, A42Berry, D. S. 2015, Astronomy and Computing, 10, 22Bertin, E., Mellier, Y., Radovich, M., et al. 2002, in Astronomical Society of thePacific Conference Series, Vol. 281, Astronomical Data Analysis Softwareand Systems XI, ed. D. A. Bohlender, D. Durand, & T. H. Handley, 228Black, J. H. 1994, in Astronomical Society of the Pacific Conference Series,Vol. 58, The First Symposium on the Infrared Cirrus and Di ff use InterstellarClouds, ed. R. M. Cutri & W. B. Latter, 355Bouwman, J. 2001, PhD thesis, University of AmsterdamClark, S. E., Peek, J. E. G., & Putman, M. E. 2014, ApJ, 789, 82Evans, Neal J., I., Dunham, M. M., Jørgensen, J. K., et al. 2009, ApJS, 181, 321Fesen, R. A., Weil, K. E., Cisneros, I. A., Blair, W. P., & Raymond, J. C. 2018,MNRAS, 481, 1786Hacar, A., Tafalla, M., Forbrich, J., et al. 2018, A&A, 610, A77Hennemann, M., Motte, F., Schneider, N., et al. 2012, A&A, 543, L3Hilditch, C. J. 1969, in Machine Intelligence, ed. B. Meltzer & D. Michie, Vol. 4,403–420Joye, W. A. & Mandel, E. 2003, in Astronomical Society of the Pacific Confe-rence Series, Vol. 295, Astronomical Data Analysis Software and SystemsXII, ed. H. E. Payne, R. I. Jedrzejewski, & R. N. Hook, 489Juvela, M. 2016, A&A, 593, A58Kirk, H., Hatchell, J., Johnstone, D., et al. 2018, ApJS, 238, 8Kirk, J. M., Ward-Thompson, D., Palmeirim, P., et al. 2013, MNRAS, 432, 1424Koch, E. W. & Rosolowsky, E. W. 2015, MNRAS, 452, 3435Könyves, V., André, P., Men’shchikov, A., et al. 2015, A&A, 584, A91Ladjelate, B., André, P., Könyves, V., et al. 2020, A&A, 638, A74Lane, J., Kirk, H., Johnstone, D., et al. 2016, ApJ, 833, 44Lee, J. C., Gil de Paz, A., Kennicutt, Robert C., J., et al. 2011, ApJS, 192, 6Li, S., Sanhueza, P., Zhang, Q., et al. 2020, ApJ, 903, 119Mack, J., Levay, Z. G., Christian, C. A., et al. 2015, Hubble
Heritage projectMcMullin, J. P., Waters, B., Schiebel, D., Young, W., & Golap, K. 2007, in As-tronomical Society of the Pacific Conference Series, Vol. 376, AstronomicalData Analysis Software and Systems XVI, ed. R. A. Shaw, F. Hill, & D. J.Bell, 127Men’shchikov, A. 2013, A&A, 560, A63, (Paper II)Men’shchikov, A. 2016, A&A, 593, A71Men’shchikov, A. 2017, A&A, 607, A64, (Paper III)Men’shchikov, A., André, P., Didelon, P., et al. 2010, A&A, 518, L103 + Men’shchikov, A., André, P., Didelon, P., et al. 2012, A&A, 542, A81, (Paper I)Mink, D. J. 2002, in Astronomical Society of the Pacific Conference Series, Vol.281, Astronomical Data Analysis Software and Systems XI, ed. D. A. Bohlen-der, D. Durand, & T. H. Handley, 169– + Molinari, S., Schisano, E., Faustini, F., et al. 2011, A&A, 530, A133 + Motte, F., André, P., & Neri, R. 1998, A&A, 336, 150Motte, F., André, P., Ward-Thompson, D., & Bontemps, S. 2001, A&A, 372, L41Motte, F., Nony, T., Louvet, F., et al. 2018, Nature Astronomy, 2, 478Motte, F., Zavagno, A., Bontemps, S., et al. 2010, A&A, 518, L77 + Nony, T., Motte, F., Louvet, F., et al. 2020, A&A, 636, A38Ntormousi, E. & Hennebelle, P. 2019, A&A, 625, A82Ossenkopf, V. & Henning, T. 1994, A&A, 291, 943Palmeirim, P., André, P., Kirk, J., et al. 2013, A&A, 550, A38Parravano, A., Hollenbach, D. J., & McKee, C. F. 2003, ApJ, 584, 797Pence, W. 1999, in Astronomical Society of the Pacific Conference Series, Vol.172, Astronomical Data Analysis Software and Systems VIII, ed. D. M.Mehringer, R. L. Plante, & D. A. Roberts, 487– + Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Nu-merical recipes in FORTRAN. The art of scientific computing (CambridgeUniversity Press, 2nd ed.)Ramsey, J. P., Clarke, D. A., & Men’shchikov, A. B. 2012, ApJS, 199, 13Rosolowsky, E. W., Pineda, J. E., Kau ff mann, J., & Goodman, A. A. 2008, ApJ,679, 1338Sanhueza, P., Contreras, Y., Wu, B., et al. 2019, ApJ, 886, 102Schisano, E., Rygl, K. L. J., Molinari, S., et al. 2014, ApJ, 791, 27Shimajiri, Y., André, P., Ntormousi, E., et al. 2019, A&A, 632, A83Smith, A. R. 1979, in SIGGRAPH’79: Proc. of the 6th annual conference onComputer graphics and interactive techniques (New York: ACM), 276–283Sousbie, T. 2011, MNRAS, 414, 350mann, J., & Goodman, A. A. 2008, ApJ,679, 1338Sanhueza, P., Contreras, Y., Wu, B., et al. 2019, ApJ, 886, 102Schisano, E., Rygl, K. L. J., Molinari, S., et al. 2014, ApJ, 791, 27Shimajiri, Y., André, P., Ntormousi, E., et al. 2019, A&A, 632, A83Smith, A. R. 1979, in SIGGRAPH’79: Proc. of the 6th annual conference onComputer graphics and interactive techniques (New York: ACM), 276–283Sousbie, T. 2011, MNRAS, 414, 350