[PDF] A comparative analysis of denoising algorithms for extragalactic imaging surveys

Abstract

We present a comprehensive analysis of the performance of noise-reduction (``denoising'') algorithms to determine whether they provide advantages in source detection on extragalactic survey images. The methods under analysis are Perona-Malik filtering, Bilateral filter, Total Variation denoising, Structure-texture image decomposition, Non-local means, Wavelets, and Block-matching. We tested the algorithms on simulated images of extragalactic fields with resolution and depth typical of the Hubble, Spitzer, and Euclid Space Telescopes, and of ground-based instruments. After choosing their best internal parameters configuration, we assess their performance as a function of resolution, background level, and image type, also testing their ability to preserve the objects fluxes and shapes. We analyze in terms of completeness and purity the catalogs extracted after applying denoising algorithms on a simulated Euclid Wide Survey VIS image, on real H160 (HST) and K-band (HAWK-I) observations of the CANDELS GOODS-South field. Denoising algorithms often outperform the standard approach of filtering with the Point Spread Function (PSF) of the image. Applying Structure-Texture image decomposition, Perona-Malik filtering, the Total Variation method by Chambolle, and Bilateral filtering on the Euclid-VIS image, we obtain catalogs that are both more pure and complete by 0.2 magnitudes than those based on the standard approach. The same result is achieved with the Structure-Texture image decomposition algorithm applied on the H160 image. The advantage of denoising techniques with respect to PSF filtering increases at increasing depth. Moreover, these techniques better preserve the shape of the detected objects with respect to PSF smoothing. Denoising algorithms provide significant improvements in the detection of faint objects and enhance the scientific return of current and future extragalactic surveys.

Full PDF

AAstronomy & Astrophysics manuscript no. aanda c (cid:13)

ESO 2020September 7, 2020

A comparative analysis of denoising algorithms for extragalacticimaging surveys

V. Roscani , S. Tozza , M. Castellano , E. Merlin , D. Ottaviani , M. Falcone , and A. Fontana INAF - Osservatorio Astronomico di Roma, Via Frascati 33, I-00040, Monteporzio, Italy. e-mail: [email protected] Department of Mathematics, Sapienza University of Rome, P. le Aldo Moro, 5, 00185 Rome, Italy Department of Mathematics and Applications “Renato Caccioppoli”, University of Naples Federico II, Via Cintia, Monte S. Angelo,I-80126 Naples, Italy

ABSTRACT

Aims.

We present a comprehensive analysis of the performance of noise-reduction (“denoising”) algorithms to determine whetherthey provide advantages in source detection, mitigating noise on extragalactic survey images.

Methods.

The methods under analysis are representative of di ﬀ erent algorithmic families: Perona-Malik ﬁltering, Bilateral ﬁlter,Total Variation denoising, Structure-texture image decomposition, Non-local means, Wavelets, and Block-matching. We tested thealgorithms on simulated images of extragalactic ﬁelds with resolution and depth typical of the Hubble, Spitzer, and Euclid SpaceTelescopes, and of ground-based instruments. After choosing their best internal parameters conﬁguration, we assess their performanceas a function of resolution, background level and image type, also testing their ability to preserve the objects ﬂuxes and shapes. Finally,we analyze in terms of completeness and purity the catalogs extracted after applying denoising algorithms on a simulated Euclid WideSurvey VIS image, on real H160 (HST) and K-band (HAWK-I) observations of the CANDELS GOODS-South ﬁeld. Results.

Denoising algorithms often outperform the standard approach of ﬁltering with the Point Spread Function (PSF) of the image.Applying Structure-Texture image decomposition, Perona-Malik ﬁltering, the Total Variation method by Chambolle, and Bilateralﬁltering on the Euclid-VIS image, we obtain catalogs that are both more pure and complete by 0.2 magnitudes than those based onthe standard approach. The same result is achieved with the Structure-Texture image decomposition algorithm applied on the H160image. The relative advantage of denoising techniques with respect to PSF ﬁltering increases at increasing depth.Moreover, these techniques better preserve the shape of the detected objects with respect to PSF smoothing.

Conclusions.

Denoising algorithms provide signiﬁcant improvements in the detection of faint objects and enhance the scientiﬁc returnof current and future extragalactic surveys. We identify the most promising denoising algorithms among the 20 considered techniques.

Key words.

Techniques: image processing – Methods: numerical, data analysis – Surveys

1. Introduction

Measuring the amount of photons that we receive from astro-nomical sources over a given range of wavelengths is the pri-mary way to gather information about the Universe. From theadvent of digital photography in the 1980’s, charge coupled de-vice (CCD) imaging is one of the primary ways by which we doso. Currently, CCD devices can reach 100 million pixels, withread noise as low as one electron, almost 100% quantum e ﬃ -ciency, and sensitivity from the X-rays to the near-infrared (seee.g. Lesser 2015, for a review).Before being ready for the extraction of meaningful scientiﬁccontent, astronomical images must be processed to, for instance,combine di ﬀ erent observations into a single mosaic, correct forﬂat-ﬁeld, transients, artifacts, and defects, subtract a global orlocal background, etc. Once these preparatory steps are com-pleted, the quality of the image mainly depends on its resolutioncapability (which is proportional to λ/ D , the ratio between theobserved wavelength and the diameter of the telescope, in thecase of di ﬀ raction-limited instruments, e.g. space observatories;or from the atmospheric seeing for ground-based facilities), andon its depth (i.e. magnitude at a given reference signal-to-noiseratio (SNR), which mainly depends on the duration of the ob-servations (exposure time). Since increasing the latter is oftenunfeasible or too demanding, searching for alternative methods to increase the SNR is important. A possible solution can be theapplication of noise reduction (“denoising”) techniques.Wavelet transforms are a standard and popular tool used fordenoising and detection of sources on astronomical images. Thetechnique is extremely versatile, as it can be applied on a widerange of scientiﬁc cases (e.g. X-ray images: XMM-LSS survey(Pierre et al. 2004) and Fermi catalog (Ackermann et al. 2013;Principe et al. 2018), Cosmic Microwave Background maps(Starck et al. 2004), N-body simulations (Romeo et al. 2003),etc.). Other interesting applications are summarized in Starcket al. (2006). Among the di ﬀ erent possible implementations, thewidely used is the so-called " à trous " algorithm (Shensa 1992).This algorithm is an undecimated wavelet transform (UWT),which is also isotropic, and because of this, it is very e ﬃ cient forthe detection of isotropic objects. It is therefore clear the reasonof its popularity for astronomical image processing, where manyobjects are nearly isotropic (e.g. stars, galaxies, galaxy clusters)(Starck et al. 2014).For what concerns extragalactic optical / near-infrared imag-ing, images are typically convolved with a PSF shaped kernel toenhance source detection (an application of the lemma by Ney-man & Pearson 1933); this is the most standard example of adenoising algorithm, since ﬁltering reduces the noise variance,allowing real sources to raise above the background. In many Article number, page 1 of 30 a r X i v : . [ a s t r o - ph . I M ] S e p & A proofs: manuscript no. aanda familiar cases, the typical PSFs of telescopes are quite similarto 2D Gaussians, making the PSF ﬁltering basically indistin-guishable from a Gaussian ﬁltering. However, in several non-astronomical applications of image analysis this approach is of-ten outclassed by other, more reﬁned methods, designed to bemore e ﬃ cient and to better preserve the borders and edges of thesources.A special mention is required for machine learning and deeplearning techniques, lately becoming popular in the image pro-cessing ﬁeld and in astronomy. An interesting application is pro-posed by Beckouche et al. (2013), where a sparse dictionarylearning algorithm has been tested on multiple astronomical im-ages. On the other hand, autoencoder neural networks for imagedenoising seem to be very promising (Vincent et al. 2008, 2010;Xie et al. 2012). Some applications of autoencoders on di ﬀ erentareas of astronomy can be found in the literature, even if theyare primarily used for other purposes (e.g. spectral energy distri-bution recovery (Frontera-Pons et al. 2017), gravitational wavesdenoising (Shen et al. 2017), stellar cluster detection (Karmakaret al. 2018)). Although these techniques are undoubtedly highlyappealing, the aim of this work is to perform a comprehensivecomparison of "traditional" denoising algorithms based on nonlinear partial di ﬀ erential equations (PDEs) and variational meth-ods. The comparison with other approaches based on machinelearning will be developed in future works.We compared several classes of denoising techniques, in or-der to ﬁnd which ones yield the best improvements in sourcedetection. To this aim, we have performed an extended set oftests. We considered many di ﬀ erent noise reduction algorithms,roughly belonging to the following families: Perona-Malik(PM) ﬁltering, Bilateral ﬁlter, Total Variation (TV) denoising,Structure-texture image decomposition, Block-matching, Non-local means, and Wavelets. Note that the numerical methodsemployed ranges from variational methods to PDEs-based tech-niques, also including some statistical methods.We tested them using two di ﬀ erent datasets. First, we fo-cused on simulated images, created by state-of-the-art codes andprescriptions in order to mimic di ﬀ erent realistic cases. Thissimpliﬁed environment has the advantages to allow a detailedanalysis of the results, since the “truth” is perfectly known. Forreal images, we applied the algorithms giving the best resultsobtained on the simulated dataset to check if the improvementis conﬁrmed. To our knowledge, this is the ﬁrst attempt to ex-tensively compare a large number of denoising algorithms inan astrophysical context. In general, the performance of any ofthese methods depends on the kind of noise that a ﬀ ects the im-age. Here we are mainly interested in extragalactic imaging, andin particular we focus on the next generation of optical - near-infrared instruments and surveys such as Euclid (Laureijs et al.2011), LSST (LSST Science Collaboration et al. 2009), DES(Dark Energy Survey Collaboration et al. 2016), and WFIRST(Spergel et al. 2015). The paper is organized as follows: in Sect.2 we list and brieﬂy describe all the denoising methods used inour tests, providing mathematical formulations and code infor-mation. In Sect. 3 we present our datasets. In Sect. 4 we describeand discuss all the tests we carried out, showing the results weobtained in Sect. 5. In Sect. 6 we apply the methods on realimages from space and ground-based. Finally, Sect. 7 summa-rizes the main results, discussing also the possible future applica-tions. Throughout this paper we adopt the AB magnitude system(Oke & Gunn 1983) and a Λ CDM cosmology with Ω m = . Ω Λ = . H = Kms − M pc − .

2. Denoising techniques

The focus of this paper is on the comparison of di ﬀ erent tech-niques proposed in the literature, when they are applied on as-tronomical images. As we said in the introduction, these imageshave speciﬁc features. So an e ﬃ cient denoising method is crucialto extract the information contained in the image and could beused as a preliminary step for other image processing problems,like the image segmentation and / or deblurring.In order to select the most e ﬃ cient denoising approach forextragalactic survey images we have analyzed di ﬀ erent classesof methods in order to cover the main families of noise reduc-tion techniques, namely non-linear ﬁltering (2.2), bilateral ﬁlter(2.3), TV denoising (2.4), image decomposition (2.5), wavelets(2.6) and non-local means (2.7).We brieﬂy summarize the mathematical formulations of thesetechniques and we provide information on the codes for repro-ducibility. Let us consider the intensity function I ( x , y ) of a noisy image,with ( x , y ) ∈ Ω , where Ω ⊂ R is the reconstruction domain.Let I clean be the desired clean image. An image with a Gaussiannoise component is I ( x , y ) = I clean ( x , y ) + η (1)where η ∼ N ( µ, σ ) is the additive noise component.Of course, we want to reconstruct I clean from I .This ﬁlter uses a Gaussian function for calculating the transfor-mation to apply to each pixel in the image. Mathematically, ap-plying a Gaussian ﬁlter to an image corresponds to convolve theimage with a Gaussian function. Since the Fourier transform ofa Gaussian is another Gaussian, applying a Gaussian smoothinghas the e ﬀ ect of reducing the image’s high-frequency compo-nents; a Gaussian ﬁlter is then a low-pass ﬁlter . In two dimen-sions, it is the product of two Gaussian functions, one in eachdimension, so that the low-pass Gaussian ﬁlter is G σ ( x , y ) : = πσ exp − x + y σ (2)where x is the distance from the origin in the horizontal axis, y is the distance from the origin in the vertical axis, and σ is thestandard deviation of the Gaussian distribution.Filtering the image I : Ω ⊂ R → R with a "low-pass" Gaus-sian ﬁlter corresponds to process it with the heat equation (Gabor1965; Lindenbaum et al. 1994), that is solving the following lin-ear PDE  ∂ I ∂ t ( x , y , t ) = ∇ I ( x , y , t ) ∀ ( x , y , t ) ∈ Ω × (0 , T C ] ,∂ I ∂η ( x , y , t ) = , ∀ ( x , y , t ) ∈ ∂ Ω × (0 , T C ] , I ( x , y , = I ( x , y ) , ∀ ( x , y ) ∈ Ω , (3)which has a di ﬀ usive e ﬀ ect on the initial datum I , for a smallﬁxed time T C >

0. The relation between the Gaussian ﬁlter (2)and the problem (3) is that the solution of the heat equation is aconvolution with the Gaussian ﬁlter, i.e. I ( x , y , t ) = ( G σ ( x , y ) ∗ I )( x , y ) (4) Article number, page 2 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys with σ = √ t .It is well known that applying that ﬁlter does not preserve edges.This edge blurring is due to the isotropic di ﬀ usion.We can get an improvement in three di ﬀ erent ways:1. Modifying the heat equation (see Sect. 2.2)2. Making convolution "nonlinear" (see Sect. 2.3)3. Deﬁning an optimization problem (see Sect. 2.4). Code information

In this paper we have used a simple Gaussian smoothing usinga kernel that approximates a PSF of known Full Width at HalfMaximum (FWHM) (Winkler 2011), referring to it as "PSF",whereas with "Gaussian" we refer to the Gaussian ﬁlter with in-ternal parameter σ . We made use of the "gaussian_ﬁlter" routineimplemented in the Python package Scipy (Jones et al. 2001),with σ ≈ FWHM pixel . , easily obtained deﬁning the Gaussian ker-nel radius r = x + y , where the kernel maximum is at r = FWHM = √ σ , see also https://brainder.org/2011/08/20/gaussian-kernels-convert-fwhm-to-sigma/ forfurther details. An improvement of the simple Gaussian ﬁlter is obtained bymodifying the heat equation. Following the PM model (Perona &Malik 1990), we choose large values of |∇ I | as an indicator of theedge points of the image, in order to stop the di ﬀ usion at thesepoints. In this way we move from an isotropic to anisotropic dif-fusion as follows: ∂ I ∂ t = div ( ∇ I ) ⇒ ∂ I ∂ t = div ( g ( |∇ I | ) ∇ I ) . (5)The equation (5) must be complemented with suitable boundaryconditions (e.g. homogeneous Neumann boundary conditions)and an initial condition. Perona and Malik pioneered the idea ofanisotropic di ﬀ usion and proposed two functions for the di ﬀ u-sion coe ﬃ cient (also called edge-stopping functions ): g ( |∇ I | ) : = + (cid:16) |∇ I | K (cid:17) (6) g ( |∇ I | ) : = exp (cid:16) − (cid:16) |∇ I | K (cid:17) (cid:17) (7)where K is the gradient magnitude threshold parameter that de-cides the amount of di ﬀ usion to take place.We also consider other three edge-stopping functions that havebeen proposed after the original work by Perona and Malik:Black et al. (1998) proposed an edge stopping function calledTukey’s biweight function deﬁned as: g ( |∇ I | ) : =  (cid:104) − (cid:16) |∇ I | K √ (cid:17) (cid:105) if |∇ I | ≤ K √

20 otherwise. (8)Guo et al. (2012) proposed the following function: g ( |∇ I | ) : = + (cid:16) |∇ I | K (cid:17) α ( |∇ I | ) (9)where α ( |∇ I | ) : = − + (cid:16) |∇ I | K (cid:17) . (10) https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.gaussian_filter.html Weickert (1998) proposed: g ( |∇ I | ) : = (cid:40) − exp( − . ∗ K / ( |∇ I | ) ) if |∇ I | (cid:44)

01 otherwise. (11)

Code information

This method has been implemented by us in C ++ and it is avail-able at: https://github.com/valerioroscani/perona-malik.git . The Bilateral ﬁlter is an edge-preserving denoising algorithmthat was ﬁrst introduced by Tomasi & Manduchi (1998).It is deﬁned as (see also Banterle et al. 2012) I ( x ) = w (cid:88) x i ∈ Ω I ( x i ) f r ( (cid:107) I ( x i ) − I ( x ) (cid:107) ) g s ( (cid:107) x i − x (cid:107) ) , (12)where w : = (cid:88) x i ∈ Ω f r ( (cid:107) I ( x i ) − I ( x ) (cid:107) ) g s ( (cid:107) x i − x (cid:107) ) (13)and • I is the ﬁltered image • I is the original input image to be ﬁltered • x are the coordinates of the current pixel to be ﬁltered • Ω is the window centered in x , so x i ∈ Ω is another pixel • f r is the range kernel for smoothing di ﬀ erences in intensities(this function can be a Gaussian function) • g s is the spatial (or domain) kernel for smoothing di ﬀ erencesin coordinates (this function can be a Gaussian function).It averages pixels based on their spatial closeness andon their radiometric similarity. Spatial closeness is measuredby the Gaussian function of the Euclidean distance betweentwo pixels and a certain standard deviation (sigma_spatial).Radiometric similarity is measured by the Gaussian function ofthe Euclidean distance between two color values and a certainstandard deviation (sigma_color). Code information

We used the Python routine "denoise_bilateral" available in thePython package scikit - image . We noticed that using our dataset,variations of the sigma_spatial were less e ﬀ ective than variationsof sigma_color . We decided to set sigma_spatial = Total-variation denoising (also known as total-variation regular-ization) is based on the principle that images with excessive andpossibly spurious detail have high TV, deﬁned as

T V ( u , Ω ) : = (cid:90) Ω |∇ u ( x ) | dx (14)for a function u ∈ C ( Ω ) (note that a similar deﬁnition can begiven also for L functions Kolmogorov & Fomin 1957). Ac-cording to this principle, TV denoising tries to ﬁnd an imagewith less TV under the constraint of being similar to the inputimage, which is controlled by the regularization parameter, i.e. Article number, page 3 of 30 & A proofs: manuscript no. aanda tries to minimize

T V ( I , Ω ). This minimization problem leads tothe Euler-Lagrangian equation, which can be solved via the fol-lowing evolutive problem: u t = ∂∂ x (cid:16) u x (cid:113) u x + u y (cid:17) + ∂∂ y (cid:16) u y (cid:113) u x + u y (cid:17) − λ ( u − u ) , (15)for t > x , y ∈ Ω , with homogeneous Neumann boundaryconditions and a given initial condition. TV denoising tendsto produce “cartoon-like” images, that is, piecewise-constantimages. The concept was pioneered by Rudin, Osher, andFatemi in Rudin et al. (1992) and is today known as the ROFmodel. TV denoising is remarkably e ﬀ ective at simultaneouslypreserving edges whilst smoothing away noise in ﬂat regions,even at low SNRs. Code information

We test the ROF method that was proposed by Chambolle in(Chambolle 2004) and the TV denoising using split-Bregmanoptimization (Goldstein & Osher 2009; Getreuer 2012; Bush2011). For the implementation of the two aforementioned meth-ods we have used the Python routines "denoise_tv_chambolle"and "denoise_tv_bregman" belonging to the Python package scikit - image (van der Walt et al. 2014). A general approach to the denoising problem is based on the as-sumption that an image I can be regarded as composed of a struc-tural part u (i.e. the objects in the image), and a textural part v which corresponds to ﬁnest details plus the noise. Following theapproach described in Aujol et al. (2006), such image decom-position technique is based on the minimization of a functionalwith two terms, one based on the total variation and a second oneon a di ﬀ erent norm adapted to the texture component. Given animage I deﬁned in a set Ω , and let BV ( Ω ) be the space of func-tions with limited total variation in Ω we can decompose I intoits two components by minimizing:inf (cid:16) (cid:90) Ω |∇ u ( x ) | + λ || v ( x ) || pX dx (cid:17) (16)where || · || pX denotes the norm of a given space X and theminimum is found among all functions ( u , v ) ∈ BV ( Ω ) × X suchthat u + v = I . The parameter p is a natural exponent, and λ is the so-called splitting parameter which modiﬁes the relativeweights. The best decomposition is found at the λ for which thecorrelation between u and v reaches a minimum. Code information

In Castellano et al. (2015), the authors proposed a C ++ codenamed Astro-Total Variation Denoiser (ATVD), which imple-ments three versions of the technique, based respectively on the T V - L ( X = L ( Ω )), T V - L ( X = L ( Ω )) and TVG ( X beinga Banach space as deﬁned in Aujol et al. 2006) norms. Twothresholds are deﬁned and used in the stopping criteria of thealgorithms, called (cid:15) corr and (cid:15) sol . (cid:15) corr deﬁnes the correlation algorithm precision, whereas (cid:15) sol de-ﬁnes the method precision (e.g. TVL2, TVG, TVL1). For all ourtests, we will use (cid:15) corr = − and (cid:15) sol = − , as suggested bythe authors of Castellano et al. (2015). https://scikit-image.org/docs/dev/api/skimage.restoration.html Remark 1.

In the deﬁnition of the functional to be minimized one canalso add a term taking into account some properties of the un-known image f corresponding to the available image g . This isa typical situation when the physical image is modeled as linearoperator A acting from a Hilbert space X to a Hilbert space Y .In this approach X contains all the functions characterizing un-known objects and Y contains the functions describing the cor-responding measurable images, so that g = A f . (17)A typical choice is to choose X = Y = L ( Ω ) and the inverseproblem is then to minimize the functional (cid:90) Ω (cid:107) A f − g (cid:107) dx (18)over f ∈ X . This problem is often ill-posed so a popularTikhonov regularization is obtained adding another term R ( f )to the functional getting (cid:90) Ω (cid:107) A f − g (cid:107) dx + µ R ( f ) (19)where µ is a positive parameter to be tuned carefully. The term R ( f ) can also be used to introduce a prior, e.g. the regularityof f (based on Schwartz Theorem) or the sparsity of f (choos-ing R ( f ) = (cid:107) f (cid:107) ). Imposing a morphological prior on the shapes,such has penalizing shapes di ﬀ erent from ellipses, would requirean enormous number of parameters in the case of astronomicalimages that usually include several sky objects and is a very chal-lenging problem which goes beyond the scopes of this paper. Forthe use of priors in other areas (e.g. in biomedical imaging), werefer the interested readers to Bertero & Piana (2006), and for ageneral introduction to inverse problems in imaging to the bookBertero & Boccacci (1998). The wavelets transform is the counterpart for images of theFourier transform and the wavelets domain, which is a sparserepresentation of the image that can be thought of similarly to thefrequency domain of the Fourier transform (Valens 1999). Beinga sparse representation means that most values are zero or near-zero and truly random noise is represented by many small valuesin the wavelet domain. Setting all values below some thresholdto 0 reduces the noise in the image, but larger thresholds alsodecrease the detail present in the image.Let us recall the relation introduced in Sect. 2.1 I = I clean + η, (20)where η is the noise and I clean is the clean image (signal). Thecomponents of η are independent and identically distributed (iid)as N (0 , σ ) and independent of I clean . The goal is again to re-move the noise obtaining an approximation (cid:98) I of I clean minimiz-ing the mean square error (MSE) MSE ( (cid:98) I ) : = N N (cid:88) j = ( (cid:98) I j − I j ) , (21)where N is the number of pixels. Let us denote by Y = W I thematrix of wavelet coe ﬃ cients of the image I where W is theorthogonal wavelet transform operator, similarly F = W I clean Article number, page 4 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys and E = W η (see Vetterli & Kovaˇcevic (1995), Mallat (2001)for more details on W ). The wavelet transform is based on thesubbands (called details) at di ﬀ erent scales usually indexed by k ∈ K , K ⊂ N . The wavelet-thresholding method ﬁlters eachcoe ﬃ cient Y j from the detail subbands k ∈ K with a thresholdfunction to obtain (cid:98) X . The denoised approximation is (cid:98) I = W − (cid:98) X ,where W − is the inverse wavelet transform. Two thresholdingtechniques are frequently used. The soft-threshold function ϕ T ( x ) : = sgn ( x ) max( | x | − T ,

0) (22)which shrinks the argument x to 0 by the threshold T . The hard-threshold function ψ T ( x ) : = x {| x | > T } (23)which sets the input to 0 if is below (or equal) the threshold T .Note that the threshold procedure removes noise by thresholding only the wavelet coe ﬃ cients of the corresponding subbands,keeping the low resolution coe ﬃ cients unaltered. Code information

We consider the two thresholding methods deﬁned in the Pythonroutine "denoise_wavelet" (Chang et al. 2000; Donoho & John-stone 1994), the ﬁrst applies BayesShrink, which is an adaptivethresholding method that computes separate thresholds for eachwavelet subband as described in Chang et al. (2000), the sec-ond is "VisuShrink", in which a single “universal threshold” isapplied to all wavelet detail coe ﬃ cients as described in Donoho& Johnstone (1994). This threshold is designed to remove allGaussian noise at a given σ with high probability, but tends toproduce images that appear overly smooth.In this work we decided to test several methods based on wavelettransforms. We selected the Meyer wavelet described in Meyer(1990) with VisuShrink thresholding method since, analyzingthe application on our dataset, we found that it provides the bestperformance based on the analysis described in Sect. 4. The listfrom which we took the Meyer wavelet can be found in Lee et al.(2019). From now on, we will refer to this method as OrthogonalWavelets.We also consider other methods, based on multiscale waveletsdecomposition (implemented in the C ++ library called sparse ). The ﬁrst is anisotropic UWT, based on the à trous algorithm, better known as"Starlet transform", where 5 wavelets scales and an iterative hardthresholding technique are set. From now on, we will refer to thismethod as Starlet. The other two both use a biorthogonal UWTusing a set of ﬁlters (introduced for the JPEG 2000 compressionstandard Starck et al. 2010) called "7 / σ threshold is set, whereas for thesecond one a multi-resolution Wiener ﬁlter is performed. Fromnow on, we will refer to these two methods as "b-UWT(7 / / + Wiener", respectively. The optimal conﬁg-urations for the methods implemented in sparse

2D have beenkindly suggested by the authors. We used the mr_filter pro-gram with the following options: • Starlet: t = default, f = n = • b-UWT(7 / t = f = default, n = • b-UWT(7 / + Wiener: t = f = n = t is the type of multi-resolution transform, f is the typeof ﬁltering, and n is the number of scales. For further detailsabout these 3 methods and the UWTs in general, see Starck et al.(2010). The non-local means algorithm averages the value of a givenpixel with values of other pixels in a limited proximity, underthe condition that the patches centered on the other pixels aresimilar enough to the patch centered on the pixel of interest. Thisalgorithm is deﬁned by the formula (Buades et al. 2005) NL [ I ]( x ) = C ( x ) (cid:90) Ω exp ( − g h σ ( x )) I ( y ) dy (24)where g h σ ( x ) : = G σ ∗ | I ( x + . ) − I ( y + . ) | )(0) h , (25) I is the original image, x ∈ Ω , C ( x ) is a normalizing constant, G σ is a Gaussian kernel with σ denoting the standard deviation,and h acts as a ﬁltering parameter. The algorithm has been foundto have excellent performance when used to denoise images withspeciﬁc textures .We deﬁne by size I the image size in pixels, by size p the sizeof the patch in pixels, by d p the maximal distance in pixels whereto search patches, by n the image number of dimensions ( n = , size I ∗ ( size p ∗ d p ) n (Buades et al. 2005). A new "fast" versionis now preferentially used since its actual complexity is propor-tional to: size I ∗ d np (Darbon et al. 2008).Compared to the classic algorithm, in the fast mode the dis-tances are computed in a coarser way, indeed all pixels of a patchcontribute to the distance to another patch with the same weight,no matter their distance to the center of the patch. This approachcan result in a slightly poorer denoising performance.When the standard deviation σ is given, the method gives amore robust computation of patch weights. A moderate improve-ment to denoising performance can be obtained subtracting theknown noise variance from the computed patch distances, thatimproves the estimates of patch similarity (Buades et al. 2011). Code information

In this work both the fast and slow version of the algorithm aretested. After a ﬁrst selection of patch sizes and distances, throughthe analysis described in Sect. 4, we decided to set size p = d p =

6. For our numerical tests we used the routine "de-noise_nl_means" , implemented in the Python package S cikit - image . "Block-matching and 3D ﬁltering" (BM3D) is a 3 dimensionalblock-matching algorithm which "groups" similar 2D fragmentswith a Matching method in the image. The

Matching methodﬁnds similar fragments to a reference one, grouping fragmentscloser than a deﬁned threshold. The matched fragments are thenstored in 3D arrays called groups . A "collaborative ﬁltering" isperformed to each group, which consists in a 3D linear trans-form, a shrink to reduce noise and an invert linear transformwhich produces 2D estimates of all the fragments. Once theestimates are obtained, they are aggregated to form an estimateof the whole image. For further details see Dabov et al. (2007). https://scikit-image.org/docs/dev/api/skimage.restoration.html Article number, page 5 of 30 & A proofs: manuscript no. aanda

Code information

In this work we tested a C ++ code of BM3D available at https://github.com/gfacciol/bm3d . For performance improvementsand memory reasons, the input images have been cut in smalleroverlapping tiles, re-aggregated in a single output image at theend of the process, as suggested by the authors.

3. Test dataset

We ﬁrst test the denoising algorithms on ﬁve di ﬀ erent simulatedimages (Table 1), chosen as to reproduce the properties of a widerange of typical cases in terms of resolution, depth, pixel scaleand wavelength: • VIS: Euclid satellite visual band (wavelength: 550-900 nm) • NIR H: Euclid satellite near-infrared H band (wavelength:1372-2000 nm) • EXT G: ground-based optical ﬁlter • H160: Hubble Space Telescope (HST) near-infrared F160Wband (e.g. CANDELS-wide Guo et al. 2013) • IRAC: Irac-Spitzer 3.6 µ m channel.From now on, we refer to the simulated images, providedas input to the algorithms, as "Original", whereas we refer tothe simulated images representing the true sky, without noise in-cluded, as "Noiseless". TEST IMAGES

Filter PSF-FWHM Pixel Scale Mag Lim ( a ) arcsec arcsecVIS 0.2 0.1 25.25NIR H 0.3 0.3 24.01EXT G 0.8 0.2 25.93H160 0.15 0.06 27.23IRAC 1.6 0.6 25.40HUDF (H160) 0.15 0.06 28 . / . ( b ) Ks (HAWK-I) 0.4 0.1 24 . / . ( c ) Table 1. ( a ) : SNR = ( b ) : limiting magnitude at the CANDELS and at thefull HUDF depth, respectively; ( c ) : images from Castellano et al. (2010),and from the HUGS survey (Fontana et al. 2014), respectively. VIS and NIR H reproduce the expected features of the visualand near-infrared bands in the forthcoming ESA satellite Euclid(Laureijs et al. 2011), and EXT G simulates a typical ground-based complementary optical observation for the Euclid WideSurvey. H160 is modeled after the detection band in recent deepsurveys such as CANDELS (Grogin et al. 2011; Koekemoeret al. 2011) and 3D-HST (Skelton et al. 2014), whereas IRACsimulates the features of the

Spitzer

Channel 1 band in the CAN-DELS GOODS-South ﬁeld (Guo et al. 2013).The images have been simulated with

SkyMaker (Bertin2009) on the basis of source catalogs generated by the Empir-ical Galaxy Generator (EGG) (Schreiber et al. 2017) and theyhave been perturbed by Gaussian noise in order to reach the lim-iting magnitudes reported in Table 1. All the PSFs are Gaussianexcept for the IRAC case where a real IRAC 3.6 µ m channel PSFhas been used. The H160 and HAWK-I images are real observa-tions whose tests are described in Sect. 6.1-6.2.We can sort the simulated images in several di ﬀ erent ways: • Depth, from the deepest to the shallowest: H > EXT G > IRAC > V IS > NIR H • PSF, from the sharpest to the coarsest: H > V IS > NIR H > EXT G > IRAC • Pixscale, from the smallest to the largest: H > V IS > EXT G > NIR H > IRAC .For each simulated image, we cut three independent areas of thesky, which are the same for every band but di ﬀ er in dimensionsdue to the di ﬀ erent pixel scale. The regions are listed below: • BG: centered on a big elliptical galaxy (see Fig. 1) • CL: centered on a cluster of galaxies (see Fig. 2) • CM: an average portion of the sky (see Fig. 3).The three regions have a dimension of: • VIS: 1000x1000 pixels • NIR_H: 333x333 pixels • EXT_G: 500x500 pixels • H160: 1666x1666 pixels • IRAC1: 166x166 pixels.After the analysis described in Sect. 4, commenting the re-sults obtained in Sect. 5, additional tests on real images (see Ta-ble 1 for details) ground-based (HAWK-I) and from space (HST)are reported and analyzed in Sect. 6.

4. Quality tests

The idea at the basis of the analysis is to ﬁrst evaluate the al-gorithms through di ﬀ erent tests, in order to apply only the mostpromising ones (with their best conﬁgurations) on real images.We organize our analysis on the ﬁve simulated images in dif-ferent levels of testing. A brief description of each step is givenbelow:1. As a ﬁrst step we compare the algorithms performancethrough three metrics: mean square error ( MSE ), structuralsimilarity (SSIM Wang et al. 2004) and CPU time . The

MSE is deﬁned as:

MSE : = (cid:80) Ni = (cid:0) x i − (cid:98) x i (cid:1) N (26)where x i is the i-th pixel in the denoised image and (cid:98) x i is thei-th pixel in the original image (without noise). The SSIM isdeﬁned as: SSIM : = (2 µ x µ y + c )(2 σ xy + c )( µ x + µ y + c )( σ x + σ y + c ) (27)where µ x is the average of x, µ y is the average of y, σ x is thevariance of x, σ y is the variance of y, σ xy is the covariance ofx and y, c and c are constants proportional to the dynamicrange of the pixel values. The CPU time is the computationaltime required by the algorithms to ﬁlter the image. Throughthese tests we identify the main internal parameters of eachalgorithm and their ideal values.2. We test the stability of the algorithms selected in the pre-vious step when faced with non-stationary Gaussian noise.Furthermore, we test their performance as a function of theFWHM of the PSF and as a function of the background noiselevel.3. We test the stability of the algorithms selected in the previoussteps against variations of the main internal parameter value(identiﬁed in Step 1), measuring how the MSE varies as afunction of the parameter values.

Article number, page 6 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys

Fig. 1.

From left to right: Crops of the BG (Big Galaxy) image central area for VIS, H160, NIR H, EXT G and IRAC

Fig. 2.

From left to right: Crops of the CL (Cluster) image central area for VIS, H160, NIR H, EXT G and IRAC

Fig. 3.

From left to right: Crops of the CM (Average ﬁeld) image central area for VIS, H160, NIR H, EXT G and IRAC

4. We test how the shapes of the objects is a ﬀ ected by the se-lected denoising algorithms checking if they preserve theFWHM of point-like objects, the ellipticity and the FWHMof the galaxies proﬁles.5. We test the selected algorithms, studying two diagnostics, completeness and purity , which provide a quality estimate ofthe catalog produced after an ideal source detection, explor- ing a combination of SE xtractor (Bertin & Arnouts 1996)detection parameters.6. As last step, we test if the denoised images can be used alsofor photometry measurements, analyzing if the object ﬂuxesare preserved after denoising. Article number, page 7 of 30 & A proofs: manuscript no. aanda

Finally, we apply the best performing algorithms of our selectionon real images acquired from space and ground-based telescope,as described in Sect. 6.

We compare the di ﬀ erent images, following always the sameprocedure here described: • The Original image is scaled to the range [0 , • The Original image is ﬁltered by the denoising algorithmproviding the denoised image • The denoised image is scaled back to[

Original min , Original max ], where

Original min and

Original max are the minimum and maximum values inthe Original image, using the following equation: x iOriginal = ( Original max − Original min ) ∗ x i [0 , + Original min (28)where x iOriginal is the i − th pixel in the Original image and x i [0 , is the i − th pixel in the denoised image scaled to [0,1] • MSE and

SSIM are computed by comparing the denoised im-age to the noiseless one.In order to choose the best internal parameter for each denoisingalgorithm (a list of these parameters is in Sect. 5.1), we useddi ﬀ erent stopping criteria: • ATVD : In this method the internal parameters are automat-ically optimized by the algorithm, and a stopping rule isalready implemented, through a minimization problem, asdescribed in Sect. 2.5 • Perona-Malik : In PM code we have a stopping rulecomposed by 3 conditions: in the ﬁrst one we compareat each time step

MSE n with MSE n − where MSE n is the MSE at the current time step, whereas

MSE n − is the MSE at the previous time step. The code continues running as

MSE n − − MSE n >

0. The second condition concerns thenumber of iterations n : the code continues running until thenumber of iterations does not exceeds the maximum numberof iterations NMAX, which is set to NMAX = | MS E n − − MS E n MS E n − |≤ (cid:15) , with (cid:15) = − • sparse : Starlet and the two b-UWTs automatically esti-mate the noise background. For the optimal conﬁgurationsof these three methods, we used the setting provided bythe authors and reported at the end of Sect. 2.6. Furtherinformation can be found in the documentation ﬁle availablewith the code on the CosmoStat webpage . • Other denoising algorithms : For all the other denoising al-gorithms we get the optimal values of the main parameter(s)by minimizing the MSE with an iterative process. The stop-ping rule is reached when | MS E n − − MS E n |≤ (cid:15) , setting (cid:15) = − , and the number of iterations is lower than themaximum number of iterations NMAX, which is in this caseset to NMAX = The scaling is required only by PM methods which need values be-tween 0 and 1 to work, and Bilateral, which needs only non-negativevalues for its use. The scaling step has been introduced for the twomethods mentioned above and applied to all the methods only for com-parison reasons. We veriﬁed that for the other algorithms the results donot signiﬁcantly change if the scaling is not applied.

5. Results

In this section we analyze and comment in a separate and se-quential way the results related to the quality tests, following thesame order of the steps used in Sect. 4.

In this test we use three metrics to constrain the performance ofdenoising methods:

MSE , SSIM and

CPU time (Sect. 4). We givepriority to those algorithms that are able to minimize as much aspossible the

MSE , preferring the fastest method (in terms of

CPUtime ) and the highest

SSIM in case of comparable

MSE . Follow-ing this criterion, in this step we identify the best conﬁgurationand the main parameters for every algorithm. These results aretaken into account separately for all the simulated images pre-sented in Sect. 3.The main internal parameters identiﬁed for the di ﬀ erent al-gorithms are listed below: • Orthogonal Wavelets: sigma - The noise standard deviationused for compute threshold(s) • NL means: h - Cut-o ﬀ distance in grey levels • TV Bregman: weight - Denoising weight, e ﬃ ciency of de-noising • TV Chambolle: weight - Denoising weight, e ﬃ ciency of de-noising • Gaussian: sigma - Standard deviation for Gaussian kernel • Bilateral: sigma_color - Standard deviation for gray valuedistance • Perona-Malik: T - Number of iterations of the anisotropicdi ﬀ usion • ATVD (TVL1,TVL2,TVG): λ - Structural-Texture splittingparameter • BM3D: sigma - The noise standard deviation.Further details for the algorithms implemented in Python andthe measurement of

MSE and

SSIM can be found in the scikit-image documentation . The method used to identifythe best internal parameter for each algorithm is described inSect. 4.1. In Appendix A we show the best MSE and CPU timevalues of every algorithm, for the di ﬀ erent crops. The tables areorganized to record the best MSE and CPU time values obtainedwith the algorithms. The columns represent the di ﬀ erent imagesimulated ﬁlters and the value indicated in bold is the lowestof the column. Tables A.1-A.2-A.3 contain the MSE values forthe crops BG, CM and CL, respectively. Table A.4 contains theCPU time values for the crop CM, after ﬁxing the optimal inter-nal parameters. We remind that in the following "PSF ﬁltering"amounts to ﬁltering with a Gaussian whose FWHM is the sameas the PSF-FWHM, whereas in the case of the "Gaussian ﬁlter-ing" the σ (and thus the FWHM) is a free internal parameter. Weshortly summarize here the main results: • TVL2, BM3D, Starlet, the two b-UWTs, PM, NLmeansslow, TV Chambolle always yield good performance, typi-cally providing the lowest values of

MSE • TVL2, BM3D, Starlet, the two b-UWTs, PM, NLmeansslow, TV Chambolle always perform better than Gaussianﬁltering, with the only exception of the IRAC image (we dis-cuss the IRAC situation below in Sect. 5.2) • the MSE of all the methods is proportional to the pixel scaleof the image, so that low sampling implies worse results • in most cases (with the exception of IRAC, which we dis-cuss below), the PSF ﬁltering provides a larger (i.e. worse)value of the MSE compared to the one provided by Gaussianﬁltering.

Article number, page 8 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys • in some cases, the MSE of the denoised image is larger (i.e.worse) than the one measured without denoising the imageat all. Indeed some algorithms in the situations listed belowtend to over-smooth the image, providing a worse MSE. Thisevent occurs:a) in VIS (CM) image, in the case of the PSF ﬁlteringb) in all the H160 images for both the PSF ﬁltering and TVBregmanc) 2-4 times in NIR H images, for methods NLmeans fast,Orthogonal Wavelets, TV Bregman and PSF ﬁltering,and only once for BM3Dd) only once in EXT G (CM), for the PSF ﬁlteringe) 4 to 5 times in IRAC images, for NLmeans slow,NLmeans fast, TV Bregman, Orthogonal Wavelets, andPSF ﬁltering. • the SSIM ranking typically reﬂects the

MSE ranking, point-ing out the same group of best algorithms found in the

MSE ranking; even if some positions are swapped in few cases, the

SSIM values provided by the best algorithms are comparable( ∆ SSIM < − ) • the CPU time table (Table A.4) shows that Gaussian is thefastest algorithm among the ones we tested, followed by PSFand TV Bregman. The CPU time for the other algorithms dif-fer from 1 to 4 order of magnitudes with respect to the Gaus-sian algorithm. However, the computational times are alwaysmanageable, at least for the cases of optimal performance.If we focus on the algorithms belonging to the same classesof methods, we can note that:1. all the PM methods yield similar performance, (see Fig. A.1),and therefore we choose to only keep g = g with the param-eter k set to k = − in the following steps2. TVL2 performs clearly better than TVG and TVL1. In fact,e.g. in BG, 1 − msemse Original value is always within 5% from thevalue provided by the original image (no noise), with the ex-ception of IRAC, where it drops to 0.2, which is still greaterthan the values provided by TVG and TVL1, as shown inFig. A.23. NLmeans slow performs slightly better than NLmeans fastfor H160, VIS, and EXT G, (1 − msemse Original di ﬀ erences arewithin 5% in favor of NLmeans slow) and much better forNIR H and IRAC (where NLmeans fast performs worse thanOriginal). See Fig. A.34. TV Chambolle performs better than TV Bregman in H160,NIR H, and IRAC. TV Bregman performs worse than Orig-inal, whereas for VIS and EXT G it performs 14% and 3%worse than TV Chambolle, respectively (see Fig. A.3)5. Bilateral is always within the best performing techniques(see Fig. A.3)6. BM3D, the two b-UWTs and Starlet are always among themost e ﬃ cient algorithms (see Fig. A.4 and Tables A.1-A.2-A.3 in Appendix A).Nevertheless, we keep Gaussian and PSF ﬁltering for refer-ence since they are widely used. Hence, at the end of this ﬁrststep we are left with 11 methods: PM with edge-stopping func-tion g and k = − , TVL2, BM3D, Starlet, b-UWT(7 / / + Wiener, Gaussian, PSF, NL means slow, Bilateral,and TV Chambolle. Following our experiments analysis we de-cide to discard 9 algorithms: 4 PM methods, TVG and TVL1,NL-means fast, TV Bregman, and Orthogonal Wavelets.

We note that the IRAC images do not follow the same trends asthe other bands. In fact, whereas for all the other images thereis always a small group of algorithms which perform better thanall the others, for IRAC nearly all the denoising algorithms tendto have similar performance. We investigated the possibility thatthe number of pixels were not enough (166x166 pixels) to ex-tract signiﬁcant conclusions from these images and we testedthe algorithms on an IRAC (CM) simulation with pixel scale0.06 arcsec and size of 1000 × ∝ − ), followed by BM3D, b-UWT(7 / = g1 k = ∝ × − ),PSF (MSE ∝ × − ) and then Bilateral, PM (g = g1 k = / + Wiener, Starlet and Orthogonal Wavelets (MSE > × − ). After this small test we point out that again theIRAC band does not follow the trend deﬁned in the other bands(even if the MSE decreases for all the methods and the Originalimage), but with the increased number of pixels TV Chambolle,NL means slow and Gaussian are the algorithms which providethe best performance. The low resolution of IRAC here plays afundamental role, impacting on most of the algorithms perfor-mance. This aspect of the algorithms will be described later on,in Sect. 5.4.

In this section we discuss the results of tests on images withvarying depths, i.e. obtained combining regions observed withdi ﬀ erent exposure times. To build the dataset, we used a noise-less simulated image I exp , which we mirrored along the x -axisto obtain a new image with two identical vertical halves; then,we added Gaussian noise with σ = σ VIS on the lower half H l and with σ = σ VIS on the upper half H u . In this way, the twohaves of the image contain the very same objects with a di ﬀ er-ent amount of observational noise, as if they had been observedwith di ﬀ erent exposure times. We applied the algorithms in theiroptimal conﬁguration (see Sect. 5.1) on the mirrored version ofthe crops VIS (CM) and VIS (CL), and we calculated MSE and

SSIM in H l and H u for both. We then compared these resultswith the ones obtained by the application of the algorithms onthe mirrored image with stationary Gaussian noise, with σ VIS and 2 σ VIS , we refer to these images as I σ and I σ , respectively.From the results reported in Tables B.1 and B.2 in Appendix B,obtained on both VIS (CM) and VIS (CL), respectively, we no-tice that: • all the algorithms applied on H l produce MSE values of thesame order of magnitude of the ones obtained with I σ • nearly all of them applied on H u produce MSE values of thesame order of magnitude of the ones obtained with I σ , withthe exception of Starlet and the two b-UWTs, for which theapplication on H u produces a MSE which is around 1 orderof magnitude larger than the respective

MSE obtained with I σ • MSE relative variation for BM3D, PSF, Gaussian, PM andTVL2 is ∝ − • For Bilateral, NL means, and TV Chambolle the

MSE rela-tive variation is ∝ − • For Starlet, and the two b-UWTs methods, the

MSE relativevariation is ∝ − • For all the algorithms,

SSIM relative variation is < − We want to point out that

MSE achieved by methods likeStarlet and the two b-UWTs on H l is ∝ − − − , meaning Article number, page 9 of 30 & A proofs: manuscript no. aanda that

MSE relative variation ( ∼ − ) still implies small absolutevariations ( ∝ − − − ). Finally, we can conclude that thealgorithms tested are not inﬂuenced (or negligibly inﬂuenced infew cases) by images with non-stationary Gaussian noise. In the second part of this test, we compare the performance ofthe 11 algorithms with respect to the variation of the FWHMand depth of the images. We consider two cases: • A 1000x1000 pixels crop of the simulated VIS image con-volved with di ﬀ erent kernels, to degrade the resolution in-creasing the FWHM without changing the depth of the im-age (we considered the cases FWHM = • We decreased the depth of a 1000x1000 pixels crop of thesimulated H160 image without changing the FWHM, byadding Gaussian noise with increasing standard deviation σ ( × , , ,

30 and 40 times the original one) to the Noiselessimage.The plots summarizing the results are shown in Figs. C.1- C.6,in Appendix C. We can note that:1. The MSE calculated on the original image alone decreases atincreasing FWHM due to the loss of information (i.e. smallobjects and details). All the algorithms follow this trendwhile lowering the

MSE even more due to the e ﬀ ect of ﬁl-tering (see Fig. C.1)2. The ratio between the MSE obtained by each algorithm andthe MSE computed on the original image ( msemse Original ) increasesat increasing FWHM, with the only exception of the Gaus-sian ﬁltering which instead follows the opposite trend (seeFig. C.2)3. msemse

PSF is weakly a ﬀ ected by variations of the FWHM formost of the denoising methods, with the exception of Gaus-sian (see Fig. C.3)4. As expected, the MSE increases at increasing backgroundlevel (due to the increasing of σ for the Gaussian noise) bothin the original image and in the output denoised images forall the algorithms (see Fig. C.4)5. msemse PSF increases at increasing background level for all themethods (see Fig. C.5)6. msemse

Original decreases at increasing background level for all themethods (see Fig. C.6).Summarizing, we conclude that the best performance by anydenoising algorithm are obtained on images with low SNR andhigh resolution (narrow FWHM). The best performance with re-spect to the PSF method are obtained by applying the denois-ing methods on image with high SNR, regardless of the PSF-FWHM. These results can be used to estimate the e ﬃ ciency ofthe denoising algorithms in di ﬀ erent situations, pointing out thatwhen applied to high resolution images they provide the best im-provements, whereas if applied on low SNR images (where thereis the peak of performance), the improvements compared to thePSF are slightly less signiﬁcant. From these results, it would bevery interesting to apply these methods, as an alternative of thePSF ﬁltering, on images with high resolution and high SNR. In this test we analyze the selected methods by varying the valuesof those internal parameters that had been kept ﬁxed to the opti-mal ones in the previous tests. The goal is to understand whether

Fig. 4.

Step 3: Stability against variations of the parameters. Each curvecorresponds to a denoising algorithm. We plot the

MSE against the rela-tive variation of the parameters, par min − parpar min . Obviously the absolute min-imum of the curves is reached in 0 on the x -axis, corresponding to theideal value of the parameter. In the upper panel we report the standarddeviations σ of the mse mean − mse distribution for each method .the performance are stable against sub-optimal parameter set-tings. We exclude from this analysis the three denoising methodsbelonging to the sparse

2D package, for which we simply usedthe conﬁguration provided by the authors, reported at the end ofSect. 2.6, and the PSF ﬁltering since it is just a particular caseof the Gaussian ﬁltering method. We perform the test on the VIS(CM) image and we change the main parameter value of eachtechnique by ± , ± , ±

50% and ±

75% with respect to thevalue used for the

MSE analysis (see Subsect. 5.1). The resultsare shown in Fig. 4. We notice that most of the techniques tend tohave similar performance when over-estimating the parameters,remaining relatively stable; on the contrary, under-estimating itsigniﬁcantly worsens the performance. However, all the algo-rithms (with the exception of BM3D) have a lower dispersionin

MSE compared to the Gaussian ﬁltering (this is not evidentin the plot because of the logarithmic y -axis scale, but we veri-ﬁed it numerically and we give the σ values in the upper panelof the plot), meaning that they are generally more stable againstthe variation of the parameters. In addition, they yield a ∼ ∼ The optimal denoising approach should not signiﬁcantly altersize and shape of the detected sources so to enable a meaning-ful scientiﬁc analysis. We thus tested the selected methods bymeasuring the FWHM of the detected sources with SE xtrac - tor , and comparing the measured values to the ones obtainedon original, unﬁltered images. We perform this test on the simu-lated VIS image described before, which is mainly populated bygalaxies, and on a speciﬁc rendition of the simulated VIS imagepopulated by stars distributed on a grid. The results are shownin Figs. 5- 6. Whereas for the stars in Fig. 5 the PSF ﬁlteringtends to smooth all the detected object as much as of ∼ Article number, page 10 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys the galaxies in Fig. 6, the PSF ﬁltering causes again a small o ﬀ -set, whereas most of the other methods tend to better preservethe FWHM. Fig. 5.

Step 4: FWHM conservation test on stars. On the x -axiswe plot the FWHM denoised / FWHM noiseless , where

FWHM noiseless isthe FHWM of the objects measured on the Noiseless image. µ and σ are the mean and the standard deviation of the distribution of FWHM denoised / FWHM noiseless .For a comprehensive comparison, another quantity has beentaken into consideration. The ellipticity of the galaxies has beenmeasured before and after the application of the denoising al-gorithms, using the parameter

ELLIPTICITY from SE xtractor .Looking at Fig. 7, it is possible to notice that most of the algo-rithms do not modify the ellipticity at an alarming level (even ifin some cases the dispersion is far from being optimal, e.g. forBilateral and TV Chambolle). Nevertheless, the PSF is one ofthe most performing method for this test.From these tests, we can conclude that most of the testedalgorithms preserve the shape of the sources similarly (in thecase of the ellipticity ) or even better (in the case of the FWHM)than the PSF ﬁltering.

In this test - perhaps the crucial one - we checked the quality ofthe catalogs of sources extracted from the denoised images. Weanalyze two diagnostics, both relevant to assess the performanceof the detection process: namely, the completeness and the purity as deﬁned below. We extract the catalogs running SE xtractor in dual image mode using a denoised image as detection band andthe original image as measurement band so to perform a cross-correlation between the extracted and the true catalogs of sourcesboth in terms of position and ﬂux.We used the simulated VIS 5000x5000 pixels image, search-ing for the best SE xtractor parameters conﬁguration for everydenoised image. We have thus tested a large number of possi-ble combinations of the two parameters which control the de-tection, i.e. DETECT_THRESH (from a minimum value of 0.2 to amaximum of 6.0, with steps of 0.1) and

DETECT_MINAREA (withvalues: 3,6,9,12,15,30), considering only the combinations forwhich the quantity

DETECT_THRESH ∗ √

DETECT_MINAREA > σ . The number of detection parameters combina-tions which fulﬁll this requirement is ∼ MSE minimization, do not di ﬀ er signiﬁcantly from the bestconﬁgurations found in the VIS (CM) image (Sect. 4).We introduce some notations: • n detected is the total number of detected objects, which in-cludes both real and spurious detections indiscriminately • n simulated is the number of simulated objects in the image • n spurious is the number of spurious detections, as identiﬁed bythe spurious sources identiﬁcation approach described in thefollowing.The spurious sources identiﬁcation approach that we de-ﬁne for this work is related to the SE xtractor cross-correlation,when an association catalog is provided: we denote by C R assoc the circle centered on the simulated object original position withradius R assoc , which is the maximal distance allowed for theassociation made by SE xtractor . We set it to 6 pixels (i.e.3 × FWHM ). Then, we tag an object as spurious if one of thefollowing two conditions holds: • is outside C R assoc • (is inside C R assoc ) AND ( | mag measured − mag simulated | > . mag aperture − mag simulated | > . mag measured is SE xtractor MAG_AUTO (an estimation ofthe total magnitude of the source), mag simulated is the true mag-nitude of the simulated object, and mag aperture is SE xtractor

MAG_APER corresponding to the magnitude within a circularaperture with diameter of 6 pixels.Finally, we can now deﬁne the two diagnostics: completeness : = n detected − n spurious n simulated (29) purity : = − n spurious n simulated (30)where purity = purity assoc , determined by the associationapproach deﬁned above. We measure completeness and purity in0.2 magnitudes bins. In Fig. 8 we plot the magnitudes at whichthe completeness drops below 50% against the one at which thepurity drops below 90%. Each symbol corresponds to a di ﬀ erentdenoising technique, and repetitions of the same symbols corre-spond to di ﬀ erent combinations of the detection parameters forthe same algorithm. For readability, a maximum of 5 combina-tions per algorithm (corresponding to the best ones) are shownin the ﬁgure.We note that all the methods improve the detection. BM3Dperforms like the PSF, whereas methods like Starlet and b-UWT(7 / + Wiener produce remarkable results, with a 0.2 incre-ment in completeness with respect to the PSF, but the best per-formance are reached by TVL2, Perona-Malik, TV Chambolle,and Bilateral. Indeed, these 4 methods reach the completenessthreshold 0.6 mag deeper, and the purity threshold 0.8 magni-tude deeper than the non-denoised run. Moreover, they improvethe detection compared to the PSF smoothing, reaching 0.2 mag-nitudes deeper in both completeness and purity.It is tempting to consider the

MSE and

SSIM measured onthe VIS images used for the completeness and purity analysis,searching for a possible correlation between the metrics and thediagnostics. In Fig. 9 the plots are produced using the resultsshown in Fig. 8. We ﬁnd no or weak correlation between

MSE ( SSIM ) and purity, whereas a stronger correlation exists between

MSE ( SSIM ) and completeness.

Article number, page 11 of 30 & A proofs: manuscript no. aanda

Fig. 6.

Step 4: FWHM conservation test on galaxies. On the x -axis we plot the FWHM of the objects measured on Noiseless image FWHM noiseless ,whereas on the y -axis we plot the FWHM measured on the Original image after the application of the denoising algorithms FWHM denoised . µ and σ are the mean and the standard deviation of the distribution of FWHM denoised − FWHM noiseless . Fig. 7.

Step 4: Ellipticity conservation test on galaxies. On the x -axis we plot the ellipticity e of the objects measured on Noiseless image e noiseless ,whereas on the y -axis we plot the e measured on the Original image after the application of the denoising algorithms e denoised . µ and σ are the meanand the standard deviation of the distribution of e denoised − e noiseless . We show the snapshots of a sample of objects detected bythe di ﬀ erent methods in the VIS image in Appendix D. Thesesnapshots give a visual match of objects detected in the de-noised images. We only show the best performing algorithmsresults compared to the Original, PSF-ﬁltered and the Noiselessimages. For VIS the algorithms are: PM, TVL2, Bilateral, andTV Chambolle, followed by BM3D, Starlet, b-UWT(7 / / + Wiener.

In this ﬁnal test, we compare the total ﬂuxes (by using SE xtrac - tor MAG_AUTO ) measured on the simulated denoised images withthe true input ﬂuxes for objects with magnitude within 19 and 23.We do not consider here the Gaussian ﬁltering but only the PSFﬁltering, since the last one is the reference method for this anal-ysis. The results are shown in Figs. 10-11. The standard devia-tion of the di ﬀ erence between measured and true magnitudes is Article number, page 12 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys

Fig. 8.

Step 5: Completeness and purity test. We extracted catalogs on the VIS simulated image processed with the denoising algorithms, usingdi ﬀ erent conﬁgurations of SE xtractor . We plot the magnitude at which the completeness drops below 50% against the magnitude at which thepurity drops below 90%. Each symbol corresponds to a di ﬀ erent denoising method, which can be present multiple times in the plot due to di ﬀ erentcombinations of detection parameters. The positions of the symbols are slightly randomized to improve readability. Fig. 9.

Step 5: Correlation between

MSE or SSIM and purity or completeness. On the x -axis we plot the magnitudes at which completeness(purity) reaches 50% (90%), whereas on the y -axis we plot the metrics MSE or SSIM . Dashed lines are the linear best-ﬁtting. ∼ /

9) ( σ mag = / + Wiener ( σ mag =

6. Test on real images

After having analyzed the performance of denoising techniqueson a series of simulated images, testing their performance withstationary and non-stationary Gaussian noise, we test the algo-rithms on real images, using the HST H

160 observations of theGOODS-South Field and a crop of the HAWK-I survey. Notably,the analysis of real observations provides us with a straightfor-

Article number, page 13 of 30 & A proofs: manuscript no. aanda

Fig. 10.

Step 6: Flux conservation distribution for objects with magnitude within 19 and 23. On the x-axis the real objects magnitude mag real . Onthe y-axis, the di ﬀ erence between the magnitude measured MAG_AUTO and mag real . Only the detected objects within the purity and completenessthresholds (Sect. 5.7) are considered. µ and σ are the distribution mean and the standard deviation values, respectively. Fig. 11.

Step 6: Flux conservation distribution for objects with magni-tude within 19 and 23. On the x-axis, the di ﬀ erence between the magni-tude measured MAG_AUTO and the real objects magnitude from the cata-log ( mag real ). On the y-axis the

MAG_AUTO - mag real probability distribu-tion function. Only the detected objects within the purity and complete-ness thresholds (Sect. 5.7) are considered. µ and σ are the distributionmean and the standard deviation values. ward test of more realistic situations, in particular concerning thepresence of non-Gaussian noise and / or correlated noise. We use two images of the area of the Hubble Ultra DeepField: one at the full depth released with the o ﬃ cial CAN-DELS mosaics that includes all WFC3 observations of that re-gion (HUDF09, reaching H160 = = = =

5) (Koekemoer et al. 2011; Grogin et al. 2011). We willuse the former, deeper image as "true sky", against which we willcompare the performance of denoising techniques on the shal-lower image. Using an analysis similar to that in Sect. 5.7, wetake as reference catalog the one obtained running SE xtractor on HUDF09 with conservative detection parameters. The goal isagain to check completeness and purity. We use again an asso-ciation radius R assoc of 3 × FWHM , now corresponding to 7 . mag aperture within a circular aperturewith diameter of 7 . /

9) per-form slightly worse than the PSF.As done in Sect. 5.7, we show the snapshots of a sam-ple of objects detected by the di ﬀ erent methods in the GS-DEEP image in Appendix E. The algorithms reported are: PM,TVL2, Bilateral, NL means, BM3D, Starlet, b-UWT(7 / / + Wiener.

We repeat the same tests described above on two Ks-band ob-servations of the Goods-South ﬁeld acquired with the HAWK-Iimager at the VLT: a shallower observation of the ﬁeld presentedin Castellano et al. (2010) and the ∼ Article number, page 14 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys

Fig. 12.

Space telescope real Images Completeness & Purity (GS-DEEP and HUDF09). On the x-axis the magnitude at which the puritydrops below 90%, on the y-axis the magnitude at which the complete-ness drops below 50%. Each symbol is referred to a di ﬀ erent denoisingmethod, which can be present multiple times in the plot due to di ﬀ erentcombinations of detection parameters, see text for details. The positionsof the symbols are slightly randomized to improve readability. shallow image. We use again the association radius R assoc of3 × FWHM corresponding to 11.25 pixels, with the relative mag aperture (11.25 pixels diameter), identifying an object as spu-rious using again the same criteria already used in Sect. 5.7 andSect. 6.1. The resulting plot (see Fig. 13) shows that these al-gorithms improve the image detection compared to not makingdenoising at all (same result obtained in Sect. 5.7 and Sect. 6.1),whereas they do not provide signiﬁcant improvements comparedto the PSF. Indeed, only PM creates a catalog of 0.2 magni-tudes more pure at the same completeness. These results arein agreement with those in Sect. 5.4, where we noticed that allthese methods give the best with high resolution images (see Fig.C.2), such as VIS and H160. Indeed the lower resolution of theground-based images impacts the algorithms performance. In thesame way the methods, and mainly the PSF, perform better forlower SNR images (e.g. HAWK-I compared to VIS and H160),as shown in Figs. C.5-C.6, resulting in less signiﬁcant improve-ments from the methods compared to the PSF.

The analysis made in 5.7 only considers the objects SNR andmagnitude to classify the detections as correct or spurious . Thealgorithms perform denoising following di ﬀ erent strategies, andartifacts related to the family of these methods can be created.Several crops denoised by di ﬀ erent tested methods are reportedin Appendix D-E. By visually inspecting such snapshots, we canget an idea of the features and the artifacts produced. As a gen-eral comment, looking at the ﬁgures reported in Appendix D,TVL2 seems to be the best method, closer to the noiseless imagereported in the last column of all the ﬁgures, followed by Star-let, b-UWT(7 / / + Wiener, which also producegood results. PM, TV Chambolle, and BM3D are often very sim-ilar from a qualitative point of view. Most of the time the ﬂux ofthe objects detected by TVL2 is highly concentrated (see e.g.Figs. D.2, D.5), but in some cases the objects ﬂux is distributedon a larger area, as visible e.g. in Fig. D.6. Starlet produces in

Fig. 13.

Ground-based real Images Completeness & Purity (HAWK-Iand HAWK-I UDS). On the x-axis the magnitude at which the puritydrops below 90%, on the y-axis the magnitude at which the complete-ness drops below 50%. Each symbol is referred to a di ﬀ erent denoisingmethod, which can be present multiple times in the plot due to di ﬀ erentcombinations of detection parameters, see text for details. The positionsof the symbols are slightly randomized to improve readability. some cases bright isotropic objects, which could lead to mis-leading morphological information (see e.g. Figs. D.14, D.16).Looking at Fig. D.10, PSF seems to detect wrongly two objects,di ﬀ erently from the other denoising methods. This qualitativeanalysis is conﬁrmed by the corresponding quantitative analysismade, as visible in Fig. 8.When more than one object is present in the ﬁgures, the di ﬀ erentschemes sometimes are not able to recognize all the objects (seee.g. Figs. D.3, D.6, D.9, D.11, D.13). Moreover, ﬂuctuations ofthe background around the objects are not completely removed,and the smoothing can create artifacts (see e.g. Fig. D.13, wherean elliptical galaxy seems to appear as a spiral galaxy after de-noising). b-UWT(7 / / + Wiener are susceptibleto background ﬂuctuations and create visual artifacts (see Figs.D.17, D.18).Inspecting the snapshots from real images, a qualitative anal-ysis is inconclusive, so we have to stick to the quantitative anal-ysis in Fig. 12. In fact, looking at the crops reported in AppendixE, PM and Bilateral are often so similar that it is di ﬃ cult to dis-tinguish them, although from Fig. 12 the di ﬀ erences betweenthem are clear (PM performs worse than Bilateral). Analogousremark can be made for BM3D, which produces images similarto PM and Bilateral, as visible e.g. in Fig. E.12. TVL2 is againthe most promising method, visually close to the HUDF09 alsothanks to the automatic optimization of its internal parameters,and this is conﬁrmed by the results reported in Fig. 12. Imagesfrom Starlet, b-UWT(7 / / + Wiener are quitesimilar to each other, even though those from Starlet present lessvisual artifacts (see Figs. E.10, E.12, E.13, E.14). Di ﬀ erentlyfrom the other methods, NL means produces nearly squaredpatches of uniform ﬂux, which e ﬀ ectively reduce noise ﬂuctua-tions (see Figs. E.8, E.9), but also could lead to artifacts (see e.g.Fig. E.5, where the galaxy in the bottom-right corner assumes anearly boxy-shape).Even if artifacts are generated, and the morphology of theobjects detected could be altered, we tested the e ﬃ ciency of thealgorithms in preserving the FWHM and the ellipticity of the Article number, page 15 of 30 & A proofs: manuscript no. aanda objects in Sect. 5.6, proving that on the average the shapes arepreserved. A more detailed assessment of the e ﬀ ect on galaxymorphology is beyond the scope of the present work. However,we note that a promising improvement in this direction can beobtained by classifying objects using a convolutional neural net-work (e.g. one of the classiﬁers proposed in Tuccillo et al. 2017;Khalifa et al. 2017; Barchi et al. 2020), in order to assess whethertheir morphological class is preserved after denoising.

7. Summary and conclusions

The goal of this work is to make an extensive comparison be-tween a number of denoising algorithms, aimed at identifyingthe best choice to improve the detection of faint objects in astro-nomical extragalactic images (e.g. considering the typical casesof HST and Euclid). To this purpose, we performed a large setof tests on simulated images. We also tested the methods on realimages: from space and ground-based, collecting really interest-ing hints on many situations.We chose to test a signiﬁcant number of denoising algo-rithms based on traditional techniques (mainly, PDEs and varia-tional methods), leaving a more complete comparison, includingmachine learning techniques, for future works. In particular, wepoint out that ATVD-TVL2, Bilateral, Perona-Malik, TV Cham-bolle, Starlet, and b-UWT(7 / + Wiener are the most interestingto use among all, since they provide good performance in the dif-ferent tests proposed, closely followed by BM3D. Even if mostof these methods are quite unusual for the astronomical commu-nity, they are very well-known in many other ﬁelds. They areknown to outperform a straightforward PSF / Gaussian ﬁltering,which is the standard choice in astronomy. We therefore consid-ered these techniques as the reference ones, against which wetested all the other methods.As a ﬁrst test, we considered the two metrics

MSE and

SSIM (deﬁned in Sect. 4) and checked which methods yield the bestperformance with respect to them. We compared their perfor-mance again through

MSE and

SSIM in relation to depth, reso-lution and type of image. We tested the algorithms ability to pre-serve the FWHM, in order to understand if they can preserve theshape of the objects, useful in case photometric measurementson the denoised image are needed. We tested their stability us-ing the

MSE , varying the ideal parameter of a ﬁxed percentage,with the goal of having a hint on their reliability, in case thebest parameter is chosen wrongly. We have also tested possibledetection improvements through two diagnostics, completeness and purity , which are used to measure the fraction of real de-tections on the total number of objects in the image. Finally, weapplied these methods on real images (CANDELS-GS-deep anda crop of HAWK-I). We summarize below the key points of theanalysis performed in this paper: • From

MSE and

SSIM we noticed that 11 algorithms are al-ways on top of rankings, especially for VIS and H160 im-ages, which are of main interest for the detection in futuresurveys. These algorithms are: PM with edge-stopping func-tion g and k = − , TVL2, BM3D, Starlet, b-UWT(7 / / + Wiener, Gaussian, PSF, NL means slow, Bilat-eral, and TV Chambolle • From the stability test discussed in Sect. 5.3, the algorithmstested are not inﬂuenced (or negligibly inﬂuenced in fewcases) by images with non-stationary Gaussian noise • From the PSF and Depth variation test we noticed that mostof the methods are performing better with narrower FWHM.Whereas all the methods perform better with the noise level increasing (in terms of Gaussian noise standard deviation),their improvements are more signiﬁcant compared to thePSF, with high SNR images • From the FWHM conservation test, we noticed that most ofthe algorithms tend to not smooth the image, in terms ofFWHM increments. On the contrary, the PSF smoothing pro-vides an o ﬀ set in the FWHM measurement, for both, stars-only image and galaxies-image. On the other hand, the ellip-ticity is well preserved by the PSF and most of the algorithms • From the completeness and purity test we found a smallnumber of algorithms which provide 0.2 magnitudes morepure and complete catalogs than the PSF ﬁltering, these areTVL2, Perona-Malik, TV Chambolle and Bilateral (whereasStarlet and b-UWT(7 / + Wiener provide a 0.2 incrementonly for completeness) • From Flux conservation test we found that most of the al-gorithms have similar performance to the PSF ﬁltering, pre-serving the overall calibration of the input images • From real image test (H160) we found that TVL2 outper-forms all the other algorithms, and it is the only one that per-forms better than the PSF of 0.2 magnitudes in completenessand purity, whereas Bilateral produces only a 0.2 more purecatalog • From real image test (HAWK-I) we found that only Perona-Malik outperforms the PSF ﬁltering, by 0.2 magnitudes inpurity, the other methods performs worse / similarly to thePSF. • From the visual inspection performed in Sect. 6.3 on simu-lated and real images, the best methods seems to be TVL2,PM, Bilateral, and Starlet, although they also generates ar-tifacts like the other methods tested. The visual inspectionconﬁrms the results visible in Fig. 8 and Fig. 12.The results we obtained demonstrate that denoising algo-rithms should be considered valuable tools in presence of Gaus-sian noise, which is a good approximation of the noise in opti-cal and near-infrared extragalactic surveys, as they clearly im-prove the detection of faint objects. Structure-texture image de-composition, Total Variation denoising, Perona-Malik, Bilateralﬁltering, and Undecimated Wavelets Transform methods are ofparticular interest. While further speciﬁc tests are needed to de-ﬁne for each survey the optimal approach, and its parameter set,among the above mentioned ones, our investigation on a smallbut reasoned reference set of simulated and real extragalacticimages shows that the scientiﬁc return of ongoing and future sur-veys can be signiﬁcantly enhanced by the adoption of these de-noising methods in place of standard ﬁltering approaches. More-over, in our opinion the use of the increasingly popular machinelearning techniques, possibly combined with the best methodscoming out from our analysis, has the potential to further im-prove on the performance of "traditional" denoising techniquesdescribed here. Also learning approaches for adding morpholog-ical priors (see e.g. Peyré et al. 2010) could be useful, subject offuture works.

Acknowledgements.

We would like to thank the referee, Prof. Jean-Luc Starck,for the many useful comments and suggestions that helped us improving ourwork. This research has been carried on within the INdAM-INAF project FOE2015 “OTTICA ADATTIVA”. The authors S. Tozza and M. Falcone are mem-bers of the INdAM Research Group GNCS, and gratefully acknowledge its ﬁ-nancial support to this research.

References

Ackermann, M., Ajello, M., Allafort, A., et al. 2013, ApJS, 209, 34

Article number, page 16 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys

Aujol, J.-F., Gilboa, G., Chan, T., & Osher, S. 2006, International Journal ofComputer Vision, 67, 111Banterle, F., Corsini, M., Cignoni, P., & Scopigno, R. 2012, Computer GraphicsForum, 31, 19Barchi, P., de Carvalho, R., Rosa, R., et al. 2020, Astronomy and Computing, 30,100334Beckouche, S., Starck, J.-L., & Fadili, J. 2013, Astronomy and Astrophysics,556, 132Bertero, M. & Boccacci, P. 1998, Introduction to inverse problems in imaging(CRC press)Bertero, M. & Piana, M. 2006, Inverse problems in biomedical imaging: model-ing and methods of solution, ed. A. Quarteroni, L. Formaggia, & A. Veneziani(Milano: Springer Milan), 1–33Bertin, E. 2009, Mem. Soc. Astron. Italiana, 80, 422Bertin, E. & Arnouts, S. 1996, A&AS, 117, 393Black, M. J., Sapiro, G., Marimont, D. H., & Heeger, D. 1998, IEEE Transactionson Image Processing, 7, 421Buades, A., Coll, B., & Morel, J.-M. 2005, in 2005 IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2,60–65 vol. 2Buades, A., Coll, B., & Morel, J.-M. 2011, Image Processing On Line, 1, 208Bush, J. 2011, Master’s thesis, University of California, Santa Bar-bara,

Castellano, M., Fontana, A., Boutsia, K., et al. 2010, A&A, 511, A20Castellano, M., Ottaviani, D., Fontana, A., et al. 2015, in Astronomical Society ofthe Paciﬁc Conference Series, Vol. 495, Astronomical Data Analysis Softwareand Systems XXIV (ADASS XXIV), ed. A. R. Taylor & E. Rosolowsky, 257–260Chambolle, A. 2004, Journal of Mathematical Imaging and Vision, 20, 89Chang, S. G., Yu, B., & Vetterli, M. 2000, IEEE Transactions on Image Process-ing, 9, 1532Dabov, K., Foi, A., Katkovnik, V., & Egiazarian, K. 2007, IEEE Transactions onImage Processing, 16, 2080Darbon, J., Cunha, A., Chan, T. F., Osher, S., & Jensen, G. J. 2008, in 20085th IEEE International Symposium on Biomedical Imaging: From Nano toMacro, 1331–1334Dark Energy Survey Collaboration, Abbott, T., Abdalla, F. B., et al. 2016, MN-RAS, 460, 1270Donoho, D. L. & Johnstone, J. M. 1994, Biometrika, 81, 425Fontana, A., Dunlop, J. S., Paris, D., et al. 2014, A&A, 570, A11Frontera-Pons, J., Sureau, F., Bobin, J., & Le Floc’h, E. 2017, A&A, 603, A60Gabor, D. 1965, Laboratory Investigation, 14, 801Getreuer, P. 2012, Image Processing On Line, 2, 74Goldstein, T. & Osher, S. 2009, SIAM Journal on Imaging Sciences, 2, 323Grogin, N. A., Kocevski, D. D., Faber, S. M., et al. 2011, ApJS, 197, 35Guo, Y., Ferguson, H. C., Giavalisco, M., et al. 2013, ApJS, 207, 24Guo, Z., Sun, J., Zhang, D., & Wu, B. 2012, IEEE Transactions on Image Pro-cessing, 21, 958Jones, E., Oliphant, T., Peterson, P., et al. 2001, SciPy: Open source scientiﬁctools for Python, [Online; accessed < today > ]Karmakar, A., Mishra, D., & Tej, A. 2018, arXiv e-prints, arXiv:1809.01434Khalifa, N. E. M., Taha, M. H. N., Hassanien, A. E., & Selim, I. M. 2017, arXive-prints, arXiv:1709.02245Koekemoer, A. M., Faber, S. M., Ferguson, H. C., et al. 2011, ApJS, 197, 36Kolmogorov, A. M. & Fomin, S. V. 1957, Elements of the theory of functionsand functional analysis (Dover Publications)Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, arXiv e-prints, arXiv:1110.3193Lee, G. R., Gommers, R., Waselewski, F., Wohlfahrt, K., & O’Leary, A. 2019,Journal of Open Source Software, 4, 1237Lesser, M. 2015, PASP, 127, 1097Lindenbaum, M., Fischer, M., & Bruckstein, A. 1994, Pattern Recognition, 27,1LSST Science Collaboration, Abell, P. A., Allison, J., et al. 2009, arXiv e-prints,arXiv:0912.0201Mallat, S. 2001, IEEE Transactions on Pattern Analysis & Machine Intelligence,11, 674Meyer, Y. 1990, Ondelettes et opérateurs: Ondelettes (Hermann)Neyman, J. & Pearson, E. S. 1933, 231 (694706):, 289337Oke, J. B. & Gunn, J. E. 1983, ApJ, 266, 713Perona, P. & Malik, J. 1990, IEEE Transactions on Pattern Analysis and MachineIntelligence, 12, 629Peyré, G., Fadili, J., & Starck, J.-L. 2010, SIAM Journal on Imaging Sciences,3, 646Pierre, M., Valtchanov, I., Altieri, B., et al. 2004, J. Cosmology Astropart. Phys.,2004, 011Principe, G., Malyshev, D., Ballet, J., & Funk, S. 2018, A&A, 618, A22Romeo, A. B., Horellou, C., & Bergh, J. 2003, MNRAS, 342, 337Rudin, L. I., Osher, S., & Fatemi, E. 1992, Physica D: Nonlinear Phenomena,60, 259 Schreiber, C., Elbaz, D., Pannella, M., et al. 2017, A&A, 602, A96Shen, H., George, D., Huerta, E. A., & Zhao, Z. 2017, arXiv e-prints,arXiv:1711.09919Shensa, M. J. 1992, IEEE Transactions on Signal Processing, 40, 2464Skelton, R. E., Whitaker, K. E., Momcheva, I. G., et al. 2014, ApJS, 214, 24Spergel, D., Gehrels, N., Baltay, C., et al. 2015, arXiv e-prints, arXiv:1503.03757Starck, J. L., Aghanim, N., & Forni, O. 2004, A&A, 416, 9Starck, J. L., Moudden, Y., Abrial, P., & Nguyen, M. 2006, A&A, 446, 1191Starck, J.-L., Murtagh, F., & Bertero, M. 2014, Starlet Transform in Astronomi-cal Data Processing, 1–40Starck, J.-L., Murtagh, F., & Fadili, J. 2010, Sparse Image and Signal Processing:Wavelets and Related Geometric Multiscale Analysis (Cambridge UniversityPress)Tomasi, C. & Manduchi, R. 1998, in Sixth International Conference on Com-puter Vision (IEEE Cat. No.98CH36271), 839–846Tuccillo, D., Huertas-Company, M., Decencière, E., & Velasco-Forero, S. 2017,Proceedings of the International Astronomical Union, 12Valens, C. 1999, A Really Friendly Guide to Waveletsvan der Walt, S., Schönberger, J. L., Nunez-Iglesias, J., et al. 2014, PeerJ, 2, e453Vetterli, M. & Kovaˇcevic, J. 1995, Wavelets and Subband Coding (Upper SaddleRiver, NJ, USA: Prentice-Hall, Inc.)Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. 2008, in Proceedingsof the 25th International Conference on Machine Learning, ICML ’08 (NewYork, NY, USA: Association for Computing Machinery), 1096–1103Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P.-A. 2010, J.Mach. Learn. Res., 11, 3371Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. 2004, IEEE Transac-tions on Image Processing, 13, 600Weickert, J. 1998, Anisotropic di ﬀ usion in image processing (Teubner)Winkler, A. M. 2011, Gaussian kernels: convert FWHM to sigmaXie, J., Xu, L., & Chen, E. 2012, in Advances in Neural Information ProcessingSystems 25, ed. F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger(Curran Associates, Inc.), 341–349 Article number, page 17 of 30 & A proofs: manuscript no. aanda

Appendix A: MSE comparison tables and plots

Fig. A.1.

Step 1: MSE comparison betweenPerona-Malik functions on CM. On the x-axis,all the simulated CM crops in the di ﬀ erentbands, whereas on the y-axis 1 − msemse Original . Fig. A.2.

Step 1: MSE comparison betweenATVD algorithms on BG. On the x-axis, allthe simulated BG crops in the di ﬀ erent bands,whereas on the y-axis 1 − msemse Original . Fig. A.3.

Step 1: MSE comparison between theother algorithms excluding ATVD and PM onCL. On the x-axis, all the simulated CL cropsin the di ﬀ erent bands, whereas on the y-axis 1 − msemse Original .Article number, page 18 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys

Fig. A.4.

Step 1: MSE comparison betweenBM3D, Starlet and the two b-UWTs on CM.On the x-axis, all the simulated CM crops inthe di ﬀ erent bands, whereas on the y-axis 1 − msemse Original . name MS E H MS E

VIS

MS E

EXTG

MS E

NIRH

MS E

IRAC

TVL2 5.320 × − × − × − × − × − TVL1 7.297 × − × − × − × − × − TVG 3.922 × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PSF 8.557 × − × − × − × − × − Original 1.912 × − × − × − × − × − TV Bregman 3.778 × − × − × − × − × − Gaussian 1.301 × − × − × − × − × − NL means slow 8.940 × − × − × − × − × − NL means fast 1.019 × − × − × − × − × − Bilateral 1.109 × − × − × − × − × − TV Chambolle 1.876 × − × − × − × − × − Orthogonal Wavelets 5.119 × − × − × − × − × − BM3D × − × − × − × − × − Starlet 6.760 × − × − × − × − × − b-UWT(7 /

9) 4.900 × − × − × − × − × − b-UWT(7 / + Wiener 6.390 × − × − × − × − × − Table A.1.

MSE table of BG crops. The lowest MSE value per band is indicated in bold Article number, page 19 of 30 & A proofs: manuscript no. aanda

Name

MS E H MS E

VIS

MS E

EXTG

MS E

NIRH

MS E

IRAC

TVL2 5.590 × − × − × − × − × − TVL1 5.878 × − × − × − × − × − TVG 2.977 × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PSF 4.853 × − × − × − × − × − Original 1.902 × − × − × − × − × − TV Bregman 2.170 × − × − × − × − × − Gaussian 1.835 × − × − × − × − × − NL means slow 1.201 × − × − × − × − × − NL means fast 1.990 × − × − × − × − × − Bilateral 1.104 × − × − × − × − × − TV Chambolle 4.964 × − × − × − × − × − Orthogonal Wavelets 5.303 × − × − × − × − × − BM3D 5.880 × − × − × − × − × − Starlet 7.040 × − × − × − × − × − b-UWT(7 / × − × − × − × − × − b-UWT(7 / + Wiener 6.650 × − × − × − × − × − Table A.2.

MSE table of CM crops. The lowest MSE value per band is indicated in bold

Name

MS E H MS E

VIS

MS E

EXTG

MS E

NIRH

MS E

IRAC

TVL2 7.070 × − × − × − × − × − TVL1 2.754 × − × − × − × − × − TVG 4.424 × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PM g = = × − × − × − × − × − × − PSF 2.728 × − × − × − × − × − Original 1.915 × − × − × − × − × − TV Bregman 1.233 × − × − × − × − × − Gaussian 1.699 × − × − × − × − × − NL means slow 1.664 × − × − × − × − × − NL means fast 2.470 × − × − × − × − × − Bilateral 1.307 × − × − × − × − × − TV Chambolle 6.098 × − × − × − × − × − Orthogonal Wavelets 5.399 × − × − × − × − × − BM3D × − × − × − × − × − Starlet 8.400 × − × − × − × − × − b-UWT(7 /

9) 6.610 × − × − × − × − × − b-UWT(7 / + Wiener 8.080 × − × − × − × − × − Table A.3.

MSE table of CL crops. The lowest MSE value per band is indicated in boldArticle number, page 20 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys

Name T H T VIS T EXTG T NIRH T IRAC s s s s sTVL2 9.047 3.588 0.688 0.132 0.0938TVL1 5.408 0.814 0.339 0.020 0.0031TVG 8.910 5.177 0.334 0.134 0.0914PM g = = × − = = × − = = × − = = × − = = × − NL means slow 75.94 27.75 6.863 3.344 1.058NL means fast 7.514 3.104 1.118 0.448 0.208Bilateral 37.09 13.46 3.602 1.567 0.489TV Chambolle 10.61 0.7668 0.109 0.587 0.034Orthogonal Wavelets 0.827 0.329 0.091 0.038 0.012BM3D 25.08 10.499 2.836 1.536 0.655Starlet 7.671 2.752 0.702 0.314 0.083b-UWT(7 /

9) 9.013 3.265 0.812 0.350 0.095b-UWT(7 / + Wiener 33.15 12.01 2.983 1.318 0.337

Table A.4.

CPU Time table of CM crops after ﬁxing the optimal internal parameters for each method.The lowest time value per band is indicatedin bold Article number, page 21 of 30 & A proofs: manuscript no. aanda

Appendix B: Non-stationary Gaussian noise MSE comparison table

Name

MS E H l MS E I σ MS E H u MS E I σ TVL2 7.622e-07 = = × − /

9) 8.316e-07 7.797e-07 b-UWT(7 / + Wiener 7.954e-07 1.009e-06 1.957e-05 3.191e-06

Table B.1.

MSE values for H l , I σ , H u , I σ related to the VIS (CM) mirrored crop. See Sect. 5.3 for further details. The best results are in bold Name

MS E H l MS E I σ MS E H u MS E I σ TVL2 8.105e-07 = = × − /

9) 1.048e-06 1.117e-06 b-UWT(7 / + Wiener 1.038e-06 1.588e-06 3.805e-05 5.305e-06

Table B.2.

MSE values for H l , I σ , H u , I σ related to the VIS (CL) mirrored crop. See Sect. 5.3 for further details. The best results are in boldArticle number, page 22 of 30. Roscani et al.: A comparative analysis of denoising algorithms for extragalactic imaging surveys Appendix C: PSF and depth comparison plots

Fig. C.1.

VIS FWHM variation comparison plot. On the x-axis the VISimages with FWHM equal to the original value, 0.5, 1.0, 1.5 and 2.0arcsecs, whereas on the y-axis mse . Fig. C.2.

VIS FWHM variation comparison plot. On the x-axis the VISimages with FWHM equal to the original value, 0.5, 1.0, 1.5 and 2.0arcsecs, whereas on the y-axis msemse

Original . Fig. C.3.

VIS FWHM variation comparison plot. On the x-axis the VISimages with FWHM equal to the original value, 0.5, 1.0, 1.5 and 2.0arcsecs, whereas on the y-axis msemse

PSF . Fig. C.4.

H160 depth variation comparison plot. On the x-axis the H160images with Gaussian noise standard deviation equal to 1, 10, 20, 30 and40 times the original value, whereas on the y-axis mse . Fig. C.5.

H160 depth variation comparison plot. On the x-axis the H160images with Gaussian noise standard deviation equal to 1, 10, 20, 30 and40 times the original value, whereas on the y-axis msemse

PSF . Fig. C.6.

H160 depth variation comparison plot. On the x-axis the H160images with Gaussian noise standard deviation equal to 1, 10, 20, 30 and40 times the original value, whereas on the y-axis msemse

Original .Article number, page 23 of 30 & A proofs: manuscript no. aanda

Appendix D: VIS crops visual comparison

Fig. D.1.