[PDF] A new approach for the statistical denoising of Planck interstellar dust polarization data

Abstract

Dust emission is the main foreground for cosmic microwave background (CMB) polarization. Its statistical characterization must be derived from the analysis of observational data because the precision required for a reliable component separation is far greater than what is currently achievable with physical models of the turbulent magnetized interstellar medium. This letter takes a significant step toward this goal by proposing a method that retrieves non-Gaussian statistical characteristics of dust emission from noisy Planck polarization observations at 353 GHz. We devised a statistical denoising method based on wavelet phase harmonics (WPH) statistics, which characterize the coherent structures in non-Gaussian random fields and define a generative model of the data. The method was validated on mock data combining a dust map from a magnetohydrodynamic simulation and Planck noise maps. The denoised map reproduces the true power spectrum down to scales where the noise power is an order of magnitude larger than that of the signal. It remains highly correlated to the true emission and retrieves some of its non-Gaussian properties. Applied to Planck data, the method provides a new approach to building a generative model of dust polarization that will characterize the full complexity of the dust emission. We also release PyWPH, a public Python package, to perform GPU-accelerated WPH analyses on images.

Full PDF

AAstronomy & Astrophysics manuscript no. main © ESO 2021February 8, 2021 L etter to the E ditor A new approach for the statistical denoising of

Planck interstellardust polarization data

Bruno Regaldo-Saint Blancard , , Erwan Allys , François Boulanger , François Levrier , and Niall Je ﬀ rey , Laboratoire de Physique de l’École Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris,75005 Paris, Francee-mail: [email protected] Observatoire de Paris, PSL University, Sorbonne Université, LERMA, 75014 Paris, France Department of Physics & Astronomy, University College London, Gower Street, London, WC1E 6BT, UKReceived: February 5, 2021 / Accepted: ?

ABSTRACT

Dust emission is the main foreground to Cosmic Microwave Background (CMB) polarization. Its statistical characterization must bederived from the analysis of observational data because the precision required for a reliable component separation is far greater thancurrently achievable with physical models of the turbulent magnetized interstellar medium. This letter takes a signiﬁcant step towardsthis goal by proposing a method that retrieves non-Gaussian statistical characteristics of dust emission from noisy

Planck polarizationobservations at 353 GHz. We devise a statistical denoising method based on the Wavelet Phase Harmonics (WPH) statistics, whichcharacterize the coherent structures in non-Gaussian random ﬁelds and deﬁne a generative model of the data. The method is validatedon mock data combining a dust map from a magnetohydrodynamic simulation and

Planck noise maps. The denoised map reproducesthe true power spectrum down to scales where the noise power is an order of magnitude larger than that of the signal. It remains highlycorrelated to the true emission and retrieves some of its non-Gaussian properties. Applied to

Planck data, the method provides a newapproach towards building a generative model of dust polarization that will characterize the full complexity of the dust emission.

Key words. dust, extinction - polarization - methods: statistical - cosmic background radiation

1. Introduction

The quest of primordial B-modes in the polarization of the Cos-mic Microwave Background (CMB) faces a major challenge,namely the accurate characterization of one of its main fore-grounds: the polarized emission of interstellar dust from ourGalaxy (BICEP2 / Keck and Planck Collaborations 2015). Torightfully claim any detection of the cosmological signal, onehas to fully take into account the complexity of the Galactic con-tribution to the sky emission. The statistical characterization ofGalactic polarized foregrounds is also important to extract theCMB lensing potential from sky observations (Beck et al. 2020).The Galactic contribution is sourced by the thermal emissionof non-spherical dust grains in the di ﬀ use interstellar medium(ISM). The mixture of dust and gas is described as a turbu-lent magnetized ﬂuid, and the grains tend to align statisticallywith their long axes perpendicular to the local magnetic ﬁeld,thus polarizing the emission (Planck Collaboration XII 2020).The physical and chemical processes regulating and structuringthe di ﬀ use ISM are non-linearly coupled (Draine 2011), leadingto the emergence of strong scale couplings, which may be evi-denced by the highly non-Gaussian statistics observed for manytracers (Burkhart et al. 2009).Capturing non-Gaussianity is therefore essential to char-acterize the dust signal. To tackle this issue, several authorshave introduced machine learning algorithms that need to betrained (Aylor et al. 2020; Krachmalnico ﬀ & Puglisi 2020;Petro ﬀ et al. 2020; Thorne et al. 2021). Another approach, thatdoes not involve any learning step, uses statistics deﬁned from nonlinear transforms, namely the Wavelet Scattering Transform,and the Wavelet Phase Harmonics (WPH). These describe thecouplings between scales in non-Gaussian ﬁelds, and have beenapplied to ISM data (Allys et al. 2019; Regaldo-Saint Blancardet al. 2020; Saydjari et al. 2020) as well as to study the large-scale structure of the Universe (Allys et al. 2020; Cheng et al.2020). They have been successfully used to characterize sim-ulated and observational data with a high signal-to-noise ratio(SNR), and to build generative models for these data. However,these approaches face di ﬃ culties with noisy data, which is thecase of Planck polarization observations. In this letter, we intro-duce a statistical denoising method using the WPH statistics toretrieve the statistical properties of the noise-free dust emission.In Sect. 2 we present our statistical denoising method andshow how well it performs on noisy simulated data of thepolarized emission from dust. In Sect. 3, we apply it to theChamaeleon-Musca 353 GHz polarized emission, as observedby the

Planck satellite, and discuss our results. In Sect. 4 weconclude and suggest a few perspectives for follow-up studies.This paper also includes two appendices. Appendix A introducesin detail the WPH statistics and the denoising procedure. Ap-pendix B presents a ﬁrst comparison with other methods.

2. Method and validation

We observe a noisy map d that is modelled as: d = s + n , (1) Article number, page 1 of 9 a r X i v : . [ a s t r o - ph . C O ] F e b & A proofs: manuscript no. main − < ( s ) : T r u t h Q m a p − − − < ( n ) : N o i s e Q m a p − < ( d ) : N o i s y Q m a p − < ( ˜ s ) : D e n o i s e d Q m a p − − − < ( d − ˜ s ) : N o i s y - D e n o i s e d Q m a p − . − . . . . < ( s − ˜ s ) : T r u t h - D e n o i s e d Q m a p Fig. 1.

Top row : simulated Q maps corresponding to the truth s (left), the noise n (middle) and the resulting noisy map d (right). Bottom row :denoised simulated Q map (cid:60) ( ˜ s ) (left) next to the di ﬀ erence noisy-denoised (cid:60) ( d − ˜ s ) (middle) and denoising error (cid:60) ( s − ˜ s ) (right). Units arearbitrary but kept consistent between the maps. with s the target truth map and n an (additive) noise signal. n isan unknown realization of a random ﬁeld N . We assume that weknow N , meaning that we are able to generate as many indepen-dent realizations of it as needed. Let us call { n , . . . , n M } M ofthese.We introduce a method to retrieve the statistical propertiesof s while denoising d . This statistical denoising consists in it-eratively deforming d to build a denoised map ˜ s such that the { ˜ s + n i } i maps and the d map become "close enough" in a givenstatistical space. It employs the possibility introduced in Zhang& Mallat (2019) to build a generative model from WPH statis-tics, i.e., the ability to generate new random realizations basedon statistical constraints. In the following, we call φ the operatorthat computes a set of summary statistics from a given map. Thealgorithm consists in minimizing the following loss function: L M ( u ) = M M (cid:88) i = (cid:107) φ ( u + n i ) − φ ( d ) (cid:107) , (2)where (cid:107) · (cid:107) denotes the Euclidean norm. We choose u = d toinitialize the optimizer. The denoised map ˜ s corresponds to anapproximate minimum obtained by performing this optimizationin pixel space, using a LBFGS optimizer (Byrd et al. 1995).The choice of the operator φ is obviously paramount to thequality of the method and must be tailored to the properties of s and n . In the context of ISM polarization data, we expect s tobe relatively regular and to exhibit non-Gaussian signatures dueto the interactions between scales (e.g., ﬁlamentary structures),while the noise is expected to be highly irregular and close to a(possibly spatially-varying) Gaussian white noise.In this work, the φ operator computes WPH statistics. Thesestatistics are designed to characterize the coherent structures thatappear in non-Gaussian random ﬁelds, by quantifying the phasealignment between di ﬀ erent scales (Zhang & Mallat 2019; Mal-lat et al. 2020). Their ability to deﬁne a generative model has − − k π [px − ] P S [ Q ] ds ˜ sn d − ˜ ss − ˜ s ˜ s × s Fig. 2. Q maps power spectra for d , s , ˜ s , n (all in solid lines), d − ˜ s and s − ˜ s , and cross-spectrum between the Q maps of ˜ s and s (all in dashedlines). been studied in particular by Allys et al. (2020), who demon-strated quantitatively that they include most of the informa-tion captured by various other statistics, such as the powerspectrum, the bispectrum, or the Minkowski functionals. Thesestatistics were applied to real-valued data. Here we extend themto complex-valued polarization maps Q + iU . Statistics derivedfrom this complex variable e ﬃ ciently characterize the full com-plexity of the polarization signal (Regaldo-Saint Blancard et al.2020). We refer to Appendix A for a presentation of the WPHstatistics, and a detailed description of the denoising procedure. Article number, page 2 of 9runo Regaldo-Saint Blancard et al.: A new approach for the statistical denoising of

Planck interstellar dust polarization data − − − − − − − − − − l = 2 px Noisy d Truth s Denoised ˜ s − − l = 8 px − − l = 32 px δQ l /σ l p ( δ Q l ) Fig. 3.

PDFs of the increments of Q for d (noisy), s (truth), and ˜ s (denoised), computed for three logarithmically spaced lags going from 2 to 32pixels. For each lag, the increments are normalized by the standard deviation σ l of the Gaussian that ﬁts the core of the PDF of the noisy map. In this section, we assess the performance of our denoising algo-rithm by applying it to mock data d = s + n emulating a noisy Q + iU Planck polarization signal. We show ﬁgures of meritbased on power spectra and probability distribution functions(PDFs) of the increments of the maps. These are only shownfor Q , but they are similar for U . We postpone ﬁgures of meritbased on other kind of statistics, including the WPH statistics, toa future study.Here s is a simulated Q + iU map representing a typicallinear polarization signal produced by the thermal emission ofdust in the di ﬀ use ISM. It was built from the same magnetohy-drodynamic simulation as the one used in Regaldo-Saint Blan-card et al. (2020), and following the same procedure. We refer tothis paper for more details on the construction of this simulatedmap. We simulate a Planck observation of dust polarization us-ing

Planck instrumental noise maps introduced in Sect. 3 (PlanckCollaboration III 2020). We have at our disposal a total of 300realizations of this noise. We pick one of these, called n , tobuild d , and use the remaining M =

299 noise maps, labelled { n , . . . , n M } , for the denoising algorithm.We deﬁne the SNR of d as the ratio of the standard devia-tions of (cid:60) ( s ) and (cid:60) ( n ) (in the following, the real part (cid:60) of acomplex Q + iU map refers to its Q map). We adjust this SNRby scaling s , so that the impact of the noise "resembles" that onthe Planck map presented in Sect. 3. This is not straightforwardsince the power spectrum of the simulated map s has a di ﬀ erentslope than that of the estimate of the power spectrum of the dustemission from the Planck map (see Figs. 2 and 5). Our choice isto adjust the SNR so that the scale k i at which the power spec-tra of n and s intersect, coincides with the one from the Planck map. Figs. 2 and 5 show that k i / (2 π ) ≈ . − . This procedureleads to SNR ≈ . Q maps correspondingto the truth s (left), the noise n (middle) and the resulting noisymap d (right). (cid:60) ( s ) exhibits coherent structures such as ﬁla-ments and large smooth regions that are characteristic of its non-Gaussian statistical properties. On the other hand, (cid:60) ( n ) seemsclose to an inhomogeneous white Gaussian noise. The spatialinhomogeneity appears as variations of the local standard devi-ation, due to the Planck satellite scanning strategy. Finally, in (cid:60) ( d ), coherent structures at intermediate and small scales arehard or impossible to identify.We apply our denoising method to the simulated noisy map d and show in Fig. 1 (bottom) the resulting denoised Q map (cid:60) ( ˜ s )(left) next to the di ﬀ erence between the noisy and denoised maps (cid:60) ( d − ˜ s ) (middle) and the denoising error (cid:60) ( s − ˜ s ) (right). (cid:60) ( ˜ s )shows that the noise level has been drastically reduced and thatwe are able to recover the ﬁlamentary structure down to a min-imum scale. We can identify the smooth regions of (cid:60) ( s ) evenif there still remains a visible noise. The similarities between (cid:60) ( d − ˜ s ) and (cid:60) ( n ) are striking. The local variations of the stan-dard deviation of the noise are clearly recovered, demonstrat-ing that the inhomogeneity of the noise is not an issue for ourmethod. Finally, (cid:60) ( s − ˜ s ) exhibits some remaining structuresthat match the thinnest ﬁlaments appearing in (cid:60) ( s ), on top of amore di ﬀ use background. This indicates that down to a minimumscale, below which the algorithm struggles to recover features,most of the structures are e ﬃ ciently reconstructed.Figure 2 compares the power spectra of the six maps shownin Fig. 1 plus the cross-spectrum between the denoised and truthmaps , . We ﬁrst point out that the power spectrum of d coin-cides with the sum of those of s and n because of the statisticalindependence between s and n . The power spectra of (cid:60) ( ˜ s ) and (cid:60) ( s ) are in very good agreement with each other up to 0 .

18 px − ,at which scale the noise power is ten times that of the signal. Atsmaller scales, where the noise dominates the signal even more,our algorithm is not able to retrieve the true power spectrum butgets closer to it. The power spectrum of (cid:60) ( d − ˜ s ) coincides withthat of (cid:60) ( n ) at small scales, and this agreement progressivelyworsens towards larger scales. This large-scale behavior showsthat our algorithm does not remove the already negligible noise,as shown by the superposition of the power spectrum of (cid:60) ( s − ˜ s )and that of (cid:60) ( n ) at these scales. The cross-spectrum ˜ s × s isslightly below the power spectrum of s at intermediate scalesand this discrepancy increases towards the smallest scales. At These power spectra are estimated computing the mean square mod-uli of the Fourier components of the maps binned on linearly spacedisotropic wavenumber bins, and the uncertainties correspond to the stan-dard deviations of the means. The cross-spectrum is computed similarlyexcept for the bins above k / (2 π ) = .

14 px − that are logarithmicallyspaced in order to lower the statistical variance. We recall that this cross-spectrum, called CS[ (cid:60) ( s ) , (cid:60) (˜ s )],is related to the power spectra by the following relation:PS[ (cid:60) ( s − ˜ s )] = PS[ (cid:60) ( s )] + PS[ (cid:60) (˜ s )] − (cid:60) ( s ) , (cid:60) (˜ s )].Article number, page 3 of 9 & A proofs: manuscript no. main − − ∆ x [deg] − − ∆ y [ d e g ] Noisy d − − ∆ x [deg]Denoised ˜ s − − ∆ x [deg]GNILC − − Q m a p [ µ K C M B ] Fig. 4.

Noisy (left) and denoised (middle) Q maps of the Chamaeleon-Musca region as observed by the Planck satellite at 353 GHz. The noisymaps were denoised as described in Sect. 2.1. We also show for reference the corresponding GNILC map (right). intermediate scales, where the power spectrum of (cid:60) ( ˜ s ) matchesthat of (cid:60) ( s ), we suspect this discrepancy to stem from di ﬀ er-ences between the phases of the Fourier components of s and˜ s that would deserve a further quantiﬁcation. Nevertheless, theproduction of a denoised map whose power spectrum coincideswith that of s , even though n is ten times more powerful than s , and that retains a signiﬁcant correlation with s , is a strikingsuccess of our method.To better characterize the non-Gaussianity of ˜ s , we com-pute the PDFs of the increments of Q for the noisy, denoised,and truth maps, and plot them in Fig. 3 for three scalar lags.The increment δ Q l ( r ) for a scalar lag l at a position r is theset of di ﬀ erences δ Q l ( r ) = Q ( r ) − Q ( r + l ) with l ≤ | l | < l + (cid:60) ( s ) are far from Gaussian for every lag. This isa clear signature of the non-Gaussianity of the data as we ex-pect Gaussian-distributed increments for homogeneous Gaus-sian data. Our method recovers these statistics with negligibledistortion for each lag, demonstrating its e ﬃ ciency in retrievingnon-Gaussianity in the data. A more thorough characterizationof this non-Gaussianity is postponed to further studies.

3. Application to

Planck polarization data

We now apply our denoising method to a Q + iU polarizationmap of the Chamaeleon-Musca region observed at 353 GHzwith the Planck satellite (PR3 data , Planck Collaboration III2020). At this frequency, the CMB is negligible and the dom-inant components are the dust emission and the noise (PlanckCollaboration XI 2020). We consider the Q and U maps corre-sponding to the full mission, and those corresponding to the twohalf-missions, as well as the 300 end-to-end simulated Q and U maps of the noise and the systematics of the instrument for thefull mission (the ones used in Sect. 2). We project all of thesemaps on 512 ×

512 grids with a pixel size of 2 . (cid:48) , centeredon the region of the Chamaeleon-Musca clouds at Galactic coor-dinates ( l , b ) = (300 . ◦ , − . ◦ ). We used a Gnomonic projec- https://wiki.cosmos.esa.int/planck-legacy-archive/index.php/Main_Page − k π [deg − ] P S [ Q ] d ˜ sn d − ˜ sd × d GNILC

Fig. 5. Q maps power spectra for d , ˜ s , n (all in solid lines), and d − ˜ s ,and cross-spectrum between d and d (both in dashed lines). Thiscross-spectrum estimates the power spectrum of the dust emission. Alsoshown in solid line is the power spectrum of the GNILC Q map corre-sponding to the same Chamaeleon-Musca ﬁeld. tion through the HEALPix/healpy package (Górski et al. 2005;Zonca et al. 2019).We show in Fig. 4 the projected full-mission Q map that weaim to denoise (left), as well as the denoised map discussed inthe next section (middle). The same maps for U are given inFig. B.2 in Appendix B. We apply the denoising method presented in Sect. 2 to the fullmission Q + iU map, using the corresponding 300 noises. Fig-ures 4 and B.2 (middle) show the resulting denoised Q and U maps respectively. The overall noise level has been clearly mit-igated although subtle residuals of the patterns due to the scan-ning strategy remain, and we can now discern a more complexlayout of structures even in the regions where the signal is weak.We show in Fig. 5 a power spectrum analysis of the denois-ing of the Q map, comparing the power spectra of the noisy and http://healpix.sourceforge.net Article number, page 4 of 9runo Regaldo-Saint Blancard et al.: A new approach for the statistical denoising of

Planck interstellar dust polarization data − − − − − − − − − − l = 4 . d Denoised ˜ s GNILC − − l = 18 . − − l = 75 . δQ l /σ l p ( δ Q l ) Fig. 6.

PDFs of the Chamaeleon-Musca Q maps increments for d (noisy), ˜ s (denoised), and GNILC map, computed for three logarithmicallyspaced lags going from 4 . (cid:48) to 75 . (cid:48) . For each lag, the increments are normalized by the standard deviation σ l of the Gaussian that ﬁts the core ofthe PDF of the noisy map. denoised maps (cid:60) ( d ) and (cid:60) (˜ s ), of one noise map (cid:60) ( n ) and ofthe di ﬀ erence between the noisy and denoised maps (cid:60) ( d − ˜ s ), aswell as the cross-spectrum between the two half-missions maps.This cross-spectrum gives an estimate of the power spectrum ofthe truth signal since the noise signals are independent. It is satis-factory to see that the power spectrum of (cid:60) (˜ s ) is consistent withthis cross-spectrum for scales down to k / (2 π ) ∼ . − . Also, (cid:60) ( d − ˜ s ) behaves similarly to what we observed in Sec. 2.2,with a power spectrum consistent with that of (cid:60) ( n ) when thenoise has a larger power than the signal, and falling below whatis expected when the emission of the dust begins to dominate.Similar results are obtained for U .Figure 6 compares the PDFs of the increments of Q for thenoisy and denoised maps, and for the same three scalar lags asin Fig. 3. Based on the results of Sect. 2, we expect the PDFsof the denoised map to be good estimates of the statistics of thetruth map. These results show important di ﬀ erences between thenoisy and denoised maps that are again a signature of the non-Gaussianity of the truth signal. We ﬁnd similar results for theincrements of U .The literature on image denoising is rich and abundant (seefor instance Buades et al. 2010; Dabov et al. 2007; Zhang et al.2017), and a thorough comparison of our method with other de-noising algorithms would be beyond the scope of this paper.Nevertheless, to give some elements of comparison, we compareour method in Appendix B to: (1) Wiener ﬁltering and samplingmethods, which are widely used in astrophysics, (2) the GNILCmethod (Planck Collaboration Int. XLVIII 2016), which wasused on Planck polarization data and provides a local smoothingkernel in order to optimally remove the noise. For the Wienerapproach, we ﬁnd that neither the Wiener ﬁltered image or real-isations drawn from the Wiener posterior distribution are able toretrieve the PDFs of the increments, even when the true powerspectrum was given as an input. The comparison with GNILCshows that our method better recovers the true power spectrumand the PDFs of the increments.

4. Conclusion and prospects

We have introduced a new method for the denoising of

Planck polarization data based on the generation of random syntheticmaps from WPH statistics that characterize the non-Gaussianityof the dust emission. This method takes advantage of the strong statistical di ﬀ erences between the signal of interest (non-Gaussian and regular) and the noise (close to Gaussian and irreg-ular) by performing an optimization that constrains the statisticalproperties of the denoised map plus noise realizations.We have applied our method to mock Q + iU noisy data de-signed to emulate typical Planck polarization maps of dust in thedi ﬀ use ISM. The denoised map has a power spectrum that coin-cides with the true power spectrum down to a minimum scalewhere the power of the noise is ten times that of the signal,while being highly correlated with the truth signal. It recoversthe PDFs of the increments for various isotropic lags, demon-strating that our method is able to retrieve non-Gaussianity in thedata. Finally, we have applied this method to a 353 GHz Planck observation of the Chamaeleon-Musca ﬁeld.Our method is introduced as a statistical denoising method,but it should be more generally applicable to component sepa-ration problems. In particular, we expect this method to be e ﬃ -cient at disentangling dust emission from the CMB, cosmic in-frared background, and noise in Planck maps. Therefore, it couldhopefully enhance the scientiﬁc outcome of

Planck data.One of the main motivations of our work is to build a genera-tive model of the dust polarization signal that takes into accountthe non-Gaussianity of the data. Such a model derived from theanalysis of

Planck data may be used to simulate the dust polar-ization sky. Our denoising algorithm constitutes a step towardsthis modelling, in the sense that WPH statistics of the dust emis-sion, corrected to a ﬁrst approximation from noise contamina-tion, may be computed from the denoised map. A natural ex-tension of this work would be to quantitatively assess how wellwe can recover the WPH statistics of the truth map with this ap-proach. This will be the subject of a future paper.

Acknowledgements.

This work was undertaken in parallel to a similar projectled by J.-M. Delouis on the full sky, and we acknowledge a fruitful interac-tion with his team. We gratefully acknowledge P. Lesa ﬀ re for helping us withthe computation of the PDFs of the increments. We also acknowledge fruit-ful discussions with J.-F. Cardoso, S. Mallat, and T. Marchand. This researchwas supported by the Agence Nationale de la Recherche (project BxB: ANR-17-CE31-0022). B. Regaldo-Saint Blancard acknowledges support from the CentreNational d’Etudes Spatiales (CNES). Article number, page 5 of 9 & A proofs: manuscript no. main

Fig. A.1.

Real part (left) and imaginary part (right) of a 512 ×

512 bump-steerable wavelet ψ ,π/ . The wavelet is centered on the middle of themap for a better visualization. Appendix A: The WPH statistics

We introduce in this appendix the WPH statistics used in thispaper, referring to Allys et al. (2020) and Mallat et al. (2020)for additional details. We also describe the detailed denoisingprocedure introduced in Sect. 2.1. We call X a given randomﬁeld, and x one of its realizations. Appendix A.1: Bump-steerable wavelets

The WPH statistics rely on a bank of wavelets that are dilationsand rotations of a mother bump-steerable wavelet ψ deﬁned inFourier space as (Mallat et al. 2020):ˆ ψ ( k ) = exp  − ( k − ξ ) ξ − ( k − ξ )  · [0 , ξ ] ( k ) × cos L − (arg( k )) · [0 ,π/ ( | arg( k ) | ) , (A.1)with k = (cid:107) k (cid:107) , A ( x ) the indicator function that returns 1 if x ∈ A and 0 otherwise, and ξ = . π the central wavenumber of themother wavelet. The dilated and rotated wavelets are deﬁned as: ψ ξ j ,θ ( r ) = − j ψ (cid:16) − j rot − θ [ r ] (cid:17) , (A.2)with j the index of the dilation by a factor 2 j and θ the angleof rotation. We call ξ j ,θ = − j ξ u θ with u θ = cos θ u x + sin θ u y the central wavevector of the wavelet obtained by such a trans-form. In practice, we consider J dilation indices j going from0 to J −

1, and divide 2 π into 2 L evenly spaced rotation an-gles θ . The bank of wavelets is thus the set of 2 JL wavelets { ψ ξ j ,θ : j = , . . . , J − , θ = , π L , . . . , (2 L − π L } , and these waveletscover most of Fourier space with their respective band-passes.In this paper, we work with 512 ×

512 maps and choose J = L =

8. We show in Fig. A.1 one example wavelet from thebank.

Appendix A.2: WPH moments

The WPH moments of X are covariances of the phase harmon-ics of the wavelet transform of X . With { ψ ξ , . . . , ψ ξ N } our bankof wavelets labelled by their central wavevectors ξ i , the wavelettransform of X corresponds to the set { X (cid:63) ψ ξ , . . . , X (cid:63) ψ ξ N } ,where (cid:63) denotes the convolution operation. These convolutionsamount to a local band-pass ﬁltering of X on the scales probedby each of the wavelets. For an homogeneous X , the WPH mo-ments of X are : C ξ i , p i , ξ j , p j ( τ ) = Cov (cid:16)(cid:104) X (cid:63) ψ ξ i ( r ) (cid:105) p i , (cid:104) X (cid:63) ψ ξ j ( r + τ ) (cid:105) p j (cid:17) , (A.3) with z (cid:55)→ [ z ] p = | z | · e ip arg( z ) the phase harmonic operator .These WPH moments are able to capture interactions be-tween di ﬀ erent scales of X thanks to the phase harmonic op-erator. Indeed, the covariance between X (cid:63) ψ ξ i and X (cid:63) ψ ξ j van-ishes when the wavelets ψ ξ i and ψ ξ j have non-intersecting band-passes, and is otherwise a function of the power spectrum of X only. With proper p i and p j indices, the phase harmonic operatorcan make [ X (cid:63) ψ ξ i ] p i and [ X (cid:63) ψ ξ j ] p j comparable in the sense thatthey share common spatial frequencies, allowing an extraction ofhigh-order information through their covariance.We discretize the τ variable similarly to Allys et al. (2020).This introduces a grid of τ n ,α vectors for each wavelet ψ j ,θ de-ﬁned as: τ n ,α = n j u θ + α , (A.4)with n = , . . . , ∆ n and α ∈ {− π/ , , π/ , π/ } . To avoid a re-dundancy of the information contained in the coe ﬃ cients, wediscard the translations for which n > min( J − − j , ∆ n ).Allys et al. (2020) identiﬁed a relevant set of WPH momentsto build a generative model of real-valued simulated data ofthe large-scale structure of the Universe, which are highly non-Gaussian at late times and at scales smaller than 100 h − Mpc. Weadopt in the present work the same corresponding WPH statis-tics, and apply them similarly on complex-valued data, exceptfor the scaling moments discussed below. In practice we esti-mate a set of moments that can be divided into ﬁve categories: – the S (1 , moments, of the form C ξ , , ξ , , at every τ n ,α with ∆ n =

2. They capture the power spectrum information of the ξ band-pass. – the S (0 , moments, of the form C ξ , , ξ , , at every τ n ,α with ∆ n =

2. They capture an information related to the sparsityof the data in the ξ band-pass. – the S (0 , moments, of the form C ξ , , ξ , , at τ = ξ band-pass. – the C (0 , moments, of the form C ξ , , ξ , , considering0 ≤ j < j ≤ J − | θ − θ | ≤ π , at every τ n ,α with ∆ n = θ = θ and at τ = θ (cid:44) θ . Theycapture an information related to the correlation between lo-cal levels of oscillation for the scales in the ξ and ξ band-passes. – the C phase moments, of the form C ξ , , ξ , p with p = ξ /ξ >

1, considering 0 ≤ j < j ≤ J − θ = θ , at every τ n ,α with ∆ n =

2. They capture an informa-tion related to the statistical phase alignment of oscillationsbetween the scales in the ξ and ξ band-passes.We call ˜ C ξ , p , ξ , p ( τ ) the normalized estimates of the WPHmoments C ξ , p , ξ , p ( τ ). In practice they are deﬁned as followsfor a map x :˜ C ξ , p , ξ , p ( τ ) = (cid:10) x ( ξ , p ) ( r ) x ( ξ , p ) ( r + τ ) (cid:11)(cid:113)(cid:10) | x ( ξ , p )0 | (cid:11)(cid:10) | x ( ξ , p )0 | (cid:11) , (A.5)where the brackets stand for a spatial mean on the r variable, the bar denotes the complex conjugate, and x ( ξ , p ) = [ x (cid:63) ψ ξ ] p − (cid:10) [ x (cid:63) ψ ξ )] p (cid:11) . These estimates depend on areference map x . We found that the choice of x can have asigniﬁcant impact on the results of our denoising method as ex-plained below. These moments do not depend on the r variable because of the ho-mogeneity of X .Article number, page 6 of 9runo Regaldo-Saint Blancard et al.: A new approach for the statistical denoising of Planck interstellar dust polarization data

Appendix A.3: Scaling moments

Similarly to Allys et al. (2020), we complement these statisticsby a small number of coe ﬃ cients, the estimates of the so-called scaling moments L j , p that better constrain the largest scales thatare not probed by the WPH moments. These moments are con-structed from a bank of scaling functions { ϕ j } ≤ j ≤ J − , that cor-respond to dilations of an isotropic Gaussian ﬁlter ϕ deﬁned inFourier space by:ˆ ϕ ( k ) ∝ exp (cid:32) − || k || σ (cid:33) , (A.6)with σ = . × − . ξ . For a real-valued random ﬁeld X , thescaling moments of X are: L j , = Cov (cid:104) | X (cid:63) ϕ j | , | X (cid:63) ϕ j | (cid:105) , (A.7) L j , p = Cov (cid:104)(cid:16) X (cid:63) ϕ j (cid:17) p , (cid:16) X (cid:63) ϕ j (cid:17) p (cid:105) (for p > . (A.8)In practice, we only estimate the scaling moments with p ∈ { , , , } and 2 ≤ j ≤ J −

2. We deﬁne the correspond-ing normalized estimates computed for a map x by:˜ L j , p = (cid:104)| x ( j , p ) | (cid:105)(cid:104)| x ( j , p )0 | (cid:105) , (A.9)with: x ( j , = | ( x − (cid:104) x (cid:105) ) (cid:63) ϕ j | − (cid:104)| ( x − (cid:104) x (cid:105) ) (cid:63) ϕ j |(cid:105) , (A.10) x ( j , p ) = (cid:16) ( x − (cid:104) x (cid:105) ) (cid:63) ϕ j (cid:17) p − (cid:104) (cid:16) ( x − (cid:104) x (cid:105) ) (cid:63) ϕ j (cid:17) p (cid:105) (for p > . (A.11)These normalized estimates depend on the same reference map x introduced for the estimates of the WPH moments of X . Aswe deal with complex maps x in this work, we compute theseestimates for their real and imaginary parts separately. Appendix A.4: Denoising procedure

We apply our denoising method to a simulated noisy map d intwo steps. We perform a ﬁrst denoising of d using a φ operatorthat only takes into account the estimates of the S (1 , moments,related to the power spectrum information, and the estimates ofthe scaling moments, and normalizing the estimates choosing u = d . This ﬁrst step yields a map ˜ s that has a power spectrummuch more consistent with the truth map s . Then, we perform asecond denoising of d using a φ operator that includes the wholeset of estimates of WPH moments and scaling moments, but herenormalizing these estimates with u = ˜ s . We call ˜ s the outputof this second step. This particular choice for the normalizationof the estimates has proven to be more e ﬃ cient at retrieving thepower spectrum and the PDFs of the increments of the truth map s . We show in Table A.1 the resulting number of statisti-cal coe ﬃ cients per class of moments computed for a complex512 ×

512 pixels map x and for the chosen parameters. In thispaper, the φ operator used for the second step of the procedurecomputes a set of 11176 statistical coe ﬃ cients (of which 40 cor-respond to the estimates of the scaling moments), which corre-sponds to approximately 4% of the number of pixels of the inputmap. S (1 , S (0 , S (0 , C (0 , C phase L j , p Total960 960 128 6336 2752 40 11176

Table A.1.

Number of statistical coe ﬃ cients per class of moments forthe parameters of this paper. Appendix B: Comparison to other methods

Appendix B.1: Wiener ﬁltering and posterior sampling

Wiener ﬁltering is a typical approach to signal inference forproblems of the form given by Eq. 1. In this case, the Wienerﬁltered reconstruction ˜ s W (Wiener 1949, Zaroubi et al. 1995) isgiven by˜ s W = Wd = S (cid:0) S + N (cid:1) − d , (B.1)where S = (cid:104) ss † (cid:105) and N = (cid:104) nn † (cid:105) are the signal and noise co-variance matrices, respectively. For brevity we have assumed (cid:104) s (cid:105) = (cid:104) n (cid:105) =

0. This is the linear ﬁlter that gives the expectedminimum variance of residuals ( Wd − s ) for known S and N .It is furthermore both the maximum a posteriori and meanposterior solution if the noise and signal are drawn from Gaus-sian distributions. That is, if the real-valued data are distributedwith a likelihood p ( d | s , N ) = √ (det2 π N ) exp (cid:104) −

12 ( d − s ) † N − ( d − s ) (cid:105) , (B.2)and the prior distribution for the real-valued signal is p ( s | S ) = √ (det2 π S ) exp (cid:104) − s † S − s (cid:105) , (B.3)then the Wiener posterior distribution is given by p ( s | S , N , d ) ∝ p ( d | s , N ) p ( s | S ) ∝ exp (cid:104) −

12 ( s − Wd ) † ( S − + N − )( s − Wd ) (cid:105) . (B.4)For the problem described in this work, the noise covariance N can be estimated from the samples of noise realizations and thesignal covariance S can either be assumed or jointly estimatedfrom the data (Wandelt et al. 2004).We avoid inverting the dense W matrix by implementinga messenger ﬁeld approach ﬁrst described by Elsner & Wan-delt (2013) and now widely used in cosmology (e.g. Jasche &Lavaux 2015; Alsing et al. 2017; Je ﬀ rey et al. 2018a,b; Kodi Ra-manah et al. 2019). We assume the underlying signal is homo-geneous and isotropic, such that the signal covariance S is di-agonal in harmonic space with elements corresponding to theone-dimensional power spectrum. The uncorrelated noise givesa diagonal noise covariance N in pixel space, so that the mes-senger ﬁeld can e ﬃ ciently iterate between harmonic and pixelspace.As we are concerned with polarization data, we use a spin-2harmonic transformation between { s Q , s U } and { (cid:101) s E , (cid:101) s B } , using thecurl-free E-mode and divergence-free B-mode representation.The signal covariance in harmonic space is therefore a concate-nation of { S E , S B } . This formulation preserves the relevant Q - U correlation.Even if the Wiener ﬁltered reconstructed signal ˜ s W is themaximum of the posterior distribution p ( s | d ), functions f (˜ s W )will not correspond to the maximum p ( f ( s ) | d ). For example, thetwo-point statistics (e.g. variance or power spectra) of ˜ s W will Article number, page 7 of 9 & A proofs: manuscript no. main − − k π [px − ] P S [ Q ] ds ˜ s W n d − ˜ s W s − ˜ s W ˜ s W × s − − δQ l /σ l − − − − − − p ( δ Q l ) l = 2 px ds ˜ s W ˜ s W , sample − < ( ˜ s ) : D e n o i s e d W i e n e r Q m a p Fig. B.1.

Left:

Wiener ﬁltered map ˜ s W . Middle:

Same as Fig. 2 but for ˜ s W . The power spectrum is suppressed, as expected; the power spectrum ofa posterior sample s i would be correct by construction. Right:

Same as Fig. 3 for l = − − ∆ x [deg] − − ∆ y [ d e g ] Noisy d − − ∆ x [deg]Denoised ˜ s − − ∆ x [deg]GNILC − − − − U m a p [ µ K C M B ] Fig. B.2.

Same as Fig. 4 for U . not be unbiased estimates of the power spectrum of s . Instead, wecan draw sample images s i from the posterior p ( s | d ) (Eq. B.4),so that the transformed samples f ( s i ) are correctly distributed ac-cording to p ( f ( s ) | d ). These realisations are generated by amend-ing the messenger ﬁeld algorithm (Elsner & Wandelt 2013).Fig. B.1 shows the Wiener ﬁltered reconstruction of the sim-ulated Q map ( left panel ). The low amplitude power spectrumof the residual map (cid:60) ( s − ˜ s W ) demonstrates that the pixel val-ues are close to the truth, whereas the power spectrum of (cid:60) (˜ s W )is biased low ( centre panel ), which is as expected. To comparebetween the methods, we compare with functions of samples s i from the Wiener posterior. Though not plotted here, as we inputthe true power spectrum (which is not done for the WPH denois-ing method), the realisations s i do have the correct power spectrawith relatively small sample variance.As a goal of this denoising work is to retain the statisticalnon-Gaussianity in the signal, which could be represented in theWPH coe ﬃ cients, we again compare the PDF of increments δ Q l .The right panel of B.1 shows the comparison between the data (cid:60) ( d ), the signal (cid:60) ( s ), the Wiener ﬁltered (mean) map (cid:60) (˜ s W ),and a realisation (cid:60) (˜ s W , sample ) drawn from the Wiener posterior.We see that neither the Wiener mean map (cid:60) (˜ s W ) nor the sample (cid:60) (˜ s W , sample ) capture the non-Gaussianity in terms of δ Q l incre-ments as successfully as our WPH denoising method (see Fig. 3).Furthermore, inspection of the WPH coe ﬃ cients of theWiener posterior samples φ ( s i ) shows them to be noticeably fur-ther from those of the truth map than φ (˜ s ). These preliminaryresults are encouraging, but not surprising. The Gaussian prior distribution in the statistical model of the Wiener posterior leadsto a poor recovery of the non-Gaussian structures that are intrin-sic to the polarized dust emission. Appendix B.2: GNILC

The GNILC method is a wavelet-based component separationmethod that is designed to extract the emission of the Galac-tic foregrounds from the

Planck full-sky maps. It has been ap-plied to polarization maps Q and U (Planck Collaboration IV2020; Planck Collaboration XII 2020) to disentangle the thermaldust polarization emission from the CMB polarization and in-strumental noise over the entire sky. We show in Figs. 4 and B.2(right column) the resulting Q and U maps, respectively, for theChamaeleon-Musca ﬁeld observed at 353 GHz. Note that at thisfrequency, the CMB can be safely ignored, so that the main ef-fect of the GNILC algorithm is to denoise the dust emission. Weclearly see on these maps that the smallest scales, most contami-nated by the noise, have been ﬁltered out. Compared to GNILC,our denoised maps thus include a wider variety of structures atintermediate and small scales.A quantitative comparison based on the power spectrum andthe PDFs of the increments for the Q maps is shown in Figs. 5and 6. The power spectrum of the GNILC map plummets atsmall scales and exhibits a lack of power compared to that of˜ s and the cross-spectrum d × d . The PDFs of the incrementsalso show important discrepancies between the ˜ s map and theGNILC map, which suggests a distortion of these statistics by Article number, page 8 of 9runo Regaldo-Saint Blancard et al.: A new approach for the statistical denoising of

Planck interstellar dust polarization data the GNILC method. This quantitative comparison demonstratesthe superiority of our method for recovering the true power spec-trum and the PDFs of the increments.

References

Allys, E., Levrier, F., Zhang, S., et al. 2019, A&A, 629, A115Allys, E., Marchand, T., Cardoso, J. F., et al. 2020, Physical Review D, 102,103506Alsing, J., Heavens, A., & Ja ﬀ e, A. H. 2017, MNRAS, 466, 3272Aylor, K., Haq, M., Knox, L., Hezaveh, Y., & Perreault-Levasseur, L. 2020, MN-RAS, 500, 3889Beck, D., Errard, J., & Stompor, R. 2020, Journal of Cosmology and Astroparti-cle Physics, 2020, 030BICEP2 / Keck and Planck Collaborations. 2015, Physical Review Letters, 114,101301Buades, A., Coll, B., & Morel, J. M. 2010, SIAM Review, 52, 113Burkhart, B., Falceta-Gonçalves, D., Kowal, G., & Lazarian, A. 2009, ApJ, 693,250Byrd, R., Lu, P., Nocedal, J., & Zhu, C. 1995, SIAM Journal on Scientiﬁc Com-puting, 16, 1190Cheng, S., Ting, Y.-S., Ménard, B., & Bruna, J. 2020, MNRAS, 499, 5902Dabov, K., Foi, A., Katkovnik, V., & Egiazarian, K. 2007, IEEE Transactions onImage Processing, 16, 2080Draine, B. T. 2011, Physics of the interstellar and intergalactic medium (Prince-ton University Press)Elsner, F. & Wandelt, B. D. 2013, A&A, 549, A111Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759Hily-Blant, P. & Falgarone, E. 2009, A&A, 500, L29Hily-Blant, P., Falgarone, E., & Pety, J. 2008, A&A, 481, 367Jasche, J. & Lavaux, G. 2015, MNRAS, 447, 1204Je ﬀ rey, N., Abdalla, F. B., Lahav, O., et al. 2018a, MNRAS, 479, 2871Je ﬀ rey, N., Heavens, A. F., & Fortio, P. D. 2018b, Astronomy and Computing,25, 230Kodi Ramanah, D., Lavaux, G., & Wandelt, B. D. 2019, MNRAS, 490, 947Krachmalnico ﬀ , N. & Puglisi, G. 2020 [ arXiv:2011.02221 ]Mallat, S., Zhang, S., & Rochette, G. 2020, Information and Inference: A Journalof the IMA, 9, 721Petro ﬀ , M. A., Addison, G. E., Bennett, C. L., & Weiland, J. L. 2020, ApJ, 903,104Planck Collaboration III. 2020, A&A, 641, A3Planck Collaboration IV. 2020, A&A, 641, A4Planck Collaboration XI. 2020, A&A, 641, A11Planck Collaboration XII. 2020, A&A, 641, A12Planck Collaboration Int. XLVIII. 2016, A&A, 596, A109Regaldo-Saint Blancard, B., Levrier, F., Allys, E., Bellomi, E., & Boulanger, F.2020, A&A, 642, A217Saydjari, A. K., Portillo, S. K. N., Slepian, Z., et al. 2020 [ arXiv:2010.11963 ]Thorne, B., Knox, L., & Prabhu, K. 2021 [ arXiv:2101.11181 ]Wandelt, B. D., Larson, D. L., & Lakshminarayanan, A. 2004, Phys. Rev. D, 70,083511Wiener, N. 1949, Extrapolation, interpolation, and smoothing of stationary timeseries, Vol. 7 (MIT press Cambridge, MA)Zaroubi, S., Ho ﬀ man, Y., Fisher, K. B., & Lahav, O. 1995, ApJ, 449, 446Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. 2017, IEEE Transactionson Image Processing, 26, 3142Zhang, S. & Mallat, S. 2019 [ arXiv:1911.10017 ]Zonca, A., Singer, L., Lenz, D., et al. 2019, Journal of Open Source Software, 4,1298]Zonca, A., Singer, L., Lenz, D., et al. 2019, Journal of Open Source Software, 4,1298