[PDF] Three-Dimensional Reconstruction of Weak Lensing Mass Maps with a Sparsity Prior. I. Cluster Detection

Abstract

We propose a novel method to reconstruct high-resolution three-dimensional mass maps using data from photometric weak-lensing surveys. We apply an adaptive LASSO algorithm to perform a sparsity-based reconstruction on the assumption that the underlying cosmic density field is represented by a sum of Navarro-Frenk-White halos. We generate realistic mock galaxy shape catalogues by considering the shear distortions from isolated halos for the configurations matched to Subaru Hyper Suprime-Cam Survey with its photometric redshift estimates. We show that the adaptive method significantly reduces line-of-sight smearing that is caused by the correlation between the lensing kernels at different redshifts. Lensing clusters with lower mass limits of 10^{14.0} h^{-1}M_{\odot}, 10^{14.7} h^{-1}M_{\odot}, 10^{15.0} h^{-1}M_{\odot} can be detected with 1.5-\sigma confidence at the low (z<0.3), median (0.3\leq z< 0.6) and high (0.6\leq z< 0.85) redshifts, respectively, with an average false detection rate of 0.022 deg^{-2}. The estimated redshifts of the detected clusters are systematically lower than the true values by \Delta z \sim 0.03 for halos at z\leq 0.4, but the relative redshift bias is below 0.5\% for clusters at 0.4<z\leq 0.85. The standard deviation of the redshift estimation is 0.092. Our method enables direct three-dimensional cluster detection with accurate redshift estimates.

Full PDF

DDraft version February 22, 2021

Typeset using L A TEX twocolumn style in AASTeX63

Three-Dimensional Reconstruction of Weak Lensing Mass Mapswith a Sparsity Prior. I. Cluster Detection

Xiangchong Li,

1, 2

Naoki Yoshida,

1, 2, 3

Masamune Oguri,

1, 2, 4

Shiro Ikeda,

5, 6 and Wentao Luo Department of Physics, University of Tokyo, Tokyo 113-0033, Japan Kavli Institute for the Physics and Mathematics of the Universe (WPI),University of Tokyo, Chiba 277-8583, Japan Institute for Physics of Intelligence, University of Tokyo, Tokyo 113-0033, Japan Research Center for the Early Universe, University of Tokyo, Tokyo 113-0033, Japan The Institute of Statistical Mathematics, Tokyo 190-8562, Japan Department of Statistical Science, Graduate University for Advanced Studies, Tokyo 190-8562, Japan

ABSTRACTWe propose a novel method to reconstruct high-resolution three-dimensional mass maps using datafrom photometric weak-lensing surveys. We apply an adaptive LASSO algorithm to perform a sparsity-based reconstruction on the assumption that the underlying cosmic density ﬁeld is represented by asum of Navarro-Frenk-White halos. We generate realistic mock galaxy shape catalogues by consideringthe shear distortions from isolated halos for the conﬁgurations matched to Subaru Hyper Suprime-Cam Survey with its photometric redshift estimates. We show that the adaptive method signiﬁcantlyreduces line-of-sight smearing that is caused by the correlation between the lensing kernels at diﬀerentredshifts. Lensing clusters with lower mass limits of 10 . h − M (cid:12) , 10 . h − M (cid:12) , 10 . h − M (cid:12) can bedetected with 1.5- σ conﬁdence at the low ( z < . . ≤ z < .

6) and high (0 . ≤ z < . − . The estimated redshifts ofthe detected clusters are systematically lower than the true values by ∆ z ∼ .

03 for halos at z ≤ . .

5% for clusters at 0 . < z ≤ .

85. The standard deviation ofthe redshift estimation is 0 . Keywords: gravitational lensing: weak — galaxies: clusters: general INTRODUCTIONWeak gravitational lensing causes small, coherent dis-tortion of the shapes of distant galaxies. Informationon the foreground mass distribution is imprinted in thedistorted galaxy images, and thus weak lensing oﬀersa direct, physical probe into the mass distribution inour universe, including both visible matter and invisibledark matter (see Kilbinger 2015; Mandelbaum 2018, forrecent reviews). Ongoing and future observational pro-grams such as Subaru HSC (Aihara et al. 2018), KIDS(de Jong et al. 2013), DES (The Dark Energy SurveyCollaboration 2005), LSST (LSST Science Collabora-tion et al. 2009), Euclid (Laureijs et al. 2011), NGRST(Spergel et al. 2015) are aimed at studying the large-scale mass distribution with high precision. [email protected]

Statistics of density peaks in two-dimensional andthree-dimensional mass maps can be used as a power-ful cosmological probe (Jain & Van Waerbeke 2000; Fanet al. 2010; Lin et al. 2016). One can detect massiveclusters of galaxies by identifying high signal-to-noise(SNR) peaks in a mass map without any reference tothe mass-to-light ratio (Schneider 1996; Hamana et al.2004).Two-dimensional (2D) density reconstruction tech-niques recover integration of the projected mass alongthe line-of-sight, which have been extensively studied sofar (Kaiser & Squires 1993; Lanusse et al. 2016; Priceet al. 2020) and applied to large-scale surveys (Oguriet al. 2018; Chang et al. 2018; Jeﬀrey et al. 2018). Clus-ter detection and identiﬁcation from 2D mass maps hasbeen applied to wide-ﬁeld weak-lensing surveys (Shanet al. 2012; Miyazaki et al. 2018a; Hamana et al. 2020).Cross-matching with another cluster catalogues (e.g., a r X i v : . [ a s t r o - ph . C O ] F e b optically selected one) is usually needed to infer physi-cal quantities such as mass and to estimate redshift forindividual clusters located in 2D mass maps.In principle, one can directly reconstruct three-dimensional (3D) mass distributions by using photomet-ric redshift (photo- z ) information of the source galaxies(Hu & Keeton 2002; Bacon & Taylor 2003; Massey et al.2007; Simon et al. 2009; VanderPlas et al. 2011). Un-fortunately, these methods either do not have enoughspatial resolution to identify individual clusters, or suf-fer from smearing along the line-of-sight. These arecritical obstacles that need to be overcome for prac-tical searches of clusters in 3D mass maps. Alterna-tively, Hennawi & Spergel (2005) propose to perform amaximum-likelihood detection of clusters, by convolv-ing tomographic shear measurements with 3D ﬁltersthat match the tangential shears induced by multi-scaleNFW halos. Their method can be used eﬀectively todetect clusters, but does not fully reconstruct wide-ﬁeldmass distributions.In the present paper, we develop a novel method forhigh-resolution 3D reconstruction. We model a given 3Ddensity ﬁeld as a sum of the NFW (Navarro et al. 1997)basis ‘atoms’. A basis atom is deﬁned by a 2D NFWsurface density proﬁle on the transverse plane and one-dimensional Dirac delta function in the line-of-sight di-rection. We apply the adaptive LASSO algorithm (Zou2006) to ﬁnd a sparse solution for a pixelized map. Weexamine the performance of cluster detection using thereconstructed mass maps. To this end, we apply sheardistortions generated by isolated halos using realisticHSC-like galaxy shapes with photo- z estimates.The rest of the paper is organized as follows. In Sec-tion 2, we propose the new method for 3-D density mapreconstruction. In Section 3, we study the cluster de-tection from the reconstructed mass map using isolatedhalo simulations with the HSC observational condition.In Section 4, we summarize and discuss the future de-velopment of the method.Throughout the present paper, we adopt the ΛCDMcosmology of the ﬁnal full-mission Planck observa-tion of the cosmic microwave background with H =67 . − Mpc − Ω m = 0 . Λ = 0 . σ =0 . n s = 0 .

965 (Planck Collaboration et al. 2020). METHOD2.1.

3D mass reconstruction

The lensing shear γ on galaxy shapes is related tothe foreground density contrast ﬁeld δ through a lineartransformation γ = T δ, (1) where the linear transformation operator T includes notonly the physical lensing eﬀect but also the systematiceﬀects in observations such as pixelization and smooth-ing of the shear ﬁeld in the transverse plane.To reconstruct the 3D mass density distribution δ from shear measurements with photometric redshift, wemodel the density contrast ﬁeld as a sum of basis atomsin a ‘dictionary’ as δ = Φ x, (2)where Φ is the transformation operator from the projec-tion coeﬃcient vector x to the density contrast. Therehave been a few studies that adopt diﬀerent atoms anddictionaries. Simon et al. (2009) perform reconstruc-tion in Fourier space, which is equivalent to represent-ing a mass map with sinusoidal functions. Leonard et al.(2014) model the mass ﬁeld with starlets (Starck et al.2015).The projection coeﬃcients can be estimated by op-timizing a regularized loss function. An estimator isgenerally deﬁned asˆ x = arg min x (cid:26) (cid:13)(cid:13)(cid:13) Σ − ( γ − TΦ x ) (cid:13)(cid:13)(cid:13) + λC ( x ) (cid:27) , (3)where (cid:13)(cid:13)(cid:13) Σ − ( γ − TΦ x ) (cid:13)(cid:13)(cid:13) is the l norm measuring thediﬀerence between the prediction and the data, and C ( x )is the regularization term measuring the deviation of thecoeﬃcient estimate x from the prior assumption. Theestimation with the ‘penalty’ term prefers parametersthat are able to describe the observation with a spec-iﬁed prior information. The regularization parameter λ adjusts the relative weight between the data and theprior assumption in the optimization process.Simon et al. (2009) propose to use the Wiener ﬁlter,which is also known as l ridge regulation ( C = (cid:107) x (cid:107) ),to ﬁnd a regularized solution in Fourier space. Oguriet al. (2018) apply the method of Simon et al. (2009)to the ﬁrst-year data of the Hyper Suprime-Cam Survey(Aihara et al. 2018). It is found that the density mapsreconstructed by the method suﬀer from signiﬁcant line-of-sight smearing with standard deviation of σ z = 0 . − . l LASSO regulation( C = (cid:107) x (cid:107) ) to ﬁnd a sparse solution in the Starlet dictio-nary space (Starck et al. 2015). GLIMPSE reduces thesmearing by adopting a coordinate descent algorithm The chi-square term is weighted by the inverse of the diagonalcovariance matrix Σ of the error on the shear measurements. ˜ Σ ( k ) Point Mass NFW r s = 1 NFW r s = 2 NFW r s = 4 Σ ( x ) -D proﬁles of basis atoms Figure 1.

The smoothed basis ‘atoms’. The leftmost column is the point mass atom, and the other columns show the NFWatoms with diﬀerent scale radii as indicated. The upper panels shows the basis atoms in Fourier space, whereas the lower panelsshow realizations in real space. We smooth the two-dimensional distribution with a Gaussian kernel with a 1 . x (pixel) . . . . . . . . Point massNFW r s = 1 NFW r s = 2 NFW r s = 4 NFW r s = 81 -D proﬁles of basis atoms Figure 2.

The one-dimensional proﬁle of smoothed basisatoms centered at x = 0. The corresponding 2-D proﬁles areshown in Figure 1. that forces the structure to grow only on the most re-lated lens redshift plane. The Starlet dictionary doesnot account for the angular scale diﬀerence at diﬀer-ent lens redshifts, and is not speciﬁcally designed tomodel the clumpy mass distribution in the universe. Itis worth exploring other dictionaries for our purpose ofweak-lensing mass reconstruction. In the standard cosmological model, dark matter isconcentrated in roughly spherical ‘halos’, which have theso-called NFW density proﬁle (Navarro et al. 1997). Mo-tivated by this fact, we generate a model dictionary withmulti-scale NFW ‘atoms’. An atom has the NFW sur-face density proﬁle on the transverse plane (Takada &Jain 2003). Following Leonard et al. (2014), we neglectthe size of halos along line-of-sight since the resolutionscale of the reconstruction is much larger than the halos.Thus we set the proﬁle of an atom in the line of sightdirection as the Dirac δ function. We assume that thehalos are sparsely distributed in the universe. With thesparsity prior, the adaptive LASSO regularization (Zou2006) can be used to reconstruct the density ﬁeld. Weexpect the adaptive LASSO to reduce the smearing ef-fect, in contrast to the ordinary LASSO estimator thattends to smear the structure along line of sight (Leonardet al. 2014).Our choice of the NFW atoms and dictionary is mo-tivated by physical consideration on the clumpy massdistribution in the universe. Furthermore, the adaptiveLASSO algorithm is strictly convex and can be directlyoptimized with the FISTA algorithm (Beck & Teboulle2009) without relying on any greedy coordinate descentapproaches. 2.2. Gravitational Lensing .

00 0 .

25 0 .

50 0 .

75 1 .

00 1 .

25 1 .

50 1 .

75 2 . z best . . . . . . . P ( z b e s t ) photo-z binning Figure 3.

The source galaxies are binned into 10 redshiftbins according to their Machine Learning and photo-Z (MLZ)best photo- z estimation. The blue histogram is the normal-ized number distribution of the best photo- z estimation. Thevertical dashed lines are the boundaries of the redshift bins.The galaxies are equal-number binned. . . . . . . . z . . . . . . . P ( z | z b e s t ) Average Photo-z PDF for 10 lens redshift bins h z i = 0 . h z i = 0 . h z i = 0 . h z i = 0 . h z i = 0 . h z i = 0 . h z i = 0 . h z i = 1 . h z i = 1 . h z i = 2 . Figure 4.

The averaged PDF of MLZ photo- z in 10 sourceredshift bins deﬁned in Figure 3. The lensing convergence to the comoving distance χ s is contributed by the foreground inhomogeneous densitydistribution as κ ( (cid:126)θ, χ s ) = 3 H Ω m c (cid:90) χ s dχ l χ l χ sl χ s δ ( (cid:126)θ, χ l ) a ( χ l ) , (4)where δ = ρ ( (cid:126)θ, χ l ) / ¯ ρ − c is the speed of light, χ sl is the comov-ing distance between source and lens plane, and a ( χ l ) isthe scale parameter at the lens position.The corresponding convergence to redshift z s is κ ( (cid:126)θ, z s ) = (cid:90) z s dz l K ( z l , z s ) δ ( (cid:126)θ, z l ) . (5) Here K ( z l , z s ) is the so-called lensing kernel given by K ( z l , z s ) =  H Ω m c χ l χ sl (1+ z l ) χ s E ( z l ) ( z s > z l ) , z s ≤ z l ) , (6)where E ( z ) is the Hubble paramefter as a function ofredshift, in units of H .Following Kaiser & Squires (1993), we relate the shearﬁeld to the kappa ﬁeld at the same source redshift by γ L ( (cid:126)θ, z s ) = (cid:90) d θ (cid:48) D ( (cid:126)θ − (cid:126)θ (cid:48) ) κ ( (cid:126)θ (cid:48) , z s ) , (7)where D ( (cid:126)θ ) = D ( θ , θ ) = − π ( θ − iθ ) − . (8)Here we denote the physical shear distortion as γ L ,which is not the ﬁnal shear measurement since shearmeasurement is inﬂuenced by systematic errors in obser-vations. The relevant systematic errors will be discussedin detail in Section 2.4.Combining eq. (5) with eq. (7), the expectation oflensing shear signal is γ L ( (cid:126)θ, z s ) = (cid:90) z s dz l K ( z l , z s ) (cid:90) d θ (cid:48) (cid:126)D ( (cid:126)θ − (cid:126)θ (cid:48) ) δ ( (cid:126)θ (cid:48) , z l ) . (9)To simplify the expression, we deﬁne the lensing trans-formation operator as Q = (cid:90) z s dz l K ( z l , z s ) (cid:90) d θ (cid:48) (cid:126)D ( (cid:126)θ − (cid:126)θ (cid:48) ) , (10)and then eq. (9) reduces to γ L = Q δ. (11)2.3. Dictionary

The density contrast ﬁeld is modeled as a sum of basisatoms φ s in the dictionary: δ ( (cid:126)r ) = N (cid:88) s =1 (cid:90) d r (cid:48) φ s ( (cid:126)r − (cid:126)r (cid:48) ) x s ( (cid:126)r (cid:48) ) , (12)where x s ( (cid:126)r (cid:48) ) is the projection coeﬃcient of the densitycontrast ﬁeld onto the basis atoms at the comoving co-ordinate: (cid:126)r (cid:48) . The basis atoms have ‘ N ’ diﬀerent scaleframes, and the atoms in each scale frame are shifted by (cid:126)r (cid:48) to form models at diﬀerent positions in the comovingcoordinates.We propose to use multi-scale NFW atoms, denotedas { φ , ..., φ N } , as the basis atoms of our dictionary. Inthe present paper, we adopt a hard truncation on the source redshift . . . . . . . lensing kernel z l = 0 . z l = 0 . z l = 0 . z l = 0 . z l = 0 . z l bin z l b i n correlation matrix, spec- z . . . . . . . . .

00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 z l bin z l b i n correlation matrix, photo- z . . . . . . . . . Figure 5.

The left panel shows the lensing kernels for ﬁve diﬀerent lens redshifts. The dashed lines are the kernels for thesource galaxies with precise spectroscopic redshifts, whereas the solid lines are with photometric redshifts with the uncertaintiesdiscussed in this section incorporated. The right two panels show the correlation between the lensing kernels of diﬀerent lensredshifts. The middle panel is for spectroscopic redshift, and the right panel is for photometric redshift. The lensing kernels arenormalized so that the diagonal elements of the correlation matrices equal unity.

NFW proﬁle at the virial radius. Studying the inﬂuenceof diﬀerent truncation forms (Oguri & Hamana 2011) onthe mass map reconstruction is left to our future work.The multi-scale NFW atoms are deﬁned as φ s ( (cid:126)r θ , z ) = f πr s F ( | (cid:126)r θ | /r s ) δ D ( z ) , ( s = 1 , ..., N ) (13)where (cid:126)r θ is the comoving position in the transverseplane, F ( x ) =  − √ c − x (1 − x )(1+ c ) + arccosh (cid:18) x cx (1+ c ) (cid:19) (1 − x ) / ( x < , √ c − c ) (1 + c +1 ) ( x = 1) , − √ c − x (1 − x )(1+ c ) + arccos (cid:18) x cx (1+ c ) (cid:19) ( x − / (1 < x ≤ c ) , x > c ) (14)with f = 1 / [ln(1 + c ) − c/ (1 + c )]. For simplicity, weﬁx c = 4 for the NFW atoms in diﬀerent scale frames.We then write the basis atoms in the angular separationcoordinates as φ s ( (cid:126)θ, z ) = f χ ( z )2 πr s F ( | (cid:126)θ | χ ( z ) /r s ) δ D ( z ) , ( s = 1 , ..., N ) (15)where χ ( z ) is the comoving distance to redshift z .To simplify the notation, we compress the projectioncoeﬃcients into a column vector x =  x x ...x N  (16) and compress the dictionary transform operator to a rowvector: Φ = (cid:16)(cid:82) d r φ ( (cid:126)r ) (cid:82) d r φ ( (cid:126)r ) ... (cid:82) d r φ N ( (cid:126)r ) (cid:17) . (17)From eq. (9) to (12), we obtain the following expressionin linear operator form: γ L = Q Φ x. (18)For an additional test and comparison, we also con-struct a dictionary with point mass atoms which arerepresented by the 3D Dirac function as φ PM ( (cid:126)θ, z ) = δ D ( θ ) δ D ( θ ) δ D ( z ) . (19)The 2D proﬁles of the point mass atom and the NFWatoms on the transverse plane are shown in Figure 1.The corresponding 1D proﬁles are plotted in Figure2. The proﬁles are smoothed with a Gaussian kerneland pixelized on linearly spaced grids. Details on thesmoothing and pixelizing operations are described inSection 2.4.2 and 2.4.4.2.4. Systematics

Shear measurement deviates from the true, physicalshear because of a variety of systematic errors in actualobservations. In this section, we ﬁrst discuss the inﬂu-ence of several major systematics, and then incorporatethem in our forward modeling.2.4.1.

Photo-z Uncertainty

Photometric redshifts are estimated with a limitednumber of broad optical and near-infrared bands in thecurrent generation weak-lensing surveys. For example, 9bands are used for KIDS+VIKING survey (Hildebrandtet al. 2020), 5 bands for DES survey and HSC sur-vey. The redshift estimation suﬀers from considerable -2.30e-072.09 4.18 6.27 8.36 10.5 z LASSO -1.30e-066.19 12.4 18.6 24.8 31.0 z adaptive LASSO Figure 6.

The density map reconstruction with LASSO (left) and with our adaptive LASSO (right) algorithm. The verticaldirection is the line-of-sight direction (redshift), and the lower boundaries and upper boundaries of the plotted region are z = 0 . z = 0 .

85, respectively. Galaxy shape measurement errors and photo- z uncertainties are not considered in the simulation.The mass of halo is M = 10 h − M (cid:12) , and its redshift is z = 0 . uncertainties. In our study, photo- z uncertainties re-sult in smearing the lensing kernels statistically becausea galaxy with a best-ﬁt photo- z estimate z s has alsopossibility of being actually located at diﬀerent redshift z . Let us denote the probability distribution functionas P ( z | z s ). Then the expected shear distortion of thegalaxy is γ exp = (cid:90) dz s P ( z | z s ) γ L ( (cid:126)θ, z s ) . (20)With the photo- z smearing operator deﬁned as P = (cid:90) dz s P ( z | z s ) , (21)we can estimate the inﬂuence of photo- z uncertainty onthe shear signal by replacing γ L → P γ L . (22)Figure 3 shows the histogram of the best machine-learning method and photo-Z (Carrasco Kind & Brunner2013, MLZ) results from Tanaka et al. (2018) for 9347galaxies of the HSC S16A data release (Aihara et al.2018). The galaxies are divided into ten bins accordingto the photo- z best-ﬁt estimates. Figure 4 shows theaveraged probability density function (PDF) for galaxiesin each redshift bin.Figure 5 shows the lensing kernels for lenses at ﬁvediﬀerent redshifts. We compare with the lensing kernelsfor spectroscopic redshifts, which converge to zero atsource redshift lower than the lens redshift. In contrast,with realistic photo- z uncertainties, the lensing kernelsdo not converge to zero at low source redshift. Thisis simply because the galaxies with photo- z estimatedlower than the lens redshifts may actually be located athigher redshifts. The middle and right panels of Figure 5 show thecorrelation between lensing kernels. The middle panelis for spect- z , and the right panel is for photo- z . Asdemonstrated in the middle panel, the lensing kernelsare highly correlated even though the redshift estimationis precise. Comparing the correlation matrices shown inthe middle panel and the right panel of Figure 5, weconclude that the photo- z uncertainty only slightly in-creases the correlations between lensing kernels at dif-ferent lens plane.As will be shown in Section 2.5, the mass map recon-struction with the standard LASSO suﬀers from signiﬁ-cant line-of-sight smearing even in the absence of shapeestimation errors and photo- z uncertainties. The smear-ing intrinsically originates from the correlation betweenlensing kernels that hinders the LASSO estimator fromlocating precisely the positions of mass clumps in theline-of-sight direction. Therefore, the precise spectro-scopic redshift estimation, even if it is available, doesnot greatly help mitigating the smearing.2.4.2. Smoothing

The source galaxies are not uniformly distributed inthe sky, with substantial ﬂuctuations in the number den-sity. To boost the computational speed, we ﬁrst smooththe shear ﬁeld, and pixelize the smoothed shear ﬁeldonto a regular grid. After these procedures, the fastFourier transform (FFT) can be directly conducted onthe transverse plane in each redshift bin. Another ben-eﬁt of smoothing is that it reduces bias arising from thealiasing eﬀect in the pixelization process, because thesmoothing kernel reduces the amplitude of the shear sig-nal at high frequency.We convolve the measured shear ﬁeld with a smooth-ing kernel as γ sm ( (cid:126)θ ) = (cid:80) i W ( (cid:126)θ − (cid:126)θ i , z − z i ) γ i (cid:80) i W ( (cid:126)θ − (cid:126)θ i , z − z i ) , (23)where W ( (cid:126)θ, z ) is a 3D smoothing kernel, and γ i , z i and θ i refer to the shear, photometric reshift, and transverseposition of the i -th galaxy in the catalog. The kernel W ( (cid:126)θ, z ) can be decomposed into a transverse component W T ( (cid:126)θ ) and a line of sight component W l ( z ) as W ( (cid:126)θ, z ) = W T ( (cid:126)θ ) W l ( z ) . (24)We use an isotropic 2D Gaussian kernel and a 1D top-hat kernel to smooth the shear ﬁeld in the transverseplane and in the line of sight direction, respectively.These components are given by W T ( (cid:126)θ ) = 12 πβ exp (cid:32) − | (cid:126)θ | β (cid:33) ,W l ( z ) =  / ∆ z ( | z | < ∆ z/ , . (25)By deﬁnition, the smoothing kernel is normalized as (cid:90) d r W ( (cid:126)r ) = 1 . (26)In this paper, we set β = 1.5 arcmin.The smoothing operator is deﬁned as W = (cid:90) d r (cid:48) W ( (cid:126)r − (cid:126)r (cid:48) ) , (27)and the smoothing procedure is expressed as an opera-tion to the shear ﬁeld as γ L → W γ L . (28)As we will discuss in Section 2.4.4, the smoothed shearﬁeld is pixelized into equally spaced grids. We note thatanother widely used scheme is to average the shear mea-surements in each pixel. Such a scenario is equivalent toresampling the shear ﬁeld smoothed with a 3-D top-hatkernel with the same scale as the pixels.2.4.3. Masking

In real observations, shear measurements can be per-formed in a ﬁnite region of the sky, and the boundariesoften have complicated geometries. Moreover, many iso-lated sub-regions near the bright stars are masked out.We deﬁne the masking window function as M ( (cid:126)r ) =  n sm ≥ , , (29) where n sm is the smoothed galaxy number density. Wedeﬁne the masking operator as M = (cid:90) d r (cid:48) M ( (cid:126)r (cid:48) ) δ D ( (cid:126)r − (cid:126)r (cid:48) ) , (30)where δ D ( (cid:126)r ) is 3D Dirac delta function. The maskingoperation acts as γ L → M γ L . (31)The ﬁnal observed shear ﬁeld, taking into account allof the systematics introduced above is γ = MWPQΦ x. (32)For simplicity, we denote A = MWPQΦ and rewriteeq. (32) as γ = A x. (33)2.4.4. Pixelization

We pixelize the smoothed shear ﬁeld on N θ × N θ × N s grids, where N θ is the number of pixels of the two or-thogonal axes of the transverse plane, and N s is thenumber of pixels in the line-of-sight direction. We de-note by γ α the smoothed shear measurements recordedon the pixel with index α , where α = 1 , ..., N θ × N θ × N s .The grids on the transverse planes are equally spacedwith a pixel size of 1 (cid:48) . In the line of sight direction,we set binning with equal galaxy number as shown inFigure 3.Similarly, we pixelize the projection coeﬃcient ﬁeld x onto an N θ × N θ × N l grid. The projection coeﬃcient ﬁeldis pixelized in equal spacing ranging from redshift 0 . .

85. Here, we use N l to denote the numberof the lens planes and x β to denote the projection ﬁeldelement with index β = 1 , ..., N θ × N θ × N l × N , where thelast N is the number of NFW dictionary frames repre-senting diﬀerent physical scale radii. The correspondingpixelized elements of the forward transform matrix A isdenoted as A αβ .We term the column vectors of the forward transformmatrix A as the eﬀective basis atoms. The l norm ofthe i -th column vectors weighted by the inverse of thediagonal elements of the noise covariance matrix (Σ α ) isdeﬁned as N i = (cid:88) α A iα A iα / Σ αα . (34)We note that diﬀerent eﬀective basis atoms have dif-ferent weighted l norm. Before performing the densitymap reconstruction (solving Equation [33]), we normal-ize the column vectors of the transform matrix throughthe following rescaling: A (cid:48) αβ = A αβ / N α ,x (cid:48) β = x β N β , (35)so that the projection coeﬃcients have the sameweighted l norm. Such normalization boosts the speedof the gradient descent iterations we perform.2.5. Density map reconstruction

Adaptive LASSO

The LASSO algorithm uses l norm of the projectioncoeﬃcient ﬁeld to regularize the modeling. The estima-tor is deﬁned asˆ x (cid:48) LASSO = arg min x (cid:26) (cid:13)(cid:13)(cid:13) Σ − ( γ − A (cid:48) x (cid:48) ) (cid:13)(cid:13)(cid:13) + λ (cid:13)(cid:13) x (cid:48) (cid:13)(cid:13) (cid:27) , (36)where (cid:107) • (cid:107) and (cid:107) • (cid:107) refer to the l norm and l norm,respectively, and λ is the penalty parameter for theLASSO estimation.The LASSO algorithm searches and selects the pa-rameters that are relevant to the measurements, andsimultaneously estimates the values of the selected pa-rameters. It has been shown by Zou (2006) that whenthe column vectors of the transforming matrix A (cid:48) arehighly correlated, the algorithm cannot determine therelevant parameters from the parameter space consis-tently. Moreover, the estimated parameters are oftenbiased owing to the shrinkage in the LASSO regression.We note that, for the density map reconstruction prob-lem here, the column vectors are highly correlated evenin the absence of photometric redshift uncertainties, be-cause the lensing kernels for lenses at diﬀerent redshiftsoverlap signiﬁcantly, i.e. highly correlated as shownin Figure 5. Therefore, the LASSO algorithm cannotprecisely determine the consistent mass distribution inredshift, and the reconstructed map suﬀers from smear-ing in the line-of-sight direction even in the absence ofnoises.Figure 6 shows an example reconstruction results fora single halo with mass M = 10 h − M (cid:12) at redshift0 . . The shear measurement error and the photo- z uncertainty are not included in this simulation. We ﬁndsigniﬁcant smearing of the mass distribution with theLASSO algorithm (left panel of Figure 6).To overcome the problem, Zou (2006) proposes anadaptive LASSO algorithm, which uses adaptive weightsto penalize diﬀerent projection coeﬃcients in the l penalty. The adaptive LASSO algorithm performs atwo-steps process. In the ﬁrst step, the standard (non-adaptive) LASSO is used to estimate the parameters.Let us denote the preliminary estimation as ˆ x (cid:48) LASSO . Inthe second step, the preliminary estimate is used to cal-culate the non-negative weight vector for penalization We set the critical over-density to 200, and use M to denotethe halo mass. as ˆ w = 1 (cid:12)(cid:12)(cid:12) ˆ x (cid:48) LASSO (cid:12)(cid:12)(cid:12) τ , (37)where we set the hyper-parameter τ to 2. The adaptiveLASSO estimator is then given byˆ x (cid:48) = arg min x (cid:48) (cid:26) (cid:13)(cid:13)(cid:13) Σ − ( γ − A (cid:48) x (cid:48) ) (cid:13)(cid:13)(cid:13) + λ ada (cid:13)(cid:13) ˆ w · x (cid:48) (cid:13)(cid:13) (cid:27) . (38)Here λ ada is the penalty parameter for the adaptiveLASSO, which does not need to be the same as thepenalty parameter for the preliminary LASSO estima-tion λ .We rewrite the loss function with the Einstein nota-tion: L ( x (cid:48) ) = 12 (Σ − ) αβ ( γ ∗ α − A (cid:48)∗ αi x (cid:48) i )( γ β − A (cid:48) βj x (cid:48) j )+ λ ada ˆ w β (cid:12)(cid:12)(cid:12) x (cid:48) β (cid:12)(cid:12)(cid:12) . (39)To simplify the equations in the following, we deﬁne thequadruple term in the loss function as G ( x (cid:48) ) = 12 Σ − αβ ( γ ∗ α − A (cid:48)∗ αi x (cid:48) i )( γ β − A (cid:48) βj x (cid:48) j ) . (40)2.5.2. FISTA

Beck & Teboulle (2009) propose the Fast Iterative SoftThresholding Algorithm (FISTA) to solve the LASSOproblem. Since the loss functions of the LASSO and theadaptive LASSO diﬀer only in their penalization terms,FISTA is also applicable to solve the adaptive LASSOproblem in a straightforward manner. In this paper,we apply FISTA to solve both the preliminary LASSOestimation and the adaptive LASSO estimation.We ﬁrst explain the LASSO preliminary estimation.The coeﬃcients are initialized as x (1) i = 0. Accordingto FISTA, we iteratively update the projection coeﬃ-cient ﬁeld x . Taking the n -th iteration as an example, atemporary update is ﬁrst calculated as x (cid:48) ( n +1) i = ST λ (cid:16) x (cid:48) ( n ) i − µ∂ i G ( x (cid:48) ( n ) ) (cid:17) , (41)where ST is the soft thresholding function deﬁned asST λ (cid:0) x (cid:48) (cid:1) = sign( x (cid:48) ) max (cid:110)(cid:12)(cid:12) x (cid:48) (cid:12)(cid:12) − λ, (cid:111) . (42)The soft thresholding is a part of the LASSO algorithm,which selects the modes with amplitude greater than λ ,and shrinks the selected estimation by λ . The coeﬃcient µ is the step size of the gradient descent iteration, and ∂ i G ( x (cid:48) ( n ) ) is the i -th element of the gradient vector of G at point x (cid:48) ( n ) : ∂ i G ( x (cid:48) ( n ) ) = Σ − αβ Re (cid:110) A (cid:48)∗ αi ( γ β − A (cid:48) βj x (cid:48) j ) (cid:111) , (43)where Re {•} returns the real part of the input function.FISTA requires an additional update using a weightedaverage between x (cid:48) ( n +1) and x (cid:48) ( n ) as t ( n +1) = 1 + (cid:112) t ( n ) ) ,x (cid:48) ( n +1) ← x (cid:48) ( x +1) + t ( n ) − t ( n +1) ( x (cid:48) ( n +1) − x (cid:48) ( n ) ) , (44)where the relative weight is initialized as t (1) = 1.FISTA converges as long as the gradient descent stepsize µ satisﬁes 0 < µ < (cid:13)(cid:13) A † Σ − A (cid:13)(cid:13) , (45)where (cid:13)(cid:13) A † Σ − A (cid:13)(cid:13) refers to the spectrum norm of thematrix A † Σ − A . The spectral norm is estimated bysimulating a large number of random vectors with l norms equal one with diﬀerent realizations. The matrixoperator A † Σ − A is subsequently applied to each ran-dom vector to yield a corresponding transformed vector.The spectral norm of the matrix A † Σ − A is determinedas the maximum l norm of the transformed vectors.As summarized in the below, we ﬁrst initialize theprojection coeﬃcients as zero, and use FISTA to ﬁndthe global minimum of the LASSO loss function. Thus-obtained global minimum is the preliminary one. Wethen use the preliminary LASSO estimation to weightthe coeﬃcients and calculate the adaptive LASSO lossfunction. Finally, we set the preliminary LASSO esti-mation as the ‘warm’ start of the adaptive LASSO es-timation, and FISTA again to ﬁnd the global minimumof the adaptive LASSO loss function, which is our ﬁnalsolution. Algorithm

Our Algorithm

Input: γ : Pixelized complex 3D shear ﬁeld Output: δ : 3D array of density contrast Normalize column vectors of A Estimate step size µ and Σ Initialization: x (cid:48) (1) = 0 ˆ w = 1 t (1) = 1, i = 1, j = 1 while j ≤ do while i ≤ N iter do x (cid:48) ( n +1) i = ST ˆ wλ (cid:16) x (cid:48) ( n ) i − µ∂ i G ( x (cid:48) ( n ) ) (cid:17) t ( n +1) = √ t ( n ) ) x (cid:48) ( n +1) ← x (cid:48) ( x +1) + t ( n ) − t ( n +1) ( x (cid:48) ( n +1) − x (cid:48) ( n ) ) i = i + 1 end while Re-initialization: ˆ w = (cid:12)(cid:12)(cid:12) ˆ x (cid:48) LASSO (cid:12)(cid:12)(cid:12) − , λ ← λ ada ˆ x (cid:48) (1) = x (cid:48) ( N iter ) t (1) = 1, i = 1 j = j + 1 end while δ = Φ N − x (cid:48) ( N iter ) CLUSTER DETECTIONWe simulate weak-lensing shear induced by NFW ha-los with various masses and redshifts. The generatedshear ﬁelds are used to distort the HSC mock galaxyshapes with diﬀerent realizations of shape measurementerror and photo- z uncertainty (Section 3.1).We then test our algorithm using the mock catalogueswith varying the regularization parameter in our model.We also compare our method that uses the NFW dic-tionary with one using the point mass dictionary.3.1. Simulations

We use halos with a variety of masses and redshifts inthe range 10 h − M (cid:12) < M < h − M (cid:12) , and 0 .

85, respectively. We divide the parameter spaceinto eight redshift bins and eight mass bins with equalseparation. We randomly shift the input halo redshiftand halo mass from the bin center by a small amountin order to avoid repeatedly sampling at the exact samehalo mass and redshift.The concentration parameter c h of a NFW halo is de-termined as a function of the halo mass ( M ) and red-shift ( z h ) according to Ragagnin et al. (2019) c h = 6 . × (cid:18) M M (cid:12) (cid:19) − . (cid:18) . . + z h (cid:19) . . (46)The weak-lensing shear ﬁelds of the NFW halos are cal-culated according to Takada & Jain (2003). The shear0 − . − . . . . δg − − − − P ( δ g ) HSC-like Shear Measurement Error galaxypixel

Figure 7.

HSC-like shape measurement error including bothfrom shape noise and photon noise on the ﬁrst component ofshear ( g ) for galaxies (blue lines) and for smoothed pixels(orange lines). The dashed lines are the best-ﬁt Gaussiandistributions to the corresponding histograms. The 5’th Source Redshift Bin . . . . . . σ pix Figure 8.

The standard deviation pixel map of the HSC-like shape measurement error for the ﬁfth source galaxy bin(0 . ≤ z < . distortions are applied to one hundred realizations ofgalaxy catalogs with the HSC-like shape measurementerror and photo- z uncertainty.The mock galaxy catalogs are generated using theHSC S16A shape catalog (Mandelbaum et al. 2018). Weuse the galaxies in a 1 square degree ﬁeld at the centerof tract 9347 (Aihara et al. 2018). The average galaxynumber density in this region is 22 .

94 arcmin − . Thepositions of galaxies are randomized to distribute ho-mogeneously in the one-square degree stamp. We ran-domly assign redshift for each galaxy following the MLZ photo- z probability distribution function (Tanaka et al.2018).We simulate the HSC-like shape estimation errorswith diﬀerent realizations by randomly rotating thegalaxies in the shape catalog. The histogram of theﬁrst component of the HSC-like shape estimation er-ror is shown in Figure 7. The corresponding histogramof the shape measurement error on the pixel level afterthe smoothing and pixelization is also shown in Figure7. The standard deviation map of the noise is demon-strated in Figure 8. As demonstrated in Figure 7, eventhough the shape measurement error on the galaxy leveldoes not fully follow a Gaussian distribution, the er-ror is well described by Gaussian distribution after thesmoothing and pixelization.3.2. NFW atoms

In this section, we test the performance of our algo-rithm by adopting models where the matter density ﬁeldis represented by multi-scale NFW atoms. The dictio-nary is constructed with three frames of diﬀerent NFWscale radii in the comoving coordinate: 0 . h − Mpc,0 . h − Mpc, and 0 . h − Mpc. The truncation radiiare set to four times the scale radii for the atoms in thedictionary, i.e., we assume c h = 4. Note that each frameof our dictionary ﬁxes the scale radius in comoving coor-dinates and thus the NFW atoms have diﬀerent angularsizes when placed at diﬀerent redshift.We test the algorithm with varying the regulariza-tion parameter for the LASSO estimation with λ = 3 . .

0, and 5 .

0. The corresponding regularization param-eters for the ﬁnal adaptive LASSO estimations are setto λ ad = λ τ +1 . Here, both the LASSO estimation andour adaptive LASSO estimation select the pixels withsignal-to-noise ratios greater than λ in each gradient de-scent iteration, and the local density is estimated for theselected pixels with a soft shrinkage of the estimationamplitude. The LASSO estimation shrinks the densityamplitudes by λ for every selected pixels, whereas theadaptive LASSO estimation suppresses the shrinkage ifthe preliminary estimation for the pixel is greater than λ , by down-weighting their regularization, and otherwiseit enhances the shrinkage.We’d like to reconstruct with the resolution limit setby the Gaussian smoothing kernel with a standard devi-ation of 1 . (cid:48) (cid:48) pixel scale as described in Section2.4.2 and Section 2.4.4, respectively. We smooth the re-constructed density ﬁeld with the same Gaussian kernelin each lens redshift plane.Figure 9 shows the 3D density maps reconstructedwith diﬀerent penalty parameters for a halo with M =10 . h − M (cid:12) at redshift 0 . − − − − δ pix N u m b e r Pixel Histogram ( λ = 3 . ) -50.0 -24.9 0.164 25.2 50.3 75.3 z NFW: λ = 3 . − − − − δ pix N u m b e r Pixel Histogram ( λ = 5 . ) -0.12013.0 26.1 39.3 52.4 65.5 z NFW: λ = 5 . Figure 9.

The lower panels show the density maps reconstructed from the mock galaxy shape catalog with the NFW dictionary.The upper panels show the number histograms of pixel values. The penalty parameters are λ = 3 . λ = 5 . M = 10 . h − M (cid:12) , and its redshift is z = 0 . z = 0 .

01 and z = 0 .

85, respectively. tograms are shown in Figure 9. We see that the adaptiveLASSO algorithm sets a fraction of the reconstructedpixels to zero, and keeps only strong signals. It is im-portant to note that the reconstructed density maps arenot compromised by line-of-sight smearing.Following Lanusse et al. (2016), we normalize the de-tected peaks in the l -th ( l = 1 ...

20) lens redshift plane toaccount for the peak amplitude diﬀerence arising fromthe diﬀerence in the norm of the lensing kernels in dif-ferent redshift bins: δ npeak ( (cid:126)θ, z l ) = δ peak ( (cid:126)θ, z l ) / R l , (47)where the normalization matrix is deﬁned as R l = (cid:88) s K ( z l , z s ) . (48)In Figure 10, we show the histograms of the normal-ized peaks with diﬀerent penalty parameters. There,we stack the histograms from 100 realizations of all ha-los sampled in the redshift-mass plane. We ﬁrst gener-ate 1000 realizations of pure noise catalogs and perform the reconstruction using the noise catalogs in order toexamine the noise properties. The histograms of thenormalized peaks detected from the pure noise catalogsare shown in Figure 10 along with the best-ﬁt Gaussianfunctions of the noise peak histograms.We ﬁnd that the number counts including both trueand false peaks decrease as the penalty parameter λ in-creases. Also, the standard deviation of noise peaks de-creases as λ increases. As a result, with λ = 5 .

0, weﬁnd a clearer excess in the positive peak counts com-pared with the noise peak histograms, especially at thehigh density contrast. This is expected because a higherpenalty parameter prefers a sparser solution; more peaksoriginating from the noise are removed than those fromreal clusters at the high density contrast.The 2D histograms stacked from all of our simulationsare shown in the left panel of Figure 11. The result in-cludes the oﬀsets of the detected peak positions fromthe positions of input halos. We see clear clusteringof the peaks close to the position of the input halo onthe stacked position histogram. For each simulation, we2 − . − . − .

25 0 .

00 0 .

25 0 .

50 0 .

75 1 .

00 1 . δ npeak − − − N u m b e r / d e g Gausian Fit: ave = − . . Peak Histograms ( λ = 3 . ) − . − . − .

25 0 .

00 0 .

25 0 .

50 0 .

75 1 .

00 1 . δ npeak − − − Gausian Fit: ave = − . . Peak Histograms ( λ = 4 ) − . − . − .

25 0 .

00 0 .

25 0 .

50 0 .

75 1 .

00 1 . δ npeak − − − Gausian Fit: ave = − . . Peak Histograms ( λ = 5 )reconstructionnoise Figure 10.

The number of peaks per square degree plotted as histograms. The solid histograms show the results fromreconstructions with the NFW dictionary penalized with diﬀerent regularization parameters: λ = 3 . , . , .

0, from left to right.The dashed blue steps are the corresponding results of the reconstructions from 1000 realizations of pure noise catalogs. Thegray lines are the best-ﬁt Gaussian distributions to the histograms of noise peaks. identify positive peaks closest to the input position (inthe pixel unit). If a closest peak is located inside theregion denoted with the dashed box in the left panel ofFigure 11, we regard it as a true peak detection. Otheridentiﬁed peaks, which include both positive and nega-tive peaks, are judged to be false detection.The right panel of Figure 11 shows the average redshiftof true detections for each halo. The estimated redshiftsare slightly lower than the ground truth by ∆ z ∼ . z ≤ . . < z ≤ .

85, the relative redshift bias is below0 . . . σ and3 . σ ) to detect galaxy clusters from the mass maps re-constructed with λ = 3 . , . , .

0. The left and middlecolumns of Figure 12 show the detection rates for ha-los in the redshift-mass plane with detection thresholdsset to 1 . σ and 3 . σ , respectively. The right panels ofFigure 12 show the corresponding number of false detec-tions per square degree as a function of detection thresh-old.Figure 12 shows that the false peak density is suc-cessfully reduced for relatively large detection thresh-old, but the detection rate of halo also decreases. Aftera few experiments, we have decided to set the detectionthreshold to 1 . σ and set the penalty parameter λ to5 . .

022 while keeping a high halo detection rate. Insummary, our method is able to detect halos with mini-mal mass of 10 . h − M (cid:12) , 10 . h − M (cid:12) , 10 . h − M (cid:12) for the low ( z < . . ≤ z < .

6) and high(0 . ≤ z < .

85) redshift ranges, respectively.Using the detection rate measured from our simula-tions, we are able to predict the number density of de-tected clusters by assuming the halo mass function ofTinker et al. (2008). We use HMF (Murray et al. 2013),an open-source package, to calculate the halo mass func-tion. The predicted halo detection number density forthe setup λ = 5 and 1 . σ detection threshold is shownin Figure 13. The resulting cluster number density is0 .

49 deg − , which is much higher than the false detec-tion rate of 0.022 deg − . This cluster number densitycorresponds to 78 . ∼

160 deg . The expected number of detection isslightly higher than the number of 2D cluster detections(63 detected clusters) for the ﬁrst year HSC shape cata-log (Miyazaki et al. 2018b). Furthermore, our 3D detec-tion method provides an accurate redshift estimation forindividual clusters. In contrast, the redshift informationis not provided from the 2D mass map reconstruction.3.3. Point mass atoms

We perform an additional test by substituting the de-fault NFW dictionary with point mass. This test mayindicate some certain limitation of our method when ap-plied to a case with very compact (point) objects, whichhowever is unlikely to happen in actual lensing observa-tions.We set the penalty parameter for the preliminaryLASSO to λ = 3 . .

0. Pramanik & Zhang (2020)propose to incorporate group information into diﬀer-ent adaptive LASSO penalization weights by setting theweights for projection coeﬃcients in a group to the av-erage of the adaptive weights in the same group. Weassume that the neighboring pixels in the same red-shift plane belong to the same structural group (e.g.,3 . . . . . . . . . ∆ R − . − . − . − . . . . . . ∆ z . . . . . . . . . input redshift . . . . . . . . . m e a s u r e d r e d s h ﬁ t h log M (cid:12) i = 13 . h log M (cid:12) i = 13 . h log M (cid:12) i = 13 . h log M (cid:12) i = 14 . h log M (cid:12) i = 14 . h log M (cid:12) i = 14 . h log M (cid:12) i = 14 . h log M (cid:12) i = 15 . Figure 11.

The left panel shows the stacked 2D distribution of the deviations of detected peak positions from the centers of thecorresponding input halos. The x -axis is for the deviated distance in the transverse plane, and the y -axis is for the deviation inredshift. In each simulation, the positive peak inside the dashed black box with the minimal oﬀset (in the pixel unit) from theposition of the input halo is taken as ”true” detection. The right panel shows the deviation of detected peaks in the line-of-sightdirection. The x -axis is the input halo redshifts, and the y -axis is the redshift of the detected peak. The cross denotes theaverage of the detected peaks for each halo over diﬀerent noise realizations, and the error-bars indicate the uncertainties of theaverage redshifts. The deep gray area indicates relative redshift bias less than 0 .

05, and the light gray area for relative redshiftbias less than 0 .

5. These results are based on our reconstruction with the NFW dictionary with λ = 3 . galaxy cluster and void), and smooth the amplitudeof the preliminary LASSO estimation in each lens red-shift plane with a top-hat ﬁlter of comoving diameter r c = 0 . h − Mpc. Let us denote the amplitude ofthe preliminary LASSO estimation with the smoothingas (cid:12)(cid:12) ˆ x LASSO (cid:12)(cid:12) sm , and adopt the penalization weights givenby ˆ w = 1 / (cid:12)(cid:12) ˆ x LASSO (cid:12)(cid:12) τ sm .Figure 14 shows the reconstruction result with thepoint mass dictionary. Interestingly, several ”discrete”masses are assigned to at diﬀerent redshift bins in theneighboring region of the true halo center. In contrast,as we have seen in Figure 9, the NFW dictionary man-ages to recover a consistent mass distribution. The prob-lem of the point mass dictionary originates from the factthat the proﬁle of the point mass atom in the transverseplane is much more compact than the proﬁle of the inputhalo, especially when placed at low redshifts. SUMMARYWe have developed a novel method to generate high-resolution three-dimensional density maps from weak-lensing shear measurement with photometric redshiftinformation. A key improvement over previous sim-ilar methods is that we represent a 3D density ﬁeldby a collection of NFW atoms with diﬀerent physicalsizes. With a prior assumption that the clumpy massdistribution is sparse in 3D, we reconstruct the density map using the adaptive LASSO algorithm (Zou 2006).We show that adopting the standard LASSO algorithmresults in signiﬁcant smearing of structure in the line-of-sight direction even in the absence of galaxy shapenoise and photometric redshift uncertainties. Our adap-tive LASSO algorithm eﬃciently reduces the smearingof structure.We have examined the performance of cluster detec-tion with the reconstructed 3D mass maps using mockcatalogues that apply shear distortions from isolated ha-los to galaxies with HSC-like shapes and photo- z un-certainties. Under the realistic conditions, our methodis able to detect halo with minimal mass limits of10 . h − M (cid:12) , 10 . h − M (cid:12) , 10 . h − M (cid:12) at low ( z < . . ≤ z < .

6) and high (0 . ≤ z < . − . The estimated redshifts of the clusters de-tected in the reconstructed mass maps are slightly lowerthan the true redshift by ∆ z ∼ .

03 for halos at low red-shifts ( z ≤ . . . < z ≤ .

85, and the standard deviationof the redshift estimation is 0 . . . . . . . . . . input z i npu t M ( h − M (cid:12) ) Detection Rate ( . σ ) . . . . . . . . . . . . . . . input z i npu t M ( h − M (cid:12) ) Detection Rate ( . σ ) . . . . . . . . . . . . . . . . . . . nσ threshold − − F a l s e P e a k / d e g False Detection λ = 3 . . . . . . . . . . input z i npu t M ( h − M (cid:12) ) Detection Rate ( . σ ) . . . . . . . . . . . . . . . input z i npu t M ( h − M (cid:12) ) Detection Rate ( . σ ) . . . . . . . . . . . . . . nσ threshold − − − F a l s e P e a k / d e g False Detection λ = 4 . . . . . . . . . . input z i npu t M ( h − M (cid:12) ) Detection Rate ( . σ ) . . . . . . . . . . . . . . . input z i npu t M ( h − M (cid:12) ) Detection Rate ( . σ ) . . . . . . . . . . . . . . nσ threshold − − − F a l s e P e a k / d e g False Detection λ = 5 . Figure 12.

The detection rates and false peak densities for diﬀerent penalty parameters and detection thresholds. The ﬁrst,second, and third rows correspond to the results with λ = 3 . , . , .

0, respectively. The left and middle columns are the halodetection rates for detection thresholds equal 1 . σ and 3 . σ , respectively. The right column shows the density of false peaks asa function of detection threshold. ACKNOWLEDGEMENTSWe thank Yin Li and Jiaxin Han for useful discus-sions. XL was supported by Global Science GraduateCourse (GSGC) program of University of Tokyo andJSPS KAKENHI (JP19J22222).This work was supported in part by Japan Scienceand Technology Agency (JST) CREST JPMHCR1414, and by JST AIP Acceleration Research Grant No.JP20317829, and by the World Premier InternationalResearch Center Initiative (WPI Initiative), MEXT,Japan, and JSPS KAKENHI Grant Nos. JP18K03693,JP20H00181, JP20H05856.REFERENCES

Aihara, H., Armstrong, R., Bickerton, S., et al. 2018, PASJ,70, S8, doi: 10.1093/pasj/psx081 Bacon, D. J., & Taylor, A. N. 2003, MNRAS, 344, 1307,doi: 10.1046/j.1365-8711.2003.06922.x . . . . . redshift . . . . . . . . . l o g (cid:0) M ( h − M (cid:12) ) (cid:1) Detection Number Density . . . . . . num / deg Figure 13.

The expected number density of detected clus-ters per square degree as a function of halo mass and redshift.The number density in total is 0 .

49 deg − .Beck, A., & Teboulle, M. 2009, SIAM Journal on ImagingSciences, 2, 183Carrasco Kind, M., & Brunner, R. J. 2013, MNRAS, 432,1483, doi: 10.1093/mnras/stt574Chang, C., Pujol, A., Mawdsley, B., et al. 2018, MNRAS,475, 3165, doi: 10.1093/mnras/stx3363de Jong, J. T. A., Verdoes Kleijn, G. A., Kuijken, K. H., &Valentijn, E. A. 2013, Experimental Astronomy, 35, 25,doi: 10.1007/s10686-012-9306-1Fan, Z., Shan, H., & Liu, J. 2010, ApJ, 719, 1408,doi: 10.1088/0004-637X/719/2/1408Hamana, T., Shirasaki, M., & Lin, Y.-T. 2020, PASJ, 72,78, doi: 10.1093/pasj/psaa068Hamana, T., Takada, M., & Yoshida, N. 2004, MNRAS,350, 893, doi: 10.1111/j.1365-2966.2004.07691.xHennawi, J. F., & Spergel, D. N. 2005, ApJ, 624, 59,doi: 10.1086/428749Hildebrandt, H., K¨ohlinger, F., van den Busch, J. L., et al.2020, ap, 633, A69, doi: 10.1051/0004-6361/201834878Hu, W., & Keeton, C. R. 2002, PhRvD, 66, 063506,doi: 10.1103/PhysRevD.66.063506Jain, B., & Van Waerbeke, L. 2000, ApJL, 530, L1,doi: 10.1086/312480Jeﬀrey, N., Abdalla, F. B., Lahav, O., et al. 2018, MNRAS,479, 2871, doi: 10.1093/mnras/sty1252Kaiser, N., & Squires, G. 1993, pj, 404, 441,doi: 10.1086/172297Kilbinger, M. 2015, Reports on Progress in Physics, 78,086901, doi: 10.1088/0034-4885/78/8/086901Lanusse, F., Starck, J. L., Leonard, A., & Pires, S. 2016,ap, 591, A2, doi: 10.1051/0004-6361/201628278 Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, ArXive-prints. https://arxiv.org/abs/1110.3193Leonard, A., Lanusse, F., & Starck, J.-L. 2014, MNRAS,440, 1281, doi: 10.1093/mnras/stu273Li, X., Oguri, M., Katayama, N., et al. 2020, TheAstrophysical Journal Supplement Series, 251, 19,doi: 10.3847/1538-4365/abbad1Lin, C.-A., Kilbinger, M., & Pires, S. 2016, A&A, 593, A88,doi: 10.1051/0004-6361/201628565LSST Science Collaboration, Abell, P. A., Allison, J., et al.2009, ArXiv e-prints. https://arxiv.org/abs/0912.0201Mandelbaum, R. 2018, ARA&A, 56, 393,doi: 10.1146/annurev-astro-081817-051928Mandelbaum, R., Miyatake, H., Hamana, T., et al. 2018,PASJ, 70, S25, doi: 10.1093/pasj/psx130Massey, R., Rhodes, J., Ellis, R., et al. 2007, at, 445, 286,doi: 10.1038/nature05497Miyazaki, S., Oguri, M., Hamana, T., et al. 2018a, PASJ,70, S27, doi: 10.1093/pasj/psx120—. 2018b, PASJ, 70, S27, doi: 10.1093/pasj/psx120Murray, S. G., Power, C., & Robotham, A. S. G. 2013,Astronomy and Computing, 3, 23,doi: 10.1016/j.ascom.2013.11.001Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, pj,490, 493, doi: 10.1086/304888Oguri, M., & Hamana, T. 2011, MNRAS, 414, 1851,doi: 10.1111/j.1365-2966.2011.18481.xOguri, M., Miyazaki, S., Hikage, C., et al. 2018, PASJ, 70,S26, doi: 10.1093/pasj/psx070Planck Collaboration, Aghanim, N., Akrami, Y., et al.2020, ap, 641, A6, doi: 10.1051/0004-6361/201833910Pramanik, S., & Zhang, X. 2020, arXiv e-prints,arXiv:2006.02041. https://arxiv.org/abs/2006.02041Price, M. A., Cai, X., McEwen, J. D., et al. 2020, MNRAS,492, 394, doi: 10.1093/mnras/stz3453Ragagnin, A., Dolag, K., Moscardini, L., Biviano, A., &D’Onofrio, M. 2019, MNRAS, 486, 4001,doi: 10.1093/mnras/stz1103Schneider, P. 1996, MNRAS, 283, 837,doi: 10.1093/mnras/283.3.837Shan, H., Kneib, J.-P., Tao, C., et al. 2012, ApJ, 748, 56,doi: 10.1088/0004-637X/748/1/56Simon, P., Taylor, A. N., & Hartlap, J. 2009, MNRAS, 399,48, doi: 10.1111/j.1365-2966.2009.15246.xSpergel, D., Gehrels, N., Baltay, C., et al. 2015, ArXive-prints. https://arxiv.org/abs/1503.03757Starck, J., Murtagh, F., & Bertero, M. 2015, Starlettransform in astronomical data processing, Vol. 1 (UnitedStates: Springer New York), 2053–2098 -63.0 -33.9 -4.87 24.2 53.3 82.3 z Point Mass: λ = 3 . -60.9 -31.3 -1.71 27.9 57.5 87.1 z Point Mass: λ = 5 . Figure 14.

The density maps reconstructed from the mock galaxy shape catalog with the point mass dictionary. The penalityparameter is λ = 3 . λ = 5 . M = 10 . h − M (cid:12) , and its redshift is z = 0 . z = 0 .

01 and z = 0 ..