On Nearest Neighbors in Non Local Means Denoising
OOn Nearest Neighbors in Non Local Means Denoising
I. Frosio
NVIDIAUSA [email protected]
J. Kautz
NVIDIAUSA [email protected]
Abstract
To denoise a reference patch, the Non-Local-Means denoising filter processesa set of neighbor patches. Few Nearest Neighbors (NN) are used to limit thecomputational burden of the algorithm. Here here we show analytically that theNN approach introduces a bias in the denoised patch, and we propose a differentneighbors’ collection criterion, named Statistical NN (SNN), to alleviate this issue.Our approach outperforms the traditional one in case of both white and colorednoise: fewer SNNs generate images of higher quality, at a lower computationalcost.
Non-Local-Means (NLM) denoising has been widely investigated by researchers[2, 10, 3]. Denoisingof a given patch is obtained as a weighted average of the surrounding patches, with weights propor-tional to the patch similarity. The filtering parameters such as the number and size of the patchesaffect both the quality of the output images [7], and the computational burden of the filter[15]. Awidely used practice is to reduce the number of neighbors collected for each reference patch: the 3DBlock-Matching (BM3D) filter achieves state-of-the-art results in this way [4]. The neighbors’ set iscollected through a Nearest-Neighbors (NN) approach. Reducing the set size leads to images withsharp edges [7], but it also introduces low-frequency artifacts (Fig. 1).Our contribution is to show that this artifact occurs because the estimate of the noise-free patch fromthe set of NNs is biased. The the best of our knowledge, this is the first time this problem is explicitelyinvestigated, although other authors (e.g. [10, 7, 18]) analyzed other sources of bias in NLM. Wepropose a strategy to collect neighbors, named Statistical NN (SNN), which reduces the predictionerror of the estimate of the noise-free patch. Using fewer neighbors, SNN leads to an improvement inperceived image quality, both in case of white and colored Gaussian noise. In the latter case, visualinspection reveals that NLM with SNN achieves an image quality comparable to the state-of-the-art,at a much lower computational cost.
NLM denoising [2, 3] averages similar patches and then aggregates the averages. Given a noisyreference patch with P elements, µ r = [ µ r µ r .... µ P − r ] , its squared distance from a noisy neighborpatch γ k , is δ ( µ r , γ k ) = P · (cid:80) P − i =0 (cid:0) µ ir − γ ik (cid:1) . Following [3], the weight of γ k in the average is: w µ r , γ k = exp (cid:8) max [0 , δ ( µ r , γ k ) − σ ] (cid:9) /h , (1)where h is the filtering parameter and σ is the variance of zero-mean, white Gaussian noise [2, 3].The estimate of the noise-free patch, ˆ µ ( µ r ) , is then: ˆ µ ( µ r ) = (cid:88) k w µ r , γ k · γ k / (cid:88) k w µ r , γ k . (2) a r X i v : . [ c s . C V ] N ov oisy NLM . NLM . NLM . NLM . Ground truth
PSNR : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . : . C : . Figure 1: Noisy, × patches from the Kodak dataset, corrupted by zero-mean Gaussian noise, σ = 20 . Traditional NLM (NLM . ) uses 361, × patches in a × search window; it denoiseseffectively the flat areas (the skin, the wall), but it blurs the small details (texture of the textile, frescodetails). Using 16 NNs for each patch (NLM . ) improves the small details, but introduces colorednoise in the flat areas. The proposed SNN technique (NLM . ) uses 16 neighbors and mimics theresults of traditional NLM in flat areas (high PSNR), while keeping the visibility of small details(high FSIM C ). Best results are achieved when patches are collected through SNN, with o = 0 . (NLM . ). Better seen at 400% zoom.Using multiple scales in the distance function can improve patch matching [12, 11], leading tohigher quality of the filtered images at the price of a higher computational cost. Wu et al. [18]identify a source of bias in ˆ µ ( µ r ) , due to the correlation among partially overlapping patches, andpropose a weighting scheme to take this into account. Duval et al. [7] interpret NLM denoising asa bias / variance dilemma and show that ˆ µ ( µ r ) is biased even in absence of noise; reducing thenumber of neighbors decreases the cost of the filter and the bias at the same time, but it also increasesthe variance on ˆ µ ( µ r ) , leaving some residual noise in the image. State-of-the art algorithms likeBM3D [4], BM3D-SAPCA [5], and NL-Bayes [10] use a reduced set of neighbors, but they alsorequire additional, computationally intensive processing steps and continue to suffer from visibleartifacts on sharp edges and in smooth regions [9]. Machine learning can be used to learn to combinea set of patches for denoising, but the pros and cons of using a set of NN patches have not beendiscussed [1]. None of the existing methods focus their attention on the fact that the neighbors’ selection criterion,combined with the fact that the reference patch µ r is noisy, affects the prediction error of ˆ µ ( µ r ) .In the Additional Material , we resort to a toy problem to compute the bias in ˆ µ ( µ r ) introduced bythe NN search strategy and we show analytically that collecting neighbors with the SNN approachcan dramatically reduce it. For reasons of space, we describe here only the guiding principle behindSNN, and invite the curious reader to read the Additional Material for the math details.Let’s assume that a set of N neighbors, { γ k } k =0 ..N − , has to be collected for a noisy, reference patch µ r . Fig. 2 shows the case of a two-dimensional patch. When N neighbors are collected through theNN approach (left panel), the average of the set of neighbors is clearly biased toward µ r . In NLMdenoising, this drawback shows up as the noise-to-noise matching issue, i.e. , residual noise correlatedwith the reference patch is still present in the filtered image.The recipe for collecting neighbors through the SNN approach is different and illustrated in theright panel. First of all, we notice that, in presence of white, Gaussian noise with standard deviation σ , the expected squared distance between the noisy, reference patch and a noisy neighbor patch2s E [ δ ( µ r , γ k )] = 2 σ . The SNN neighbors are then defined as the patches { γ k } k =1 ..N n thatminimize: | δ ( µ r , γ k ) − o · σ | , (3)where where the offset parameter o allows to continuously move from the traditional NN approach( o = 0 ) to the SNN approach ( o = 1 ). The right panel in Fig. 2 shows that the average of theset of SNNs is more likely to lie close to the noise-free patch µ . In fact, because of the statisticaldistribution of the noisy patches, the chance to collect a neighbor γ k on the left side of µ r (and thusclose to µ r ) is high, while neighbors on the right side of µ r are unlikely to occur. In practice, theSNN criterion looks for similar patches having orthogonal realizations of the noise. This minimizesthe noise-to-noise matching issue and it leads to more effective noise cancellation, at the price of aslightly higher variance of ˆ µ ( µ r ) (due to the fact that the neighbors are generally farther from µ r ). Figure 2: Left: collecting neighbors { γ k } k =0 ..N − around the reference, noisy patch µ r through the tra-ditional NN approach generates a neighbor set whichis highly biased towards µ r . The proposed SNN ap-proach (right) collects neighbors close to the distance o · (cid:112) (3) σ from µ r , where σ is the noise standarddeviation and o and additional offset parameter. Theresulting set { γ k } k =0 ..N − is less biased towards µ r and closer to the noise-free value µ . For reasons of space, a detailed numeri-cal evaluation of the effectiveness of theSNN schema for NLM denoising, ob-tained thorugh a wide set of image qual-ity metric (PSNR, SSIM[16], MSSSIM[17],GMSD[19], FSIM, and FSIM C [21]), is re-ported only in the Additional Material . Itis important noticing that these numericalevaluation confirms the superiority of theSNN approach, but we discuss here only thequalitative results.The case of white, Gaussian noise is illus-trated in Fig. 1. This shows a higher levelof details when reducing the number of NNneighbors (from traditional NLM with 361patches, NLM . , to NLM with only 16patches, NLM . ), coherently with the re-duction of the NLM bias for a small set ofneighbors, already explained in [7]. At thesame time, the PSNR is decreasing, mostlybecause of the residual noise left in the flatareas (the skin of the girl, the wall with thefresco in Fig. 1); this is how the noise − to − noise matching problem manifests in the NLM filteredimage. Increasing the offset from o = 0 . (NN) to o = 1 . (SNN) increases the PSNR: SNN removesmore noise in the flat areas ( e.g. , the girl’s skin, the wall surface) compared to NN, thanks to thereduction of the bias introduced by the noise-to-noise matching issue. On the other hand, low-contrastedges ( e.g. , the texture of the textile, the sun rays and the small details in the fresco) tend to be blurredby SNN. The best compromise between preserving low-contrasted details in the image and effectivelysmoothing the flat areas is obtained for o = 0 . , as shown in Fig. 1.Having in mind denoising in practical situations, we also study the case of colored noise. In fact,images are generally acquired using a Bayer sensor. RGB data are obtained only after demosaicing,which introduces correlations between nearby pixels. In this case the NN approach is even morelikely to collect false matches, amplifying low-frequency noise and demosaicing artifacts. Fig. 3shows the comparison of state-of-the-art BM3D-CFA [6] to NLM with a NN or SNN approach in thecase of an image corrupted by Gaussian noise ( σ = 20 ) in the Bayer domain and then demosaiced.Visual inspection (Fig. 3) confirms that the NLM with NN continues to suffer from the noise-to-noise matching problem in the case of colored noise. NLM with SNN is comparable in terms of imagequality to BM3D-CFA: our approach generates a slightly more noisy image, but with better preservedfiner details ( e.g. , see the eyelids in Fig. 3) and without introducing grid artifacts that are on the otherhand visible for BM3D-CFA. Overall, our approach and BM3D-CFA result to be comparable in termsof image quality (as further demonstrated numerically in the Additional Material), but NLM withSNN has a much lower computational cost. 3round truth BM3D-CFA [6] N LM . (NN) N LM . (SNN)Figure 3: Comparison of different algorithms for colored noise denoising, σ = 20 . The state-of-the-art BM3D-CFA [6] achieves the highest image quality metrics (see Fig. 5 in the Additional Material);a grid artifact is however visible in the white part of the eye. The traditional NLM algorithm, using N n = 32 NNs (
N LM . ), does not eliminate the colored noise introduced by demosaicing. Instead,when SNNs neighbors are used ( N LM . ), the colored noise is efficiently removed and quality atvisual inspection is comparable to that of BM3D-CFA. In fact, our result appears less “splotchy" anda bit sharper. Better seen at 400% zoom.Further numerical comparison against against SDCT [20], BM3D [4] (in the case of white and colorednoise), and BM3D-CFA [6] (in the case of colored noise) are reported in the Additional Material ,where we also report and discuss the case of a “real” image.
The NLM algorithm has been be analyzed as a bias / variance dilemma: Duval et al. [7] showed thata reduction of the number of neighbors, reduces the bias in the estimate ˆ µ r . The price to be paidis the introduction of the noise-to-noise matching issue, that shows up as splotchy, colored artifactsin the filtered images. In fact, since the NNs of any noisy reference patch µ r lie close to it, theiraverage is biased towards µ r itself. We have shown here that this new source of bias is associatedto the strategy adopted to collect the set of neighbors and demonstrated that the SNN approach canlargely mitigate this bias both in theory and in practice.When applied to NLM denoising and compared to the traditional NN strategy, the SNN approachproduces images of quality similar to the original NLM in the flat areas, while keeping the visiblityof the small details. In other words, SNN looks for neighbors whose noise realization is likely tobe orthogonal to that of the reference patch. Averaging these patches is consequently more likely toeffectively cancel out the noise.The advantage of using the SNNs is even more evident in the case of colored noise, when noise-to-noise matching is more likely to occur. Visual inspection reveals that in this case NLM withSNN achieves an image quality comparable to the state-of-the-art BM3D-CFA [6], but at a lowercomputational cost (see Fig. 3). This case is indeed of practical importance, since denoising isperformed after demosaicing in “real” images. Our experiments (see the Additional Material ) suggestthat NLM with SNN is close to BM3D-CFA also in the case of “real” images, where white balance,color correction, gamma correction and edge enhancement are applied after denoising, and they cansignificantly enhance the visibility of artifacts and residual noise.It is worth mentioning that both the NN and SNN approaches to NLM denoising require the knowledgeof the noise power, σ , to properly set the filtering parameter h in Eq. (1) and guide the selection ofthe neighbors (in the case of SNN). In the case of “real” images, several effective noise estimationmethods have been proposed [10], but it is also worthy mentioning that the noise distribution in realimages is far more complicated than zero-mean Gaussian [8]. Even if a comprehensive analysis ofthe application of SNN to NLM denoising for real images goes beyond the scope of this paper, ourapproach can be effectively applied to “real” images through the application of a VST [8], and itachieves image quality far superior compared to traditional NLM and in practice comparable at visualinspection to the state-of-the-art BM3D-CFA approach [6], at a lower computational cost.4 eferences [1] A HN , B., AND C HO , N. I. Block-matching convolutional neural network for image denoising. CoRR abs/1704.00524 (2017).[2] B
UADES , A., C
OLL , B.,
AND M OREL , J. M. A review of image denoising algorithms, with anew one.
Multiscale Model Sim. (2005).[3] B
UADES , A., C
OLL , B.,
AND M OREL , J.-M. Non-Local Means Denoising.
Image ProcessingOn Line 1 (2011), 208–212.[4] D
ABOV , K., F OI , A., K ATKOVNIK , V.,
AND E GIAZARIAN , K. Image denoising by sparse 3-dtransform-domain collaborative filtering.
IEEE Transactions on Image Processing 16 , 8 (2007).[5] D
ABOV , K., F OI , A., K ATKOVNIK , V.,
AND E GIAZARIAN , K. Bm3d image denoising withshape-adaptive principal component analysis. In
SPARS (2009).[6] D
ANIELYAN , A., V
EHVILAINEN , M., F OI , A., K ATKOVNIK , V.,
AND E GIAZARIAN , K.Cross-color bm3d filtering of noisy raw data.[7] D
UVAL , V., A
UJOL , J.-F.,
AND G OUSSEAU , Y. On the parameter choice for the non-localmeans.
SIAM J. Imag. Sci. (2011).[8] F OI , A. Clipped noisy images: Heteroskedastic modeling and practical denoising. SignalProcessing (2009).[9] K
NAUS , C.,
AND Z WICKER , M. Progressive image denoising.
IEEE TIP (2014).[10] L
EBRUN , M., C
OLOM , M., B
UADES , A.,
AND M OREL , J. M. Secrets of image denoisingcuisine.
Acta Numerica (2012).[11] L
OTAN , O.,
AND I RANI , M. Needle-match: Reliable patch matching under high uncertainty.In
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016).[12] L OU , Y., F AVARO , P., S
OATTO , S.,
AND B ERTOZZI , A. Nonlocal similarity image filtering,2009.[13] P
ONOMARENKO , N., J IN , L., I EREMEIEV , O., L
UKIN , V., E
GIAZARIAN , K., A
STOLA , J.,V
OZEL , B., C
HEHDI , K., C
ARLI , M., B
ATTISTI , F.,
AND K UO , C.-C. J. Image databaseTID2013: Peculiarities, results and perspectives. Signal Processing: Image Communication 30 (2015).[14] R
ICO M ALVAR , L I - WEI H E , R. C. High-quality linear interpolation for demosaicing of bayer-patterned color images. In International Conference of Acoustic, Speech and Signal Processing (May 2004), Institute of Electrical and Electronics Engineers, Inc.[15] T
SAI , Y.-T., S
TEINBERGER , M., P
AJ ˛AK , D.,
AND P ULLI , K. Fast ANN for high-qualitycollaborative filtering.
CGF (2015).[16] W
ANG , Z., B
OVIK , A. C., S
HEIKH , H. R.,
AND S IMONCELLI , E. P. Image quality assessment:From error visibility to structural similarity.
IEEE TIP (2004).[17] W
ANG , Z., S
IMONCELLI , E.,
AND B OVIK , A. Multiscale structural similarity for imagequality assessment. In
ACSSC (2003).[18] W U , Y., T RACEY , B., N
ATARAJAN , P.,
AND N OONAN , J. P. Probabilistic non-local means.
IEEE Sig. Proc. Let. 20 (2013).[19] X UE , W., Z HANG , L., M OU , X., AND B OVIK , A. C. Gradient magnitude similarity deviation:A highly efficient perceptual image quality index.
IEEE TIP (2014).[20] Y U , G., AND S APIRO , G. DCT Image Denoising: a Simple and Effective Image DenoisingAlgorithm.
IPOL (2011).[21] Z
HANG , L., Z
HANG , D., M OU , X., AND Z HANG , D. Fsim: A feature similarity index forimage quality assessment.
IEEE TIP (2011).5 n Nearest Neighbors in Non Local Means Denoising:Additional Material
In this section we resort to a toy problem to compute analytically the bias and variance of the estimateof a noise-free patch, in case neighbor patches are collected through the traditional NN approach ortrough the proposed SNN method. We also demonstrate numerically the advantage of SNN over NNfor the toy problem in hand.
NLM denoising is a three step procedure: i) for each patch, the neighbors are identified; ii) neighborsare averaged trough a weighted average procedure); iii) denoised patches are aggregated (sincepatches are partially overlapping, multiple estimates are averaged for each pixel). Collecting NNneighbors introduces a prediction error in step ii). To demonstrate this, we consider a × patchwith noise-free value µ , and assume Gaussian noise with variance σ . We assume that N noisyreplicas of the patch are available (black samples in Fig. 4a). For each reference patch, µ r , we collect N n neighbors, { γ k } k =1 ..N n ; γ k and µ r are Gaussian random variables with average µ and variance σ , indicated by G ( µ, σ ) , with pdf /σ · φ [( x − µ ) /σ ] , where φ ( x ) = 1 / √ π · e − . · x . We define d = d ( µ r ) the range in which we are likely to find the N n nearest neighbors of µ r (red stars inFig. 4a). More specifically, d ( µ r ) satisfies: σ (cid:90) µ r + d ( µ r ) µ r − d ( µ r ) φ (cid:18) x − µσ (cid:19) dx =Φ[ µ r + d ( µ r ) − µσ ] − Φ[ µ r − d ( µ r ) − µσ ] = N n N , (4)where Φ( x ) = 1 / · [1 + erf ( x/ √ , and Φ[( x − µ ) /σ ] is the cdf of a Gaussian random variable withmean µ and variance σ . A closed form solution for d ( µ r ) in Eq. (4) does not exist, but d ( µ r ) canbe computed numerically, for instance, using a bracketing technique (see Fig. 4c). From d ( µ r ) , wecompute the expected value of the average of the N n nearest neighbors around µ r , which is E [ˆ µ ( µ r )] .This is equivalent to step ii) of NLM, neglecting the weights in the weighted average. Since ˆ µ ( µ r ) isa truncated Gaussian variable, bounded by µ r ± d ( µ r ) , its expected value and variance are: α = [ µ r − d ( µ r )] σ , β = [ µ r + d ( µ r )] σ E [ˆ µ ( µ r )] = µ − σ · φ ( β ) − φ ( α )Φ( β ) − Φ( α ) (5)Var [ˆ µ ( µ r )] = σ · { − β · φ ( β ) − α · φ ( α )Φ( β ) − Φ( α ) + − [ φ ( β ) − φ ( α )Φ( β ) − Φ( α ) ] } /N n . (6)Fig. 4d shows E [ˆ µ ( µ r )] and Std [ˆ µ ( µ r )] for the case µ = 1 , σ = 0 . , when N n = 16 neighbors are tobe collected from a total of N = 100 samples. Since the NN neighbors { γ k } k =1 ..N n lie close to thenoisy reference patch, µ r , each estimate ˆ µ ( µ r ) is biased towards µ r . The bias grows almost linearlywith µ − µ r and it saturates at approximately σ of distance from µ , as the set of neighbors becomesstable (since N is finite, the same 16 samples are found in the tail of the distribution). We compute Our reasoning for a single-pixel patch easily generalizes to a larger patch, where each pixel is corrupted byzero-mean Gaussian noise. a) (b)(c) (d) Figure 4: Panels (a) and (b) show respectively the NN and SNN strategy to collect N n = 16 neighbors { γ k } k =1 ..N n of µ r , for a × patch with noise-free value µ = 1 , corrupted by zero-mean Gaussiannoise, σ = 0 . , from a set of N = 100 samples. In panels (a) and (b), µ r = 1 . , far off µ = 1 .Its 16 NNs are in its immediate vicinity (red, in a), ultimately leading to a bad estimate. The 16SNNs (blue, in b) are on average closer to the actual µ leading to a better estimate. Panel (c) showsthe neighbor search intervals; d ( µ r ) is generally smaller for SNN, since the search for neighborsoccurs in this case in two ranges, on the left and right of µ r . Panel (d) shows the expected value ofthe estimate, E [ˆ µ ( µ r )] , and its standard deviation, Std [ˆ µ ( µ r )] , as a function of µ r ; ˆ µ ( µ r ) are specificestimates of µ obtained from the black samples x in panels (a) and (b). SNN yields values closer tothe actual µ = 1 , even when the noisy patch µ r is far off the actual noise-free value.the expected prediction error for estimating µ through ˆ µ ( µ r ) by integrating the bias and varianceterms as: M SE = (cid:90) + ∞−∞ { [ E [ˆ µ ( µ r )] − µ ] + Var [ˆ µ ( µ r )] } · σ · φ [ ( µ r − µ ) σ ] dµ r , (7)and it is equal to 0.180 (bias) + 0.001 (variance) = 0.181 for the case in Fig. 4a. In practice, whenNN neighbors are collected, samples with correlated noise are likely to be chosen. We call this the noise-to-noise matching problem: averaging NNs patches may amplify small correlations between thenoise, without canceling it. In the context of image denoising, the noise-to-noise matching problemsshows up as residual, colored noise in the filtered image (see the images filtered with NLM . in themain paper). Before illustrating the SNN method to collect the neighbors of a reference patch, we analyze thestatistical distribution of the distance between multi-dimension patches (similarly to [18]). Thisshows that, in presence of noise, the expected distance between the reference patch and its neighborsis not zero, which is the main observation behind the proposed SNN approach.The search for similar patches is performed by computing the distance between µ r and patches ofthe same size, but in different positions in the image. If, apart from noise, the reference patch µ r and7ts neighbor γ k are two replica of the same patch, we get: δ ( µ r , γ k ) = (2 σ /P ) · P − (cid:88) i =0 G (0 , . (8)Since the sum of P squared normal variables has a χ P distribution with P degrees of freedom, wehave δ ( µ r , γ k ) ∼ (2 σ / P ) · χ P , and therefore: E [ δ ( µ r , γ k )] = 2 σ . (9)Thus, for two noisy replicas of the same patch, the expected squared distance is not zero. This hasalready been noticed and effectively employed since the original NLM paper [2] to compute theweights of the patches in the weighted average (see Eq. (1) in the main paper), giving less importanceto patches at a squared distance larger than σ , or to build a better weighting scheme, as in [18].Nonetheless, to the best of our knowledge, it has never been employed as a driver for the selection ofthe neighbor patches, as we do here.We can also establish a connection between the toy problem at hand and matching patches in general.In fact, for our toy problem, we can apply Fischer’s approximation ( (cid:112) χ P ∼ = G ( √ P − , ) to asingle-pixel patch ( P = 1 ), and get: δ ( µ r , γ ) ∼ = σ · √ P − G (0 , σ ) = σ + G (0 , σ ) , (10)showing that the expected distance between two × noisy patches is approximately σ . As an alternative to NN, and mostly inspired by Eq. (9), we propose collecting neighbors whosesquared distance from the reference patch is instead close to its expectation. Thus SNNs are thepatches { γ k } k =1 ..N n that minimize: | δ ( µ r , γ k ) − o · σ | , (11)where we have introduced an additional offset parameter o , that allows to continuously move fromthe traditional NN approach ( o = 0 ) to the SNN approach ( o = 1 ). We assume o = 1 for now. To compare the prediction error of NN and SNN in our toy problem, we compute E [ˆ µ ( µ r )] andVar [ˆ µ ( µ r )] , when SNN neighbors of µ r are collected, i.e. , when samples are close to µ r ± o · σ .Fig. 4b illustrates the sampling of SNN neighbors. Sampling potentially occurs on both sides of µ r ,with a different chance of collecting neighbors on each side. The interval d ( µ r ) where we expect tofind the neighbors, has now to satisfy: Φ { [ µ r − o · σ + d ( µ r ) − µ ] /σ } + − Φ { [ µ r − o · σ − d ( µ r ) − µ ] /σ } ++Φ { [ µ r + o · σ + d ( µ r ) − µ ] /σ } + − Φ { [ µ r + o · σ − d ( µ r ) − µ ] /σ } = N n /N, (12)where the first two rows represent the probability to sample one neighbor in the left interval ( P L inFig. 4b), whereas the term in the third and fourth row is for the right interval ( P R in Fig. 4b). Since P L > P R here, the estimate ˆ µ ( µ r ) will be likely moved towards the left, decreasing the predictionerror. As for the NN case, d ( µ r ) can be estimated through Eq. (12) using a bracketing technique.Fig. 4c shows the search intervals d ( µ r ) for a Gaussian random variable, µ = 1 , σ = 0 . , when N n = 16 and N = 100 . Compared to the NN search range, the SNN one is smaller as it generallyincludes two search intervals. These two search ranges may collapse into one when µ r belongs to thetales of the distribution, providing the same result as NN in this (rare) case. Notice that in our toy problem we collect the samples based on their expected distance from the noisyreference µ r , computed thorugh Eq. (10), while in SNN we use the squared distance. This simplification makesthe toy problem mathematically manageable, without changing the intuition behind SNN. µ r . After a few simplifications and usingEqs. (5 - 6) to compute the expected values (E L , E R ) and variances (Var L , Var R ) on the left and rightside of µ r , we have: α L = [ µ r − o · σ − d ( µ r ) − µ ] /σ,β L = [ µ r − o · σ + d ( µ r ) − µ ] /σα R = [ µ r + o · σ − d ( µ r ) − µ ] /σ,β R = [ µ r + o · σ + d ( µ r ) − µ ] /σP L = Φ( β L ) − Φ( α L ) , P R = Φ( β R ) − Φ( α R ) E [ˆ µ ( µ r )] = ( P L E L + P R E R ) / ( P L + P R ) (13)Var [ˆ µ ( µ r )] = { P L ( E L + Var L ) + P R ( E R + Var R ) P L + P R + − E [ˆ µ ( µ r )] } /N n . (14)Eqs. (12), (13), and (14) boil down to the simple NN case in the case of overlapping intervals.Fig. 4d shows E [ˆ µ ( µ r )] and Std [ˆ µ ( µ r )] for the SNN case. The overall bias and variance for SNNare computed by integration as in Eq. (7), and they are respectively equal to 0.040 and 0.010, for atotal MSE of 0.49. Compared to NN, SNN slightly increases the variance of ˆ µ ( µ r ) , but it drasticallydecreases the bias of the estimate, especially for those points close to µ (which are, by the way, themost frequent). In practice, the SNN criterion looks for similar patches having orthogonal realizationsof the noise. This minimizes the noise-to-noise matching issue and it leads to more effective noisecancellation, at the price of a slightly higher variance of ˆ µ ( µ r ) . We report here the extensive quantitative evaluation of the results achieved with the proposed SNNapproach applied to NLM denoising, in the case of white Gaussian noise, colored noise, and in thecase of a “real” image. Such evaluation has not been reported in the main paper for reasons of space.
We test the effectiveness of the SNN schema on the Kodak image dataset [13]. We first considerdenoising in case of white, zero-mean, Gaussian noise with standard deviation σ = { , , , , } ,and evaluate the effect of the offset parameter o and number of neighbors N n . With NLM N n o weindicate NLM with N n neighbors and an offset o . Each image is processed with NLM o and for o ranging from o = 0 ( i.e. , the traditional NN strategy) to o = 1 ; NLM . and NLM . indicate theoriginal NLM, using all the patches in the search window, respectively for σ < and σ ≥ . Thepatch size, the number of negihbors N n , and the filtering parameter h are the optimal ones in [2].We resort to a large set of image quality metrics to evaluate the filtered images: PSNR, SSIM[16],MSSSIM[17], GMSD[19], FSIM and FSIM C [21]. PSNR is proportional to the average squarederror in the image; SSIM and MSSSIM take inspiration from the the human visual system, givingless importance to the residual noise close to edges. GMSD is a perceptually-inspired metric thatcompares the gradients of the filtered and the reference images. FSIM is the one that correlates betterwith the human judgement of the image quality, and takes into consideration the structural features ofthe images; FSIM C also takes into account the color component.Table 1 shows the average image quality metrics measured on the Kodak dataset, whereas visualinspection of Fig. 1 in the main paper reveals the correlation between these metrics and the artifactsintroduced in the filtered images. The PSNR decreases when reducing the number of NN neighbors(see the first two rows for each noise level in Table 1), mostly because of the residual, colorednoise left by NLM . in the flat areas (like the skin of the girl and the wall with the fresco in Fig. 1in the main paper); SSIM and MSSSIM show a similar trend, while GMSD, FSIM, and FSIM C improve when the number of NN neighbors is reduced to 16. The improvement of these threeperceptually-based image quality metrics is associated with the reduction of the NLM bias in the caseof a small set of neighbors, clearly explained in [7]. At visual inspection, edges and structures arebetter preserved, (see the textile pattern and the sun rays in Fig. 1 in the main paper). Table 1 shows9igure 5: Average PSNR and FSIM C measured on the Kodak dataset, as a function of the numberof neighbors N n , for NLM using the NN ( N LM N n . ) and SNN ( N LM N n . ) approaches, comparedto SDCT [20], BM3D [4], and BM3D-CFA [6] in the case of additive white Gaussian noise (firsttwo rows) and colored noise (bottom rows). Notice that the number of patches used by BM3D andBM3D-CFA is constant and coherent with the optimal implementations described in [4] and [6].the PSNR consistently increasing with the offset o , coherently with the decrease of the predictionerror measured in our toy problem from o = 0 . (NN) to o = 1 . (SNN). Visual inspection (Fig. 1 inthe main paper) confirms that SNN removes more noise in the flat areas ( e.g. , the girl’s skin, the wallsurface) compared to NN. For o = 1 . the PSNR approaches that achieved by traditional NLM, whichrequires more neighbor patches and therefore a higher computational time. Even more interestingly,SSIM and MSSSIM are maximized for o = 0 . , whereas the best GMSD, FSIM and FSIM C areobtained for o = 0 . or o = 0 . . Even if none of these metrics is capable of perfectly measuringthe quality perceived by a human observer, they consistently suggest that filtering with the SNNapproach generates images of superior, perceptual quality compared to the traditional NLM filtering.To summarize, sharp edges are reconstructed well by all the approaches. Low-contrast edges ( e.g. ,the texture of the textile, the sun rays and the small details in the fresco), are well preserved by theNN approach, which achieves better GMSD, FSIM and FSIM C compared to traditional NLM, butNLM also leaves higher, colored noise in the image (low PSNR, SSIM and MSSSIM). These samelow-contrast edges are oversmoothed by NLM . and by SNN when o = 1 (NLM . ), whereas SNNwith an offset o = 0 . (NLM . ) achieves the best compromise between preserving low-contrasteddetails in the image and effectively smoothing the flat areas, coherently with the numerical evaluationin Table 1. The proposed SNN strategy always outperforms NN (when using the same number ofpatches) in the image quality metrics, with the exception of very low noise ( σ = 5 ), where GMSD,FSIM and FSIM C are constant for . < o < . . Quite reasonably, the advantage of SNN over NNis larger for a large σ : notice in fact that, for σ → , NN and SNN converge to the same algorithm.The first row of Fig. 5 shows the average PSNR and FSIM C achieved by NLM using NN ( o = 0 ) andSNN ( o = 0 . ) on the Kodak dataset, as a function of the number of neighbors N n . Notice that thefilter computational cost scales linearly (apart from the neighbors search part) with N n . Comparedto NN, the SNN approach always achieves an higher PSNR, since it smooths better the flat areas,as already shown in Fig. 1 in the main paper. FSIM C is slightly lower for NN when N n ≥ ,because SNN tends to oversmooth the edges, but when few neighbors are used the SNN approachclearly outperforms NN. The state-of-the-art denoising filter BM3D [4] produces images of superiorquality, but at a much higher computational cost. We also compared the image quality achieved byour approach with the Sliding Discrete Cosine Transform filter (SDCT [20]), a denoising algorithmof comparable computational complexity based on signal sparsification and not using a NL approach.SDCT achieves a comparable PSNR, but generally a lower FSIM C . Although the case of white Gaussian noise is widely studied, it represents an ideal situation whichrarely occurs in practice. Having in mind the practical application of denoising, we also study themore general case of colored noise. In fact, images are generally acquired using a Bayer sensor.10
SNR SSIM MSSSIM GMSD FSIM FSIM C σ noisy 34.15 .9625 .9797 .0211 .9937 .9931 5NLM . .9851 .9921 .0130 .9941 .9939NLM . NLM . .0113 .9949 .9947 NLM . .0113 .9949 .9947 NLM . .0113 .9949 .9947 NLM . .9854 .9922 .0115 .9948 .9946NLM . .9854 .9923 .0118 .9946 .9945NLM . . .9634 .9804 .0338 .9814 .9811NLM . .9861 .9856NLM . .9861 .9856NLM . .9861 .9856NLM . .0266 .9861 .9857 NLM . . .9645 .9809 .0291 .9845 .9841NLM . . .9109 .9505 .0749 .9475 .9468NLM . . . . .0572 .9653 .9641 NLM . . .9119 .9511 .0635 .9585 .9576NLM . . .8631 .9197 .1071 .9154 .9144NLM . . . . .9441 .9422 NLM . .0793 .9439 .9422 NLM . .8665 .9233 .0850 .9382 .9368NLM . . .8184 .8893 .1303 .8856 .8844NLM . . . . . . . .0977 .9248 .9223 NLM . .8153 .8910 .1037 .9168 .9148NLM . Table 1: Average image quality index for denoising the Kodak image dataset, corrupted by zero-meanGaussian noise, different standard deviation σ and different values of the offset parameter o . Boldnumbers indicate the best results. 11GB data are obtained only after demosaicing, which introduces correlations between nearby pixels.A direct consequence of this is that the NN approach is even more likely to collect false matches,amplifying low-frequency noise as well as demosaicing artifacts. To study this, we mosaic the noisyimages of the Kodak dataset, add noise in the Bayer domain, and subsequently demosaic them throughthe widely used Malvar algorithm [14]. Denoising is then performed through N LM N n . , N LM N n . (for N n = { , , , } ), BM3D [4], and SDCT [20]. All these filters require an estimate of thenoise standard deviation in the RGB domain, obtained here considering that RGB data come froma linear combination (as in [14]) of independent samples in the Bayer domain. To compare withthe state-of-the-art, we also filter the images with BM3D-CFA [6], a modified version of BM3Dspecifically tailored to denoise data in the Bayer domain.The bottom rows of Fig. 5 show the average PSNR and FSIM C on the Kodak dataset. The proposedSNN approach is always more effective than NN in removing colored noise, both in terms of PSNRand FSIM C . This is clearly related with the higher occurrence of noise-to-noise matching for NNin the case of colored noise. Visual inspection (Fig. 3 in the main paper) further confirms that theNN approach applied to NLM suffers from the noise-to-noise matching problem. NLM with SNNalso generally outperforms SDCT and BM3D in terms of PSNR and FSIM C . Since BM3D adopts aNN search selection strategy, it also suffers from the noise-to-noise matching problem in this case.Remarkably, for σ ≤ , NLM coupled with SNN is better, in terms of PSNR, than the state-of-the-art BM3D-CFA. It is slightly inferior for higher noise levels, but at a lower computational cost.Visually, the result produced by NLM with SNN is comparable with that obtained by BM3D-CFA:our approach generates a slightly more noisy image, but with better preserved finer details ( e.g. , seethe eyelids in Fig. 3 in the main paper) and without introducing grid artifacts. Having in mind the practical application of SNN, it is worthy analyzing the peculiarities of imagedenoising for “real” images. We test different filtering techniques (SDCT, NLM . , NLM . , andBM3D-CFA) on a high resolution ( × ) image, captured at ISO 1200 with an NVIDIAShield tablet. Notice that image resolution is much higher in this case, compared to the Kodak dataset.Futhermore, the noise distribution in the Bayer domain is far from the ideal, Gaussian distributionwith constant variance. Therefore, we compute the sensor noise model and the correspondingVariance Stabilizing Transform (VST) as in [8] and apply the VST in the Bayer domain such thatthe noise distribution approximately resembles the ideal one. We filter the image through the SDCT,NLM . , and NLM . denoising algorithms in the RGB domain, after applying the VST and Malvardemosaicing, which introduces correlations among pixels. The inverse VST is then applied to thedata in the RGB domain. Differently from the other algorithms, BM3D-CFA is applied directly toraw data, after the VST and before the inverse VST. This represents an advantage of BM3D-CFA overthe other approaches, that have been designed to deal with white noise, but are applied in this case tocolored noise. Finally, after the denoising step, we apply color correction, white balancing, gammacorrection, unsharp masking, coherently with the typical workflow of an Image Signal Processor;these steps can easily increase the visibility of artifacts and residual noise. A stack of 650 frames isaveraged to build a ground truth image of the same scene.Noisy SDCT NLM . (NN) NLM . (SNN) BM3D-CFAPSNR 21.2689 23.3715 24.0972 24.5500 25.2580SSIM 0.8741 0.8889 0.9135 0.9261 0.9607MSSSIM 0.7133 0.7773 0.8124 0.8450 0.8943GMSD 0.1921 0.1499 0.1356 0.1145 0.0921FSIM 0.9920 0.9927 0.9933 0.9940 0.9952FSIM C . over SDCTand NLM . even for the case of a real image. The NLM . filter preserves high frequency detailsin the image, but it also generates high frequency colored noise in the flat areas, because of the noise-to-noise matching problem. The SDCT filter produces a more blurry image with middlefrequency colored noise in the flat areas. Coherently with the case of colored noise and the Kodak12 r ound t r u t h N o i s y S D C T N L M . ( NN ) N L M . ( S NN ) B M D - C F A Figure 6: Comparison of several patches from the image on the top, acquired with and NVIDIAShield Tablet at ISO 1200, demosaiced and then denoised with the different algorithms. Colorcorrection, white balancing, gamma correction and unsharp masking have also been applied afterdenoising. The ground truth image is obtained by averaging 650 frames. Better seen at 400% zoom.dataset, our approach provides an image of quality comparable to BM3D-CFA; the NLM . filteris slightly more noisy than BM3D-CFA, as measured by the image quality metrics in Table 2, butfiltering requires a lower computational cost. References [1] A HN , B., AND C HO , N. I. Block-matching convolutional neural network for image denoising. CoRR abs/1704.00524 (2017). 132] B
UADES , A., C
OLL , B.,
AND M OREL , J. M. A review of image denoising algorithms, with anew one.
Multiscale Model Sim. (2005).[3] B
UADES , A., C
OLL , B.,
AND M OREL , J.-M. Non-Local Means Denoising.
Image ProcessingOn Line 1 (2011), 208–212.[4] D
ABOV , K., F OI , A., K ATKOVNIK , V.,
AND E GIAZARIAN , K. Image denoising by sparse 3-dtransform-domain collaborative filtering.
IEEE Transactions on Image Processing 16 , 8 (2007).[5] D
ABOV , K., F OI , A., K ATKOVNIK , V.,
AND E GIAZARIAN , K. Bm3d image denoising withshape-adaptive principal component analysis. In
SPARS (2009).[6] D
ANIELYAN , A., V
EHVILAINEN , M., F OI , A., K ATKOVNIK , V.,
AND E GIAZARIAN , K.Cross-color bm3d filtering of noisy raw data.[7] D
UVAL , V., A
UJOL , J.-F.,
AND G OUSSEAU , Y. On the parameter choice for the non-localmeans.
SIAM J. Imag. Sci. (2011).[8] F OI , A. Clipped noisy images: Heteroskedastic modeling and practical denoising. SignalProcessing (2009).[9] K
NAUS , C.,
AND Z WICKER , M. Progressive image denoising.
IEEE TIP (2014).[10] L
EBRUN , M., C
OLOM , M., B
UADES , A.,
AND M OREL , J. M. Secrets of image denoisingcuisine.
Acta Numerica (2012).[11] L
OTAN , O.,
AND I RANI , M. Needle-match: Reliable patch matching under high uncertainty.In
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016).[12] L OU , Y., F AVARO , P., S
OATTO , S.,
AND B ERTOZZI , A. Nonlocal similarity image filtering,2009.[13] P
ONOMARENKO , N., J IN , L., I EREMEIEV , O., L
UKIN , V., E
GIAZARIAN , K., A
STOLA , J.,V
OZEL , B., C
HEHDI , K., C
ARLI , M., B
ATTISTI , F.,
AND K UO , C.-C. J. Image databaseTID2013: Peculiarities, results and perspectives. Signal Processing: Image Communication 30 (2015).[14] R
ICO M ALVAR , L I - WEI H E , R. C. High-quality linear interpolation for demosaicing of bayer-patterned color images. In International Conference of Acoustic, Speech and Signal Processing (May 2004), Institute of Electrical and Electronics Engineers, Inc.[15] T
SAI , Y.-T., S
TEINBERGER , M., P
AJ ˛AK , D.,
AND P ULLI , K. Fast ANN for high-qualitycollaborative filtering.
CGF (2015).[16] W
ANG , Z., B
OVIK , A. C., S
HEIKH , H. R.,
AND S IMONCELLI , E. P. Image quality assessment:From error visibility to structural similarity.
IEEE TIP (2004).[17] W
ANG , Z., S
IMONCELLI , E.,
AND B OVIK , A. Multiscale structural similarity for imagequality assessment. In
ACSSC (2003).[18] W U , Y., T RACEY , B., N
ATARAJAN , P.,
AND N OONAN , J. P. Probabilistic non-local means.
IEEE Sig. Proc. Let. 20 (2013).[19] X UE , W., Z HANG , L., M OU , X., AND B OVIK , A. C. Gradient magnitude similarity deviation:A highly efficient perceptual image quality index.
IEEE TIP (2014).[20] Y U , G., AND S APIRO , G. DCT Image Denoising: a Simple and Effective Image DenoisingAlgorithm.
IPOL (2011).[21] Z
HANG , L., Z
HANG , D., M OU , X., AND Z HANG , D. Fsim: A feature similarity index forimage quality assessment.