Fast High-Dimensional Bilateral and Nonlocal Means Filtering
IIEEE TRANSACTIONS ON IMAGE PROCESSING 1
Fast High-Dimensional Bilateral andNonlocal Means Filtering
Pravin Nair,
Student Member, IEEE, and Kunal N. Chaudhury,
Senior Member, IEEE
Abstract —Existing fast algorithms for bilateral and nonlocalmeans filtering mostly work with grayscale images. They cannoteasily be extended to high-dimensional data such as color andhyperspectral images, patch-based data, flow-fields, etc. In thispaper, we propose a fast algorithm for high-dimensional bilateraland nonlocal means filtering. Unlike existing approaches, wherethe focus is on approximating the data (using quantization) orthe filter kernel (via analytic expansions), we locally approximatethe kernel using weighted and shifted copies of a Gaussian, wherethe weights and shifts are inferred from the data. The algorithmemerging from the proposed approximation essentially involvesclustering and fast convolutions, and is easy to implement.Moreover, a variant of our algorithm comes with a guarantee(bound) on the approximation error, which is not enjoyed byexisting algorithms. We present some results for high-dimensionalbilateral and nonlocal means filtering to demonstrate the speedand accuracy of our proposal. Moreover, we also show that ouralgorithm can outperform state-of-the-art fast approximations interms of accuracy and timing.
Index Terms —High-dimensional filter, bilateral filter, nonlocalmeans, shiftability, kernel, approximation, fast algorithm.
I. I
NTRODUCTION
Smoothing images while preserving structures (edges, cor-ners, lines, etc.) is a fundamental task in image processing. Aclassic example in this regard is the diffusion framework ofPerona and Malik [1]. In the last few decades, several filteringbased approaches have been proposed for this task. Prominentexamples include bilateral filtering [2], [3], [4], mean shiftfiltering [5], weighted least squares [6], domain transform [7],guided filtering [8], and nonlocal means [9]. The brute-forceimplementation of most of these filters is computationallyprohibitive and cannot be used for real-time applications.To address this problem, researchers have come up withapproximation algorithms that can significantly accelerate thefiltering without compromising the quality. Unfortunately, forbilateral and nonlocal means filtering, most of these algo-rithms can be used only for grayscale images. It is difficultto use them even for color filtering, while preserving theirefficiency. The situation is more challenging for multispectraland hyperspectral images, where the range dimension is muchlarger. Of course, any algorithm for grayscale filtering can beused for channel-by-channel processing of high-dimensionalimages. However, by working in the combined intensity space,we can exploit the strong correlation between channels [4].
The authors are with the Department of Electrical Engineering, Indian Insti-tute of Science, Bangalore 560012, India. Correspondence: [email protected]. N. Chaudhury was supported by a Startup Grant from Indian Institute ofScience and EMR Grant SERB/F/6047/2016-2017 from the Department ofScience and Technology, Government of India.
A. High-Dimensional Filtering
The focus of the present work is on two popular smoothers,the bilateral and the nonlocal means filters, and their applica-tion to high-dimensional data. The former is used for edge-preserving smoothing in a variety of applications [10], whilethe latter is primarily used for denoising. Though nonlocalmeans has limited denoising capability compared to state-of-the-art denoisers [11], [12], it continues to be of interest due toits simplicity and the availability of low-complexity algorithms[13], [14], [15]. The connection between these filters is thatthey can be interpreted as a multidimensional Gaussian filteroperating in the joint spatio-range space [16]. The term high-dimensional filtering is used when the dimension of the spatio-range space is large [14], [17], [18]. In this paper, we will usethis term when the range dimension is greater than one.An unified formulation of high-dimensional bilateral andnonlocal means filtering is as follows. Suppose that the inputdata is f : Ω → [0 , R ] n , where Ω ⊂ Z d is the spatial domain, [0 , R ] n is the range space, and d and n are dimensions ofthe domain and range of f . Let p : Ω → R ρ be the guide image, which is used to control the filtering [10]. The output g : Ω → [0 , R ] n is given by g (ı) = 1 η (ı) (cid:88) j ∈ W ω ( j ) ϕ (cid:0) p (ı − j ) − p (ı) (cid:1) f (ı − j ) , (1)where η (ı) = (cid:88) j ∈ W ω ( j ) ϕ (cid:0) p (ı − j ) − p (ı) (cid:1) . (2)The aggregations in (1) and (2) are performed over a windowaround the pixel of interest, i.e., W = [ − S, S ] d , where S isthe window length. We call ω : Z d → R the spatial kerneland ϕ : R ρ → R the range kernel [4], where we denote thedimension of the range space of p as ρ . The spatial kernelis used to measure proximity in the spatial domain Ω (as inclassical linear filtering). On the other hand, the range kernelis used to measure proximity in the range space of p .For (joint) bilateral filtering, ω and ϕ are generally Gaus-sian. While f and p are identical for the bilateral filter( ρ = n ), they are different for the joint bilateral filter [10]. Inthe original proposal for nonlocal means [9], p (ı) are thespatial neighbors of pixel ı (extracted from a patch around ı ) and hence ρ is n times the size of image patch, ϕ is amultidimensional Gaussian, and ω is a box filter. Later, it wasshown in [21] that by reducing the dimension of each patch(e.g., by applying PCA to the collection of patches), we canimprove the speed and denoising performance. Similar to [14],[17], [18], we have also considered PCA-based nonlocal means a r X i v : . [ c s . C V ] N ov EEE TRANSACTIONS ON IMAGE PROCESSING 2 (a)
Lena ( × × ) (b) Proposed, 274ms, 55.36dB. (c) GCS [19], ms, . dB. (d) AM [14], ms, . dB.(e) Brute-force, sec. -30-20-100102030 (f) Error: (b) - (e). -30-20-100102030 (g) Error: (c) - (e). -30-20-100102030 (h) Error: (d) - (e).Fig. 1. Bilateral filtering of a color image [20], where the filter parameters are σ s = 10 and σ r = 40 . We have used clusters for GCS and the proposedmethod. The number of manifolds was automatically set to in AM. The run-time and PSNR (see definition in (21)) with manifolds are . sec and . dB. In the second row, we show the error between a particular method (first row) and the brute-force implementation for just the red channel. in this work. Needless to say, the proposed algorithm can alsowork with full patches.The brute-force computation of (1) and (2) clearly re-quires O ( S d ( n + ρ )) operations per pixel. In particular, thecomplexity scales exponentially with the window length S ,which can be large for some smoothing applications. In thelast decade, several fast algorithms have been proposed forbilateral filtering of grayscale images [20], [22], [23], [24],[25], [26]. The complexity can be cut down from O ( S d ) to O (1) using these algorithms. Similarly, fast algorithms havebeen proposed for nonlocal means [13], [15] that can reducethe complexity from O ( S d ρ ) to O ( S d ) . B. Previous WorkQuantization methods . Durand et al. [23] proposed a novelframework for approximating (1) and (2) using clustering andinterpolation. In terms of our notations, their approximationof (1) and (2) can be expressed as η (ı) K (cid:88) k =1 c k (ı) (cid:32) (cid:88) j ∈ W ω ( j ) ϕ (cid:0) p (ı − j ) − µ k (cid:1) f (ı − j ) (cid:33) , (3)and ˆ η (ı) = K (cid:88) k =1 c k (ı) (cid:32) (cid:88) j ∈ W ω ( j ) ϕ (cid:0) p (ı − j ) − µ k (cid:1)(cid:33) . (4)The centers µ k are obtained by uniformly sampling the rangespace of p , whereas the interpolation coefficient c k (ı) isdetermined from the distance between p (ı) and µ k . Noticethat the inner summations in (3) and (4) can be expressed asconvolutions, which are computed using FFT in [23]. Further-more, the entire processing is performed on the subsampled image, following which the output image is upsampled. Theapproximation turns out to be effective for grayscale images.This is because the range space is one-dimensional in this case,and hence a good approximation can be achieved using a smallthe number of samples K . Moreover, it suffices to interpolatejust the two neighboring pixels. Paris et al. [20] showed thatthe accuracy of [23] can be improved by downsampling theintermediate images (instead of the input image) involved inthe convolutions. Chen et al. [16] proposed to accelerate [20]by performing convolutions in the higher-dimensional spatio-range space. Yang et al. [26] observed that this frameworkcan also be used with non-Gaussian range kernels, and that O (1) convolutions can be used to improve the computationalcomplexity. As the range dimension ρ increases, these methodshowever become computationally inefficient. In particular, anexponentially large K is required to achieve a decent ap-proximation, and linear interpolation requires ρ neighboringsamples. In [27], Yang et al. extended this method to performbilateral filtering of color images, and proposed a fast schemefor linear interpolation. However, [27] is based on uniformquantization, which is not optimal given the strong correlationbetween the color channels. In fact, a more efficient algorithmfor color filtering was later proposed in [19], where clusteringis used to find µ k . Moreover, instead of interpolation, theyused local statistics prior to improve the accuracy. However,the algorithm is used only for color bilateral filtering and itsapplication to other forms of high-dimensional filtering is notreported in [19]. Splat-blur-slice method . Another line of work is [14], [17],[18], which is based on a slightly different form of approx-imation. More precisely, they are based on the “splat-blur-slice” framework, which involves data partitioning (clustering
EEE TRANSACTIONS ON IMAGE PROCESSING 3
Cluster Range Space Form Intermediate Images Calculate Coe ffi cientsFast Convolutions Weighted Recombination fp ˆ gµ k c k b k u k ! ⇤ u k ! ⇤ b k (input)(guide) (output) Fig. 2. Flowchart of our algorithm with the main modules. Also shown are the timings for the bilateral filtering of a MP color image. The filter parametersare σ s = 10 and σ r = 75 . When K = 4 (number of clusters), total time required is ms and the PSNR is . dB. The symbols used in the flowchartare defined in Section II (also see Algorithm 1). or tessellation), fast convolutions, and interpolation as the coreoperations. These are considered to be the state-of-the-art fastalgorithms for high-dimensional bilateral and nonlocal meansfiltering. The idea, originally proposed in [16], is based onthe observation that [20] corresponds to convolutions in thejoint spatio-range space. The general idea is to sample theinput pixels in a different space (splatting), perform Gaussianconvolutions (blurring), and resample the result back to theoriginal pixel space (slicing). Adams et al. tessellated thedomain and performed blurring using the k -d tree in [17] andthe permutohedral lattice in [18]. Gastal et al. [14] dividedand resampled the domain into non-linear manifolds, andperformed blurring on them. This was shown to be faster thanall other methods within the splat-blur-slice framework. Kernel approximation . In an altogether different direction,it was shown in [22], [24], [25], [28] that fast algorithms forbilateral filtering can be obtained by approximating the rangekernel ϕ using the so-called shiftable functions, which includespolynomials and trigonometric functions. The above men-tioned algorithms provide excellent accelerations for grayscaleimages, and are generally superior to algorithms based ondata approximation. Unfortunately, it is difficult to approxi-mate a high-dimensional Gaussian using low-order shiftablefunctions. For example, the Fourier expansion in [25] is quiteeffective in one dimension (grayscale images), but its straight-forward extension to even three dimensions (color images)results in exponentially many terms. This is referred to as the“curse of dimensionality”. C. Contributions
Existing algorithms have focussed on either approximatingthe data [14], [17], [18], [19], [26] or the range kernel [22],[24], [25], [29], [30]. In this paper, we combine the ideasof shiftability [22], [24], [28], [31] and data approximation[23], [26] within a single framework to approximate (1) and(2). In particular, we locally approximate the range kernel in(1) and (2) using weighted and shifted copies of a Gaussian,where the weights and shifts are determined from the data.More specifically, we use clustering to determine the shifts,whereby the correlation between data channels is taken into account. Once the shifts have been fixed, we determine theweights (coefficients) using least-squares fitting, where thedata is again taken into account. An important technical pointis that we are required to solve a least-squares problem ateach pixel, which can be expensive. We show how this canbe done using just one matrix inversion. In summary, the keydistinctions of our method in relation to previous approachesare as follows: • We use data-dependent methods to calculate both µ k and c k in (3) and (4). The latter is done in a heuristic fashionin [14], [17], [18]. We note that our approximation alsoreduces to the form in (3) and (4). However, the notion ofshiftable approximation (in a sense which will be madeprecise in Section II) plays an important role. Namely,it allows us to interpret the coefficients in (3) and (4)from an approximation-theoretic point of view. This inturn forms the basis of the proposed optimization used tocompute the coefficients. • Unlike [19], [20], [23], [26], [27], our approach is scal-able in the dimensions of the input f and the guide p . • An important difference with [22], [24], [25], [28] is thata different shiftable approximation is used at each pixelin our proposal, while the approximation is global in [22],[24], [25], [28]. This will be made precise in Section II. • Similar to [14], [17], [18], our method also involvessplatting (clustering), blurring (convolutions) and slicing(weighted recombinations). However, the important dif-ference is that we perform slicing in an optimal manner,which is done in a heuristic fashion in these methods.We explain the proposed shiftable approximation in detailin Section II. The algorithm emerging from the approximationis conceptually simple and easy to code. The flowchart of theprocessing pipeline is shown in Figure 2. The computation-intensive components are “Cluster Range Space” and “FastConvolutions”, for which highly optimized routines are avail-able. Excluding clustering, the complexity is O ( K ( n + ρ )) ,where K is the number of clusters. Notice that there is nodependence on S . This is a big advantage compared to fastnonlocal means [13], [15], where the scaling is O ( S d ) . Thehighlight of our proposal is that we are able to derive a EEE TRANSACTIONS ON IMAGE PROCESSING 4 bound on the approximation error for a particular variant ofour algorithm [32]. In particular, the error is guaranteed tovanish if we use more clusters. This kind of guarantee isnot offered by state-of-the-art algorithms for high-dimensionalfiltering [14], [17], [18], partly due to the complex nature oftheir formulation.We use the proposed fast algorithm for various applicationssuch as smoothing [4], denoising [9], low-light denoising [33],hyperspectral filtering [34], and flow-field denoising [35]. Inparticular, we demonstrate that our algorithm is competitivewith existing fast algorithms for color bilateral and nonlocalmeans filtering (cf. Figures 1 and 11). We note that althoughour original target was high-dimensional filtering, our algo-rithm outperforms [26], which is considered as the state-of-the-art for bilateral filtering of grayscale images (cf. Figure5). Finally, we note that we have not used the particular formof the Gaussian kernel in the derivation of the fast algorithm.Therefore, it can also be used for non-Gaussian kernels, suchas the exponential and the Lorentz kernel [23], [36].
D. Organization
This rest of the paper is organized as follows. In SectionII, we describe the proposed approximation and the resultingfast algorithm. We also derive error bounds for a particularvariant of the approximation. In Section III, we apply the fastalgorithm for high-dimensional bilateral and nonlocal meansfiltering, and compare its performance (timing and accuracy)with state-of-the-art algorithms. We conclude the paper witha summary of the results in Section IV.II. P
ROPOSED M ETHOD
Notice that, if the guide p has constant intensity value ateach pixel, then (1) and (2) reduce to linear convolutions. Thisobservation is essentially at the core of the fast algorithms in[20], [23], [26]. On the other hand, the fast algorithms in [22],[25], [28] are derived using a completely different idea, wherethe univariate Gaussian kernel ϕ is approximated using apolynomial or a trigonometric function ψ . These functions are shiftable in the following sense: There exists basis functions ψ , . . . , ψ K such that, for any τ ∈ R , we can write ψ ( x − τ ) = K (cid:88) k =1 c k ( τ ) ψ k ( x ) , (5)where the coefficients c , . . . , c K depend only on τ . One canreadily see that polynomials and trigonometric functions, ψ ( x ) = N (cid:88) n =0 p n x n and ψ ( x ) = N (cid:88) n =0 q n cos( nω x ) , (6)are shiftable. The utility of expansion (5) is that it allows us tofactor out p (ı) from the kernel. In particular, by replacing ϕ by ψ , it was shown in [22] that we can compute the bilateralfilter using fast convolutions. The Taylor polynomial of ϕ wasused for ψ in [24], and the Fourier approximation of ϕ wasused in [22], [25], [28]. A. Shiftable Approximation
As mentioned earlier, it is rather difficult to obtain shiftableapproximations for high-dimensional Gaussians that are practi-cally useful. We can use separable extensions of (6) to generatesuch approximations. However, the difficulty is that the num-ber of terms grows as N ρ in this case, where N is the order in(6). We address this fundamental problem using a data-centricapproach. First, we do not use a shiftable (trigonometric orpolynomial) approximation of ϕ [22], [25]. Instead, for somefixed pixel ı ∈ Ω , we consider the multidimensional Gaussian ϕ ( x − p (ı)) centered at p (ı) appearing in (1) and (2). Similarto (5), we wish to find ψ , . . . , ψ K such that ϕ ( x − p (ı)) ≈ K (cid:88) k =1 c k (ı) ψ k ( x ) , (7)where c , . . . , c K depend only on p (ı) . The key distinctionwith [22], [24], [25], [28] is that the functional approximationis defined locally in (7), namely, the approximation is differentat each pixel. Moreover, (7) is neither a polynomial nor atrigonometric function. In spite of this, we continue to use theterm “shiftable” to highlight the fact that the approximationis based on “shifts” of the basis functions ψ , . . . , ψ K . In thiswork, we propose to use the translates of the original kernelas the basis functions, i.e., we set ψ k ( x ) = ϕ ( x − µ k ) . Onecan in principle use different basis functions, but we will notinvestigate this possibility in this paper.The important consideration is that the shifts { µ k } are setglobally. Since the range kernel is defined via the guide p , wepropose to cluster the range space of p and use the clustercentroids for { µ k } . That is we partition the range space Θ = (cid:8) p (ı) : ı ∈ Ω (cid:9) , (8)into clusters C , ...., C K . We set µ k to be the centroid of C k .In summary, for each ı , we require that ϕ ( x − p (ı)) ≈ K (cid:88) k =1 c k (ı) ϕ ( x − µ k ) . (9)At this point, notice the formal resemblance between (5) and(9). Hence, we will refer to (9) as a shiftable approximationin the rest of the discussion, even though it is not shiftable inthe sense of (5).Note that we need a good approximation in (9) only for x ∈ Θ ı = { p (ı − j ) : j ∈ W } . This is because the samplesappearing in (1) and (2) are ϕ ( x − p (ı)) , where x takes valuesin Θ ı . To determine the coefficients, it is therefore natural toconsider the following problem: min (¸ ı) ∈ R K (cid:88) x ∈ Θ ı (cid:16) ϕ ( x − p (ı)) − K (cid:88) k =1 c k (ı) ϕ ( x − µ k ) (cid:17) . (10)The difficulty is that this requires us to solve an expensiveleast-squares problem (the matrix to be inverted is large) foreach ı . More importantly, we have a different inversion, ateach pixel. This is time consuming and impractical. On theother hand, notice that Θ i is included in Θ . Hence, we couldperform the fitting over Θ . Instead, we set Λ = { µ , . . . , µ K } and choose to fit over Λ , which are quantized representatives EEE TRANSACTIONS ON IMAGE PROCESSING 5
Fig. 3. Proposed approximation of the Gaussian range kernel ϕ ( x − p (ı)) in (9) for p (ı) = 25 (left) and p (ı) = 100 (right). The shifts { µ k } are obtained byclustering a grayscale image whose histogram is shown above. As expected, the approximation is better for order K = 8 compared to K = 4 . In particular,the approximation is quite accurate over the entire dynamic range [0 , when K = 8 . However, notice that for K = 4 , the approximation is better in theinterval [100 , , where the density is higher. As the approximation is data-driven, the error gets distributed in tune with the underlying histogram.
10 20 30 40 50 60510152025
Fig. 4. Error rates for the proposed approximation (see text for the definitionof E K ), where ρ is the dimension of the range space of the guide image p .We notice that the approximation error falls off with the increase in order K .The error is averaged over the color images ( ρ = 3 ) from the Kodak datasetand hyperspectral images ( ρ = 33 ) from [37]. of the points in Θ . In short, while the cluster centers Λ mostlikely do not belong to Θ i , they are representative of the largerset Θ . Therefore, instead of (10), we consider the surrogateproblem: min (¸ ı) ∈ R K (cid:88) x ∈ Λ (cid:16) ϕ ( x − p (ı)) − K (cid:88) k =1 c k (ı) ϕ ( x − µ k ) (cid:17) . In matrix notation, we can compactly write this as min (¸ ı) ∈ R K (cid:107) A (¸ ı) − (¯ ı) (cid:107) , (11)where A ∈ R K × K and (¯ ı) ∈ R K are given by A kl = ϕ ( µ k − µ l ) and b k (ı) = ϕ ( µ k − p (ı)) . (12)The solution of (11) is (¸ ı) = A † (¯ ı) , where A † is the pseudo-inverse of A . In particular, we need to compute the pseudo-inverse just once; the coefficients at each pixel are obtainedusing a matrix-vector multiplication. In Figure 6, we showthat solving (11) is much faster than solving (10), and that thecoefficients from (11) are close to those from (10).The approximation result for univariate Gaussians are shownin Figure 3. Notice that our approximation depends on the dis-tribution of p (because of the clustering), and the coefficientsare obtained by solving a least square problem at each pixel ı . Hence, in Figure 3, a particular distribution is selected andthe approximations for p (ı) = 25 and p (ı) = 100 are shown.The error rates (error vs. K ) for multidimensional Gaussians(covariance σ r I ) corresponding to dimensions ρ = 3 and are shown in Figure 4. The error E K for a fixed K is simply(10) averaged over all the pixels.To summarize, the steps in the proposed shiftable approxi-mation are as follows: • Cluster the range space Θ and use the centers for µ k . • Set the basis functions as ψ k ( x ) = ϕ ( x − µ k ) . • Set up A using (12) and compute its pseudo-inverse A † . • At each pixel ı , set (¯ ı) using (12) and (¸ ı) = A † (¯ ı) .We note that one can freely choose different shifts (and basisfunctions) in (9) for different applications. That is, (9) offers abroad approximation framework, where one can consider otherrules for fixing the parameters (shifts and coefficients). B. Fast Algorithm
We now develop a fast algorithm by replacing the kernel in(1) and (2) with the approximation in (9). For ≤ k ≤ K ,define v k : Ω → R n and r k : Ω → R to be v k (ı) = (cid:88) j ∈ W ω ( j ) ϕ ( p (ı − j ) − µ k ) f (ı − j ) , (13)and r k (ı) = (cid:88) j ∈ W ω ( j ) ϕ ( p (ı − j ) − µ k ) . (14)For x = p (ı − j ) , j ∈ W , we replace ϕ ( x − p (ı)) in (1) and(2) with the approximation in (9). After some manipulations(see Appendix V-A), (1) and (2) are approximated as, ˆ g (ı) = 1ˆ η (ı) K (cid:88) k =1 c k (ı) v k (ı) , (15)and ˆ η (ı) = K (cid:88) k =1 c k (ı) r k (ı) . (16)The advantage with (15) and (16) is clear once we noticethat (13) and (14) can be expressed as convolutions. Indeed,by defining u k : Ω → R n to be u k (ı) = b k (ı) f (ı) , where b k is given by (12), we can write v k (ı) = ( ω ∗ u k )(ı) = (cid:88) j ∈ W ω ( j ) u k (ı − j ) , and r k (ı) = ( ω ∗ b k )(ı) = (cid:88) j ∈ W ω ( j ) b k (ı − j ) , where ∗ denotes linear convolution. In summary, using (9), wehave been able to approximate the high-dimensional filteringusing weighted combinations of fast convolutions. The overallprocess is summarized in Algorithm 1. Note that we have used ⊕ , ⊗ and (cid:11) to represent pointwise addition, multiplication,and division. In steps 17 and 18, c k denotes the mapping EEE TRANSACTIONS ON IMAGE PROCESSING 6
Algorithm 1:
Fast High-Dimensional Filtering. Input : f : Ω → R n and p : Ω → R ρ , kernels ω and ϕ ; Parameter : Number of clusters K ; Output : Approximation in (15); // CLUSTER RANGE SPACE { µ k } ← cluster Θ in (8) using bisecting K -means; // FORM INTERMEDIATE IMAGES Set up A in (12) using { µ k } and ϕ , and compute A † ; for ı ∈ Ω do for k = 1 , . . . , K do b k (ı) ← ϕ ( µ k − p (ı)) ; u k (ı) ← b k (ı) f (ı) ; end end // CALCULATE COEFFICIENTS for ı ∈ Ω do (¸ ı) ← A † (¯ ı) ; end // FAST CONVOLUTIONS Initialize v : Ω → R n and r : Ω → R with zeros; for k = 1 , . . . , K do v ← v ⊕ [ c k ⊗ ( ω ∗ u k )] ; r ← r ⊕ [ c k ⊗ ( ω ∗ b k )] ; end ˆ g ← v (cid:11) r . c k : Ω → R . The dominant computation in Algorithm 1 arethe ( n + 1) K convolutions in steps 17 and 18. Clusteringand inversion of A is performed just once. Notice that b k ,which is used for computing the coefficients in step 13, isanyways required to form the intermediate images in step 9.The flowchart of our algorithm, along with typical timings fora megapixel image, is shown in Figure 2. Note that the stepsin Figure 2 (resp. loops in Algorithm 1) can be parallelized:clustering the range space (loop 6), finding coefficients (loop12), and performing convolutions (loop 16). C. Implementation
We have used bisecting K -means [38] for iteratively clus-tering the range space in step 4. In particular, the clusterwith largest variance is divided into two parts (using -meansclustering) at every iteration. This ensures that we do not endup with few large clusters at the end. We initialize the -meansclustering using a pair of points with the largest separation. Wepicked bisecting K -means over other clustering algorithms asits cost is linear in K , and is generally faster than other cluster-ing algorithms [38]. Importantly, its computational overhead isnegligible compared to the time required for the convolutions.Needless to say, we can use any efficient clustering methodand the fast algorithm would still go through.We have used the Matlab routine pinv to compute A † instep 5. As is well-known, when ω is a box or Gaussian,we can convolve using O (1) operations, i.e., the cost doesnot depend on the size of spatial filter [39], [40]. TheGaussian convolutions in steps 17 and 18 are implementedusing Young’s O (1) algorithm [40]. For box convolutions, we used a standard recursive procedure which require justfour operations per pixel for any box size. Since the leadingcomputations are the convolutions, and there are O ( Kn ) convolutions, the overall complexity is O ( K ( n + ρ )) ; theadditional O ( Kρ ) term is for computing (12). Importantly,we are able to obtain a speedup of S d /K over the brute-force computations of (1) and (2). The storage complexity ofAlgorithm 1 is clearly linear in n , K, as well as the image size.The MATLAB implementation of our algorithm is availablehere: https://github.com/pravin1390/FastHDFilter. D. Approximation Accuracy
We note that the algorithm in [32] is a variant of our method.Indeed, for ≤ k ≤ K , if we set the coefficients in (9) to be c k (ı) = (cid:40) if p (ı) ∈ C k , else , (17)then we obtain the fast algorithm in [32]. In other words, thefiltering is performed on a cluster-by-cluster basis in this case.Correspondingly, (15) reduces to ˆ g (ı) = v s (ı) r s (ı) , (18)assuming that p (ı) is in cluster C s .In (15) and (16), we weight the results from the K clusters,where the weights are obtained using the optimization in (11).We will demonstrate in Section III that the weighting helps inreducing the approximation error. Unfortunately, it also makesthe analysis of the algorithm difficult. Intuitively though, onewould expect the error from (15) to be less than that from (18).For the latter approximation, we have the following result [32]. Theorem 1 (Error bound):
Suppose that ω and ϕ are non-negative, and that the latter is Lipschitz continuous, i.e., forsome L > , | ϕ ( x ) − ϕ ( y ) | ≤ L (cid:107) x − y (cid:107) , for any x and y in R ρ . Then, for some constant C > , (cid:88) ı ∈ Ω (cid:107) ˆ g (ı) − g (ı) (cid:107) ≤ CL n | W | E K , (19)where ˆ g is given by (18), and E K is the clustering error: E K = K (cid:88) k =1 (cid:88) p (ı) ∈C k (cid:107) p (ı) − µ k (cid:107) . (20)We note that the assumptions in Theorem 1 are valid when ω is a box or Gaussian, and ϕ is a multidimensional Gaussian.For completeness, we provide the derivation of Theorem 1 inAppendix V-B. For several clustering methods, the clusteringerror E K vanishes as K → ∞ [38]. For any of these methods,we can approximate (1) with arbitrary accuracy provided K is sufficiently large. We note that such a guarantee is notavailable for existing fast approximation algorithms for high-dimensional filtering [14], [17], [18].III. N UMERICAL R ESULTS
We now report some numerical results for high-dimensionalfiltering. We have used the Matlab implementation of Algo-rithm 1. For fair timing comparisons with Adaptive Manifolds
EEE TRANSACTIONS ON IMAGE PROCESSING 7 (AM) [14] and Global Color Sparseness (GCS) [19], we haveused the Matlab code provided by the authors. The timinganalysis was performed on an Intel 4-core 3.4 GHz machinewith 32 GB memory. The grayscale and color images usedfor our experiments were obtained from standard databases .The infrared and hyperspectral images are the ones used in[14] and [37], [41]. To compare the filtering accuracy withexisting methods, we have fixed the timings by adjusting therespective approximation order. The objective of this sectionis to demonstrate that, in spite of its simple formulation, ouralgorithm is competitive with existing fast approximations ofbilateral and nonlocal means filtering.We have used an isotropic Gaussian range kernel for bilat-eral and nonlocal means filtering : ϕ ( x ) = exp (cid:0) −(cid:107) x (cid:107) / σ r (cid:1) , where the standard deviation σ r is used to control the effectivewidth (influence) of the kernel. We recall from (1) and (2) thatthe input to ϕ is the difference x = p (ı) − p (ı − j ) . In oneof the experiments, ϕ ( x ) is an anisotropic Gaussian; we haveexplicitly mentioned this at the right place. The spatial kernel ω for bilateral filtering is also Gaussian: ω (ı) = exp (cid:0) −(cid:107) ı (cid:107) / σ s (cid:1) , where σ s is again the standard deviation. For nonlocal means(NLM), ω is a box filter, namely, no spatial weighting is used.The filter width for bilateral filtering is S = 3 σ s , while thatfor NLM is explicitly mentioned (typically, S = 10 ).Following [24], [26], the error between the outputs of thebrute-force implementation and the fast algorithm is measuredusing the peak signal-to-noise ratio (PSNR). This is given by PSNR = −
10 log (MSE) , where MSE = | Ω | − (cid:88) ı ∈ Ω (cid:107) ˆ g (ı) − g (ı) (cid:107) , (21)and | Ω | is the number of pixels. Note that PSNR is also usedto measure denoising performance; in this case, ˆ g (ı) and g (ı) are the denoised and clean images in (21). It should be evidentfrom the context as to what is being measured using PSNR .We also use
SSIM [42] for measuring visual quality.
A. Grayscale Bilateral Filtering
As mentioned earlier, although the proposed method istargeted at high-dimensional filtering, we can outperform [26],which is considered the state-of-the-art for bilateral filtering ofgrayscale images. We demonstrate this in Table I and Figure5 using a couple of examples. In particular, notice in Figure 5that artifacts are visible in Yang’s result (when K is small.).Our PSNR is almost dB higher, and our output does notshow any artifacts. A detailed PSNR comparison is providedin Table I. We notice that, for the same timing, our PSNR isconsistently better than [26]. Note that we have set K = 4 asthe number of clusters (resp. quantization bins) for our method(resp. [26]).For completeness, we have reported the timings for different σ s in Table II, where we have used Young’s fast algorithm for https://goo.gl/821N2G, https://goo.gl/2fcNmu, https://goo.gl/MvxCMX. (a) Input ( × ). (b) Brute-force filtering.(c) Proposed (43.5 dB). (d) Yang ( dB).Fig. 5. Visual comparison of bilateral filtering for the House image. Filterparameters: σ s = 10 and σ r = 30 . For a fair comparison, we have used fourclusters for the proposed method and four bins for Yang’s method [26]. Noticethat our result is better than Yang’s, both visually (compare boxed areas) and interms of PSNR . This is because we use non-uniform quantization (clustering)and data-driven approximation (see Figure 3).TABLE IC
OMPARISON OF APPROXIMATION ACCURACY ( PSNR ) FOR BILATERALFILTERING OF
Barbara ( × ). A VERAGE TIMING IS
MS FOR Y ANG AND
MS FOR PROPOSED METHOD . σ s \ σ r
10 20 30 40 50 60 70 80 90 100
Yang [26] PSNR .
48 53 .
01 53 .
48 53 .
65 53 .
48 53 .
17 53 .
01 52 .
71 52 .
28 52 . PSNR .
23 53 .
17 53 .
81 53 .
98 53 .
98 53 .
81 53 .
81 53 .
98 53 .
98 53 . PSNR .
86 53 .
32 53 .
81 53 .
98 53 .
82 53 .
65 53 .
65 53 .
82 53 .
98 54 . Proposed PSNR .
08 64 .
61 61 .
69 59 .
50 57 .
50 56 .
09 54 .
88 53 .
81 53 .
32 52 . PSNR .
53 68 .
13 64 .
61 62 .
11 60 .
53 59 .
19 58 .
03 57 .
25 56 .
77 56 . PSNR .
01 70 .
07 67 .
30 64 .
61 62 .
56 60 .
89 60 .
17 59 .
19 58 .
88 58 . performing the Gaussian convolutions [40]. As expected, thetiming essentially does not change with σ s . TABLE II
TIMINGS FOR BILATERAL FILTERING OF
Barbara
FOR VARYING σ s ( σ r = 100 , K = 16 ). σ s
10 20 30 40 50 60 70 80
Time (ms)
496 498 475 486 500 489 485 481
In Figure 6, we have compared the bilateral filtering resultsobtained using (10) and (11). Since (11) is an approximation of(10), we see a drop in PSNR using (10), but importantly thereis a significant speed up using (11). In fact, the time requiredusing (10) is more than that of the brute-force implementation.
B. Color Bilateral Filtering
We next perform fast bilateral filtering of color images,for which the state-of-the-art methods are AM [14] and GCS[19]. We have also compared with our previous work [32],which we simply refer to as “Clustering”. In Figure 7, wehave studied the effect of the number of clusters on the
EEE TRANSACTIONS ON IMAGE PROCESSING 8 (a) Input ( × ). (b) Brute-force ( ms).(c) Using (10) ( sec, . dB). (d) Using (11) (27 ms, 32.6dB).
Fig. 6. Bilateral filtering of a grayscale image. Parameters: σ s = 2 , σ r = 20 , and K = 4 . The coefficients are obtained from (10) and (11) in (c) and (d). filtering accuracy, for our method, GCS, AM and [32]. WhileGCS performs better for small K (coarse approximation), ourmethod outperforms GCS when K > . Note that the numberof manifolds can only take dyadic (discrete) values [14].
10 20 30 40 50 6030405060
ClusteringGCSProposedAM
Fig. 7. Scaling of filtering accuracy (average
PSNR ) with number of clusters(manifolds for AM), for bilateral filtering of color images from the Kodakdataset. The parameters are σ s = 20 and σ r = 40 . size \ K P .
54 48 .
68 53 . K .
29 47 .
42 53 . K .
25 46 .
90 54 . TABLE III
PSNR
FOR COLOR IMAGES ( σ s = 10 , σ r = 50 ). Resolution (MP) T i m i ng ( s e c ) Fig. 8. Timing (seconds) vs. resolution(MP) for color filtering.
In Table III, we report the filtering accuracy for full HD, K ultra HD, and K ultra HD by varying the number of clusters K . For a given resolution and K , we have averaged the PSNRover six high resolution images with that resolution. Noticethat, irrespective of the image resolution, the PSNR values areabove dB with just clusters. In Figure 8, the timings forfiltering color images are plotted for varying image sizes ( . https://hdqwalls.com MP to MP). This plot verifies the linear dependence ofthe timing on the image size, irrespective of the number ofclusters used.
25 50 75 100 125 150 175 200 225 25030354045
ProposedGCSAMClustering
Fig. 9. Approximation accuracy (
PSNR ) for bilateral filtering of color
Bar-bara ( × × ), at different σ r values averaged over σ s = 10 , , .The average timings are: Proposed: ms, Global Color Sparseness: ms, Adaptive Manifolds: ms and Clustering [32]: ms. In Figure 9, we have compared the
PSNR and timings forbilateral filtering a color image, at different σ s and σ r values.We have used K = 16 clusters for all three methods: proposed,GCS, and [32]. We use the default settings for the numberof manifolds in AM. Notice that we are better by - dB,while the timings are roughly the same. In particular, notice thePSNR improvement that we obtain over [32] by optimizing thecoefficients, i.e., by using (15) in place of (18). Interestingly,[32] can outperform GCS and AM at large σ r values.A visual comparison for color bilateral filtering was alreadyshown in Figure 1, along with the error images, timings, and PSNR s. In particular, notice that the error from our method issmaller than AM and GCS. This is interesting as our methodis conceptually simpler than these algorithms — AM has acomplex formulation and GCS is based on two-pass filtering.In Figure 10, a visual result is provided to highlight howthe approximation accuracy scales with K . Notice that theaccuracy improves significantly as K increases from to . Infact, the approximation already resembles the result of brute-force filtering when K is . C. Color Nonlocal Means
In NLM, f (ı) is the noisy image (corrupted with additiveGaussian noise) and p (ı) is a square patch of pixels around ı .In particular, if the patch is m × m , then the dimension of p (ı) is ρ = 3 m for a color image. A popular trick to accelerateNLM (called PCA-NLM) is to project the patch onto a lower-dimensional space using PCA [21]. We note that PCA wasused for reducing the patch dimension (as explained earlier)for both AM and our algorithm.A visual comparison is provided in Figure 11 for low andhigh noise, where the superior performance of our method overAM is evident. We have also compared with a state-of-the-art Gaussian denoiser (CPU implementation ) based on deepneural nets [12]. It is not surprising that the result from [12] isbetter both in terms of PSNR and visual quality. However, weare somewhat faster. Figure 11 also suggests that our methodis visually closer to PCA-NLM at both low and high noiselevels, albeit with a significant speedup ( min to . sec).In Figure 12, we have reported Gaussian denoising resultson the Kodak dataset for different noise levels ( σ ). While the https://github.com/cszn/FFDNet EEE TRANSACTIONS ON IMAGE PROCESSING 9 (a) K = 2 ( ms, dB). (b) K = 4 ( ms, dB). (c) K = 8 ( ms, dB). (d) K = 16 ( ms, dB). (e) Brute-force.Fig. 10. Bilateral filtering of Peppers for different values of K ( σ s = 10 and σ r = 40 ). Notice how the accuracy improves with K .(a) Clean/Noisy ( dB, σ =20 ). (b) [21] ( min, . dB, . ) . (c) Ours (1.7 sec, 31.2 dB, 0.92) . (d)
AM ( sec, . dB, . ) . (e) [12] ( . sec, . dB, . ) .(f) Clean/Noisy ( dB, σ =63 ). (g) [21] ( min, . dB, . ) . (h) Ours (2.7 sec, 26.8 dB, 0.82) . (i)
AM ( . sec, dB, . ) . (j) [12] ( sec, dB, . ) .Fig. 11. Color image denoising (Gaussian noise) using PCA-NLM [21], where the patch and search sizes are × and × . For PCA-NLM, AMand our method, PCA was used to reduce the range dimension from × × to . We used clusters for our method. The number of manifolds wasautomatically set to in the AM code [14]. The run-time, PSNR and SSIM for AM with manifolds are sec, . dB and . (top image) and . sec, . dB and . (bottom image). For comparison, we have also shown the result from a state-of-the-art Gaussian denoiser [12].
25 37.5 50 62.5 7512151821242730
ProposedAMPCA-NLM
25 37.5 50 62.5 750.20.40.60.81
ProposedAMPCA-NLM
Fig. 12. Comparison of the denoising performances of PCA-NLM [21],Adaptive Manifolds, and the proposed method. The noise level is denotedby σ ( × . The PCA dimension was set to for all methods. The PSNR and
SSIM [42] values are averaged over the images from the Kodak dataset.The parameters used are K = 31 , S = 10 , and m = 3 . proposed method and AM perform similarly for small σ , theformer is more robust when σ is large. In fact, the denoisingperformance of AM degrades rapidly with the increase in σ .Another important point is that our result is close to PCA-NLM in terms of PSNR and SSIM for all noise levels. D. Hyperspectral Denoising
We next perform denoising of hyperspectral images usingthe bilateral filter. For such images, the range dimension (i.e.,the number of spectral bands) is high. In Figure 13, denoisingresults are exclusively shown to highlight the speedup ofour approximation over brute-force implementation for a -band image. Notice the dramatic reduction in timing compared to the brute-force implementation. Moreover, notice that thedenoising quality is in fact quite good for our method (comparethe text in the boxed regions). (a) Input [37]. (b) 440nm band.(c) Brute-force ( min). (d) Proposed (15 sec).
Fig. 13. Denoising of a hyperspectral image (1340 × × usingbilateral filtering ( σ s = 5 and σ r = 100 ). The image in (b) shows one of thenoisy bands; the same band after filtering is shown in (c) and (d). We haveshown one of the bands just for visualization; the filtering was performed onthe entire hyperspectral image and not on a band-by-band basis. Moreover, we also compare with recent optimization-based
EEE TRANSACTIONS ON IMAGE PROCESSING 10 (a) Infrared input. (b) Noisy low-light input. (c) Without infrared. (d) With infrared data. (e) [43].Fig. 14. Nonlocal means denoising of a low-light image using [43] and the proposed method, with and without infrared data [33]. Notice that we obtainbetter denoising by taking the infrared information into account (see zoomed sections).(a) Clean/Noisy( σ = 0 . ). (b) Proposed. (c) [44]. (d) [45].Fig. 15. Hyperspectral denoising of a natural image corrupted with Gaussiannoise. Image size: ( × × bands. For our method, σ s = 3 , σ r = 100 , and K = 32 . We used iterations for [44] and iteration for[45]. The (timing, MPSNR, MSSIM) for (b), (c) and (d) are (36 sec, 31.29dB, 0.87) , ( sec, . dB, . ) and ( sec, . dB, . ). denoising methods [44], [45], where parameters are tunedaccordingly. Visual and quantitative comparisons are shownin Figure 15 (Pavia dataset). For quantitative comparisons, wehave used MPSNR and MSSIM, which are simply the PSNRand SSIM values averaged over the spectral bands. We noticethat the restoration obtained using our method is better than[44], [45] (source code made public by authors), which issupported by the metrics shown in the figures. The same isvisually evident from a comparison of the boxed regions inFigure 15. In particular, the color is not restored properly in[44], and grains can be seen in [45]. As expected, we aremuch faster than these iterative methods, since we performthe filtering in one shot. E. Low-light Denoising
Finally, we use our fast algorithm for NLM denoising oflow-light images using additional infrared data [33]. In Figure14, we have shown a visual result which compares our method,both with and without infrared data, and [43] (an optimizationmethod). We set the patch and search window sizes as × and × . We used an anisotropic Gaussian kernel, where σ r =50 for the low-light data and σ r = 10 for the infrared data.The dimension was reduced from × × to using PCA,and the infrared data was added as the seventh dimension. Weused clusters for our method. In Figure 14, notice that somesalient features are lost if we solely use the low-light input asthe guide. Instead, if we add the infrared data to the guide,then the restored image is sharper and the features are moreapparent compared to [43] and NLM without infrared data. F. Flow-field Denoising
Finally, we apply the proposed method for flow-field de-noising. In particular, sharp directional changes in the flow canbe preserved much better using NLM, while simultaneouslyremoving the noise (with Gaussian filtering as the baseline).An instance of flow-field denoising [35] is shown in Figure16 and compared with Gaussian smoothing.IV. C
ONCLUSION
We proposed a framework for fast high-dimensional filteringby approximating both the data and the kernel. In particular,we derived an algorithm that fuses the scalability of the formerwith the approximation capability of the latter. At the coreof our algorithm is the concept of shiftable approximation,which allows us to interpret the coefficients in the frameworkof Durand et al. [23] from an approximation-theoretic pointof view. We proposed an efficient method for determining theshifts (centers) and the coefficients using K -means clustering(inspired by [19]) and data-driven optimization. Though theproposed algorithm is conceptually simple and easy to imple-ment (about lines of code), it was shown to yield promisingresults for diverse applications. In particular, our algorithm wasshown to be competitive with state-of-the-art methods for fastbilateral and nonlocal means filtering of color images. EEE TRANSACTIONS ON IMAGE PROCESSING 11 -2 20 20 02 -2 -2 (a) Clean flow-field. -2 20 20 02 -2 -2 (b) Noisy flow-field ( σ = 7 . ). -2 20 20 02 -2 -2 (c) NLM output. -22 20 0 02 -2 -2 (d) Gaussian smoothing.Fig. 16. Denoising of a synthetic flow-field using proposed NLM approxima-tion and Gaussian smoothing. The synthetic flow-field used is u ( x, y, z ) = x − ze y , v ( x, y, z ) = y − xze y , and w ( x, y, z ) = z − xe y . The patchand search window sizes for NLM are × × and × × , and σ r = 7 . . We have used PCA to reduce the dimension of the patch from × to . Notice that isotropic Gaussian smoothing fails to preserve sharpchanges in the flow direction. A CKNOWLEDGEMENTS
The authors thank the editor and the anonymous reviewersfor their comments and suggestions. We also thank the authorsof [12], [14], [19], [44] for distributing their code. We alsothank Jingxiang Yang for helping us debug the code of [45].V. A
PPENDIX
A. Derivation of (15) and (16)We recall that the basis of the approximation is the follow-ing: For x = p (ı − j ) , j ∈ W , we replace ϕ ( x − p (ı)) in (1)and (2) with (cid:80) Kk =1 c k (ı) ϕ ( x − µ k ) . That is, we approximatethe numerator of (1) with (cid:88) j ∈ W ω ( j ) (cid:32) K (cid:88) k =1 c k (ı) ϕ ( p (ı − j ) − µ k ) (cid:33) f (ı − j ) , (22)and (2) with (cid:88) j ∈ W ω ( j ) (cid:32) K (cid:88) k =1 c k (ı) ϕ ( p (ı − j ) − µ k ) (cid:33) . (23)Exchanging the sums, we can write (22) as K (cid:88) k =1 c k (ı) (cid:88) j ∈ W ω ( j ) ϕ ( p (ı − j ) − µ k ) f (ı − j ) = K (cid:88) k =1 c k (ı) v k (ı) , where v k is defined in (13). Similarly, we can write (23) as (cid:80) Kk =1 c k (ı) r k (ı) , where r k is defined in (14). This completesthe derivation of (15) and (16). B. Proof of Theorem 1
For notational convenience, let g (ı) = ξ (ı) /η (ı) , where ξ (ı) = (cid:88) j ∈ W ω ( j ) ϕ (cid:0) p (ı − j ) − p (ı) (cid:1) f (ı − j ) , and η is as defined in (2). For some fixed ı ∈ Ω , assume that p (ı) ∈ C s where ≤ s ≤ K . Then, following (18), ˆ g (ı) = v s (ı) r s (ı) . (24)Note that we can write ˆ g (ı) − g (ı) = 1 η (ı) (cid:16) ˆ g (ı) (cid:0) η (ı) − r s (ı) (cid:1) + (cid:0) v s (ı) − ξ (ı) (cid:1)(cid:17) . By triangle inequality, we can bound (cid:107) ˆ g (ı) − g (ı) (cid:107) using | η (ı) | (cid:16) √ nR | η (ı) − r s (ı) | + (cid:107) v s (ı) − ξ (ı) (cid:107) (cid:17) , (25)where we have used the fact that ˆ g (ı) ∈ [0 , R ] n . This followsfrom (13), (14), and (24). Now v s (ı) − ξ (ı) = (cid:88) j ∈ W ω ( j ) δ (ı , j ) f (ı − j ) , where δ (ı , j ) = ϕ ( p (ı − j ) − p (ı)) − ϕ ( p (ı − j ) − µ s ) . By Lipschitz property, (cid:107) δ (ı , j ) (cid:107) ≤ L (cid:107) p (ı) − µ s (cid:107) . Note that we can assume that each ω ( j ) is in [0 , R ] . This issimply because the weights appear both in the numerator anddenominator in (1) and (24). Moreover, since the range of f is [0 , R ] n , using triangle inequality again, we obtain (cid:107) v s (ı) − ξ (ı) (cid:107) ≤ L | W |√ nR (cid:107) p (ı) − µ s (cid:107) . (26)Similarly, | η (ı) − r s (ı) | ≤ L | W |(cid:107) p (ı) − µ s (cid:107) . (27)Now, since ω and ϕ are non-negative, η (ı) ≥ ω ( ) ϕ ( ) . Usingthe fact that / | η (ı) | ≤ /ω ( ) ϕ ( ) along with (25), (26) and(27), we obtain (cid:107) ˆ g (ı) − g (ı) (cid:107) ≤ C | W |√ nL (cid:107) p (ı) − µ s (cid:107) , (28)where C = 2 √ R/ω ( ) ϕ ( ) . On squaring (28) and summingover all pixels, we arrive at (19).R EFERENCES[1] P. Perona and J. Malik, “Scale-space and edge detection usinganisotropic diffusion,”
IEEE Transactions on Pattern Analysis andMachine Intelligence , vol. 12, no. 7, pp. 629–639, 1990.[2] V. Aurich and J. Weule, “Non-linear gaussian filters performing edgepreserving diffusion,”
Mustererkennung , pp. 538–545, 1995.[3] S. M. Smith and J. M. Brady, “SUSAN- A new approach to low levelimage processing,”
International Journal of Computer Vision , vol. 23,no. 1, pp. 45–78, 1997.[4] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and colorimages,”
Proc. IEEE International Conference on Computer Vision , pp.839–846, 1998.[5] D. Comaniciu and P. Meer, “Mean shift analysis and applications,”
Proc.IEEE International Conference on Computer Vision , vol. 2, pp. 1197–1203, 1999.[6] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski, “Edge-preservingdecompositions for multi-scale tone and detail manipulation,”
ACMTransactions on Graphics , vol. 27, no. 3, p. 67, 2008.
EEE TRANSACTIONS ON IMAGE PROCESSING 12 [7] E. S. Gastal and M. M. Oliveira, “Domain transform for edge-awareimage and video processing,”
ACM Transactions on Graphics , vol. 30,no. 4, pp. 69:1–69:12, 2011.[8] K. He, J. Sun, and X. Tang, “Guided image filtering,”
IEEE Transactionson Pattern Analysis and Machine Intelligence , vol. 35, no. 6, pp. 1397–1409, 2013.[9] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for imagedenoising,”
Proc. IEEE Conference on Computer Vision and PatternRecognition , vol. 2, pp. 60–65, 2005.[10] S. Paris, P. Kornprobst, J. Tumblin, and F. Durand, “Bilateral filtering:Theory and Applications,”
Foundations and Trends R (cid:13) in ComputerGraphics and Vision , vol. 4, no. 1, pp. 1–73, 2009.[11] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian et al. , “Image denoisingwith block-matching and 3D filtering,” Proceedings of SPIE , vol. 6064,no. 30, pp. 606 414–606 414, 2006.[12] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussiandenoiser: Residual learning of deep cnn for image denoising,”
IEEETransactions on Image Processing , vol. 26, no. 7, pp. 3142–3155, 2017.[13] J. Darbon, A. Cunha, T. F. Chan, S. Osher, and G. J. Jensen, “Fastnonlocal filtering applied to electron cryomicroscopy,”
Proc. IEEEInternational Symposium on Biomedical Imaging: From Nano to Macro ,pp. 1331–1334, 2008.[14] E. S. Gastal and M. M. Oliveira, “Adaptive manifolds for real-time high-dimensional filtering,”
ACM Transactions on Graphics , vol. 31, no. 4,pp. 33:1–33:13, 2012.[15] J. Wang, Y. Guo, Y. Ying, Y. Liu, and Q. Peng, “Fast non-local algorithmfor image denoising,”
Proc. IEEE International Conference on ImageProcessing , pp. 1429–1432, 2006.[16] J. Chen, S. Paris, and F. Durand, “Real-time edge-aware image process-ing with the bilateral grid,”
ACM Transactions on Graphics , vol. 26,no. 3, p. 103, 2007.[17] A. Adams, J. Baek, and M. A. Davis, “Fast high-dimensional filteringusing the permutohedral lattice,”
Computer Graphics Forum , vol. 29,no. 2, pp. 753–762, 2010.[18] A. Adams, N. Gelfand, J. Dolson, and M. Levoy, “Gaussian kd-trees forfast high-dimensional filtering,”
ACM Transactions on Graphics , vol. 28,no. 3, p. 21, 2009.[19] M. G. Mozerov and J. Van De Weijer, “Global color sparseness and alocal statistics prior for fast bilateral filtering,”
IEEE Transactions onImage Processing , vol. 24, no. 12, pp. 5842–5853, 2015.[20] S. Paris and F. Durand, “A fast approximation of the bilateral filter usinga signal processing approach,”
Proc. European Conference on ComputerVision , pp. 568–580, 2006.[21] T. Tasdizen, “Principal neighborhood dictionaries for nonlocal meansimage denoising,”
IEEE Transactions on Image Processing , vol. 18,no. 12, pp. 2649–2660, 2009.[22] K. N. Chaudhury, D. Sage, and M. Unser, “Fast O(1) bilateral filteringusing trigonometric range kernels,”
IEEE Transactions on Image Pro-cessing , vol. 20, no. 12, pp. 3376–3382, 2011.[23] F. Durand and J. Dorsey, “Fast bilateral filtering for the display of high-dynamic-range images,”
ACM Transactions on Graphics , vol. 21, no. 3,pp. 257–266, 2002.[24] F. Porikli, “Constant time O(1) bilateral filtering,”
Proc. IEEE Confer-ence on Computer Vision and Pattern Recognition , pp. 1–8, 2008.[25] K. Sugimoto and S. I. Kamata, “Compressive bilateral filtering,”
IEEETransactions on Image Processing , vol. 24, no. 11, pp. 3357–3369, 2015.[26] Q. Yang, K. H. Tan, and N. Ahuja, “Real-time O(1) bilateral filtering,”
Proc. IEEE Conference on Computer Vision and Pattern Recognition ,pp. 557–564, 2009.[27] Q. Yang, N. Ahuja, and K.-H. Tan, “Constant time median and bilateralfiltering,”
International Journal of Computer Vision , vol. 112, no. 3, pp.307–318, 2015.[28] K. N. Chaudhury and S. D. Dabhade, “Fast and provably accuratebilateral filtering,”
IEEE Transactions on Image Processing , vol. 25,no. 6, pp. 2519–2528, 2016.[29] S. Ghosh and K. N. Chaudhury, “Fast bilateral filtering of vector-valuedimages,”
Proc. IEEE International Conference on Image Processing , pp.1823–1827, 2016.[30] ——, “Color bilateral filtering using stratified fourier sampling,”
Proc.Global Conference on Signal and Information Processing , 2018.[31] K. N. Chaudhury, “Acceleration of the shiftable O(1) algorithm forbilateral filtering and nonlocal means,”
IEEE Transactions on ImageProcessing , vol. 22, no. 4, pp. 1291–1300, 2013.[32] P. Nair and K. N. Chaudhury, “Fast high-dimensional filtering usingclustering,”
Proc. IEEE International Conference on Image Processing ,pp. 240–244, 2017. [33] S. Zhuo, X. Zhang, X. Miao, and T. Sim, “Enhancing low light imagesusing near infrared flash images,”
Proc. IEEE International Conferenceon Image Processing , pp. 2537–2540, 2010.[34] Q. Yuan, L. Zhang, and H. Shen, “Hyperspectral image denoisingemploying a spectral–spatial adaptive total variation model,”
IEEETransactions on Geoscience and Remote Sensing , vol. 50, no. 10, pp.3660–3677, 2012.[35] M. A. Westenberg and T. Ertl, “Denoising 2-D vector fields by vectorwavelet thresholding,”
Journal of WSCG , pp. 33–40, 2005.[36] Q. Yang, “Hardware-efficient bilateral filtering for stereo matching,”
IEEE Transactions on Pattern Analysis and Machine Intelligence ,vol. 36, no. 5, pp. 1026–1032, 2014.[37] S. M. C. N. D. H. Foster, K. Amano and M. J. Foster, “Frequency ofmetamerism in natural scenes,”
Journal of the Optical Society of AmericaA , vol. 23, pp. 2359–2372, 2006.[38] P.-N. Tan, M. Steinbach, and V. Kumar,
Introduction to Data Mining .Addison-Wesley Longman Publishing Co., Inc., 2005.[39] R. Deriche, “Recursively implementating the Gaussian and its deriva-tives,” INRIA, Research Report RR-1893, 1993.[40] I. T. Young and L. J. Van Vliet, “Recursive implementation of theGaussian filter,”
Signal Processing , vol. 44, no. 2, pp. 139–151, 1995.[41] “Hyperspectral image database.” [Online]. Available: http://lesun.weebly.com/hyperspectral-data-set.html[42] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Imagequality assessment: from error visibility to structural similarity,”
IEEETransactions on Image Processing , vol. 13, no. 4, pp. 600–612, 2004.[43] X. Yang and X. Lin, “Brightening and denoising lowlight images,”
Proc.International Conference on Information, Communications and SignalProcessing , pp. 1–5, 2015.[44] F. Fan, Y. Ma, C. Li, X. Mei, J. Huang, and J. Ma, “Hyperspectral imagedenoising with superpixel segmentation and low-rank representation,”
Information Sciences , vol. 397, pp. 48–68, 2017.[45] Y.-Q. Zhao and J. Yang, “Hyperspectral image denoising via sparse rep-resentation and low-rank constraint,”