[PDF] Signal reconstruction via operator guiding

Abstract

Signal reconstruction from a sample using an orthogonal projector onto a guiding subspace is theoretically well justified, but may be difficult to practically implement. We propose more general guiding operators, which increase signal components in the guiding subspace relative to those in a complementary subspace, e.g., iterative low-pass edge-preserving filters for super-resolution of images. Two examples of super-resolution illustrate our technology: a no-flash RGB photo guided using a high resolution flash RGB photo, and a depth image guided using a high resolution RGB photo.

Full PDF

SSignal reconstruction via operator guiding

Andrew Knyazev

Mitsubishi Electric Research Laboratories (MERL)201 Broadway, 8th ﬂoorCambridge, MA 02139-1955Email: [email protected]

Alexander Malyshev

University of BergenDepartment of MathematicsPostbox 7803, 5020 Bergen, NorwayEmail: [email protected]

Abstract —Signal reconstruction from a sample using an or-thogonal projector onto a guiding subspace is theoreticallywell justiﬁed, but may be difﬁcult to practically implement.We propose more general guiding operators, which increasesignal components in the guiding subspace relative to those in acomplementary subspace, e.g., iterative low-pass edge-preservingﬁlters for super-resolution of images. Two examples of super-resolution illustrate our technology: a no-ﬂash RGB photo guidedusing a high resolution ﬂash RGB photo, and a depth imageguided using a high resolution RGB photo.

I. I

NTRODUCTION

Super-resolution (SR) refers to techniques that reconstructhigh-resolution (HR) images from low-resolution (LR) images[1], [2], [3], [4], [5], determining high-frequency componentsand removing degradation caused by image acquisition in LRcameras. A single-image edge-preserving SR by interpolation,called the zooming problem in [6], [7], is not well-posed dueto an ambiguity of determining high-frequency components,needed to preserve edges in the reconstructed high-resolution(HR) image. To remove the ambiguity and make the SRproblem well-posed, one can introduce a guiding subspace,determined by frames, or via an action of an orthogonalprojection on it; see, e.g., [8], [9].Frame-less guided SR using an orthogonal projector on theguiding subspace is theoretically well justiﬁed, e.g., in [8],[9], but may be difﬁcult in practice, where even the guidingsubspace itself may be not explicitly known. We extend theapproach of [8], [9] to guiding operators, which increasesignal components in the guiding subspace, relative to those ina complementary subspace. As noticed in [9], the traditionalTikhonov’s regularization may be substituted with pre- andpost-smoothing of sample-consistent reconstruction in caseof noisy samples. We adopt a similar approach and providealternative to [9] mathematical justiﬁcation covering generalguiding operators, not necessarily orthogonal projectors.We illustrate our sample-consistent operator guided signalreconstruction with pre- and post-smoothing for imaging. AnHR image is reconstructed from a noisy LR image using anHR image of the same scene but in different modality asa guidance in setting up an edge-preserving denoising ﬁlter.Several authors study a corresponding application, where theLR image is a depth image and the HR guidance image is anRGB image of the same scene; e.g., see [10], [11], [12], [13]. II. N

OTATION AND PRIOR WORK

A sampled (degraded) signal vector y in signal processingis often represented by the linear model y = Ax + n, (1)where the vector x is the latent signal (image), n is a noise vec-tor, e.g., consisting of independent and identically distributedzero mean noise. The matrix A is typically a product of down-sampling (e.g., decimation) and blurring operators.The most frequently used restoration method is given byTikhonov’s regularization model, see, e.g., [4], [7], [14], min x (cid:107) Ax − y (cid:107) ω + ρR ( x ) , (2)where the functional R ( x ) is called a regularization term, and ρ > is a regularization parameter. The regularization term R ( x ) aims at bounding and smoothing the solution x of (2).The weighted norm (cid:107) · (cid:107) ω is deﬁned by (cid:107) x (cid:107) ω = x T ωx , where ω is a positive deﬁnite matrix. If A = I is the identity matrix,then (2) describes denoising.Signal (image) processing often uses total variation due toits remarkable ability to preserve contours/edges of signals.The total variation term R ( x ) approximates (cid:107)∇ x (cid:107) ; see, e.g.,[7], [11], [14], [15]. Total variation ﬁltering analogs can beset up in a framework of graph-based signal processing, e.g.,by setting R ( x ) = x T L ( g ) x , where L ( g ) is a graph Laplacianmatrix guided by a signal g , e.g., [16], that we deﬁne next.A signal (image) is interpreted as an intensity function on N vertices V of a weighted graph G = ( V, E, W ) consistingof a ﬁnite set V of vertices (e.g., representing image pixels)and a ﬁnite set E ⊂ V × V of edges ( i, j ) with typicallynonnegative (cf., [17], [18]) weights W ( i, j ) , which measuresimilarity between vertices i and j in the graph. The intensityvalues of the signal form the vector g = [ g , . . . , g N ] T , wherethe vertices V are arbitrarily numbered. The similarity weightsform a symmetric N × N graph adjacency matrix W = W ( g ) .Row-sums d = W N , where N is the vector of ones, of W deﬁne the diagonal N × N degree matrix D = diag( d ) of the graph. Desired for smoothing ﬁlters property d = 1 N ,i.e. D = I , of the graph adjacency matrix W = W ( g ) can beensured by scaling W with positive diagonal matrix multipliersvia Sinkhorn’s algorithm; e.g., [19]. Alternatively, W can besubstituted by the normalized graph adjacency matrix D − W ,although the latter is technically non-symmetric. a r X i v : . [ c s . C V ] M a y he symmetric and positive semideﬁnite graph Laplacianmatrix L = L ( g ) is then deﬁned as L = D − W . In graph-based signal processing, eigenvectors of L serve as generaliza-tions of basis functions of the Discrete Cosine Transform. Ifthe guidance image g , used to determine L ( g ) , is aligned withan noisy image x and shares the same edges, edge-preservingdenoising of x can be performed using ﬁlters based on L ( g ) .For example, the multiplication W ( g ) x on x ampliﬁes spectralcomponents in x corresponding to large eigenvalues of L andthus can be viewed as an approximate high-pass ﬁlter; cf. [20].With R ( x ) = x T L ( g ) x , the normal equations for theoptimization problem (2) become A T ω ( Ax − y ) + ρL ( g ) x = 0 , (3)where L ( g ) is the graph Laplacian operator with a guidancesignal g . A self-guided choice g = x is nonlinear; e.g., [16].Graph-based interpretations of denoising ﬁlters are com-mon. Classical bilateral and total variation ﬁlters for imagedenoising are usually constructed via explicit formulas for theweights W ( i, j ) on a priori determined graph edges E ; e.g.,[12], [16]. More recent guided ﬁlter [20] is deﬁned directly viaexplicit description of its action W ( g ) x on a given image x ,while its edges E and formulas for the weights W ( i, j ) arethen derived for theoretical purposes only. Sophisticated ﬁlters,e.g., BM3D [21], are determined exclusively by functions thatimplement their action, so it may be difﬁcult to obtain explicitformulas for their weights W ( i, j ) . Exact low-pass ﬁlters,where orthogonal projectors represent W and W N = 1 N ,also ﬁt the graph-based framework, although some weights W ( i, j ) are negative.System (3) with the weight matrix ω = I + βL ( g ) , where β ≥ − is a parameter, is used in [19]. The resulting system ( A T [ I + βL ] A + ρL ) x = A T [ I + βL ] y is solved by the conjugate gradient (CG) method; see [19].Connecting Tikhonov’s regularization equation (2) to frame-based reconstruction, one can set ω = (cid:0) AA T (cid:1) + , where theoperation + denotes the Moore-Penrose pseudoinverse, whichgives A + = A T (cid:0) AA T (cid:1) + and turns (3) into A + ( Ax − y ) + ρL ( g ) x = 0 . (4)Denoting s = A + y and introducing the orthogonal projector S = A + A , we equivalently rewrite equation (4) as Sx − s + ρL ( g ) x = 0 , (5)which is the normal equation for Tikhonov’s regularization min x ∈ R n (cid:107) Sx − s (cid:107) + ρx T Lx, (6)—a particular case of optimization problem (2).The authors of [8], [9] investigate the case of problem (6),where L = I − W is an orthogonal projector. In particular,they prove that when the regularization parameter ρ vanishesunconstrained minimization (6) reduces to min x x T Lx , subject to Sx − s = 0 , (7) which can be interpreted as graph-harmonic sample-consistentsignal extension, since the quadratic form x T Lx can be viewedas energy of the signal x , deﬁned by the graph Laplacian L .Since S is an orthogonal projector, we have Sx = s = Ss and the orthogonal decomposition x = s + ( I − S ) x helps toshow that the minimizer x in (7) solves the system ( I − S ) L ( I − S ) x = − ( I − S ) Ls, (8)where the matrix ( I − S ) L ( I − S ) is symmetric positivesemideﬁnite. The special structure of equation (8) allowsapplying CG method to the linear system ( I − S ) Lu = − ( I − S ) Ls, (9)with an initial approximation from the null-space Null ( S ) of S to iteratively approximate the solution u within the subspaceNull ( S ) . The sample-consistent reconstruction x in (7) is thengiven by the orthogonal sum x = s + u ; see [8], [9] for details.The authors of [21] propose solving a self-guided nonlinearversion of (7) via simple iteration x = s, x i +1 = s + ( I − S ) BM3D ( x i ) , (10)where BM3D ( x i ) is an application of the BM3D ﬁlter to x i .Post-processing by αx + (1 − α ) W x with α = 1 / ( ρ + 1) isproved in [9] to solve Tikhonov’s regularization problem (6).Since α ≈ − ρ for small ρ , the post-processing is approxi-mated by x − ρLx when ρ → . The corresponding argumentsin [9] rely on the assumptions that L = I − W and that W is an orthogonal projector. We provide an alternative analysis,dropping these assumptions, in the next section.III. T IKHONOV ’ S REGULARIZATION DEMYSTIFIED

Equation (5) implies ( I − S ) Lx = 0 , i.e. ( I − S ) L ( I − S ) x = − ( I − S ) LSx, (11)which differs from (8) only in the right-hand side, since thesample consistency Sx = s is not enforced in (5).In an orthonormal basis of R N , where S = (cid:20) I (cid:21) , L = (cid:20) L L L L (cid:21) , x = (cid:20) x x (cid:21) , s = (cid:20) s (cid:21) , equation (5) takes the following block form, (cid:20) ρL ρL ρL I + ρL (cid:21) (cid:20) x x (cid:21) = (cid:20) s (cid:21) . (12)Assuming that the block L is invertible, we solve the equiva-lent to (11) equation L x = − L x for x = − L − L x and then for x we obtain the following equation ( I + ρL/L ) x = s , where L/L = L − L L − L , is the Schur complement of the block L in the matrix L .If ρ → , we conclude that x → s and x → − L − L s ,so that the limit vector x solves (7) and (8), where x = s ,i.e. Sx = s . For small ρ , we have x ≈ s − ρL/L s , i.e. thesolution x of Tikhonov’s regularization equation (5) dependslinearly on ρ . The minuend L s in L/L s corresponds tothe vector SLs . Thus, rather than solving (5) directly, for small ρ one can try sample-consistent reconstruction via solving (7)or (8), but with an a priori denoised sampled signal s − ρSLs .V. T HE PROBLEM AND PROPOSED ALGORITHMS

We are given two images of the same scene, but in differentmodalities. The ﬁrst image y has low resolution and maybe noisy. The second image g is of high resolution and isnoise-free. The images are registered (or aligned) by meansof a linear downsampling transform, that is, there is a well-conditioned matrix A such that the images Ag and y arealigned. We want to reconstruct an image x of the sameresolution as g from the image y so that the image g serves asa guidance image in the regularization term of the Tikhonov’sregularization reconstruction model (4).In contrast to the traditional approach directly solving (4),we use our arguments above to take the sample-consistentreconstruction approach (7) from [8] as a starting point, butfor a more general case where the Laplacian L is an arbitrarypositive semi-deﬁnite operator, not necessarily an orthogonalprojector. We notice that the proposed in [8], [9] reduction of(7) to (9) also works for this general case, so we solve (9).Noisy LR samples y should evidently be smoothed prior tocomputing the sample-consistent HR reconstruction x by (9).Dealing with noise, we always pre-process y via denoising, byanalogy with y − ρALA + y proposed in Section III. We alsoﬁnd it helpful in some tests to post-process our sample-consistent reconstruction x , similar to subtracting x − ρLx highly-oscillatory contributions described in Section II.Let CG m ( A , v ) denote a function, which implements m iterations of CG to solve the equation A ( u ) = v with a givenlinear operator A ( u ) . The main cost per iteration is the cost ofevaluation of A ( u ) . Our reconstruction method is as follows. Algorithm 1

Operator guided super-resolution by CG

Input: sample y , downsampling operator A ,guiding operator L , number of iterations m . Output: super-resolution reconstruction x of y :Deﬁne operator A ( u ) = ( I − A + A ) Lu .Choose an initial approximation x satisfying Ax = y .Compute x = x − CG m ( A , A ( x )) .The level x T N of the DC-component N in the recon-structed HR signal x can be adjusted during post-processingto match that in the LR sample y , if necessary.Compared to [8], where W and L = I − W are assumed tobe orthogonal projectors, our approach allows choosing moregeneral smoothing ﬁlters W , but there are still limitations: • Negative entries in the ﬁlter matrix W are allowed, butthe resulting Laplacian L = D − W needs to be symmet-ric positive semideﬁnite, in order for the minimization ofits quadratic form x T Lx in (7) to make sense, and tosatisfy assumptions for CG convergence in Algorithm 1. • DC-invariant ﬁlters, i.e. satisfying W N = 1 N , lead to D = I and thus to the trivial construction of L = I − W ,such as, e.g., the image guided ﬁlter of [20], implementedin the MATLAB Image Processing Toolbox, and the totalvariation ﬁlter presented in [16]. Otherwise, the ﬁlter DC-component evaluation d = W N is needed to determine D = diag ( d ) in L = D − W . • Iterative ﬁlters are allowed, e.g., a DC-invariant smooth-ing ﬁlter W , satisfying W N = 1 N , generates the DC-invariant polynomial ﬁlter P n ( I − W ) where P n ( · ) isa polynomial of degree n , satisfying P n (1) = 1 , forexample, P ( W ) = W . However, the polynomial P n ( · ) needs to remain ﬁxed during the CG iterations. • Self-guided ﬁlters lead to nonlinear operators A ( · ) , re-quiring special care, such as proposed in [16].V. N UMERICAL EXPERIMENTS a) Super-resolution for ﬂash-no ﬂash images:

We havecarried out numerical tests in MATLAB with two reg-istered color images from the MATLAB own directory, toysflash.png and toysnoflash.png of the size × , taken respectively with and without ﬂash. Theﬁrst image is our HR guidance image. We downsample thesecond image by factor in both dimensions choosing every -th pixel with the command image(1:4:end,1:4:end) .The result of downsampling is our LR image. Both images arescaled so that their intensities lie in the range [0 , . Figures 1and 2 display the guidance and the ground truth images.The two images of the same scene have different modalitiesowing to big differences in color representations of the objects.For example, the yellow ball looks white in the ﬂash image.To produce a noisy downsampled LR image, a Gaussiannoise with the default parameters, zero mean and variance . , has been added to the LR image. The degraded image isdisplayed in Figure 3. We pre-smooth the noisy LR image bythe image guided ﬁlter with the default parameters and thenapply Algorithm 1 to the pre-smoothed image.We use a single application of the guided ﬁlter function imguidedfilter from the MATLAB Image ProcessingToolbox as the smoothing ﬁlter W ( g ) for color images.The function parameters are as follows: width of neighbor-hoods is , the smoothing value is − .The relative residual is − after m = 20 CG iterations,which translates into applications of imguidedfilter .Figure 4 shows the reconstructed image, which has the samedesired colors as the ground truth image. The PSNR referredto the ground truth HR image equals . . b) Super-resolution of a depth image: Algorithm 1 hasbeen also evaluated on the

Art image set from the simulatedMiddlebury 2007 data sets extensively used in [11]. The guid-ance image is the gray component of the HR RGB image

Art in Figure 5. The LR noisy depth image is shown in Figure 6.The upsampling shown in Figure 7 has been computed by thecode used in [11], which implements Tikhonov’s regularizationby the generalized total variation.To produce Figure 8 we use the guided Total Variation(TV) ﬁlter described in [16] as the guided smoothing ﬁlterin Algorithm 1 as well as during LR pre-smoothing and HRpost-smoothing. The smoothing parameter used is − at allthree stages, i.e. inside Algorithm 1, as well as for the pre-(and post-) smoothing, with 1/480/850 function evaluations,correspondingly. The number m of CG iterations is toachieve the relative residual · − . ig. 1. HR guidance image toysﬂash .Fig. 2. HR ground truth image toysnoﬂash . Comparison Figure 7 has a smaller PSNR, but a bit lessnoisy and has sharper edges, than our Figure 8, as thegeneralized TV ﬁlter used in [11] is more powerful smoothercompared to the TV ﬁlter from [16] that we use in these tests.However, Figure 7 has noticeable artifacts, clearly comingfrom sharp edges in RGB Figure 5, which are absent in LRFigure 6. In our approach, sharp edges in the guidance image g only affect the ﬁlter weights and thus cannot possibly polluteour reconstruction, as conﬁrmed in Figure 8.VI. C ONCLUSION

We propose algorithms for guided iterative reconstruction ofsignals, illustrated with reconstruction of a higher resolutionimage from a lower resolution noisy sample. Guiding is givenby a smoothing ﬁlter and plays the role of regularization. Leastsquares minimization of the data term (cid:107) Ax − y (cid:107) is substitutedby the equality constraint Ax = y , allowing one to controlsample consistency and eliminating the problem of choosingthe regularization parameter in Tikhonov’s regularization. Fig. 3. Noisy LR image.Fig. 4. Our reconstructed image, PSNR = 24.32.

Noisy lower resolution samples are pre-smoothed. Theobtained sample-consistent high resolution reconstruction canbe post-smoothed, if needed. Iterations are based on conjugategradients, giving the optimal performance. General smooth-ing guidance ﬁlters allow ﬂexibility in designing reconstruc-tions with desirable properties. Initial numerical experimentsdemonstrate feasibility of the proposed technology for super-resolution for ﬂash-no ﬂash images and for depth reconstruc-tion guided by RGB images.R

EFERENCES[1] S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution imagereconstruction: a technical overview,”

IEEE Signal Processing Magazin ,vol. 20, no. 3, pp. 21–36, 2003.[2] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar, “Advances andchallenges in super-resolution,”

International J. Imaging Systems andTechnology , vol. 1420, pp. 47–57, 2004.[3] M. Protter, M. Elad, H. Takeda, and P. Milanfar, “Generalizing thenon-local means to super-resolution reconstruction,”

IEEE Trans. ImageProcess. , vol. 18, no. 1, pp. 36–51, 2009.[4] P. Milanfar,

Super-resolution Imaging , CRC Press, USA, 2010.ig. 5. High resolution RGB × image Art .Fig. 6. Low resolution noisy depth × image Art .[5] Y. Romano, J. Isidoro, and P. Milanfar, “RAISR: Rapid and accurateimage super resolution,”

IEEE Trans. Computational Imaging , vol. 3,no. 1, pp. 110–125, 2017.[6] A. Chambolle, “An algorithm for total variation minimization andapplications,”

J. Math. Imaging and Vision , vol. 205, pp. 89–97, 2004.[7] A. Chambolle, V. Caselles, D. Cremers, M. Novaga, and T. Pock,“An introduction to total variation for image analysis,” in

TheoreticalFoundations and Numerical Methods for Sparse Recovery , M. Fornasier,Ed., vol. 5, pp. 263–340. Radon Series on Comp. Appl. Math., 2010.[8] A. Gadde, A. Knyazev, D. Tian, and H. Mansour, “Guided signalreconstruction with application to image magniﬁcation,” in

IEEE GlobalConf. Signal and Information Processing , 2015, pp. 938–942.[9] A. Knyazev, A. Gadde, H. Mansour, and D. Tian, “Guided signalreconstruction theory,” arXiv:1702.00852, 2017.[10] M. Y. Liu, O. Tuzel, and Y. Taguchi, “Joint geodesic upsampling ofdepth images,” in

IEEE Conference on Computer Vision and PatternRecognition , June 2013, pp. 169–176.[11] D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruether, and H. Bischof, “Imageguided depth upsampling using anisotropic total generalized variation,”in

IEEE Int. Conf. Computer Vision (ICCV) , 2013, pp. 993–1000.[12] Y. Wang, A. Ortega, D. Tian, and A. Vetro, “A graph-based joint bilateralapproach for depth enhancement,” in

IEEE Int. Conf. Acoustic, Speech

Fig. 7. Upsampled depth × image Art , using [11]. PSNR = . .Fig. 8. Upsampled depth × image Art . Our result. PSNR = . . and Signal Processing (ICASSP) , 2014, pp. 885–889.[13] U. S. Kamilov and P. T. Boufounos, “Motion-adaptive depth super-resolution,” IEEE Trans. Image Process. , vol. 26, pp. 1723–1731, 2017.[14] F. ˘Sroubek, J. Kamenick´y, and P. Milanfar, “Superfast superresolution,”in , 2011, pp. 1177–1180.[15] M. Lebrun, M. Colum, A. Buades, and J. M. Morel, “Secrets of imagedenoising cuisine,”

Acta Numerica , vol. 21, pp. 475–576, 2012.[16] A. Knyazev and A. Malyshev, “Accelerated graph-based nonlineardenoising ﬁlters,”

Procedia Comp. Sci. , vol. 80, pp. 607–616, 2016.[17] A. Knyazev, “Signed Laplacian for spectral clustering revisited,”arXiv:1701.01394, 2017.[18] A. Knyazev, “Edge-enhancing ﬁlters with negative weights,” in

IEEEGlobal Conf. Signal and Information Processing , 2015, pp. 260–264.[19] A. Kheradmand and P. Milanfar, “A general framework for regularized,similarity-based image restoration,”

IEEE Trans. Image Process. , vol.23, no. 12, pp. 5136–5151, 2014.[20] K. He, J. Sun, and X. Tang, “Guided image ﬁltering,”

IEEE Trans.Pattern Anal. Machine Intel. , vol. 35, no. 6, pp. 1397–1409, 2013.[21] A. Danielyan, A. Foi, V. Katkovnik, and K. Egiazarian, “Spatiallyadaptive ﬁltering as regularization in inverse imaging: compressivesensing, super-resolution, and upsampling,” in