Regularization with Multilevel Non-stationary Tight Framelets for Image Restoration
Yan-ran Li, Raymond H. F. Chan, Lixin Shen, Xiaosheng Zhuang
aa r X i v : . [ m a t h . O C ] F e b Regularization with Multilevel Non-stationary Tight Framelets forImage Restoration
Yan-Ran Li ∗ Raymond H. F. Chan † Lixin Shen ‡ Xiaosheng Zhuang § February 9, 2021
Abstract
Variational regularization models are one of the popular and efficient approaches for imagerestoration. The regularization functional in the model carries prior knowledge about the imageto be restored. The prior knowledge, in particular for natural images, are the first-order (i.e.variance in luminance) and second-order (i.e. contrast and texture) information. In this paper,we propose a model for image restoration, using a multilevel non-stationary tight framelet systemthat can capture the image’s first-order and second-order information. We develop an algorithmto solve the proposed model and the numerical experiments show that the model is effective andefficient as compared to other higher-order models.
The restoration of a degraded image may be modeled as z = Ku + ǫ, (1)where u denotes the unknown image to be recovered, K a blurring matrix, z an observed blurredimage, and ǫ the noise. In general, K is a singular or near-singular matrix and hence the prob-lem of finding the solution u from model (1) is ill-posed. To overcome the difficulties caused bythe ill-posedness, regularization techniques such as total-variation regularization and multiscaleregularization are often adopted, see [4, 6, 12, 14, 29] and the references therein. The resultingregularized image models have the following generic formmin u {F ( u ) + α G ( u ) } , α > α is the regularization parameter, F represents the data fidelity term and G the regularizationterm. The fidelity term measures the closeness of the estimate obtained from (2) to the data z ∗ College of Computer Science and Software Engineering, Guangdong Key Laboratory of Intelligent Informa-tion Processing and Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen, 518060, P.R. China; SZU Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society, China. Email: :[email protected] . Research supported in part by the Shenzhen R&D Program (JCYJ20180305124325555). † Department of Mathematics, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong.Research supported in part by HKRGC Grants No. CUHK14301718, CityU Grant: 9380101, CRF Grant C1007-15G, AoE/M-05/12. Email: [email protected] . ‡ Department of Mathematics, Syracuse University, Syracuse, NY 13244, USA. Email: [email protected] . Thework of L. Shen was supported in part by the National Science Foundation under grant DMS-1913039. § Department of Mathematics, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong.Email: [email protected] . Research was supported in part by the Research Grants Council of Hong Kong(Project no. CityU 11301419) and City University of Hong Kong (Project no. 7005497) F with the regularizationfunctional G that carries prior knowledge about the image to be restored.Our main focus of this paper is to choose a proper regularization G in (2) for image restoration.Here, a proper regularization means a regularization functional that encodes prior knowledge aboutthe image to be restored. Prior knowledge about images, in particular for natural images, includesfirst-order (i.e. variance in luminance) and second-order (i.e. contrast and texture) information [19].One commonly used regularization term that exploits the first-order information is the boundedvariation semi-norm [29] G ( u ) := Z Ω |∇ u | , (3)where the image u is defined on the bounded set Ω ⊂ R d . The corresponding model (2), referred asthe total variation (TV) based image restoration model, performs incredibly well especially if theimage to be reconstructed is piecewise constant. The total variation functional does not penalizediscontinuities in images and thus allows us to recover the edges of the original image. However,it does not distinguish between jumps and smooth transitions, therefore it tends to give piecewiseconstant images with staircase artifacts. Due to this notably staircase phenomenon, the TV-basedmodel is not suited for reconstructing images that are not nearly piecewise constant. It was pointedout in [13] that whereas the reconstruction generated with the first-order model will display jumps,the basic geometric structure of the original intensity surfaces is missing, even if it appears in thedata. It was further mentioned that using higher order models, these artifacts from the first-ordermodel can be eliminated and some of the fine geometric structures, particularly planar and quadricpatches, of the original image can be recovered.One of the earliest models using higher derivatives was proposed in [5] where the infimal con-volution of the first and second order derivatives was proposed as regularizer G ( u ) := inf v Z Ω |∇ u − ∇ v | + α |∇ ( ∇ v ) | . (4)It approximates locally the gradient of the function u by ∇ v , that itself has a low total variation.Different second-order functionals for staircase reduction have been considered in other papers, forexample, see [8, 26]. Based on tensor algebra, the regularizer with derivatives of arbitrary orderwas introduced in [2]. The corresponding regularizer was called total generalized variation (TGV).In particular, the TGV of second-order is G ( u ) := min v Z Ω ( α |∇ u − v | + α |E ( v ) | ) , (5)where the parameters α , α are positive, and E ( v ) = (cid:20) ∂ v ( ∂ v + ∂ v ) ( ∂ v + ∂ v ) ∂ v (cid:21) with v and v being the components of v . Note that for twice differentiable u , E ( ∇ u ) is the Hessianof u . We note that the TGV of second-order (5) is similar to, but structural different from, theregularizer (4). The use of TGV and its variants in a plethora of applications has been reported in[1, 30] and the references therein.Motivated from the fact that an image/signal naturally has a hierarchical structure and allows tobe represented in a multiscale structure, we exploit this structure to formulate a regularization term G in (2) that is different from the aforementioned ones. To this end, we first construct a two-level2on-stationary tight framelet system that is suitable for representing images to be restored. Morespecifically, the tight framelet system in the first level is the directional Haar framelet (DHF) systemintroduced in our recent work [21] while the one in the second level is constructed from the discretecosine transform (DCT). We then use the framelet coefficients of an image under this two-levelnon-stationary tight framelet system to formulate the regularization term G . More precisely, theframelet coefficients of the image with the DHF consist of the first-order information of the imagein the vertical, horizontal, and ± ° directions. As a result, the regularization term G containsnot only the TV term but also ameliorates it by including the diagonal information. The coarseapproximation to this image resulting from the low-pass filter of the DHF, considered as a smoothversion of this image, will facilitate the extraction of the second-order information of the image.As shown in our previous work [24], the second-order information of the image can be reliablyextracted from the DCT-based tight framelet coefficients of this smoothed image. Our proposedregularization term G also includes these second-order information. We remark that the success oftight framelets have been proven to be useful in image processing, see, e.g., [3, 7, 22, 23, 24, 31] andthe references therein. However, despite that our two-level non-stationary tight framelet system isnew, the proposed regularization is also different from the existing ones in the following perspectives: • Due to the DHF, our regularization assimilates the advantages of both the total variationregularization and other framelet regularizations, and remedies their drawbacks. On theone hand, the filters associated with DHF have the shortest support among all tight frameletsystems, therefore, it can suppress ringing artifacts arising from other framelet regularizations.In comparison, the filters associated with the 2-dimensional orthogonal Haar wavelet have theshortest support only among all compactly supported orthogonal wavelets. On the other hand,the diagonal first-order information provided by the DHF can reduce the staircase artifacts (orblock effect) arising from the classical TV regularization. In comparison, the 2-dimensionalorthogonal Haar wavelet only provide first-order information in the vertical and horizontaldirections. • We exploit the second-order information of the underlying image from its smoothed versionrather than from the image itself. The main idea behind it is that the high frequency spatialinformation of the image will be suppressed in its smoothed one and therefore the second-order information of the image will be faithfully computed, in particularly, for images withhigh degree of noise. • Finally, the properties of the tight framelet can be easily exploited to develop algorithms withcomputational efficiency and to analyze the convergence of the resulting algorithms.To summarize, the proposed regularizer G contains the first and second order information of theimage to be constructed for (2). The resulting optimization problem (2) can be efficiently solvedand the efficiency and accuracy of this regularizer will be confirmed for image restoration.The rest of this paper is organized as follows. In Section 2 we first briefly review the tightframelet systems, we then propose an image restoration model regularized by a two-level non-stationary tight framelet system and develop an algorithm to solve this model. The performanceof the proposed model for image restoration is presented in Section 3.3 Model and Algorithm with Multi-Level Non-Stationary TightFramelets
This section consists of three parts. In the first part, we briefly review the multi-level non-stationarytight framelet systems. In the second part, we propose our image restoration model using a two-level non-stationary tight framelet system. In the last part, we propose an algorithm to solve theresulting optimization problem.
Tight framelets are closely related to filter banks. A tight framelet filter bank can be used to(sparsely) represent data sequences through its associated discrete framelet transforms as well asits underlying discrete affine system [17]. Before proceeding to their connections, let us recall somedefinitions and notation first.By l ( Z d ) we denote the set of all sequences and l ( Z d ) the set of all finitely supported sequences.A filter or mask h = { h ( k ) } k ∈ Z d : Z d → C on Z d is a sequence in l ( Z d ). For a filter h ∈ l ( Z d ),its Fourier series is defined to be b h ( ξ ) := P k ∈ Z d h ( k ) e − i k · ξ for ξ ∈ R d , which is a 2 π Z d -periodictrigonometric polynomial. In particular, by δ we denote the Dirac sequence such that δ (0) = 1and δ ( k ) = 0 for all k ∈ Z d \{ } . Throughout the paper, we assume the tight framelets are dyadicdilated , that is, the dilation matrix is 2 I d with I d the d × d identity matrix.For filters τ , τ , . . . , τ s ∈ l ( Z d ), we say that a filter bank { τ ; τ , . . . , τ s } is a ( d -dimensiondyadic) tight framelet filter bank if s X ℓ =0 b τ ℓ ( ξ ) b τ ℓ ( ξ + πω ) = δ ( ω ) , ∀ ξ ∈ R d , ω ∈ { , } d , (6)where for a number x ∈ C , ¯ x denotes its complex conjugate. Equation (6) is equivalent to the perfect reconstruction property of the discrete framelet transforms associated with the filter bank { τ ; τ , . . . , τ s } ([18, Theorems 1.1.1 and 1.1.4]). The filter τ is usually a low-pass filter satisfying b τ (0) = 1 while τ ℓ ’s are the high-pass filters satisfying b τ ℓ (0) = 0 for ℓ > discrete framelet trans-form associated with tight framelet filter banks are commonly used in order to exploit the sparseproperty of the data. Moreover, in signal/image processing, translation invariance property of adiscrete framelet transform is desirable especially in the scenario of signal denoising/inpainting. Topreserve the translation invariance property, one usually considers the redundant version of discreteframelet transform, that is, the undecimated discrete framelet transform (UDFmT). More precisely,denote a filter bank at level j as η j := { τ j ; τ j , . . . , τ js j } and consider a sequence { η j : j = 1 , . . . , J } = ∪ Jj =1 { τ j ; τ j , . . . , τ js j } of J filter banks with j = J > j = 1 being thecoarsest level. Let the convolution operation * be defined by [ h ∗ v ]( γ ) := P k ∈ Z d h ( γ − k ) v ( k ), for v ∈ l ( Z d ) , h ∈ l ( Z d ) , γ ∈ Z d , and the upsampling operator ↑ m with m ∈ N be given by[ v ↑ m ]( γ ) := ( v ( m − γ ) , if m − γ ∈ Z d ;0 , otherwise . For a filter h , let h ⋆ be a filter defined by h ⋆ ( k ) = h ( − k ), k ∈ Z d . Then, for a given input datasequence v = v J , the UDFmT includes (i) Decomposition: v j − = v j ∗ (( τ j ) ⋆ ↑ J − j ) , w j − ℓ = v j ∗ (( τ jℓ ) ⋆ ↑ J − j ) , ℓ = 1 , . . . , s j , j = J, . . . , , (7)4nd (ii) Reconstruction: v j = v j − ∗ ( τ j ↑ J − j ) + s X ℓ =1 w j − ℓ ∗ ( τ jℓ ↑ J − j ) , j = 1 , . . . , J. (8)One can show that if each filter bank { τ j ; τ j , . . . , τ js j } satisfies the partition of unity condition : P s j ℓ =0 | b τ jℓ ( ξ ) | = 1, ξ ∈ R d , then any input data sequence v ∈ l ( Z d ) can be perfectly reconstructedvia (8) from its framelet coefficient sequences { v } ∪ { w j ; ℓ : ℓ = 1 , . . . , s j } Jj =1 decomposed from (7).The framelet system associated with such a sequence { η j : j = 1 , . . . , J } is then called a multi-levelnon-stationary tight framelet system .In this paper, we consider J = 2, that is, two-level non-stationary tight framelet system. Onecan of course consider J >
2. However, in terms of efficiency and simplicity, J = 2 is the bestchoice for the development of this paper. In this subsection, we integrate two different tight framelet systems as a two-level non-stationarytight framelet system which will be exploited for the optimization problem (2).The tight framelets in the first level is the directional Haar framelet (DHF) system proposed in[21]. The filters associated with this DHF are τ = (cid:20) (cid:21) , τ = (cid:20) − (cid:21) , τ = (cid:20) −
11 0 (cid:21) , τ = (cid:20) −
10 0 (cid:21) ,τ = (cid:20) − (cid:21) , τ = (cid:20) − (cid:21) , τ = (cid:20) − (cid:21) . As two-dimensional filters, the indices of the entries (top-left, top-right, bottom-left, and bottom-right) in each filter are (0 , , , , τ is a low-passfilter and the rest are high-pass filters that have the ability to provide directional information ofan image when these filters are applied to the image. More precisely, the filters τ and τ act asthe first-order difference operators in the 45 ° and 135 ° directions, respectively. The results of thesetwo filters convolving with an image will highlight changes in intensity of the image in these twodiagonal directions. The filters τ and τ are the first-order difference operators in the horizontaldirection while the filters τ and τ are the first-order difference operators in the vertical direction.The convolutions of these filters with the underlying image are the coefficients of the image underthe corresponding filters, which are the multiplications of some associated transformation matriceswith the image.Now, we propose a generic regularization term based on DHF. Let u ∈ R n be the vectorrepresenting the column-stacked version of an image. We denote by M κ the associated matrixrepresentation of the filters τ κ , κ = 0 , , . . . ,
6, under a proper boundary condition. We furtherdenote B ℓ := M and B h := [ M ⊤ , . . . , M ⊤ ] ⊤ . (9)By the tight frame property of { τ ; τ , . . . , τ s } , these two matrices satisfy the following perfectreconstruction condition B ⊤ ℓ B ℓ + B ⊤ h B h = I. : R n → R be defined through a function ϕ : R → R and a non-negative parametervector Λ = (cid:2) λ , λ , . . . , λ n (cid:3) as followsΦ ( v ) := n X i =1 λ i ϕ ( v i , v i + n , . . . , v i +5 n ) . (10)With this function Φ , we propose a functional based on DHF in the following form G ( u ) := Φ ( B h u ) , (11)from which the TV regularization and its variants can be derived by properly chosen ϕ in (10).For example, if we choose ϕ ( x , x , x , x , x , x ) = | x | + | x | , the regularization in (11) is re-duced to the so-called anisotropic TV; If we choose ϕ ( x , x , x , x , x , x ) = p | x | + | x | , theregularization in (11) is reduced to the so-called isotropic TV.We choose, in this paper, ϕ ( x , x , x , x , x , x ) = p | x | + | x | + p | x | + | x | . (12)One of the advantages of the regularization G with ϕ given in (12) is that it assimilates theadvantages of both total variation and wavelet regularizations and remedies their drawbacks. Theway of avoiding or suppressing ringing artifacts arising from wavelet regularizations is to choosea wavelet system whose filters have small supports. The filters associated with the 2-dimensionalorthogonal Haar wavelet have the shortest support among all compactly supported orthogonalwavelets, but the staircase artifacts (or blocky effect) will appear in the neighborhoods of edges inthe directions about ± ° . Since ϕ in (12) includes the diagonal first-order information from thefilters τ and τ , the staircase artifacts can be reduced.The tight framelet in the second level is generated from the standard 3 × c = √ [1 , , c = √ [1 , , − c = √ [1 , − , τ i + j = c ⊤ i c j with i, j ∈ { , , } , where τ is the low-pass filter andthe others are high-pass filters. Here, for simplicity of notation, we use τ κ to denote the filtersassociated with both the DHF or DCT-based tight framelet. The expansions of these filters are τ = , τ = √ −
11 0 −
11 0 − , τ = √ − − − ,τ = √ − − − , τ = −
10 0 0 − , τ = √ − − − ,τ = √ − − −
21 1 1 , τ = √ − − − , τ = − − − − . The filters τ and τ , known as the Prewitt operator in image processing, are used to computean approximation of the gradient (i.e., the first-order information) of the image intensity function.The convolution of τ (resp. τ ) with an image gives the horizontal (resp. vertical) changes ofthe image intensity and they compute changes of intensity with smoothing due to τ = c ⊤ c and τ = c ⊤ c . The filter τ (resp. τ ) computes the discrete second-order difference in vertical (resp.horizontal) direction with smoothing due to τ = c ⊤ c and τ = c ⊤ c . The other filters τ , τ , τ , and τ perform like discrete high-order difference operators.6e should note that the first-order derivative operators exaggerate the effects of noise while thesecond-order derivatives will exaggerated noise twice as much [15]. Therefore, the applicability ofthe second-order derivatives is limited to images with low noise level. Motivated from the Laplacianof a Gaussian (LOG) and difference of Gaussian (DOG) operators in computer vision, see, forexample, [25, 27], we propose to take the second-order derivatives on the blurred or smoothedimages in order to reduce the effect of the presence of noise in an image. To this end, we denote by P κ the matrix representation of the filters τ κ , κ = 0 , , . . .
8, under a proper boundary condition.Let us define B ℓ := P and B h := [ P ⊤ , . . . , P ⊤ ] ⊤ . (13)We have that B ⊤ ℓ B ℓ + B ⊤ h B h = I. Let Φ : R n → R be defined through a nonnegative parameter sequence Θ = { θ i = ( θ i , θ i , . . . , θ i ) ∈ R : 1 i n } with non-negative elements as followsΦ ( v ) := n X i =1 k [ θ i v i , θ i v i + n , . . . , θ i v i +7 n ] k , (14)where k · k denotes the ℓ norm. With this function Φ , we propose a functional based on theDCT-based tight framelet system in the following form G ( u ) := Φ ( B h B ℓ u ) , (15)where B ℓ u is viewed as the smooth version of u .All together, our proposed image restoration model ismin u {F ( u ) + G ( u ) + G ( u ) } . (16)The efficiency of the regularization functional G ( u ) + G ( u ) in (16) will be presented in Section 3when it is compared with several possible regularization functionals formulated from the DHF andDCT-based tight framelet, and with other existing higher-order regularization functionals. In this subsection, we specify the data fidelity F in (16). For Gaussian noise, the natural choicefor F is F ( u ) = k Ku − z k where k · k denotes either the vector 2-norm or matrix 2-norm. Thatis, the optimization problem we consider here ismin u ∈ [0 , n k Ku − z k + Φ ( B h u ) + Φ ( B h B ℓ u ) , (17)where Φ is given in (10) and Φ is given in (14). Here, we assume that all pixel values of animage are in [0 , f : R d → ( −∞ , + ∞ ] such that dom f := { x ∈ R d : f ( x ) < + ∞} 6 = ∅ is denoted by Γ ( R d ). The indicator function of a closed convex set C in R d is defined, at u ∈ R d , as ι C ( u ) := (cid:26) , if u ∈ C ,+ ∞ , otherwise.7learly, the indicator function ι C is in Γ ( R d ) for any closed nonempty convex set C .For a function f ∈ Γ ( R d ), the proximity operator of f with parameter λ , denoted by prox λf ,is a mapping from R d to itself, defined for a given point x ∈ R d byprox λf ( x ) := argmin (cid:26) k u − x k + λf ( u ) : u ∈ R d (cid:27) . We also need the notation of conjugate. The conjugate of f ∈ Γ ( R d ) is the function f ∗ ∈ Γ ( R d )defined at x ∈ R d by f ∗ ( x ) := sup {h u, x i − f ( u ) : u ∈ R d } . A key property of the proximityoperators of f and its conjugate isprox λf ( x ) + λ prox λ − f ∗ ( x/λ ) = x, (18)which holds for all x ∈ R n and any λ > f ( u ) = 12 k Ku − z k , g ( u ) = ι [0 , n , p ( s ) = Φ ( s ) + Φ ( s ) , and A = (cid:20) B h B h B ℓ (cid:21) , (19)where u ∈ R n and s = ( s , s ) with s ∈ R n and s ∈ R n . Then, our optimization problem (17)can be viewed as a special case of the optimization problem whose objective function is the sum ofthree lower semicontinuous convex functions in the form ofmin u ∈ R n f ( u ) + g ( u ) + p ( Au ) , (20)where A is a d × n matrix, f ∈ Γ ( R n ) is differentiable, g ∈ Γ ( R n ), and p ∈ Γ ( R d ).Several algorithms have been developed for the optimization problem (20), see, for example,[10, 11, 20, 33]. We adopt the algorithm given in [33] for problem (20) since it converges undera much weaker condition and can choose a larger step-size, yielding a faster convergence. Thisalgorithm, named as Primal-Dual Three-Operator splitting (PD3O), has the following iteration: u k = prox γg ( v k ) (21a) s k +1 = prox δp ∗ (cid:16) ( I − γδAA ⊤ ) s k + δA (2 u k − v k − γ ∇ f ( u k )) (cid:17) (21b) v k +1 = u k − γ ∇ f ( u k ) − γA ⊤ s k +1 (21c)One PD3O iteration can be viewed as an operator T PD3O such that ( v k +1 , s k +1 ) = T PD3O ( v k , s k ).The convergence analysis of PD3O is given in the following lemma. Lemma 1 (Sublinear convergence rate [33]) . Let f ∈ Γ ( R n ) and its gradient be Lipschitz contin-uous with constant L , let g ∈ Γ ( R n ) , and p ∈ Γ ( R d ) . Choose γ and δ such that γ < /L and M = γδ ( I − γδAA ⊤ ) is positive definite. Let ( v ∗ , s ∗ ) be any fixed point of T PD3O , and { ( v k , s k ) } k > be the sequence generated by PD3O. Define k ( v, s ) k M := p k v k + h s, M s i . Then, the followingstatements hold.(i) The sequence { ( k ( v k , s k ) − ( v ∗ , s ∗ ) k M ) } k > is monotonically nonincreasing.(ii) The sequence { ( k ( v k +1 , s k +1 ) − ( v k , s k ) k M ) } k > is monotonically nonincreasing. Moreover, k ( v k +1 , s k +1 ) − ( v k , s k ) k M = o (cid:16) k +1 (cid:17) .
8o adapt PD3O for our optimization problem (17) with f , g and p , and the matrix A given in(19), some preparations are provided in the following lemmas. Lemma 2.
Let δ > and Φ be given in (10) . For any v ∈ R n , if y = prox δ − Φ ( v ) , then y ( i ) = prox δ − λ i ϕ ( v ( i ) ) , (22) where y ( i ) = (cid:2) y i y i + n · · · y i +5 n (cid:3) ⊤ and v ( i ) = (cid:2) v i v i + n · · · v i +5 n (cid:3) ⊤ . Furthermore, let y ( ij ) and v ( ij ) be y i +( j − n and v i +( j − n , respectively, for i = 1 , . . . , n and j = 1 , . . . , , then (cid:20) y ( i y ( i (cid:21) = − λ i δ − max {k (cid:2) v ( i v ( i (cid:3) k , λ i δ − } ! (cid:20) v ( i v ( i (cid:21) , where the pair ( y ( i , y ( i ) is obtained by simply replacing ( v ( i , v ( i ) in the right hand side of theabove formula by ( v ( i , v ( i ) , and y ( i = v ( i , y ( i = v ( i .Proof. The proof is based on the block separable property of Φ in (10). By the definition ofproximity operator and equations (10) and (12),prox δ − Φ ( v ) = argmin (cid:26) k u − v k + δ − Φ ( u ) : u ∈ R n (cid:27) = argmin ( n X i =1 k u ( i ) − v ( i ) k + δ − λ i ϕ ( u ( i ) ) : u ( i ) ∈ R , i = 1 , . . . , n ) . Hence, equation (22) holds. Notice that ϕ is also a block separable function. By using the definitionof proximity again and the proximity operator of the ℓ norm (see, for example, [9, 28]), we obtainthe explicit expression for y ( i ) as given above. Lemma 3.
Let δ > and Φ be given in (14) . For any v ∈ R n , if y = prox δ − Φ ( v ) , then y ( i ) = prox δ − k·k ◦ diag( θ i ) (diag( v ( i ) ) . (23) where y ( i ) = (cid:2) y i y i + n · · · y i +7 n (cid:3) ⊤ and v ( i ) = (cid:2) v i v i + n · · · v i +7 n (cid:3) ⊤ . Furthermore, let y ( ij ) and v ( ij ) be y i +( j − n and v i +( j − n , respectively, for i = 1 , . . . n and j = 1 , . . . , , then y ( ij ) = max {| v ( ij ) | − δ − θ ij , } sgn( v ( ij ) ) . Proof.
The proof is based on the block separable property of Φ in (14). By the definition ofproximity operator,prox δ − Φ ( v ) = argmin (cid:26) k u − v k + δ − Φ ( u ) : u ∈ R n (cid:27) = argmin ( n X i =1 k u ( i ) − v ( i ) k + δ − k diag( θ i ) u ( i ) k : u ( i ) ∈ R , i = 1 , . . . , n ) . Hence, equation (23) holds. Furthermore, notice that prox δ − k·k ◦ diag( θ i ) is the well-known softthresholding operator, the rest of result holds. Lemma 4.
Let Φ be given in (10) and Φ be given in (14) . For any v ∈ R n , write v = ( v , v ) with v ∈ R n and v ∈ R n , and define p ( v ) = Φ ( v ) + Φ ( v ) . Then, for any δ > , prox δ − p ( v ) = prox δ − Φ ( v ) × prox δ − Φ ( v ) . p . Therefore, weomit its proof here.To apply Lemma 1 to problem (17), we verify all the requirements listed in Lemma 1. First,for the function f in (19), we have that ∇ f ( u ) = K ⊤ ( Ku − z ), the gradient of f is k K k -Lipschitzcontinuous. Next, we discuss the positive definiteness of the matrix I − γδAA ⊤ . Lemma 5.
Let A be given in (19) . Then, for positive numbers γ and δ , the matrix I − γδAA ⊤ ispositive semidefinite (or definite) if and only if γδ (or γδ < ).Proof. First, we show that k A k = 1. For any u ∈ R n , we have that u ⊤ A ⊤ Au = u ⊤ B ⊤ h B h u + u ⊤ B ⊤ ℓ B ⊤ h B h B ℓ u. Since B ⊤ h B h + B ⊤ ℓ B ℓ = I and B ⊤ h B h + B ⊤ ℓ B ℓ = I , from the above we have that u ⊤ A ⊤ Au u ⊤ B ⊤ h B h u + u ⊤ B ⊤ ℓ B ℓ u = u ⊤ u. Hence k A k
1. Further, since the null space of B h is non-empty, therefore, k A k = 1 . Next, since AA ⊤ is positive semi-definite and its largest eigenvalue is 1, hence, I − γδAA ⊤ ispositive semidefinite (or definite) if and only if γδ γδ < δ − p is given in Lemma 4 with the help of Lemmas 2 and 3. Therefore,the proximity operator prox δp ∗ can be computed via (18). With the above preparation, the completeprocedure for solving (17) based on (21a)-(21c) is described in Algorithm 1. This algorithm isrefereed to as TNTF (two-level non-stationary tight framelet) algorithm. Algorithm 1
Two-level Non-stationary Tight Framelet (TNTF) Algorithm Set parameters γ < k K k , γδ <
1; pre-given parameters Λ and Θ. Initialize v = 0 and s = 0 Auxiliary variable x k and write s k = ( s k , s k ) for k = 1 , , . . . do u k = Proj [0 , ( v k ) (24a) x k = γ ( B ⊤ h s k + B ⊤ ℓ B ⊤ h s k ) − (2 u k − v k ) + γK ⊤ ( Ku k − z ) (24b) s k +11 = ( s k − δB h x k ) − δ · prox δ − Φ ( δ − ( s k − δB h x k )) (24c) s k +21 = ( s k − δB h B ℓ x k ) − δ · prox δ − Φ ( δ − ( s k − δB h B ℓ x k )) (24d) v k +1 = u k − γK ⊤ ( Ku k − z ) − γ ( B ⊤ h s k +11 + B ⊤ ℓ B ⊤ h s k +22 ) (24e) end for The convergence analysis for Algorithm 1 is as follows.
Theorem 1.
Let ( v ∗ , s ∗ ) be any fixed point of T P D O with f , g , p and A given in (19) . Let { ( v k , s k ) } k > be the sequence generated by Algorithm 1, where s k = ( s k , s k ) . Choose γ and δ suchthat γ < / k K k and γδ < . Define M = γδ ( I − γδAA ⊤ ) . Then, the following statements hold.(i) The sequence { ( k ( v k , s k ) − ( v ∗ , s ∗ ) k M ) } k > is monotonically nonincreasing. ii) The sequence { ( k ( v k +1 , s k +1 ) − ( v k , s k ) k M ) } k > is monotonically nonincreasing. Moreover, k ( v k +1 , s k +1 ) − ( v k , s k ) k M = o (cid:16) k +1 (cid:17) .Proof. We know that the gradient of f in (19) is k K k -Lipschitz continuous. By Lemma 5,the matrix M is positive definite if γδ <
1, the result of this theorem follows immediately fromLemma 1.Remark: the computational cost of Algorithm 1 depends on mainly two factors: the UDFmTused in steps (24b-e) and the total number of iterations for k . The UDFmT decompositionsinclude B h x k , B h B ℓ x k in (24c-d) while the UDFmT reconstructions include B ⊤ h s k , B T ℓ B ⊤ h s k in (24b) and B ⊤ h s k +11 and B ⊤ ℓ B ⊤ h s k +22 in (24e). Since UDFmT in Algorithm 1 uses convolutionswith 7 DHF filters in the first level and 9 DCT-based filters in the second level, the UDFmT canbe implemented with computational cost O ( n ), where n is the number of pixels in u . For thetotal number of iterations k in Algorithm 1, it depends on when the algorithm converges and themaximum number K of iterations set manually. Consequently, the total computational cost forAlgorithm 1 is O ( K n ).Finally, we discuss how to choose the parameters in the algorithm. In our tests below we choose γ = 1 . , δ = 0 . γδ <
1. Recall from (12) that we only use the first four subbandcoefficients of DHF in the first level. The corresponding regularization parameters λ i (defined in(10)) are chosen to adaptively adjust to local variations. Let I ( i ) be the set containing all indicesin the neighborhood at the i th pixel. Then λ i is set as λ i = λ × |I ( i ) | max { P p ∈I ( i ) k w p k , − } , (25)where w p = [ v ( p , v ( p ] ⊤ or w p = [ v ( p , v ( p ] ⊤ are defined as in (22). In our tests, we choose theneighborhood of window size 3 × λ is set by hand.For the regularization parameters θ i associated with the DCT-based tight framelet coefficients(see (14)), they are all automatically estimated and updated using the approach in our previouswork [24]. More precisely, for the regularization parameters θ iκ , κ = 1 , . . . ,
8, used in (14), they areautomatically estimated according to the local variations of framelet coefficients and noise level.Suppose the ǫ in model (1) is the Gaussian noise with the standard deviation σ . As it was done inour previous work [24], σ κ the noise variance of the framelet coefficients coming from the filter τ κ at the second decomposed level is estimated as σ κ = σ k τ κ k F , where k τ κ k F is the Frobenius normof τ κ ; ( σ κi ) , the local signal variance of the i th framelet coefficients coming from the filter κ th, iscomputed as ( σ κi ) = max { ( P p ∈I κ ( i ) | v p | / |I κ ( i ) | ) − σ κ , − } , where I κ ( i ) is the set containingall indices in the neighborhood at the i th framelet coefficients from the filter κ . With them, theregularization parameters are estimated as θ iκ = √ σ κ σ κi , κ = 1 , . . . , . (26)To save computational cost of estimating parameters λ i in (25) and θ iκ in (26), we only updatethese parameters when the iteration k is a multiple of 30 and fix them after the 200th iteration inour numerical experiments. In this section, we present numerical experiments to illustrate the effectiveness and efficiency ofour proposed model (17) for image restoration. We use the images “Square Circle”, “Cameraman”,11nd “Montage” of size 256 ×
256 as the original images u in our experiments, see Figure 1. Thepixel values of these images are normalized to the interval [0 , e u , is evaluated in terms of the peak-signal-to-noise ratio (PSNR) that is defined byPSNR := 10 log n k e u − u k , where n is the number of pixels in u . To incorporate structural information in image comparisons,the metric of structural similarity (SSIM) [32] of e u to u is reported as well. The higher the PSNRand SSIM, the better the quality of the restored image.(a) (b) (c)Figure 1: Original image: (a) Square Circle; (b) Cameraman; (c) Montage.Two sets of comparisons for image restoration will be conducted in this section. The firstset is to compare with other tight frame regularizers. The second set is to compare with somederivative-based models. Here we compare the proposed regularization functional (16) with two other tight frame regular-ization functionals G DCT and G DHF+DCT while using the classical TV regularizer G TV (see (3)) asa benchmark. The G DCT is defined as G DCT ( u ) = Φ ( B h u ) which only uses the DCT-based tightframelet and takes the first- and second-order information on the image u , where Φ is given in(14). The G DHF+DCT is defined as G DHF+DCT ( u ) = Φ ( B h u ) + Φ ( B h u ), where Φ is givenin (11) with ϕ in (12). The main difference between our proposed regularization functional (16)and G DHF+DCT is that the action B h takes on the smoothed image B ℓ u for our regularizationfunctional while the action B h takes directly on the image u for G DHF+DCT .In our experiment, the image of “Square Circle” in Figure 1(a) (which is the same as Fig-ure 2(a)) is blurred by a 5 × fspecial(’average’,[5:5]) , followed by adding Gaussian noise of mean zero and standard deviation σ = 0 .
04. Thevalues of the pair of (PSNR, SSIM) of these restored images by G TV , G DCT , G DHF+DCT , and theproposed regularization functional (16) are (33.66dB, 0.962), (32.54dB, 0.970), (33.14dB, 0.980),and (35.00dB, 0.980), respectively. To view the visual quality of the restored images, the squareportion marked in the image 1(a) is displayed in Figure 2. For the regions pointed by two arrows,we can conclude that the proposed regularization functional (16) leads to the restored images hav-ing better visual quality than the others. The results clearly show that our combined tight framemodel is better than other intuitive tight frame models.12a) (b) (c)(d) (e) (f)Figure 2: (a) A region of the image of “Square Circle”; (b) The blurred image by the kernel fspecial(’average’, [5:5]) with Gaussian noise of mean zero and variance σ = 0 .
04; The restoredimages with regularization (c) TV with α = 0 .
04 (see (2) and (3)); (d) G DCT ; (e) G DHF+DCT ; and (f)the proposed regularization functional (16) with λ = 0 . .2 Comparison with Derivative-based Regularizers Now we give a comprehensive comparison between our model (17) and the TV and TGV models.The TV model uses G ( u ) in (3) as its regularization term while the TGV model uses G ( u ) in (5)as its regularization term. The software of TGV model was provided by the authors in [16]. Allalgorithms are carried out until the stopping condition k u ( k +1) − u ( k ) k / k u ( k ) k < − is satisfiedor the maximal iterations is 400.In our experiments, the test images in Figure 1 are blurred by a 5 × fspecial(’average’, [5:5]) , followed by adding Gaussian noise of mean zeroand standard deviation σ . For different values of σ , the PSNR and SSIM values of the restoredimages by TV, TGV, and our TNTF are reported in Table 1. The highest values of PSNR andSSIM for each σ in each test image are highlighted. It clearly shows that our proposed TNTFperforms the best in terms of both PSNR and SSIM values. We remark that the regularizationparameters in the second level for our TNTF are automatically estimated based on the approachin our work [24].Table 1: The PSNR ( dB ) and SSIM for the restored results of each algorithm with blurred im-ages contaminated by Gaussian noise. The test images are blurred by the blurring kernel fspe-cial(’average’,[5:5]) . Algorithm “ Square Circle” “Cameraman” “Montage” CasePSNR SSIM PSNR SSIM PSNR SSIMTV 35.40dB 0.980 26.43dB 0.815 28.21dB 0.907TGV 35.58dB 0.976 26.27dB 0.811 28.84dB 0.910 STD σ =0.02TNTF dB dB dB σ =0.03TNTF dB dB dB TV 33.66dB 0.962 25.10dB 0.774 25.94dB 0.867TGV 33.41dB 0.955 24.94dB 0.770 26.17dB 0.875 STD σ =0.04TNTF dB dB dB In the rest of this section, we provide qualitative results of the restored images from the abovethree algorithms. We first show the case for the blurred image of “Square Circle” with Gaussiannoise of STD σ = 0 .
03 in Figure 3. The noisy and blurry image is shown in Figure 3(a). Theregularization parameter α = 0.02 (see (2) and (3)) is used for the TV model and ( α , α ) =(0 . , . λ = 0 . σ =0.02.The restored images by TV, TGV, and TNTF are displayed in Figure 5(b), (c), and (d), respectively.The regularization parameters for TV, TGV, and TNGV are α = 0 . α , α ) = (0 . , . λ = 0 . fspecial(’average’, [5:5]) and added Gaussian noisewith σ = 0 .
03; Images reconstructed by (b) TV with α = 0 .
02 (see (2) and (3)), (c) TGV with( α , α ) = (0 . , . λ = 0 . fspecial(’average’, [5:5]) and added Gaussian noise with σ = 0 .
02; images reconstructed by (b) TV with α = 0.006; (c) TGV with ( α , α ) = (0 . , . λ = 0 . σ =0.04. Therestored images by TV, TGV, and TNTF are displayed in Figure 7(b), (c), and (d), respectively.The regularization parameters for TV, TGV, and TNGV are α = 0 . α , α ) = (0 . , . λ = 0 . In this paper, we have designed a two-level non-stationary tight framelet system and utilized it ina regularization model for image restoration. This framelet system has the ability to capture thefirst and second order information of the image to be reconstructed. We developed an algorithmto solve the resulting optimization problem. The numerical experiments show the effectiveness ofthe proposed image restoration model and the corresponding algorithm.16a) (b) (c) (d)Figure 6: Zoom-in parts of Figure 5: (a) Original image; images reconstructed by (b) TV, (c) TGV,and (d) TNTF. (a) (b)(c) (d)Figure 7: (a) Image blurred by kernel fspecial(’average’, [5:5]) and added Gaussian noise with σ = 0 .
04; images reconstructed by (b) TV with α = 0 . α , α ) = (0 . , . λ = 0 . References [1]
K. Bredies and M. Holler , A TGV-based framework for variational image decompression,zooming, and reconstruction. Part II: Numerics , SIAM Journal on Imaging Sciences, 8 (2015),pp. 2851–2886.[2]
K. Bredies, K. Kunisch, and T. Pock , Total generalized variation , SIAM Journal onImaging Sciences, 3 (2010), pp. 492–526.[3]
J.-F. Cai, B. Dong, S. Osher, and Z. Shen , Image restorations: total variation, waveletframes and beyond , Journal of the American Mathematical Society, 25 (2012), pp. 1033–1089.[4]
A. Chambolle, R. DeVore, N.-Y. Lee, and B. Lucier , Nonlinear wavelet image process-ing: Variational problems, compression, and noise removal through wavelet shrinkage , IEEETransactions on Image Processing, 7 (1998), pp. 319–335.[5]
A. Chambolle and P. L. Lions , Image recovery via total variational minimization andrelated problems , Numerische Mathematik, 76 (1997), pp. 167–188.[6]
R. Chan, T. Chan, L. Shen, and Z. Shen , Wavelet algorithms for high-resolution imagereconstruction , SIAM Journal on Scientific Computing, 24 (2003), pp. 1408–1432.[7]
R. Chan, S. D. Riemenschneider, L. Shen, and Z. Shen , Tight frame: The efficient wayfor high-resolution image reconstruction , Applied and Computational Harmonic Analysis, 17(2004), pp. 91–115.[8]
T. Chan, A. Marquina, and P. Mulet , High-order total variation-based image restoration ,SIAM Journal on Scientific Computing, 22 (2000), pp. 503–516.189]
P. Combettes and V. Wajs , Signal recovery by proximal forward-backward splitting , Mul-tiscale Modeling and Simulation: A SIAM Interdisciplinary Journal, 4 (2005), pp. 1168–1200.[10]
P. L. Combettes and J.-C. Pesquet , Primal-dual splitting algorithm for solving inclusionswith mixtures of composite, lipschitzian, and parallel-sum type monotone operators , Set-Valuedand Variational Analysis, 20 (2012), pp. 307–330.[11]
L. Condat , A primal-dual splitting method for convex optimization involving lipschitzian,proximable and linear composite terms , Journal of Optimization Theory and Applications, 158(2013), pp. 460–479.[12]
M. Figueiredo and R. D. Nowak , An EM algorithm for wavelet-based image restoration ,IEEE Transactions on Image Processing, 12 (2003), pp. 906–916.[13]
D. Geman and G. Reynolds , Constrained restoration and the recovery of discontinuities ,IEEE Transactions on Pattern Analysis and Machine Intelligence, 14 (1992), pp. 367–383.[14]
S. Geman and D. Geman , Stochastic relaxation, Gibbs distributions, and the Bayesianrestoration of images , IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(1984), pp. 721–741.[15]
R. Gonzalez and R. Woods , Digital Image Processing , Addison-Wesley, Boston, MA, 1993.[16]
W. Guo, J. Qin, and W. Yin , A new detail-preserving regularization scheme , SIAM Journalon Imaging Sciences, 7 (2014), pp. 1309–1334.[17]
B. Han , Properties of discrete framelet transforms , Mathematical Modelling of Natural Phe-nomena, 8 (2013), pp. 18–47.[18] ,
Framelets and Wavelets: Algorithms, Analysis, and Applications , Springer InternationalPublishing, 2018.[19]
A. P. Johnson and C. L. Baker , First- and second-order information in natural images:a filter-based approach to image statistics , Journal of the Optical Society of America A, 21(2004), pp. 913–925.[20]
Q. Li and N. Zhang , Fast proximity-gradient algorithms for structured convex optimizationproblems , Applied and Computational Harmonic Analysis, 41 (2016), pp. 491 – 517.[21]
Y.-R. Li, R. H. Chan, L. Shen, Y.-C. Hsu, and W.-Y. I. Tseng , An adaptive directionalhaar framelet-based reconstruction algorithm for parallel magnetic resonance imaging , SIAMJournal on Imaging Sciences, 9 (2016), pp. 794–821.[22]
Y.-R. Li, D. Dai, and L. Shen , Multiframe super-resolution reconstruction using sparsedirectional regularization , IEEE Transactions on Circuits and Systems for Video Technology,20 (2010), pp. 945–956.[23]
Y.-R. Li, L. Shen, D. Q. Dai, and B. W. Suter , Framelet algorithms for de-blurringimages corrupted by impulse plus gaussian noise , IEEE Transactions on Image Processing, 20(2011), pp. 1822–1837.[24]
Y.-R. Li, L. Shen, and B. W. Suter , Adaptive inpainting algorithm based on DCT inducedwavelet regularization , IEEE Transactions on Image Processing, 22 (2013), pp. 752–763.1925]
D. G. Lowe , Distinctive image features from scale-invariant keypoints , International Journalof Computer Vision, 60 (2004), p. 91–110.[26]
M. Lysaker, A. Lundervold, and Xue-Cheng Tai , Noise removal using fourth-orderpartial differential equation with applications to medical magnetic resonance images in spaceand time , IEEE Transactions on Image Processing, 12 (2003), pp. 1579–1590.[27]
D. Marr and E. Hildreth , Theory of edge detection , Proceedings of the Royal Society ofLondon, B-207 (1980), p. 187–217.[28]
C. A. Micchelli, L. Shen, and Y. Xu , Proximity algorithms for image models: Denoising ,Inverse Problems, 27 (2011), p. 045009(30pp).[29]
L. Rudin, S. Osher, and E. Fatemi , Nonlinear total variation based noise removal algo-rithms , Physica D, 60 (1992), pp. 259–268.[30]
S. Setzer, G. Steidl, and T. Teuber , Infimal convolution regularizations with discrete ℓ -type functionals , Communications in Mathematical Sciences, 9 (2011), pp. 797–827.[31] L. Shen, I. Kakadiaris, M. Papadakis, I. Konstantinidis, D. Kouri, and D. Hoff-man , Image denoising using a tight frame , IEEE Transactions on Image Processing, 15 (2006),pp. 1254–1263.[32]
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli , Image quality assess-ment: From error visibility to structural similarity , IEEE Transactions on Image Processing,13 (2004), pp. 600–612.[33]