[PDF] Regularization with Multilevel Non-stationary Tight Framelets for Image Restoration

Abstract

Variational regularization models are one of the popular and efficient approaches for image restoration. The regularization functional in the model carries prior knowledge about the image to be restored. The prior knowledge, in particular for natural images, are the first-order (i.e. variance in luminance) and second-order (i.e. contrast and texture) information. In this paper, we propose a model for image restoration, using a multilevel non-stationary tight framelet system that can capture the image's first-order and second-order information. We develop an algorithm to solve the proposed model and the numerical experiments show that the model is effective and efficient as compared to other higher-order models.

Full PDF

aa r X i v : . [ m a t h . O C ] F e b Regularization with Multilevel Non-stationary Tight Framelets forImage Restoration

Yan-Ran Li ∗ Raymond H. F. Chan † Lixin Shen ‡ Xiaosheng Zhuang § February 9, 2021

Abstract

Variational regularization models are one of the popular and eﬃcient approaches for imagerestoration. The regularization functional in the model carries prior knowledge about the imageto be restored. The prior knowledge, in particular for natural images, are the ﬁrst-order (i.e.variance in luminance) and second-order (i.e. contrast and texture) information. In this paper,we propose a model for image restoration, using a multilevel non-stationary tight framelet systemthat can capture the image’s ﬁrst-order and second-order information. We develop an algorithmto solve the proposed model and the numerical experiments show that the model is eﬀective andeﬃcient as compared to other higher-order models.

The restoration of a degraded image may be modeled as z = Ku + ǫ, (1)where u denotes the unknown image to be recovered, K a blurring matrix, z an observed blurredimage, and ǫ the noise. In general, K is a singular or near-singular matrix and hence the prob-lem of ﬁnding the solution u from model (1) is ill-posed. To overcome the diﬃculties caused bythe ill-posedness, regularization techniques such as total-variation regularization and multiscaleregularization are often adopted, see [4, 6, 12, 14, 29] and the references therein. The resultingregularized image models have the following generic formmin u {F ( u ) + α G ( u ) } , α > α is the regularization parameter, F represents the data ﬁdelity term and G the regularizationterm. The ﬁdelity term measures the closeness of the estimate obtained from (2) to the data z ∗ College of Computer Science and Software Engineering, Guangdong Key Laboratory of Intelligent Informa-tion Processing and Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen, 518060, P.R. China; SZU Branch, Shenzhen Institute of Artiﬁcial Intelligence and Robotics for Society, China. Email: :[email protected] . Research supported in part by the Shenzhen R&D Program (JCYJ20180305124325555). † Department of Mathematics, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong.Research supported in part by HKRGC Grants No. CUHK14301718, CityU Grant: 9380101, CRF Grant C1007-15G, AoE/M-05/12. Email: [email protected] . ‡ Department of Mathematics, Syracuse University, Syracuse, NY 13244, USA. Email: [email protected] . Thework of L. Shen was supported in part by the National Science Foundation under grant DMS-1913039. § Department of Mathematics, City University of Hong Kong, Tat Chee Avenue, Kowloon Tong, Hong Kong.Email: [email protected] . Research was supported in part by the Research Grants Council of Hong Kong(Project no. CityU 11301419) and City University of Hong Kong (Project no. 7005497) F with the regularizationfunctional G that carries prior knowledge about the image to be restored.Our main focus of this paper is to choose a proper regularization G in (2) for image restoration.Here, a proper regularization means a regularization functional that encodes prior knowledge aboutthe image to be restored. Prior knowledge about images, in particular for natural images, includesﬁrst-order (i.e. variance in luminance) and second-order (i.e. contrast and texture) information [19].One commonly used regularization term that exploits the ﬁrst-order information is the boundedvariation semi-norm [29] G ( u ) := Z Ω |∇ u | , (3)where the image u is deﬁned on the bounded set Ω ⊂ R d . The corresponding model (2), referred asthe total variation (TV) based image restoration model, performs incredibly well especially if theimage to be reconstructed is piecewise constant. The total variation functional does not penalizediscontinuities in images and thus allows us to recover the edges of the original image. However,it does not distinguish between jumps and smooth transitions, therefore it tends to give piecewiseconstant images with staircase artifacts. Due to this notably staircase phenomenon, the TV-basedmodel is not suited for reconstructing images that are not nearly piecewise constant. It was pointedout in [13] that whereas the reconstruction generated with the ﬁrst-order model will display jumps,the basic geometric structure of the original intensity surfaces is missing, even if it appears in thedata. It was further mentioned that using higher order models, these artifacts from the ﬁrst-ordermodel can be eliminated and some of the ﬁne geometric structures, particularly planar and quadricpatches, of the original image can be recovered.One of the earliest models using higher derivatives was proposed in [5] where the inﬁmal con-volution of the ﬁrst and second order derivatives was proposed as regularizer G ( u ) := inf v Z Ω |∇ u − ∇ v | + α |∇ ( ∇ v ) | . (4)It approximates locally the gradient of the function u by ∇ v , that itself has a low total variation.Diﬀerent second-order functionals for staircase reduction have been considered in other papers, forexample, see [8, 26]. Based on tensor algebra, the regularizer with derivatives of arbitrary orderwas introduced in [2]. The corresponding regularizer was called total generalized variation (TGV).In particular, the TGV of second-order is G ( u ) := min v Z Ω ( α |∇ u − v | + α |E ( v ) | ) , (5)where the parameters α , α are positive, and E ( v ) = (cid:20) ∂ v ( ∂ v + ∂ v ) ( ∂ v + ∂ v ) ∂ v (cid:21) with v and v being the components of v . Note that for twice diﬀerentiable u , E ( ∇ u ) is the Hessianof u . We note that the TGV of second-order (5) is similar to, but structural diﬀerent from, theregularizer (4). The use of TGV and its variants in a plethora of applications has been reported in[1, 30] and the references therein.Motivated from the fact that an image/signal naturally has a hierarchical structure and allows tobe represented in a multiscale structure, we exploit this structure to formulate a regularization term G in (2) that is diﬀerent from the aforementioned ones. To this end, we ﬁrst construct a two-level2on-stationary tight framelet system that is suitable for representing images to be restored. Morespeciﬁcally, the tight framelet system in the ﬁrst level is the directional Haar framelet (DHF) systemintroduced in our recent work [21] while the one in the second level is constructed from the discretecosine transform (DCT). We then use the framelet coeﬃcients of an image under this two-levelnon-stationary tight framelet system to formulate the regularization term G . More precisely, theframelet coeﬃcients of the image with the DHF consist of the ﬁrst-order information of the imagein the vertical, horizontal, and ± ° directions. As a result, the regularization term G containsnot only the TV term but also ameliorates it by including the diagonal information. The coarseapproximation to this image resulting from the low-pass ﬁlter of the DHF, considered as a smoothversion of this image, will facilitate the extraction of the second-order information of the image.As shown in our previous work [24], the second-order information of the image can be reliablyextracted from the DCT-based tight framelet coeﬃcients of this smoothed image. Our proposedregularization term G also includes these second-order information. We remark that the success oftight framelets have been proven to be useful in image processing, see, e.g., [3, 7, 22, 23, 24, 31] andthe references therein. However, despite that our two-level non-stationary tight framelet system isnew, the proposed regularization is also diﬀerent from the existing ones in the following perspectives: • Due to the DHF, our regularization assimilates the advantages of both the total variationregularization and other framelet regularizations, and remedies their drawbacks. On theone hand, the ﬁlters associated with DHF have the shortest support among all tight frameletsystems, therefore, it can suppress ringing artifacts arising from other framelet regularizations.In comparison, the ﬁlters associated with the 2-dimensional orthogonal Haar wavelet have theshortest support only among all compactly supported orthogonal wavelets. On the other hand,the diagonal ﬁrst-order information provided by the DHF can reduce the staircase artifacts (orblock eﬀect) arising from the classical TV regularization. In comparison, the 2-dimensionalorthogonal Haar wavelet only provide ﬁrst-order information in the vertical and horizontaldirections. • We exploit the second-order information of the underlying image from its smoothed versionrather than from the image itself. The main idea behind it is that the high frequency spatialinformation of the image will be suppressed in its smoothed one and therefore the second-order information of the image will be faithfully computed, in particularly, for images withhigh degree of noise. • Finally, the properties of the tight framelet can be easily exploited to develop algorithms withcomputational eﬃciency and to analyze the convergence of the resulting algorithms.To summarize, the proposed regularizer G contains the ﬁrst and second order information of theimage to be constructed for (2). The resulting optimization problem (2) can be eﬃciently solvedand the eﬃciency and accuracy of this regularizer will be conﬁrmed for image restoration.The rest of this paper is organized as follows. In Section 2 we ﬁrst brieﬂy review the tightframelet systems, we then propose an image restoration model regularized by a two-level non-stationary tight framelet system and develop an algorithm to solve this model. The performanceof the proposed model for image restoration is presented in Section 3.3 Model and Algorithm with Multi-Level Non-Stationary TightFramelets

This section consists of three parts. In the ﬁrst part, we brieﬂy review the multi-level non-stationarytight framelet systems. In the second part, we propose our image restoration model using a two-level non-stationary tight framelet system. In the last part, we propose an algorithm to solve theresulting optimization problem.

Tight framelets are closely related to ﬁlter banks. A tight framelet ﬁlter bank can be used to(sparsely) represent data sequences through its associated discrete framelet transforms as well asits underlying discrete aﬃne system [17]. Before proceeding to their connections, let us recall somedeﬁnitions and notation ﬁrst.By l ( Z d ) we denote the set of all sequences and l ( Z d ) the set of all ﬁnitely supported sequences.A ﬁlter or mask h = { h ( k ) } k ∈ Z d : Z d → C on Z d is a sequence in l ( Z d ). For a ﬁlter h ∈ l ( Z d ),its Fourier series is deﬁned to be b h ( ξ ) := P k ∈ Z d h ( k ) e − i k · ξ for ξ ∈ R d , which is a 2 π Z d -periodictrigonometric polynomial. In particular, by δ we denote the Dirac sequence such that δ (0) = 1and δ ( k ) = 0 for all k ∈ Z d \{ } . Throughout the paper, we assume the tight framelets are dyadicdilated , that is, the dilation matrix is 2 I d with I d the d × d identity matrix.For ﬁlters τ , τ , . . . , τ s ∈ l ( Z d ), we say that a ﬁlter bank { τ ; τ , . . . , τ s } is a ( d -dimensiondyadic) tight framelet ﬁlter bank if s X ℓ =0 b τ ℓ ( ξ ) b τ ℓ ( ξ + πω ) = δ ( ω ) , ∀ ξ ∈ R d , ω ∈ { , } d , (6)where for a number x ∈ C , ¯ x denotes its complex conjugate. Equation (6) is equivalent to the perfect reconstruction property of the discrete framelet transforms associated with the ﬁlter bank { τ ; τ , . . . , τ s } ([18, Theorems 1.1.1 and 1.1.4]). The ﬁlter τ is usually a low-pass ﬁlter satisfying b τ (0) = 1 while τ ℓ ’s are the high-pass ﬁlters satisfying b τ ℓ (0) = 0 for ℓ > discrete framelet trans-form associated with tight framelet ﬁlter banks are commonly used in order to exploit the sparseproperty of the data. Moreover, in signal/image processing, translation invariance property of adiscrete framelet transform is desirable especially in the scenario of signal denoising/inpainting. Topreserve the translation invariance property, one usually considers the redundant version of discreteframelet transform, that is, the undecimated discrete framelet transform (UDFmT). More precisely,denote a ﬁlter bank at level j as η j := { τ j ; τ j , . . . , τ js j } and consider a sequence { η j : j = 1 , . . . , J } = ∪ Jj =1 { τ j ; τ j , . . . , τ js j } of J ﬁlter banks with j = J > j = 1 being thecoarsest level. Let the convolution operation * be deﬁned by [ h ∗ v ]( γ ) := P k ∈ Z d h ( γ − k ) v ( k ), for v ∈ l ( Z d ) , h ∈ l ( Z d ) , γ ∈ Z d , and the upsampling operator ↑ m with m ∈ N be given by[ v ↑ m ]( γ ) := ( v ( m − γ ) , if m − γ ∈ Z d ;0 , otherwise . For a ﬁlter h , let h ⋆ be a ﬁlter deﬁned by h ⋆ ( k ) = h ( − k ), k ∈ Z d . Then, for a given input datasequence v = v J , the UDFmT includes (i) Decomposition: v j − = v j ∗ (( τ j ) ⋆ ↑ J − j ) , w j − ℓ = v j ∗ (( τ jℓ ) ⋆ ↑ J − j ) , ℓ = 1 , . . . , s j , j = J, . . . , , (7)4nd (ii) Reconstruction: v j = v j − ∗ ( τ j ↑ J − j ) + s X ℓ =1 w j − ℓ ∗ ( τ jℓ ↑ J − j ) , j = 1 , . . . , J. (8)One can show that if each ﬁlter bank { τ j ; τ j , . . . , τ js j } satisﬁes the partition of unity condition : P s j ℓ =0 | b τ jℓ ( ξ ) | = 1, ξ ∈ R d , then any input data sequence v ∈ l ( Z d ) can be perfectly reconstructedvia (8) from its framelet coeﬃcient sequences { v } ∪ { w j ; ℓ : ℓ = 1 , . . . , s j } Jj =1 decomposed from (7).The framelet system associated with such a sequence { η j : j = 1 , . . . , J } is then called a multi-levelnon-stationary tight framelet system .In this paper, we consider J = 2, that is, two-level non-stationary tight framelet system. Onecan of course consider J >

2. However, in terms of eﬃciency and simplicity, J = 2 is the bestchoice for the development of this paper. In this subsection, we integrate two diﬀerent tight framelet systems as a two-level non-stationarytight framelet system which will be exploited for the optimization problem (2).The tight framelets in the ﬁrst level is the directional Haar framelet (DHF) system proposed in[21]. The ﬁlters associated with this DHF are τ = (cid:20) (cid:21) , τ = (cid:20) − (cid:21) , τ = (cid:20) −

11 0 (cid:21) , τ = (cid:20) −

10 0 (cid:21) ,τ = (cid:20) − (cid:21) , τ = (cid:20) − (cid:21) , τ = (cid:20) − (cid:21) . As two-dimensional ﬁlters, the indices of the entries (top-left, top-right, bottom-left, and bottom-right) in each ﬁlter are (0 , , , , τ is a low-passﬁlter and the rest are high-pass ﬁlters that have the ability to provide directional information ofan image when these ﬁlters are applied to the image. More precisely, the ﬁlters τ and τ act asthe ﬁrst-order diﬀerence operators in the 45 ° and 135 ° directions, respectively. The results of thesetwo ﬁlters convolving with an image will highlight changes in intensity of the image in these twodiagonal directions. The ﬁlters τ and τ are the ﬁrst-order diﬀerence operators in the horizontaldirection while the ﬁlters τ and τ are the ﬁrst-order diﬀerence operators in the vertical direction.The convolutions of these ﬁlters with the underlying image are the coeﬃcients of the image underthe corresponding ﬁlters, which are the multiplications of some associated transformation matriceswith the image.Now, we propose a generic regularization term based on DHF. Let u ∈ R n be the vectorrepresenting the column-stacked version of an image. We denote by M κ the associated matrixrepresentation of the ﬁlters τ κ , κ = 0 , , . . . ,

6, under a proper boundary condition. We furtherdenote B ℓ := M and B h := [ M ⊤ , . . . , M ⊤ ] ⊤ . (9)By the tight frame property of { τ ; τ , . . . , τ s } , these two matrices satisfy the following perfectreconstruction condition B ⊤ ℓ B ℓ + B ⊤ h B h = I. : R n → R be deﬁned through a function ϕ : R → R and a non-negative parametervector Λ = (cid:2) λ , λ , . . . , λ n (cid:3) as followsΦ ( v ) := n X i =1 λ i ϕ ( v i , v i + n , . . . , v i +5 n ) . (10)With this function Φ , we propose a functional based on DHF in the following form G ( u ) := Φ ( B h u ) , (11)from which the TV regularization and its variants can be derived by properly chosen ϕ in (10).For example, if we choose ϕ ( x , x , x , x , x , x ) = | x | + | x | , the regularization in (11) is re-duced to the so-called anisotropic TV; If we choose ϕ ( x , x , x , x , x , x ) = p | x | + | x | , theregularization in (11) is reduced to the so-called isotropic TV.We choose, in this paper, ϕ ( x , x , x , x , x , x ) = p | x | + | x | + p | x | + | x | . (12)One of the advantages of the regularization G with ϕ given in (12) is that it assimilates theadvantages of both total variation and wavelet regularizations and remedies their drawbacks. Theway of avoiding or suppressing ringing artifacts arising from wavelet regularizations is to choosea wavelet system whose ﬁlters have small supports. The ﬁlters associated with the 2-dimensionalorthogonal Haar wavelet have the shortest support among all compactly supported orthogonalwavelets, but the staircase artifacts (or blocky eﬀect) will appear in the neighborhoods of edges inthe directions about ± ° . Since ϕ in (12) includes the diagonal ﬁrst-order information from theﬁlters τ and τ , the staircase artifacts can be reduced.The tight framelet in the second level is generated from the standard 3 × c = √ [1 , , c = √ [1 , , − c = √ [1 , − , τ i + j = c ⊤ i c j with i, j ∈ { , , } , where τ is the low-pass ﬁlter andthe others are high-pass ﬁlters. Here, for simplicity of notation, we use τ κ to denote the ﬁltersassociated with both the DHF or DCT-based tight framelet. The expansions of these ﬁlters are τ =   , τ = √  −

11 0 −

11 0 −  , τ = √  − − −  ,τ = √  − − −  , τ =  −

10 0 0 −  , τ = √  − − −  ,τ = √  − − −

21 1 1  , τ = √  − − −  , τ =  − − − −  . The ﬁlters τ and τ , known as the Prewitt operator in image processing, are used to computean approximation of the gradient (i.e., the ﬁrst-order information) of the image intensity function.The convolution of τ (resp. τ ) with an image gives the horizontal (resp. vertical) changes ofthe image intensity and they compute changes of intensity with smoothing due to τ = c ⊤ c and τ = c ⊤ c . The ﬁlter τ (resp. τ ) computes the discrete second-order diﬀerence in vertical (resp.horizontal) direction with smoothing due to τ = c ⊤ c and τ = c ⊤ c . The other ﬁlters τ , τ , τ , and τ perform like discrete high-order diﬀerence operators.6e should note that the ﬁrst-order derivative operators exaggerate the eﬀects of noise while thesecond-order derivatives will exaggerated noise twice as much [15]. Therefore, the applicability ofthe second-order derivatives is limited to images with low noise level. Motivated from the Laplacianof a Gaussian (LOG) and diﬀerence of Gaussian (DOG) operators in computer vision, see, forexample, [25, 27], we propose to take the second-order derivatives on the blurred or smoothedimages in order to reduce the eﬀect of the presence of noise in an image. To this end, we denote by P κ the matrix representation of the ﬁlters τ κ , κ = 0 , , . . .

8, under a proper boundary condition.Let us deﬁne B ℓ := P and B h := [ P ⊤ , . . . , P ⊤ ] ⊤ . (13)We have that B ⊤ ℓ B ℓ + B ⊤ h B h = I. Let Φ : R n → R be deﬁned through a nonnegative parameter sequence Θ = { θ i = ( θ i , θ i , . . . , θ i ) ∈ R : 1 i n } with non-negative elements as followsΦ ( v ) := n X i =1 k [ θ i v i , θ i v i + n , . . . , θ i v i +7 n ] k , (14)where k · k denotes the ℓ norm. With this function Φ , we propose a functional based on theDCT-based tight framelet system in the following form G ( u ) := Φ ( B h B ℓ u ) , (15)where B ℓ u is viewed as the smooth version of u .All together, our proposed image restoration model ismin u {F ( u ) + G ( u ) + G ( u ) } . (16)The eﬃciency of the regularization functional G ( u ) + G ( u ) in (16) will be presented in Section 3when it is compared with several possible regularization functionals formulated from the DHF andDCT-based tight framelet, and with other existing higher-order regularization functionals. In this subsection, we specify the data ﬁdelity F in (16). For Gaussian noise, the natural choicefor F is F ( u ) = k Ku − z k where k · k denotes either the vector 2-norm or matrix 2-norm. Thatis, the optimization problem we consider here ismin u ∈ [0 , n k Ku − z k + Φ ( B h u ) + Φ ( B h B ℓ u ) , (17)where Φ is given in (10) and Φ is given in (14). Here, we assume that all pixel values of animage are in [0 , f : R d → ( −∞ , + ∞ ] such that dom f := { x ∈ R d : f ( x ) < + ∞} 6 = ∅ is denoted by Γ ( R d ). The indicator function of a closed convex set C in R d is deﬁned, at u ∈ R d , as ι C ( u ) := (cid:26) , if u ∈ C ,+ ∞ , otherwise.7learly, the indicator function ι C is in Γ ( R d ) for any closed nonempty convex set C .For a function f ∈ Γ ( R d ), the proximity operator of f with parameter λ , denoted by prox λf ,is a mapping from R d to itself, deﬁned for a given point x ∈ R d byprox λf ( x ) := argmin (cid:26) k u − x k + λf ( u ) : u ∈ R d (cid:27) . We also need the notation of conjugate. The conjugate of f ∈ Γ ( R d ) is the function f ∗ ∈ Γ ( R d )deﬁned at x ∈ R d by f ∗ ( x ) := sup {h u, x i − f ( u ) : u ∈ R d } . A key property of the proximityoperators of f and its conjugate isprox λf ( x ) + λ prox λ − f ∗ ( x/λ ) = x, (18)which holds for all x ∈ R n and any λ > f ( u ) = 12 k Ku − z k , g ( u ) = ι [0 , n , p ( s ) = Φ ( s ) + Φ ( s ) , and A = (cid:20) B h B h B ℓ (cid:21) , (19)where u ∈ R n and s = ( s , s ) with s ∈ R n and s ∈ R n . Then, our optimization problem (17)can be viewed as a special case of the optimization problem whose objective function is the sum ofthree lower semicontinuous convex functions in the form ofmin u ∈ R n f ( u ) + g ( u ) + p ( Au ) , (20)where A is a d × n matrix, f ∈ Γ ( R n ) is diﬀerentiable, g ∈ Γ ( R n ), and p ∈ Γ ( R d ).Several algorithms have been developed for the optimization problem (20), see, for example,[10, 11, 20, 33]. We adopt the algorithm given in [33] for problem (20) since it converges undera much weaker condition and can choose a larger step-size, yielding a faster convergence. Thisalgorithm, named as Primal-Dual Three-Operator splitting (PD3O), has the following iteration: u k = prox γg ( v k ) (21a) s k +1 = prox δp ∗ (cid:16) ( I − γδAA ⊤ ) s k + δA (2 u k − v k − γ ∇ f ( u k )) (cid:17) (21b) v k +1 = u k − γ ∇ f ( u k ) − γA ⊤ s k +1 (21c)One PD3O iteration can be viewed as an operator T PD3O such that ( v k +1 , s k +1 ) = T PD3O ( v k , s k ).The convergence analysis of PD3O is given in the following lemma. Lemma 1 (Sublinear convergence rate [33]) . Let f ∈ Γ ( R n ) and its gradient be Lipschitz contin-uous with constant L , let g ∈ Γ ( R n ) , and p ∈ Γ ( R d ) . Choose γ and δ such that γ < /L and M = γδ ( I − γδAA ⊤ ) is positive deﬁnite. Let ( v ∗ , s ∗ ) be any ﬁxed point of T PD3O , and { ( v k , s k ) } k > be the sequence generated by PD3O. Deﬁne k ( v, s ) k M := p k v k + h s, M s i . Then, the followingstatements hold.(i) The sequence { ( k ( v k , s k ) − ( v ∗ , s ∗ ) k M ) } k > is monotonically nonincreasing.(ii) The sequence { ( k ( v k +1 , s k +1 ) − ( v k , s k ) k M ) } k > is monotonically nonincreasing. Moreover, k ( v k +1 , s k +1 ) − ( v k , s k ) k M = o (cid:16) k +1 (cid:17) .

8o adapt PD3O for our optimization problem (17) with f , g and p , and the matrix A given in(19), some preparations are provided in the following lemmas. Lemma 2.

Let δ > and Φ be given in (10) . For any v ∈ R n , if y = prox δ − Φ ( v ) , then y ( i ) = prox δ − λ i ϕ ( v ( i ) ) , (22) where y ( i ) = (cid:2) y i y i + n · · · y i +5 n (cid:3) ⊤ and v ( i ) = (cid:2) v i v i + n · · · v i +5 n (cid:3) ⊤ . Furthermore, let y ( ij ) and v ( ij ) be y i +( j − n and v i +( j − n , respectively, for i = 1 , . . . , n and j = 1 , . . . , , then (cid:20) y ( i y ( i (cid:21) = − λ i δ − max {k (cid:2) v ( i v ( i (cid:3) k , λ i δ − } ! (cid:20) v ( i v ( i (cid:21) , where the pair ( y ( i , y ( i ) is obtained by simply replacing ( v ( i , v ( i ) in the right hand side of theabove formula by ( v ( i , v ( i ) , and y ( i = v ( i , y ( i = v ( i .Proof. The proof is based on the block separable property of Φ in (10). By the deﬁnition ofproximity operator and equations (10) and (12),prox δ − Φ ( v ) = argmin (cid:26) k u − v k + δ − Φ ( u ) : u ∈ R n (cid:27) = argmin ( n X i =1 k u ( i ) − v ( i ) k + δ − λ i ϕ ( u ( i ) ) : u ( i ) ∈ R , i = 1 , . . . , n ) . Hence, equation (22) holds. Notice that ϕ is also a block separable function. By using the deﬁnitionof proximity again and the proximity operator of the ℓ norm (see, for example, [9, 28]), we obtainthe explicit expression for y ( i ) as given above. Lemma 3.

Let δ > and Φ be given in (14) . For any v ∈ R n , if y = prox δ − Φ ( v ) , then y ( i ) = prox δ − k·k ◦ diag( θ i ) (diag( v ( i ) ) . (23) where y ( i ) = (cid:2) y i y i + n · · · y i +7 n (cid:3) ⊤ and v ( i ) = (cid:2) v i v i + n · · · v i +7 n (cid:3) ⊤ . Furthermore, let y ( ij ) and v ( ij ) be y i +( j − n and v i +( j − n , respectively, for i = 1 , . . . n and j = 1 , . . . , , then y ( ij ) = max {| v ( ij ) | − δ − θ ij , } sgn( v ( ij ) ) . Proof.

The proof is based on the block separable property of Φ in (14). By the deﬁnition ofproximity operator,prox δ − Φ ( v ) = argmin (cid:26) k u − v k + δ − Φ ( u ) : u ∈ R n (cid:27) = argmin ( n X i =1 k u ( i ) − v ( i ) k + δ − k diag( θ i ) u ( i ) k : u ( i ) ∈ R , i = 1 , . . . , n ) . Hence, equation (23) holds. Furthermore, notice that prox δ − k·k ◦ diag( θ i ) is the well-known softthresholding operator, the rest of result holds. Lemma 4.

Let Φ be given in (10) and Φ be given in (14) . For any v ∈ R n , write v = ( v , v ) with v ∈ R n and v ∈ R n , and deﬁne p ( v ) = Φ ( v ) + Φ ( v ) . Then, for any δ > , prox δ − p ( v ) = prox δ − Φ ( v ) × prox δ − Φ ( v ) . p . Therefore, weomit its proof here.To apply Lemma 1 to problem (17), we verify all the requirements listed in Lemma 1. First,for the function f in (19), we have that ∇ f ( u ) = K ⊤ ( Ku − z ), the gradient of f is k K k -Lipschitzcontinuous. Next, we discuss the positive deﬁniteness of the matrix I − γδAA ⊤ . Lemma 5.

Let A be given in (19) . Then, for positive numbers γ and δ , the matrix I − γδAA ⊤ ispositive semideﬁnite (or deﬁnite) if and only if γδ (or γδ < ).Proof. First, we show that k A k = 1. For any u ∈ R n , we have that u ⊤ A ⊤ Au = u ⊤ B ⊤ h B h u + u ⊤ B ⊤ ℓ B ⊤ h B h B ℓ u. Since B ⊤ h B h + B ⊤ ℓ B ℓ = I and B ⊤ h B h + B ⊤ ℓ B ℓ = I , from the above we have that u ⊤ A ⊤ Au u ⊤ B ⊤ h B h u + u ⊤ B ⊤ ℓ B ℓ u = u ⊤ u. Hence k A k

1. Further, since the null space of B h is non-empty, therefore, k A k = 1 . Next, since AA ⊤ is positive semi-deﬁnite and its largest eigenvalue is 1, hence, I − γδAA ⊤ ispositive semideﬁnite (or deﬁnite) if and only if γδ γδ < δ − p is given in Lemma 4 with the help of Lemmas 2 and 3. Therefore,the proximity operator prox δp ∗ can be computed via (18). With the above preparation, the completeprocedure for solving (17) based on (21a)-(21c) is described in Algorithm 1. This algorithm isrefereed to as TNTF (two-level non-stationary tight framelet) algorithm. Algorithm 1

Two-level Non-stationary Tight Framelet (TNTF) Algorithm Set parameters γ < k K k , γδ <

1; pre-given parameters Λ and Θ. Initialize v = 0 and s = 0 Auxiliary variable x k and write s k = ( s k , s k ) for k = 1 , , . . . do u k = Proj [0 , ( v k ) (24a) x k = γ ( B ⊤ h s k + B ⊤ ℓ B ⊤ h s k ) − (2 u k − v k ) + γK ⊤ ( Ku k − z ) (24b) s k +11 = ( s k − δB h x k ) − δ · prox δ − Φ ( δ − ( s k − δB h x k )) (24c) s k +21 = ( s k − δB h B ℓ x k ) − δ · prox δ − Φ ( δ − ( s k − δB h B ℓ x k )) (24d) v k +1 = u k − γK ⊤ ( Ku k − z ) − γ ( B ⊤ h s k +11 + B ⊤ ℓ B ⊤ h s k +22 ) (24e) end for The convergence analysis for Algorithm 1 is as follows.

Theorem 1.

Let ( v ∗ , s ∗ ) be any ﬁxed point of T P D O with f , g , p and A given in (19) . Let { ( v k , s k ) } k > be the sequence generated by Algorithm 1, where s k = ( s k , s k ) . Choose γ and δ suchthat γ < / k K k and γδ < . Deﬁne M = γδ ( I − γδAA ⊤ ) . Then, the following statements hold.(i) The sequence { ( k ( v k , s k ) − ( v ∗ , s ∗ ) k M ) } k > is monotonically nonincreasing. ii) The sequence { ( k ( v k +1 , s k +1 ) − ( v k , s k ) k M ) } k > is monotonically nonincreasing. Moreover, k ( v k +1 , s k +1 ) − ( v k , s k ) k M = o (cid:16) k +1 (cid:17) .Proof. We know that the gradient of f in (19) is k K k -Lipschitz continuous. By Lemma 5,the matrix M is positive deﬁnite if γδ <

1, the result of this theorem follows immediately fromLemma 1.Remark: the computational cost of Algorithm 1 depends on mainly two factors: the UDFmTused in steps (24b-e) and the total number of iterations for k . The UDFmT decompositionsinclude B h x k , B h B ℓ x k in (24c-d) while the UDFmT reconstructions include B ⊤ h s k , B T ℓ B ⊤ h s k in (24b) and B ⊤ h s k +11 and B ⊤ ℓ B ⊤ h s k +22 in (24e). Since UDFmT in Algorithm 1 uses convolutionswith 7 DHF ﬁlters in the ﬁrst level and 9 DCT-based ﬁlters in the second level, the UDFmT canbe implemented with computational cost O ( n ), where n is the number of pixels in u . For thetotal number of iterations k in Algorithm 1, it depends on when the algorithm converges and themaximum number K of iterations set manually. Consequently, the total computational cost forAlgorithm 1 is O ( K n ).Finally, we discuss how to choose the parameters in the algorithm. In our tests below we choose γ = 1 . , δ = 0 . γδ <

1. Recall from (12) that we only use the ﬁrst four subbandcoeﬃcients of DHF in the ﬁrst level. The corresponding regularization parameters λ i (deﬁned in(10)) are chosen to adaptively adjust to local variations. Let I ( i ) be the set containing all indicesin the neighborhood at the i th pixel. Then λ i is set as λ i = λ × |I ( i ) | max { P p ∈I ( i ) k w p k , − } , (25)where w p = [ v ( p , v ( p ] ⊤ or w p = [ v ( p , v ( p ] ⊤ are deﬁned as in (22). In our tests, we choose theneighborhood of window size 3 × λ is set by hand.For the regularization parameters θ i associated with the DCT-based tight framelet coeﬃcients(see (14)), they are all automatically estimated and updated using the approach in our previouswork [24]. More precisely, for the regularization parameters θ iκ , κ = 1 , . . . ,

8, used in (14), they areautomatically estimated according to the local variations of framelet coeﬃcients and noise level.Suppose the ǫ in model (1) is the Gaussian noise with the standard deviation σ . As it was done inour previous work [24], σ κ the noise variance of the framelet coeﬃcients coming from the ﬁlter τ κ at the second decomposed level is estimated as σ κ = σ k τ κ k F , where k τ κ k F is the Frobenius normof τ κ ; ( σ κi ) , the local signal variance of the i th framelet coeﬃcients coming from the ﬁlter κ th, iscomputed as ( σ κi ) = max { ( P p ∈I κ ( i ) | v p | / |I κ ( i ) | ) − σ κ , − } , where I κ ( i ) is the set containingall indices in the neighborhood at the i th framelet coeﬃcients from the ﬁlter κ . With them, theregularization parameters are estimated as θ iκ = √ σ κ σ κi , κ = 1 , . . . , . (26)To save computational cost of estimating parameters λ i in (25) and θ iκ in (26), we only updatethese parameters when the iteration k is a multiple of 30 and ﬁx them after the 200th iteration inour numerical experiments. In this section, we present numerical experiments to illustrate the eﬀectiveness and eﬃciency ofour proposed model (17) for image restoration. We use the images “Square Circle”, “Cameraman”,11nd “Montage” of size 256 ×

256 as the original images u in our experiments, see Figure 1. Thepixel values of these images are normalized to the interval [0 , e u , is evaluated in terms of the peak-signal-to-noise ratio (PSNR) that is deﬁned byPSNR := 10 log n k e u − u k , where n is the number of pixels in u . To incorporate structural information in image comparisons,the metric of structural similarity (SSIM) [32] of e u to u is reported as well. The higher the PSNRand SSIM, the better the quality of the restored image.(a) (b) (c)Figure 1: Original image: (a) Square Circle; (b) Cameraman; (c) Montage.Two sets of comparisons for image restoration will be conducted in this section. The ﬁrstset is to compare with other tight frame regularizers. The second set is to compare with somederivative-based models. Here we compare the proposed regularization functional (16) with two other tight frame regular-ization functionals G DCT and G DHF+DCT while using the classical TV regularizer G TV (see (3)) asa benchmark. The G DCT is deﬁned as G DCT ( u ) = Φ ( B h u ) which only uses the DCT-based tightframelet and takes the ﬁrst- and second-order information on the image u , where Φ is given in(14). The G DHF+DCT is deﬁned as G DHF+DCT ( u ) = Φ ( B h u ) + Φ ( B h u ), where Φ is givenin (11) with ϕ in (12). The main diﬀerence between our proposed regularization functional (16)and G DHF+DCT is that the action B h takes on the smoothed image B ℓ u for our regularizationfunctional while the action B h takes directly on the image u for G DHF+DCT .In our experiment, the image of “Square Circle” in Figure 1(a) (which is the same as Fig-ure 2(a)) is blurred by a 5 × fspecial(’average’,[5:5]) , followed by adding Gaussian noise of mean zero and standard deviation σ = 0 .

04. Thevalues of the pair of (PSNR, SSIM) of these restored images by G TV , G DCT , G DHF+DCT , and theproposed regularization functional (16) are (33.66dB, 0.962), (32.54dB, 0.970), (33.14dB, 0.980),and (35.00dB, 0.980), respectively. To view the visual quality of the restored images, the squareportion marked in the image 1(a) is displayed in Figure 2. For the regions pointed by two arrows,we can conclude that the proposed regularization functional (16) leads to the restored images hav-ing better visual quality than the others. The results clearly show that our combined tight framemodel is better than other intuitive tight frame models.12a) (b) (c)(d) (e) (f)Figure 2: (a) A region of the image of “Square Circle”; (b) The blurred image by the kernel fspecial(’average’, [5:5]) with Gaussian noise of mean zero and variance σ = 0 .

04; The restoredimages with regularization (c) TV with α = 0 .

04 (see (2) and (3)); (d) G DCT ; (e) G DHF+DCT ; and (f)the proposed regularization functional (16) with λ = 0 . .2 Comparison with Derivative-based Regularizers Now we give a comprehensive comparison between our model (17) and the TV and TGV models.The TV model uses G ( u ) in (3) as its regularization term while the TGV model uses G ( u ) in (5)as its regularization term. The software of TGV model was provided by the authors in [16]. Allalgorithms are carried out until the stopping condition k u ( k +1) − u ( k ) k / k u ( k ) k < − is satisﬁedor the maximal iterations is 400.In our experiments, the test images in Figure 1 are blurred by a 5 × fspecial(’average’, [5:5]) , followed by adding Gaussian noise of mean zeroand standard deviation σ . For diﬀerent values of σ , the PSNR and SSIM values of the restoredimages by TV, TGV, and our TNTF are reported in Table 1. The highest values of PSNR andSSIM for each σ in each test image are highlighted. It clearly shows that our proposed TNTFperforms the best in terms of both PSNR and SSIM values. We remark that the regularizationparameters in the second level for our TNTF are automatically estimated based on the approachin our work [24].Table 1: The PSNR ( dB ) and SSIM for the restored results of each algorithm with blurred im-ages contaminated by Gaussian noise. The test images are blurred by the blurring kernel fspe-cial(’average’,[5:5]) . Algorithm “ Square Circle” “Cameraman” “Montage” CasePSNR SSIM PSNR SSIM PSNR SSIMTV 35.40dB 0.980 26.43dB 0.815 28.21dB 0.907TGV 35.58dB 0.976 26.27dB 0.811 28.84dB 0.910 STD σ =0.02TNTF dB dB dB σ =0.03TNTF dB dB dB TV 33.66dB 0.962 25.10dB 0.774 25.94dB 0.867TGV 33.41dB 0.955 24.94dB 0.770 26.17dB 0.875 STD σ =0.04TNTF dB dB dB In the rest of this section, we provide qualitative results of the restored images from the abovethree algorithms. We ﬁrst show the case for the blurred image of “Square Circle” with Gaussiannoise of STD σ = 0 .

03 in Figure 3. The noisy and blurry image is shown in Figure 3(a). Theregularization parameter α = 0.02 (see (2) and (3)) is used for the TV model and ( α , α ) =(0 . , . λ = 0 . σ =0.02.The restored images by TV, TGV, and TNTF are displayed in Figure 5(b), (c), and (d), respectively.The regularization parameters for TV, TGV, and TNGV are α = 0 . α , α ) = (0 . , . λ = 0 . fspecial(’average’, [5:5]) and added Gaussian noisewith σ = 0 .

03; Images reconstructed by (b) TV with α = 0 .

02 (see (2) and (3)), (c) TGV with( α , α ) = (0 . , . λ = 0 . fspecial(’average’, [5:5]) and added Gaussian noise with σ = 0 .

02; images reconstructed by (b) TV with α = 0.006; (c) TGV with ( α , α ) = (0 . , . λ = 0 . σ =0.04. Therestored images by TV, TGV, and TNTF are displayed in Figure 7(b), (c), and (d), respectively.The regularization parameters for TV, TGV, and TNGV are α = 0 . α , α ) = (0 . , . λ = 0 . In this paper, we have designed a two-level non-stationary tight framelet system and utilized it ina regularization model for image restoration. This framelet system has the ability to capture theﬁrst and second order information of the image to be reconstructed. We developed an algorithmto solve the resulting optimization problem. The numerical experiments show the eﬀectiveness ofthe proposed image restoration model and the corresponding algorithm.16a) (b) (c) (d)Figure 6: Zoom-in parts of Figure 5: (a) Original image; images reconstructed by (b) TV, (c) TGV,and (d) TNTF. (a) (b)(c) (d)Figure 7: (a) Image blurred by kernel fspecial(’average’, [5:5]) and added Gaussian noise with σ = 0 .

04; images reconstructed by (b) TV with α = 0 . α , α ) = (0 . , . λ = 0 . References [1]

K. Bredies and M. Holler , A TGV-based framework for variational image decompression,zooming, and reconstruction. Part II: Numerics , SIAM Journal on Imaging Sciences, 8 (2015),pp. 2851–2886.[2]

K. Bredies, K. Kunisch, and T. Pock , Total generalized variation , SIAM Journal onImaging Sciences, 3 (2010), pp. 492–526.[3]

J.-F. Cai, B. Dong, S. Osher, and Z. Shen , Image restorations: total variation, waveletframes and beyond , Journal of the American Mathematical Society, 25 (2012), pp. 1033–1089.[4]

A. Chambolle, R. DeVore, N.-Y. Lee, and B. Lucier , Nonlinear wavelet image process-ing: Variational problems, compression, and noise removal through wavelet shrinkage , IEEETransactions on Image Processing, 7 (1998), pp. 319–335.[5]

A. Chambolle and P. L. Lions , Image recovery via total variational minimization andrelated problems , Numerische Mathematik, 76 (1997), pp. 167–188.[6]

R. Chan, T. Chan, L. Shen, and Z. Shen , Wavelet algorithms for high-resolution imagereconstruction , SIAM Journal on Scientiﬁc Computing, 24 (2003), pp. 1408–1432.[7]

R. Chan, S. D. Riemenschneider, L. Shen, and Z. Shen , Tight frame: The eﬃcient wayfor high-resolution image reconstruction , Applied and Computational Harmonic Analysis, 17(2004), pp. 91–115.[8]

T. Chan, A. Marquina, and P. Mulet , High-order total variation-based image restoration ,SIAM Journal on Scientiﬁc Computing, 22 (2000), pp. 503–516.189]

P. Combettes and V. Wajs , Signal recovery by proximal forward-backward splitting , Mul-tiscale Modeling and Simulation: A SIAM Interdisciplinary Journal, 4 (2005), pp. 1168–1200.[10]

P. L. Combettes and J.-C. Pesquet , Primal-dual splitting algorithm for solving inclusionswith mixtures of composite, lipschitzian, and parallel-sum type monotone operators , Set-Valuedand Variational Analysis, 20 (2012), pp. 307–330.[11]

L. Condat , A primal-dual splitting method for convex optimization involving lipschitzian,proximable and linear composite terms , Journal of Optimization Theory and Applications, 158(2013), pp. 460–479.[12]

M. Figueiredo and R. D. Nowak , An EM algorithm for wavelet-based image restoration ,IEEE Transactions on Image Processing, 12 (2003), pp. 906–916.[13]

D. Geman and G. Reynolds , Constrained restoration and the recovery of discontinuities ,IEEE Transactions on Pattern Analysis and Machine Intelligence, 14 (1992), pp. 367–383.[14]

S. Geman and D. Geman , Stochastic relaxation, Gibbs distributions, and the Bayesianrestoration of images , IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(1984), pp. 721–741.[15]

R. Gonzalez and R. Woods , Digital Image Processing , Addison-Wesley, Boston, MA, 1993.[16]

W. Guo, J. Qin, and W. Yin , A new detail-preserving regularization scheme , SIAM Journalon Imaging Sciences, 7 (2014), pp. 1309–1334.[17]

B. Han , Properties of discrete framelet transforms , Mathematical Modelling of Natural Phe-nomena, 8 (2013), pp. 18–47.[18] ,

Framelets and Wavelets: Algorithms, Analysis, and Applications , Springer InternationalPublishing, 2018.[19]

A. P. Johnson and C. L. Baker , First- and second-order information in natural images:a ﬁlter-based approach to image statistics , Journal of the Optical Society of America A, 21(2004), pp. 913–925.[20]

Q. Li and N. Zhang , Fast proximity-gradient algorithms for structured convex optimizationproblems , Applied and Computational Harmonic Analysis, 41 (2016), pp. 491 – 517.[21]

Y.-R. Li, R. H. Chan, L. Shen, Y.-C. Hsu, and W.-Y. I. Tseng , An adaptive directionalhaar framelet-based reconstruction algorithm for parallel magnetic resonance imaging , SIAMJournal on Imaging Sciences, 9 (2016), pp. 794–821.[22]

Y.-R. Li, D. Dai, and L. Shen , Multiframe super-resolution reconstruction using sparsedirectional regularization , IEEE Transactions on Circuits and Systems for Video Technology,20 (2010), pp. 945–956.[23]

Y.-R. Li, L. Shen, D. Q. Dai, and B. W. Suter , Framelet algorithms for de-blurringimages corrupted by impulse plus gaussian noise , IEEE Transactions on Image Processing, 20(2011), pp. 1822–1837.[24]

Y.-R. Li, L. Shen, and B. W. Suter , Adaptive inpainting algorithm based on DCT inducedwavelet regularization , IEEE Transactions on Image Processing, 22 (2013), pp. 752–763.1925]

D. G. Lowe , Distinctive image features from scale-invariant keypoints , International Journalof Computer Vision, 60 (2004), p. 91–110.[26]

M. Lysaker, A. Lundervold, and Xue-Cheng Tai , Noise removal using fourth-orderpartial diﬀerential equation with applications to medical magnetic resonance images in spaceand time , IEEE Transactions on Image Processing, 12 (2003), pp. 1579–1590.[27]

D. Marr and E. Hildreth , Theory of edge detection , Proceedings of the Royal Society ofLondon, B-207 (1980), p. 187–217.[28]

C. A. Micchelli, L. Shen, and Y. Xu , Proximity algorithms for image models: Denoising ,Inverse Problems, 27 (2011), p. 045009(30pp).[29]

L. Rudin, S. Osher, and E. Fatemi , Nonlinear total variation based noise removal algo-rithms , Physica D, 60 (1992), pp. 259–268.[30]

S. Setzer, G. Steidl, and T. Teuber , Inﬁmal convolution regularizations with discrete ℓ -type functionals , Communications in Mathematical Sciences, 9 (2011), pp. 797–827.[31] L. Shen, I. Kakadiaris, M. Papadakis, I. Konstantinidis, D. Kouri, and D. Hoff-man , Image denoising using a tight frame , IEEE Transactions on Image Processing, 15 (2006),pp. 1254–1263.[32]

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli , Image quality assess-ment: From error visibility to structural similarity , IEEE Transactions on Image Processing,13 (2004), pp. 600–612.[33]