[PDF] Automated Parameter Selection for Total Variation Minimization in Image Restoration

Abstract

Algorithms for automatically selecting a scalar or locally varying regularization parameter for total variation models with an L τ -data fidelity term, τ∈{1,2} , are presented. The automated selection of the regularization parameter is based on the discrepancy principle, whereby in each iteration a total variation model has to be minimized. In the case of a locally varying parameter this amounts to solve a multi-scale total variation minimization problem. For solving the constituted multi-scale total variation model convergent first and second order methods are introduced and analyzed. Numerical experiments for image denoising and image deblurring show the efficiency, the competitiveness, and the performance of the proposed fully automated scalar and locally varying parameter selection algorithms.

Full PDF

aa r X i v : . [ m a t h . NA ] D ec Automated Parameter Selection for Total VariationMinimization in Image Restoration

Andreas LangerAbstract

Algorithms for automatically selecting a scalaror locally varying regularization parameter for totalvariation models with an L τ -data ﬁdelity term, τ ∈{ , } , are presented. The automated selection of theregularization parameter is based on the discrepancyprinciple, whereby in each iteration a total variationmodel has to be minimized. In the case of a locally vary-ing parameter this amounts to solve a multi-scale totalvariation minimization problem. For solving the consti-tuted multi-scale total variation model convergent ﬁrstand second order methods are introduced and analyzed.Numerical experiments for image denoising and imagedeblurring show the eﬃciency, the competitiveness, andthe performance of the proposed fully automated scalarand locally varying parameter selection algorithms. Keywords

Total variation minimization · Locallydependent regularization parameter · Automatedparameter selection · L -data ﬁdelity · L -data ﬁdelity · Discrepancy principle · Constrained/unconstrainedproblem · Gaussian noise · Impulse noise

Observed images are often contaminated by noise andmay be additionally distorted by some measurementdevice. Then the obtained data g can be described as g = N ( T ˆ u ) , where ˆ u is the unknown original image, T is a linearbounded operator modeling the image-formation de-vice, and N represents noise. In this paper, we consider A. LangerInstitute for Applied Analysis and Numerical Simulation,University of Stuttgart, Pfaﬀenwaldring 57, 70569 Stuttgart,GermanyE-mail: [email protected] images which are contaminated either by white Gaus-sian noise or impulse noise. While for white Gaussiannoise the degraded image g is obtained as g = T ˆ u + η, where the noise η is oscillatory with zero mean and stan-dard deviation σ , there are two main models for impulsenoise, that are widely used in a variety of applications,namely salt-and-pepper noise and random-valued im-pulse noise. We assume that T ˆ u is in the dynamic range[0 , ≤ T ˆ u ≤

1, then in the presence of salt-and-pepper noise the observation g is given by g ( x ) =  r ∈ [0 , , r ∈ [0 , ,T ˆ u ( x ) with probability 1 − r − r , (1.1)with 1 − r − r >

0. If the image is contaminated byrandom-valued impulse noise, then g is described as g ( x ) = ( ρ with probability r ∈ [0 , ,T ˆ u ( x ) with probability 1 − r, (1.2)where ρ is a uniformly distributed random variable inthe image intensity range [0 , u from the given degraded image g is an ill-posed inverse problem and thus regularizationtechniques are required to restore the unknown image[41]. A good approximation of ˆ u may be obtained bysolving a minimization problem of the typemin u H ( u ; g ) + α R ( u ) , (1.3)where H ( . ; g ) represents a data ﬁdelity term, which en-forces the consistency between the recovered and mea-sured image, R is an appropriate ﬁlter or regulariza-tion term, which prevents over-ﬁtting, and α > Andreas Langer of the two terms. We aim at reconstructions in whichedges and discontinuities are preserved. For this pur-pose we use the total variation as a regularization term,ﬁrst proposed in [81] for image denoising. Hence, hereand in the remaining of the paper we choose R ( u ) = R Ω | Du | , where R Ω | Du | denotes the total variation of u in Ω ; see [3,46] for more details. However, we notethat other regularization terms, such as the total gen-eralized variation [13], the non-local total variation [60],the Mumford-Shah regularizer [71], or higher order reg-ularizers (see e.g. [77] and references therein) might beused as well.1.1 Choice of the ﬁdelity termThe choice of H typically depends on the type of noisecontamination. For images corrupted by Gaussian noisea quadratic L -data ﬁdelity term is typically chosen andhas been successfully used; see for example [18,19,20,24,28,29,30,32,35,47,72,76,91,93]. In this approach, whichwe refer to as the L -TV model, the image ˆ u is recov-ered from the observed data g by solvingmin u ∈ BV ( Ω ) k T u − g k L ( Ω ) + α Z Ω | Du | , (1.4)where BV ( Ω ) denotes the space of functions with boundedvariation, i.e., u ∈ BV ( Ω ) if and only if u ∈ L ( Ω ) and R Ω | Du | < ∞ . In the presence of impulse noise, e.g.,salt-and-pepper noise or random-valued impulse noise,the above model usually does not yield a satisfactoryrestoration. In this context, a more successful approach,suggested in [1,74,75], uses a non-smooth L -data ﬁ-delity term instead of the L -data ﬁdelity term in (1.4),i.e., one considersmin u ∈ BV ( Ω ) k T u − g k L ( Ω ) + α Z Ω | Du | , (1.5)which we call the L -TV model. In this paper, we areinterested in both models, i.e., the L -TV and the L -TV model, and condense them intomin u ∈ BV ( Ω ) (cid:26) J τ ( u ; g ) := H τ ( u ; g ) + α Z Ω | Du | (cid:27) (1.6)to obtain a combined model for removing Gaussian orimpulsive noise, where H τ ( u ; g ) := τ k T u − g k τL τ ( Ω ) for τ = 1 ,

2. Note, that instead of (1.6) one can considerthe equivalent problemmin u ∈ BV ( Ω ) λ H τ ( u ; g ) + Z Ω | Du | , (1.7)where λ = α >

0. Other and diﬀerent ﬁdelity termshave been considered in connection with other type of noise models, as Poisson noise [64], multiplicative noise[4], Rician noise [43]. For images which are simultane-ously contaminated by Gaussian and impulse noise [15]a combined L - L -data ﬁdelity term has been recentlysuggested and demonstrated to work satisfactory [53].However, in this paper, we concentrate on images de-graded by only one type of noise, i.e., either Gaussiannoise or one type of impulse noise, and perhaps addi-tionally corrupted by some measurement device.1.2 Choice of the scalar regularization parameterFor the reconstruction of such images the proper choiceof α in (1.6) and λ in (1.7) is delicate; cf. Fig. 1.1. Inparticular, large α and small λ , which lead to an over-smoothed reconstruction, not only remove noise butalso eliminate details in images. On the other hand,small α and large λ lead to solutions which ﬁt thegiven data properly but therefore retain noise in ho-mogeneous regions. Hence a good reconstruction canbe obtained by choosing α and respectively λ such thata good compromise of the aforementioned eﬀects aremade. There are several ways of how to select α in(1.6) and equivalently λ in (1.7), such as manually bythe trial-and-error method, the unbiased predictive riskestimator method (UPRE) [69,67], the Stein unbiasedrisk estimator method (SURE) [82,38,10] and its gen-eralizations [34,40,45], the generalized cross-validationmethod (GCV) [48,66,67,78], the L-curve method [49,50], the discrepancy principle [70], and the variationalBayes’ approach [6]. Further parameter selection meth-ods for general inverse problems can be found for ex-ample in [41,42,88,89].Based on a training set of pairs ( g k , ˆ u k ), for k =1 , , . . . , N ∈ N , where g k is the noisy observation andˆ u k represents the original image, for example in [16,33,61] bilevel optimization approaches have been presentedto compute suitable scalar regularization parameters ofthe corresponding image model. Since in our setting wedo not have a training set given, these approaches arenot applicable here.Applying the discrepancy principle to estimate α in(1.6) or λ in (1.7), the image restoration problem canbe formulated as a constrained optimization problem ofthe formmin u ∈ BV ( Ω ) Z Ω | Du | subject to (s.t.) H τ ( u ; g ) = B τ (1.8)where B τ := ν τ τ | Ω | with ν τ > τ = 1 ,

2, and | Ω | utomated Parameter Selection for Total Variation Minimization in Image Restoration 3(a) noisy image (b) over-smoothed reconstruction (c) over-ﬁtted reconstruction Fig. 1.1

Reconstruction of an image corrupted by Gaussian white noise with a “relatively” large parameter α in (b) and a“relatively” small parameter α in (c). denoting the volume of Ω ; see Section 3 for more de-tails. Note, that here we assume to know a-priori thenoise level. In real applications this means that possiblyin a ﬁrst step a noise estimation has to be performedbefore the discrepancy principle may be used. However,in general it is easier to estimate the noise level thanthe regularization parameter [18].The constrained minimization problem (1.8) is nat-urally linked to the unconstrained minimization prob-lem (1.7) and accordingly to (1.6). In particular, thereexists a constant λ ≥ T does not annihilate constant functions, i.e., T ∈ L ( L ( Ω )) is such that T · τ = 2 have beenproposed in the literature, see for example [9,18,51,92]and references therein, while not so much attention hasbeen given to the case τ = 1, see for example [73,91].1.3 Spatially adaptive parameterNote, that a scalar regularization parameter might notbe the best choice for every image restoration problem,since images usually have large homogeneous regions aswell as parts with a lot of details. Actually it seems ob-vious that α should be small, or λ should be large, inparts with small features in order to preserve the de-tails. On the contrary α should be large, or λ should besmall, in homogeneous parts to remove noise consider-able. With such a choice of a spatially varying weight weexpect better reconstructions than with a globally con-stant parameter, as demonstrated for example in [37,58]. This motivated to consider multi-scale total varia-tion models with spatially varying parameters initiallysuggested in [80]. The multi-scale version of (1.6) reads asmin u ∈ BV ( Ω ) H τ ( u ; g ) + Z Ω α ( x ) | Du | (1.9)while for (1.7) one writesmin u ∈ BV ( Ω ) τ Z Ω λ ( x ) | T u − g | τ dx + Z Ω | Du | , (1.10)and in the sequel we refer to (1.9) and (1.10) as themulti-scale L τ -TV model.In [84] the inﬂuence of the scale of an image fea-ture on the choice of α is studied and the obtainedobservations were later used in [83] to construct an up-dating scheme of α . Based on (1.10) in [8] a piecewiseconstant function λ , where the pieces are deﬁned bya partitioning of the image due to a pre-segmentation,is determined. In particular, for each segment a scalar λ i , i = 1 , . . . , λ respec-tively α should incorporate statistical properties of thenoise. In this vein, in [2,37,44] for the problem (1.10)automated update rules for λ based on statistics of lo-cal constraints were proposed. In [44] a two level ap-proach for variational denoising is considered, where inthe ﬁrst level noise and relevant texture are isolated inorder to compute local constraints based on local vari-ance estimation. In the second level a gradient descentmethod and an update formula for λ ( x ) derived fromthe Euler-Lagrange equation is utilized. An adaptationof this approach to multiplicative noise can be found in[65]. For convolution type of problems in [2] based onan estimate of the noise variance for each pixel an au-tomatic updating scheme of λ using Uzawa’s method iscreated. This approach is improved in [37] by determin-ing the ﬁdelity weights due to the Gumbel statistic forthe maximum of a ﬁnite number of random variablesassociated with localized image residuals and by incor-porating hierarchical image decompositions, proposed Andreas Langer in [86,87], to speed up the iterative parameter adjust-ment process. An adaptation of this approach to a totalvariation model with L local constraints is studied in[58]. A diﬀerent approach has been proposed in [85] forimage denoising only, where non-local means [14] areused to create a non-local data ﬁdelity term. While inall these approaches the adjustment of λ relies on theoutput of T being a deteriorated image again, in [54]the method of [37] is adjusted to the situation where T is an orthogonal wavelet transform or Fourier trans-form. Very recently also bilevel optimisation approachesare considered for computing spatially adaptive weights[26,56,57].1.4 ContributionOur ﬁrst contribution of this paper is to present a meth-od which automatically computes the regularization pa-rameter α in (1.6) based on (1.8) for τ = 1 as well asfor τ = 2. Our approach is motivated by the parameterselection algorithm presented in [18], which was origi-nally introduced for L -TV image denoising only, i.e.,when T = I , where I denotes the identity operator. Inthis setting the algorithm in [18] is shown to convergeto a parameter α ∗ such that the corresponding mini-mizer u α ∗ of (1.6) is also a solution of (1.8). The proofrelies on the non-increase of the function α H ( u α ; g ) B .However, this important property does not hold for op-erators T = I in general. Nevertheless, we generalizethe algorithm from [18] to problems of the type (1.6)for τ = 1 , T ,e.g., T might be a convolution type of operator. Utiliz-ing an appropriate update of α , which is diﬀerent thanthe one used in [18], we are able to show analyticallyand numerically that our approach indeed converges tothe desired regularization parameter. Further, besidesthe general applicability of our proposed method it evenpossesses advantages for the case τ = 2 and T = I overthe algorithm from [18] with respect to convergence.More precisely, in our numerics it turned out that ourproposed method always needs less or at least the samenumber of iterations as the algorithm from [18] till ter-mination.Motivated by multi-scale total variation minimiza-tion, the second contribution of this paper is concernedwith the automated selection of a suitable spatiallyvarying α for the optimization problem in (1.9) for τ = 1 ,

2. Based on our considerations for an automaticscalar regularization parameter selection, we present al-gorithms where the adjustment of a locally varying α is fully automatic. Diﬀerently to the scalar case the ad-justment of α is now based on local constraints, sim- ilarly as already considered for example in [2,37,58].However, our approach diﬀers signiﬁcantly from theseprevious works, where problem (1.10) is considered andUzawa’s method or an Uzawa-like method is utilized forthe update of the spatially varying parameter. Note,that in Uzawa’s method an additional parameter hasto be introduced and chosen accordingly. We proposean update-scheme of α which does not need any addi-tional parameter and hence is not similar to Uzawa’smethod. Moreover, diﬀerently to the approaches in [37,58] where the initial regularization parameter λ > α > α are presented in Section 5 where also theirconvergence properties are studied. To demonstrate theperformance of the new algorithms we present in Sec-tion 6 numerical experiments for image denoising andimage deblurring. Finally, in Section 7 conclusions aredrawn. In this section we discuss the connection between theunconstrained minimization problem (1.6) and the con-strained optimization problem (1.8). For this purposewe introduce the following basic terminology. Let V bea locally convex space, V ′ its topological dual, and h· , ·i the bilinear canonical pairing over V × V ′ . The domainof a functional J : V → R ∪ { + ∞} is deﬁned as the setDom( J ) := { v ∈ V : J ( v ) < ∞} . A functional J is called lower semicontinuous (l.s.c) iffor every weakly convergent subsequence v ( n ) ⇀ ˆ v wehave lim inf v ( n ) ⇀ ˆ v J ( v ( n ) ) ≥ J (ˆ v ) . utomated Parameter Selection for Total Variation Minimization in Image Restoration 5 For a convex functional J : V → R ∪ { + ∞} , we deﬁnethe subdiﬀerential of J at v ∈ V , as the set valuedfunction ∂ J ( v ) = ∅ if J ( v ) = ∞ , and otherwise as ∂ J ( v ) = { v ∗ ∈ V ′ : h v ∗ , u − v i + J ( v ) ≤ J ( u ) ∀ u ∈ V} . For any operator T we denote by T ∗ its adjoint andby L ( L ( Ω )) we denote the space of linear and con-tinuous operators from L ( Ω ) to L ( Ω ). Moreover, g Ω describes the average value of the function g ∈ L ( Ω )in Ω deﬁned by g Ω := | Ω | R Ω g ( x ) d x . Theorem 2.1

Assume that T ∈ L ( L ( Ω )) does notannihilate constant functions, i.e., T Ω = 0 , where Ω ( x ) = 1 for x ∈ Ω . Then the problem min u ∈ BV ( Ω ) Z Ω | Du | s.t. H τ ( u ; g ) ≤ B τ (2.1) has a solution for τ = 1 , .Proof For a proof we refer the reader to [20] and [58]. ⊓⊔ Moreover, we have the following statement.

Proposition 2.1

Assume that T ∈ L ( L ( Ω )) is suchthat T · and ν τ | Ω | ≤ k g − g Ω k τL τ ( Ω ) . Then prob-lem (2.1) is equivalent to the constrained minimizationproblem (1.8) for τ = 1 , .Proof For τ = 2 the statement is shown in [20]. Westate the proof for τ = 1 by noting it follows similararguments as for τ = 2. Let ˜ u be a solution of (2.1).Note, that there exists u ∈ BV ( Ω ) such that ˜ u = u + g Ω . We consider now the continuous function f ( s ) = k T ( su + g Ω ) − g k L ( Ω ) for s ∈ [0 , f (1) = k T ˜ u − g k L ( Ω ) ≤ ν | Ω | and f (0) = k T g Ω − g k L ( Ω ) = k g − g Ω k L ( Ω ) ≥ ν | Ω | , since T · s ∈ [0 ,

1] such that f ( s ) = ν | Ω | .Set u ′ = su which satisﬁes k T u ′ − g k L ( Ω ) = ν | Ω | and Z Ω | Du ′ | = s Z Ω | Du | ≤ lim inf n →∞ Z Ω | Du n | , where ( u n ) n is a minimizing sequence of (1.8). Hence u ′ is a solution of (1.8). ⊓⊔ Now we are able to argue the equivalence of theproblems (1.6) and (1.8).

Theorem 2.2

Let T ∈ L ( L ( Ω )) be such that T · and ν τ | Ω | ≤ k g − g Ω k τL τ ( Ω ) . Then there exists α ≥ such that the constrained minimization problem (1.8) isequivalent to the unconstrained problem (1.6) , i.e., u isa solution of (1.8) if and only if u solves (1.6) . Proof For τ = 2 the proof can be found in [20, Prop.2.1]. By similar arguments one can show the statementfor τ = 1, which we state here.Set R ( u ) := R Ω | Du | and G ( u ) = ( + ∞ if k u − g k L ( Ω ) > ν | Ω | , k u − g k L ( Ω ) ≤ ν | Ω | . Notice, that R and G are convex l.s.c functions andproblem (2.1) is equivalent to min u R ( u ) + G ( T u ). Wehave Dom( R ) = BV ( Ω ) ∩ L ( Ω ) and Dom( G ) = { u ∈ L ( Ω ) : G ( u ) < + ∞} . Since g ∈ T Dom( R ), there ex-ists ˜ u ∈ Dom( R ) with k T ˜ u − g k L ( Ω ) ≤ ν | Ω | /

2. As T ∈ L ( L ( Ω )) is continuous, G ◦ T is continuous at ˜ u .Hence, by [39, Prop. 5.6, p. 26] we obtain ∂ ( R + G ◦ T )( u ) = ∂ R ( u ) + ∂ ( G ◦ T )( u )for all u . Further, G is continuous at T ˜ u , and hence by[39, Prop. 5.7, p. 27] we have for all u , ∂ ( G ◦ T )( u ) = T ∗ ∂ G ( T u )where ∂ G ( u ) = { } if k u − g k L ( Ω ) < ν | Ω | and ∂ G ( u ) = { α∂ ( k u − g k L ( Ω ) ) , α ≥ } if k u − g k L ( Ω ) = ν | Ω | .If u is a solution of (2.1) and hence of (1.8), then0 ∈ ∂ ( R + G ◦ T )( u ) = ∂ R ( u ) + T ∗ ∂ G ( T u ) . Since any solution of (1.8) satisﬁes k T u − g k L ( Ω ) = ν | Ω | , this shows that there exists an α ≥ ∈ ∂ R ( u ) + αT ∗ ∂ ( k T u − g k L ( Ω ) ) . Hence for this α ≥ u is a minimizer of the problemin (1.6).Conversely, a minimizer u of (1.6) with the above α is obviously a solution of (1.8) with k T u − g k L ( Ω ) = ν | Ω | . This concludes the proof. ⊓⊔ Note, that k T u α − g k L ( Ω ) is (only) convex withrespect to T u α , and hence a minimizer of (1.5) is ingeneral not unique even in the simple case when T = I ,i.e., for two minimizers u α and u α in general we have T u α = T u α . On the contrary, k T u α − g k L ( Ω ) is strictlyconvex with respect to T u α , i.e., for two minimizers u α and u α of (1.4) we have T u α = T u α . Moreover, thefunction α

7→ H ( u α ; g ) is in general not continuous[23], while α

7→ H ( u α ; g ) indeed is continuous [20],where u α is a respective minimizer of (1.6). Hence wehave the following further properties: Lemma 2.1

Let u α be a minimizer of (1.6) then α τ ( u α ; g ) is non-decreasing for τ = 1 , . Moreover, α

7→ H ( u α ; g ) maps R + onto [0 , k g − g Ω k L ( Ω ) ] .Proof For a proof see [20,23]. ⊓⊔ Andreas Langer

Proposition 2.2 If u α i is a minimizer of E ( u, α i ) := k u − g k L ( Ω ) + α i Z Ω | Du | for i = 1 , , then we have k u α − u α k L ( Ω ) ≤ C k g − g Ω k L ( Ω ) . with C := min (cid:26) (cid:12)(cid:12)(cid:12) α − α α + α (cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) α − α α + α (cid:12)(cid:12)(cid:12) (cid:27) Proof

By [7, Lemma 10.2] we have1 α k u α − u α k L ( Ω ) ≤ α ( E ( u α , α ) − E ( u α , α ))1 α k u α − u α k L ( Ω ) ≤ α ( E ( u α , α ) − E ( u α , α )) . Summing up these two inequalities yields (cid:18) α + 1 α (cid:19) k u α − u α k L ( Ω ) ≤ (cid:18) α − α (cid:19) (cid:16) k u α − g k L ( Ω ) − k u α − g k L ( Ω ) (cid:17) which implies k u α − u α k L ( Ω ) ≤ α − α α + α (cid:16) k u α − g k L ( Ω ) − k u α − g k L ( Ω ) (cid:17) . (2.2)By the non-decrease and boundedness of the function α

7→ H ( u α ; g ), see Lemma 2.1, it follows k u α − u α k L ( Ω ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) α − α α + α (cid:12)(cid:12)(cid:12)(cid:12) k g − g Ω k L ( Ω ) . (2.3)On the other hand inequality (2.2) implies k u α − u α k L ( Ω ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) α − α α + α (cid:12)(cid:12)(cid:12)(cid:12) (cid:0) k u α − g k L ( Ω ) + k u α − g k L ( Ω ) (cid:1) , where we used the binomial formula a − b = ( a + b )( a − b ) for a, b ∈ R and the triangle inequality. UsingLemma 2.1 yields k u α − u α k L ( Ω ) ≤ (cid:12)(cid:12)(cid:12)(cid:12) α − α α + α (cid:12)(cid:12)(cid:12)(cid:12) k g − g Ω k L ( Ω ) . (2.4)The assertion follows then from (2.3) and (2.4). ⊓⊔ Remark 2.1

Without loss of generality let α ≥ α inProposition 2.2, then we easily check that C = (cid:16) α − α α + α (cid:17) if α > α , if α = α , α − α α + α otherwise . In order to ﬁnd a suitable regularization parameter α > T does not anni-hilate constant function, which guarantees the existenceof a minimizer of the considered optimization problems;see Section 2. We recall, that in the constraint of (1.8)the value B τ is deﬁned as B τ = ν τ τ | Ω | , where ν τ ∈ R is a statistical value depending on the underlying noiseand possibly on the original image.3.1 Statistical characterization of the noiseLet us characterize the noise corrupting the image inmore details by making similar considerations as in [58,Section 2]. Note, that at any point x ∈ Ω the contami-nated image g ( x ) = N ( T ˆ u )( x ) is a stochastic observa-tion, which depends on the underlying noise. Two im-portant measures to characterize noise are the expectedabsolute value and the variance, which we denote by ν and ν respectively. For images contaminated by Gaus-sian white noise with standard deviation σ , we typicallyset τ = 2 and ν = σ . If the image is instead corruptedby impulse noise, then we set τ = 1 and we have tochoose ν properly. In particular, for salt-and-peppernoise ν ∈ [min { r , r } , max { r , r } ], while for random-valued impulse noise ν should be a value in the interval[ r , r ], where we used that for any point x ∈ Ω we have T ˆ u ( x ) ∈ [0 , ν seems to be ﬁxed, whileactually ν depends on the true (unknown) image ˆ u . Inparticular, for salt-and-pepper noise the expected ab-solute value is given by ν (ˆ u ) := r − ( r − r ) 1 | Ω | Z Ω ( T ˆ u )( x ) d x (3.1)and for random-valued impulse noise we have ν (ˆ u ) := 1 | Ω | Z Ω r (cid:18) ( T ˆ u )( x ) − ( T ˆ u )( x ) + 12 (cid:19) d x. (3.2)However, instead of considering the constraint H τ ( u ; g ) = B τ ( u ) in (1.8), which results in a quite nonlinear prob-lem, in our numerics we choose a reference image andcompute an approximate value B τ . Since our proposedalgorithms are of iterative nature (see APS- and pAPS-algorithm below), it makes sense to choose the currentapproximation as the reference image, i.e., the referenceimage changes during the iterations. Note, that for salt-and-pepper noise with r = r the expected absolutevalue becomes independent of ˆ u and hence ν = r . utomated Parameter Selection for Total Variation Minimization in Image Restoration 7 In case of Gaussian noise ν τ and B τ are independentof ˆ u too. Nevertheless, in order to keep the paper con-cise, in the sequel instead of ν τ and B τ we often write ν τ (˜ u ) and B τ (˜ u ), where ˜ u represents a reference imageapproximating ˆ u , even if the values may actually beindependent from the image.3.2 Automated parameter selection strategyIn order to determine a suitable regularization param-eter α in [18] an algorithm for solving the constrainedminimization problem (1.8) for T = I and τ = 2 isproposed, i.e., in the presence of Gaussian noise withzero mean and standard deviation σ . This algorithmrelies on the fact that α

7→ H ( u α ; g ) is non-decreasing,which leads to the following iterative procedure. Chambolle’s parameter selection (CPS):

Choose α > n := 0.1) Compute u α n ∈ arg min u ∈ BV ( Ω ) k u − g k L ( Ω ) + 2 α n Z Ω | Du |

2) Update α n +1 := σ √ | Ω |k u αn − g k L Ω ) α n .3) Stop or set n := n + 1 and return to step 1).For the minimization of the optimization problem instep 1) in [18] a method based on the dual formulationof the total variation is used. However, we note thatany other algorithm for total variation minimizationmight be used for solving this minimization problem.The CPS-algorithm generates a sequence ( u α n ) n suchthat for n → ∞ , k u α n − g k L ( Ω ) → σ p | Ω | and u α n converges to the unique solution of (1.8) with T = I and τ = 2 [18]. The proof relies on the fact that thefunction α → k u α − g k L Ω ) α is non-increasing. Note, thatthis property does not hold in general for operators T = I . We generalize now the CPS-algorithm to optimizationproblems of the type (1.8) for τ = 1 , T . In order to keep or obtain appropri-ate convergence properties, we need the following twoconditions to be satisﬁed. Firstly, the function α τ ( u α ; g ) has to be monotonic, which is the case dueto Lemma 2.1. Secondly, in each iteration n the pa-rameter α n has to be updated such that ( H τ ( u α n ; g )) n is monotonic and bounded by ( B τ ( u α n )) n . More pre-cisely, if H τ ( u α ; g ) ≤ B τ ( u α ) then there has to existan α n +1 ≥ α n such that H τ ( u α n +1 ; g ) ≤ B τ ( u α n +1 ), while if H τ ( u α ; g ) > B τ ( u α ) then there has to existan α n +1 ≤ α n such that H τ ( u α n +1 ; g ) ≥ B τ ( u α n +1 ).This holds true by setting in every iteration α n +1 := (cid:18) B τ ( u α n ) H τ ( u α n ; g ) (cid:19) p α n (3.3)together with an appropriate choice of p ≥

0. In partic-ular, there exists always a p ≥ Proposition 3.1

Assume k g − g Ω k τL τ ( Ω ) ≥ ν τ | Ω | and α n +1 is deﬁned as in (3.3) .(i) If α n > such that H τ ( u α n ; g ) = B τ ( u α n ) , then forall p ∈ R we have that H τ ( u α n +1 ; g ) ≤ B τ ( u α n +1 ) .(ii) If α n > such that < H τ ( u α n ; g ) < B τ ( u α n ) , thenthere exist p ≥ with H τ ( u α n +1 ; g ) ≤ B τ ( u α n +1 ) .(iii) If α n > such that H τ ( u α n ; g ) > B τ ( u α n ) , thenthere exist p ≥ with H τ ( u α n +1 ; g ) ≥ B τ ( u α n +1 ) .Proof The assertion immediately follows by noting thatfor p = 0 we have α n +1 = α n . ⊓⊔ Taking these considerations into account, a general-ization of the CPS-algorithm can be formulated as thefollowing p -adaptive automated parameter selection al-gorithm: pAPS-algorithm: Choose α > p := p > n := 0.1) Compute u α n ∈ arg min u ∈ BV ( Ω ) J τ ( u ; g )2) Update α n +1 := (cid:16) B τ ( u αn ) H τ ( u αn ; g ) (cid:17) p α n if H τ ( u α n ; g ) > α n , e.g., α n := 10 α n , and goto step 1).3) Compute u α n +1 ∈ arg min u ∈ BV ( Ω ) J τ ( u ; g )4) a) if H τ ( u α ; g ) ≤ B τ ( u α )(i) if H τ ( u α n +1 ; g ) ≤ B τ ( u α n +1 ) go to step5)(ii) if H τ ( u α n +1 ; g ) > B τ ( u α n +1 ), decrease p ,e.g., set p := p/

2, and go to step 2)b) if H τ ( u α ; g ) > B τ ( u α )(i) if H τ ( u α n +1 ; g ) ≥ B τ ( u α n +1 ) go to step5)(ii) if H τ ( u α n +1 ; g ) < B τ ( u α n +1 ), decrease p ,e.g., set p := p/

2, and go to step 2)5) Stop or set n := n + 1 and return to step 2).Note, that due to the dependency of α n +1 on p aproper p cannot be explicitly computed, but only iter-atively, as in the pAPS-algorithm.The initial p > p = 32,which seems large enough to us. Andreas Langer

Proposition 3.2

The pAPS-algorithm generates mono-tone sequences ( α n ) n and ( H τ ( u α n ; g )) n such that ( H τ ( u α n ; g )) n is bounded. Moreover, if H τ ( u α ; g ) > B τ ( u α ) or B τ ( u α ) ≤ τ k g − g Ω k ττ for all α > , then ( α n ) n is also bounded.Proof If H τ ( u α ; g ) > B τ ( u α ), then by induction andLemma 2.1 one shows that 0 < α n +1 ≤ α n and 0 ≤H τ ( u α n +1 ; g ) ≤ H τ ( u α n ; g ) for all n ∈ N . Consequently( α n ) n and ( H τ ( u α n ; g )) n are monotonically decreasingand bounded.If H τ ( u α ; g ) ≤ B τ ( u α ), due to Lemma 2.1 we havethat 0 < α n ≤ α n +1 and H τ ( u α n ; g ) ≤ H τ ( u α n +1 ; g ) forall n ∈ N and hence ( α n ) n and ( H τ ( u α n ; g )) n are mono-tonically increasing. Since there exists B ∗ τ > H τ ( u α n ; g ) ≤ B τ ( u α n ) ≤ B ∗ τ for all n ∈ N , see Section3.1, ( H τ ( u α n ; g )) n is also bounded. If we additionallyassume that B τ ( u α ) ≤ τ k g − g Ω k ττ for all α > B ∗ τ := max α B τ ( u α ), then Theorem 2.2 ensures theexistence of an α ∗ ≥ H τ ( u α ∗ ; g ) = B ∗ τ . ByLemma 2.1 it follows that α n ≤ α ∗ for all n ∈ N , since H τ ( u α n ; g ) ≤ B τ ( u α n ) ≤ B ∗ τ . Hence, ( α n ) n is bounded,which ﬁnishes the proof. ⊓⊔ Since any monotone and bounded sequence convergesto a ﬁnite limit, also ( α n ) n converges to a ﬁnite valueif one of the assumptions in Proposition 3.2 holds. Forconstant B τ we are even able to argue the convergenceof the pAPS-algorithm to a solution of the constrainedminimization problem (1.8). Theorem 3.1

Assume that B τ ( u ) ≡ B τ is a constantindependent of u and k g − g Ω k ττ ≥ ν τ | Ω | . Then thepAPS-algorithm generates a sequence ( α n ) n such that lim n →∞ α n = ¯ α > with H τ ( u ¯ α ; g ) = lim n →∞ H τ ( u α n ; g )= B τ and u α n → u ¯ α ∈ arg min u ∈ X J τ ( u, ¯ α ) for n → ∞ .Proof Let us start with assuming that H τ ( u α ; g ) ≤ B τ . By induction, we show that α n ≤ α n +1 and H τ ( u α n ; g ) ≤ H τ ( u α n +1 ; g ) ≤ B τ . In particular, if H τ ( u α n ; g ) ≤ B τ then α n +1 = (cid:16) B τ H τ ( u αn ; g ) (cid:17) p α n > α n , where p > H τ ( u α n +1 ; g ) ≤ B τ ; cf. pAPS-algorithm. Then byLemma 2.1 it follows that H τ ( u α n ; g ) ≤ H τ ( u α n +1 ; g ) ≤ B τ . Note, that there exists an α ∗ > H τ ( u α ∗ ; g ) = B τ ,see Theorem 2.2, such that for any α ≥ α ∗ , H τ ( u α ; g ) ≥B τ ; cf. Lemma 2.1. If α n ≥ α ∗ , then H τ ( u α n ; g ) ≥ B τ .Hence H τ ( u α n ; g ) = B τ and α n +1 = α n . Thus we de-duce that the sequences ( H τ ( u α n ; g )) n and ( α n ) n arenon-decreasing and bounded. Consequently, there ex-ists an ¯ α such that lim n →∞ α n = ¯ α with H τ ( u ¯ α ; g ) = B τ . Let ¯ H = lim n →∞ H τ ( u α n ; g ), then ¯ H = H τ ( u ¯ α ; g ) = B τ . By the optimality of u α n we have that 0 ∈ ∂ J τ ( u α n , α n ) = ∂ H τ ( u α n ; g ) + α n ∂ R Ω | Du α n | ; see [39, Prop. 5.6 + Eq.(5.21), p.26]. Consequently there exist v α n ∈ ∂ R Ω | Du α n | such that − α n v α n ∈ ∂ H τ ( u α n ; g ) with lim n →∞ v α n = v ¯ α . By [79, Thm. 24.4, p. 233] we obtain that − ¯ αv ¯ α ∈ ∂ H τ ( u ¯ α ; g ) with v ¯ α ∈ ∂ R Ω | Du ¯ α | and hence 0 ∈ ∂ J τ ( u ¯ α , ¯ α )for n → ∞ .If H τ ( u α ; g ) > B τ , then as above we can show byinduction that α n ≥ α n +1 and H τ ( u α n ; g ) ≥ H τ ( u α n +1 ; g ) ≥ B τ . Thus we deduce that ( H τ ( u α n ; g )) n and ( α n ) n arenon-increasing and bounded. Note, that there existsan α ∗ > H τ ( u α ∗ ; g ) = B τ such that for any α ≤ α ∗ , H τ ( u α ; g ) ≤ B τ . Hence if α n ≤ α ∗ , then H τ ( u α n ; g ) ≤ B τ . This implies, that H τ ( u α n ; g ) = B τ and α n +1 = α n . The rest of the proof is identical toabove. ⊓⊔ Remark 3.1

The adaptive choice of the value p in thepAPS-algorithm is fundamental for proving convergencein Theorem 3.1. In particular, the value p is chosen independency of α , i.e., actually p = p ( α ), such that inthe case of a constant B τ the function α H τ ( u α ; g ) p ( α ) α is non-increasing; cf. Fig. 6.2(a). A special case of the pAPS-algorithm accrues when thevalue p is not adapted in each iteration but set ﬁxed.For the case p = 1 (ﬁxed) we obtain the following au-tomated parameter selection algorithm. APS-algorithm:

Choose α > n := 0.1) Compute u α n ∈ arg min u ∈ BV ( Ω ) J τ ( u, α n )2) Update α n +1 := B τ ( u αn ) H τ ( u αn ; g ) α n if H τ ( u α n ; g ) > α n , e.g., α n := 10 α n , and go to step 1).3) Stop or set n := n + 1 and return to step 1).Even in this case, although under certain assump-tions, we can immediately argue the convergence of thisalgorithm. Theorem 3.2

For α > let u α be a minimizer of J τ ( u, α ) . Assume that B τ ( u ) ≡ B τ is a constant in-dependent of u , the function α H τ ( u α ; g ) α is non-increasing, and k g − g ω k ττ ≥ ν τ | Ω | . Then the APS-algorithm generates a sequence ( α n ) n ⊂ R + such that lim n →∞ α n = ¯ α > , lim n →∞ H τ ( u α n ; g ) = B τ and u α n converges to u ¯ α ∈ arg min u ∈ BV ( Ω ) J ( u, ¯ α ) for n →∞ .Proof We only consider the case when H τ ( u α ; g ) ≤B τ by noting that the case H τ ( u α ; g ) > B τ can beshown analogous. By induction, we can show that α n ≤ α n +1 and H τ ( u α n ; g ) ≤ H τ ( u α n +1 ; g ) ≤ B τ . More pre-cisely, if H τ ( u α n ; g ) ≤ B τ then α n +1 = B τ H τ ( u αn ; g ) α n ≥ utomated Parameter Selection for Total Variation Minimization in Image Restoration 9 α n and by Lemma 2.1 it follows that H τ ( u α n ; g ) ≤H τ ( u α n +1 ; g ). Moreover, by the assumption that α H τ ( u α ; g ) α is non-increasing we obtain H τ ( u α n +1 ; g ) ≤ α n +1 α n H τ ( u α n ; g ) = B τ . That is, H τ ( u α n ; g ) ≤ H τ ( u α n +1 ; g ) ≤ B τ . The rest of the proof is analog to the one of Theorem3.1. ⊓⊔ Nothing is known about the convergence of the APS-algorithm, if B τ ( · ) indeed depends on u and B τ ( u α n ) isused instead of a ﬁxed constant. In particular, in our nu-merics for some examples, in particular for the applica-tion of removing random-valued impulse noise with r =0 .

05, we even observe that starting from a certain iter-ation the sequence ( α n ) n oscillates between two states,see Fig. 6.7(c). This behavior can be attributed to thefact that, for example, if H τ ( u α n ) ≤ B τ ( u α n ), then itis not guaranteed that also H τ ( u α n +1 ) ≤ B τ ( u α n +1 ),which is essential for the convergence.The second assumption in the previous theorem,i.e., the non-increase of the function α H τ ( u α ; g ) α ,can be slightly loosened, since for the convergence ofthe APS-algorithm it is enough to demand the non-increase starting from a certain iteration ˜ n ≥

0. Thatis, if there exists a region U ⊂ R + where α H τ ( u α ; g ) α is non-increasing and ( α n ) n ≥ ˜ n ⊂ U , then the algorithmconverges; see Fig. 6.2. Analytically, this can be eas-ily shown via Theorem 3.2 by just considering α ˜ n asthe initial value of the algorithm. If τ = 2, similar tothe CPS-algorithm, we are able to show the followingmonotonicity property. Proposition 3.3

If there exists a constant c > suchthat k T ∗ ( T u − g ) k L ( Ω ) = c k T u − g k L ( Ω ) for all u ∈ L ( Ω ) , then the function α

7→ √ H ( u α ; g ) α is non-increasing,where u α is a minimizer of J ( u, α ) .Proof We start by replacing the functional J by a fam-ily of surrogate functionals denoted by ¯ S and deﬁnedfor u, a ∈ X as¯ S ( u, a ) := J ( u, α ) + δ k u − a k L ( Ω ) − k T ( u − a ) k L ( Ω ) = δ k u − z ( a ) k L ( Ω ) + 2 α Z Ω | Du | + ψ ( a, g, T )where δ > k T k , z ( a ) := a − δ T ∗ ( T a − g ), and ψ isa function independent of u . It can be shown that theiteration u α, ∈ X, u α,k +1 = arg min u ¯ S ( u, u α,k ) , k ≥ u α,k ) k which converges weakly for k → ∞ to a minimizer u α of J ( u, α ), see for example [32]. The unique minimizer u α,k +1 is given by u α,k +1 =( I − P αδ K )( z ( u α,k )), where K is the closure of the set { div ξ : ξ ∈ C c ( Ω, R ) , | ξ ( x ) | ≤ ∀ x ∈ Ω } and P K ( u ) := arg min v ∈ K k u − v k L ( Ω ) ; see [18]. Thenfor k → ∞ , let us deﬁne˜ f ( α ) := (cid:13)(cid:13) P αδ K ( z ( u α )) (cid:13)(cid:13) L ( Ω ) = (cid:13)(cid:13)(cid:13)(cid:13) δ T ∗ ( T u α − g ) (cid:13)(cid:13)(cid:13)(cid:13) L ( Ω ) . Since k T ∗ ( T u − g ) k L ( Ω ) = c k T u − g k L ( Ω ) , it followsthat ˜ f ( α ) = cδ p H ( u α ; g ). The assertion follows byapplying [18, Lemma 4.1], which is extendable to in-ﬁnite dimensions, to ˜ f and by noting that the non-increase of α ˜ f ( α ) α implies the non-increase of α H ( u α ; g ) α . ⊓⊔ We remark that for convolution type of operatorsthe assumption of Proposition 3.3 does not hold ingeneral. However, there exist several operators T , rele-vant in image processing, with the property k T ∗ ( T u − g ) k L ( Ω ) = k T u − g k L ( Ω ) . Such operators include T = I for image denoising, T = 1 D for image inpainting,where 1 D denotes the characteristic function of the do-main D ⊂ Ω , and T = S ◦ A , where S is a sub-sampling operator and A is an analysis operator ofa Fourier or orthogonal wavelet transform. The lat-ter type of operator is used for reconstructing signalsfrom partial Fourier data [17] or in wavelet inpainting[25], respectively. For all such operators the function α

7→ √ H ( u α ; g ) α is non-increasing and hence by setting p = ﬁxed in the pAPS-algorithm or changing theupdate of α in the APS-algorithm to α n +1 := s B H ( u α n ; g ) α n , where B is a ﬁxed constant, chosen according to (1.6),we obtain in these situations a convergent algorithm.We emphasize once more, that in general the non-increase of the function α H τ ( u α ; g ) α is not guaran-teed. Nevertheless, there exists always a constant p ≥ α ( H τ ( u α ; g )) p α is indeed non-increasing.For example, p = for operators T with the property k T ∗ ( T u − g ) k L ( Ω ) = k T u − g k L ( Ω ) ; cf. Propsition 3.3.In particular, one easily checks the following result. Proposition 3.4

Let < α ≤ β , and u α and u β min-imizers of J τ ( · , α ) and J τ ( · , β ) , respectively, for τ =1 , . Then ( H τ ( u β ; g )) p β ≤ ( H τ ( u α ; g )) p α if and only if p ≤ ln β − ln α ln H τ ( u β ; g ) − ln H τ ( u α ; g ) . In order to enhance image details, while preserving ho-mogeneous regions, we formulate, as in [37,58], a locallyconstrained optimization problem. That is, instead ofconsidering (1.6) we formulatemin u ∈ BV ( Ω ) Z Ω | Du | s.t. Z Ω w ( x, y ) | T u − g | τ ( y ) dy ≤ ν τ (4.1)for almost every x ∈ Ω , where w is a normalized ﬁlter,i.e., w ∈ L ∞ ( Ω × Ω ), and w ≥ Ω × Ω with Z Ω Z Ω w ( x, y ) dydx = 1and Z Ω Z Ω w ( x, y ) | φ ( y ) | τ dydx ≥ ǫ k φ k τL τ ( Ω ) (4.2)for all φ ∈ L τ ( Ω ) and for some ǫ > φ ;cf. [37,58].4.1 Local ﬁlteringIn practice for w we may use the mean ﬁlter togetherwith a windowing technique, see for example [37,58]. Inorder to explain the main idea we continue in a discretesetting. Let Ω h be a discrete image domain contain-ing N × N pixels, N , N ∈ N , and by | Ω h | = N N we denote the size of the discrete image (number ofpixels). We approximate functions u by discrete func-tions, denoted by u h . The considered functions spacesare X = R N × N and Y = X × X . In what follows forall u h ∈ X we use the following norms k u h k τ := k u h k ℓ τ ( Ω h ) =  X x ∈ Ω h | u h ( x ) | τ  /τ for 1 ≤ τ < + ∞ . Moreover we denote by u hΩ the averagevalue of u h ∈ X , i.e, u hΩ := | Ω h | P x ∈ Ω h u h ( x ). The dis-crete gradient ∇ h : X → Y and the discrete divergencediv h : Y → X are deﬁned in a standard-way by forwardand backward diﬀerences such that div h := − ( ∇ h ) ∗ ;see for example [18,22,55,63]. With the above nota-tions and deﬁnitions the discretization of the generalfunction in (1.9) is given by J τ ( u h , α ) := H τ ( u h ) + R α ( u h ) (4.3)where H τ ( u h ) = τ k T h u h − g h k ττ , τ ∈ { , } , T h : X → X is a bounded linear operator, α ∈ ( R + ) N × N , and R α ( u h ) := X x ∈ Ω h α ( x ) |∇ h u h ( x ) | l (4.4) with | y | l = p y + y for every y = ( y , y ) ∈ R . Inthe sequel if α is a scalar or α ≡ R α or R just αR or R , respectively, i.e., R ( u h ) = X x ∈ Ω h |∇ h u h ( x ) | l is the discrete total variation of u in Ω h , and we write¯ E τ instead of E τ to indicate that α is constant. Intro-ducing some step-size h , then for h → N N goes to inﬁnity) one can show, similaras for the case α ≡

1, that R α Γ -converges to R Ω α | Du | ;see [12,62].We turn now to the locally constrained minimiza-tion problem, which is given in the discrete setting asmin u h ∈ X R ( u h ) s.t. S τi,j ( u h ) ≤ ν τ τ for all x i,j ∈ Ω h . (4.5)Here ν τ is a ﬁxed constant and S τi,j ( u h ) := 1 M i,j X x s,t ∈I i,j τ | ( T h u h )( x s,t ) − g h ( x s,t ) | τ denotes the local residual at x i,j ∈ Ω h with I i,j beingsome suitable set of pixels around x i,j of size M i,j , i.e., M i,j = |I i,j | . For example, in [37,54,58] for I i,j the set Ω ωi,j = (cid:26) x s + i,t + j ∈ Ω h : − ω − ≤ s, t ≤ ω − (cid:27) with a symmetric extension at the boundary and with ω being odd is used. That is, Ω ωi,j is a set of pixels ina ω -by- ω window centered at x i,j , i.e., M i,j = ω forall i, j , such that Ω ωi,j Ω h for x i,j suﬃciently close to ∂Ω . Additionally we denote by ˜ Ω ωi,j a set of pixels in awindow centered at x i,j without any extension at theboundary, i.e.,˜ Ω ωi,j = ( x s + i,t + j : max (cid:26) − ( i, j ) , − ω − (cid:27) ≤ ( s, t ) ≤ min (cid:26) ω − , ( N − i, N − j ) (cid:27) ) . Hence ˜ Ω ωi,j ⊂ Ω h for all x i,j ∈ Ω h . Before we analyzethe diﬀerence between Ω ωi,j and ˜ Ω ωi,j with respect to theconstrained minimization problem (4.5), we note that,since T h does not annihilate constant functions, theexistence of a solution of (4.5) is guaranteed; see [37,Theorem 2][58, Theorem 2].In the following we set B τ := ν τ τ | Ω h | . Proposition 4.1 (i) If u h is a solution of (4.5) with I i,j = Ω ωi,j , then H τ ( u h ) < B τ . utomated Parameter Selection for Total Variation Minimization in Image Restoration 11 (ii) If u is a solution of (4.5) with I i,j = ˜ Ω ωi,j , then H τ ( u h ) ≤ B τ . Proof (i) Since u h is a solution of (4.5) and Ω ωi,j is aset of pixels in a ω -by- ω window, we have B τ ≥ X i,j S τi,j ( u h )= X i,j τ ω X x s,t ∈ Ω ωi,j | g h ( x s,t ) − T h u h ( x s,t ) | τ > τ X i,j | g h ( x i,j ) − T h u h ( x i,j ) | τ = H τ ( u h ) . Here we used that due to the sum over i, j eachelement (pixel) in Ω ωi,j appears at most ω times.More precisely, any pixel-coordinate in the set Λ ω := { ( i, j ) : min { i − , j − , N − i, N − j } ≥ ω − } occursexactly ω -times, while any other pixel-coordinateappears strictly less than ω -times. This shows theﬁrst statement.(ii) For a minimizer u h of (4.5) we obtain B τ ≥ X i,j S τi,j ( u h )= X i,j τ M i,j X x s,t ∈ Ω ωi,j | g h ( x s,t ) − T h u h ( x s,t ) | τ = 1 τ X i,j | g h ( x i,j ) − T h u h ( x i,j ) | τ = H τ ( u h ) , which concludes the proof. ⊓⊔ Note, that if I i,j = ˜ Ω ωi,j then by Proposition 4.1 a mini-mizer of (4.5) also satisﬁes the constraint of the problemmin u h ∈ X R ( u h ) s.t. H τ ( u h ) ≤ B τ (4.6)(discrete version of (2.1)) but is in general of course nota solution of (4.6). Proposition 4.2

Let I i,j = ˜ Ω ωi,j , u hs be a minimizerof (4.6) and u hl be a minimizer of (4.5) , then R ( u hs ) ≤ R ( u hl ) .Proof Assume that R ( u hs ) > R ( u hl ). Since u hs is a so-lution of (4.6) it satisﬁes the constraint H τ ( u hs ) ≤ B τ .By Proposition 4.1 we also have H τ ( u hl ) ≤ B τ . Since R ( u hs ) > R ( u hl ), u hs is not the solution of (4.6) which isa contradiction. Hence, R ( u hs ) ≤ R ( u hl ). ⊓⊔ Remark 4.1

Proposition 4.1 and its consequence arenot special properties of the discrete setting. Let theﬁlter w in (4.1) be such that the inequality in (4.2)becomes an equality with ǫ = 1 / | Ω | , as it is the case in Proposition 4.1(ii), then a solution u l of the locallyconstrained minimization problem (4.1) satisﬁes H τ ( u l ; g ) ≤ ν τ τ | Ω | and Z Ω | Du l | ≥ Z Ω | Du s | where u s is a solution of (2.1).From Proposition 4.2 and Remark 4.1 we conclude,since R ( u hs ) ≤ R ( u hl ) and R Ω | Du s | ≤ R Ω | Du l | , that u hs and u s are smoother than u hl and u l , respectively.Hence the solution of the locally constrained minimiza-tion problem is expected to preserve details better thanthe minimizer of the globally constrained optimizationproblem. Since noise can be interpreted as ﬁne details,which we actually want to eliminate, this could alsomean, that noise is possibly left in the image.4.2 Locally adaptive total variation algorithmWhenever ν τ depends on ˆ u problem (4.5) results in aquite nonlinear problem. Instead of considering nonlin-ear constraints we choose as in Section 3 a reference im-age ˜ u and compute an approximate ν τ = ν τ (˜ u ). Note,that in our discrete setting for salt-and-pepper noise wehave now ν ( u h ) := r − ( r − r ) 1 | Ω | X x ∈ Ω h ( T h u h )( x )and for random-valued impulse noise we have ν ( u h ) := 1 | Ω h | X x ∈ Ω h r (cid:18) ( T h u h )( x ) − ( T h u h )( x ) + 12 (cid:19) . In our below proposed locally adaptive algorithms wechoose as a reference image the current approximation(see LATV- and pLATV-algorithm below), as also donein the pAPS- and APS-algorithm above. Then we areseeking for a solution u h such that S τi,j ( u h ) is close to ν τ τ . We note, that for large α > u hα and theresidual contains details, i.e., we expect H τ ( u hα ) > B τ .Hence, if S τi,j ( u hα ) > ν τ τ we suppose that this is dueto image details contained in the local residual image.In this situation we intend to decrease α in the localregions I i,j . In particular, we deﬁne, similar as in [37,58], the local quantity f ωi,j by f ωi,j := ( S τi,j ( u hα ) if S τi,j ( u hα ) > ν τ τ , ν τ τ otherwise . Note, that ν τ τf ωi,j ≤ i, j and hence we set α ( x i,j ) := 1 M i,j X x s,t ∈I i,j (cid:18) ν τ τ f ωs,t (cid:19) p α ( x s,t ) . (4.7) On the other hand, for small α > u hα , which still contains noise, i.e., weexpect H τ ( u hα ) < B τ . Analogously, if S τi,j ( u hα ) ≤ ν τ τ , wesuppose that there is still noise left outside the residualimage in I i,j . Hence we intend to increase α in the localregions I i,j by deﬁning f ωi,j := ( S τi,j ( u hα ) if S τi,j ( u hα ) < ν τ τ , ν τ τ otherwise , and setting α as in (4.7). Notice, that now ν τ τf ωi,j ≥

1. These considerations lead to the following locallyadapted total variation algorithm.

LATV-algorithm:

Choose α > p := p > n := 0.1) Compute u hα n ∈ arg min u h ∈ X J τ ( u h , α n )2) (a) If H τ ( u hα ) > B τ ( u hα ), then set f ωi,j := max (cid:26) S τi,j ( u hα n ) , ν τ ( u hαn ) τ (cid:27) (b) If H τ ( u hα ) ≤ B τ ( u hα ), then set f ωi,j := max (cid:26) min (cid:26) S τi,j ( u hα n ) , ν τ ( u hαn ) τ (cid:27) , ε (cid:27)

3) Update α n +1 ( x i,j ) := M i,j X x s,t ∈I i,j (cid:18) ν τ ( u hαn ) τf ωs,t (cid:19) p α n ( x s,t ) .

4) Stop or set n := n + 1 and return to step 1).Here and below ε > ε = 10 − ) to ensure that f ωi,j >

0, since it may happen that S τi,j ( u hα n ) = 0.If H τ ( u hα ) > B τ ( u hα ), we stop the algorithm assoon as the residual H τ ( u hα n ) < B τ ( u hα n ) for the ﬁrsttime and set the desired locally varying α ∗ = α n . If H τ ( u hα ) ≤ B τ ( u hα ), we stop the algorithm as soonas the residual H τ ( u hα n ) > B τ ( u hα n ) for the ﬁrst timeand set the desired locally varying α ∗ = α n − , since H τ ( u hα n − ) ≤ ν τ ( u hαn − ) τ .The LATV-algorithm has the following monotonic-ity properties with respect to ( α n ) n . Proposition 4.3

Assume I i,j = Ω ωi,j and let ε > be suﬃciently small. If α > such that H τ ( u hα ) ≤ B τ ( u hα ) , then the LATV-algorithm generates a sequence ( α n ) n such that X i,j α n +1 ( x i,j ) > X i,j α n ( x i,j ) . Proof

By the same argument as in the proof of Propo-sition 4.1 we obtain X i,j α n +1 ( x i,j ) = X i,j  ( ν τ ( u hα n )) p τ p ω X ( s,t ) ∈ Ω ωi,j α n ( x s,t )( f ωs,t ) p  > X i,j ( ν τ ( u hα n )) p ω τ p ω α n ( x i,j )( f ωi,j ) p ! . Note that ν τ ( · ) is bounded from below, see Section 3.1.Consequently there exists an ε > ν τ ( u h ) τ ≥ ε for any u h . Then, since H τ ( u hα ) ≤ B τ ( u hα ) we have bythe LATV-algorithm that f ωi,j := max ( min ( S τi,j ( u hα n ) , ν τ ( u hα n ) τ ) , ε ) ≤ ν τ ( u hα n ) τ and hence P i,j ( α n +1 )( x i,j ) > P i,j ( α n )( x i,j ) . ⊓⊔ Proposition 4.4

Let I i,j = ˜ Ω ωi,j and ε > be suﬃ-ciently small.(i) If α > such that H τ ( u hα ) > B τ ( u hα ) , then theLATV-algorithm generates a sequence ( α n ) n suchthat X i,j ( α n +1 )( x i,j ) ≤ X i,j ( α n )( x i,j ) . (ii) If α > such that H τ ( u hα ) ≤ B τ ( u hα ) , then theLATV-algorithm generates a sequence ( α n ) n suchthat X i,j ( α n +1 )( x i,j ) ≥ X i,j ( α n )( x i,j ) . Proof (i) By the same argument as in the proof of Propo-sition 4.1 and since f ωi,j := max (cid:26) S τi,j ( u hα n ) , ν τ ( u hαn ) τ (cid:27) ≥ ν τ ( u hαn ) τ we obtain X i,j α n +1 ( x i,j ) = X i,j  ν τ ( u hα n ) p τ p M i,j X x s,t ∈ ˜ Ω ωi,j α n ( x s,t )( f ωs,t ) p  = X i,j ν τ ( u hα n ) τ ! p α n ( x i,j )( f ωi,j ) p ! ≤ X i,j α n ( x i,j ) . (ii) Since ν τ ( · ) is bounded from below, see Section 3.1,there exists an ε > ν τ ( u h ) τ ≥ ε for any u h . Hence f ωi,j := max ( min ( S τi,j ( u hα n ) , ν τ ( u hα n ) τ ) , ε ) ≤ ν τ ( u hα n ) τ utomated Parameter Selection for Total Variation Minimization in Image Restoration 13 and by the same arguments as above we get X i,j α n +1 ( x i,j ) = X i,j ν τ ( u hα n ) τ ! p α n ( x i,j )( f ωi,j ) p ! ≥ X i,j α n ( x i,j ) . ⊓⊔ In contrast to the pAPS-algorithm in the LATV-algorithm the power p > p = in our experiments. Note, that small p onlyallow small changes of α in each iteration. In this waythe algorithm is able the generate a function α ∗ suchthat H τ ( u hα ∗ ) is very close to ν τ ( u hα ∗ ) τ . On the contrary,small p have the drawback that the number of itera-tions till termination are kept large. Since the parame-ter p has to be chosen manually, the LATV-algorithm,at least in the spirit, seems to be similar to Uzawa’smethod, where also a parameter has to be chosen. Theproper choice of such a parameter might be complicatedand hence we are desiring for an algorithm where we donot have to tune parameters manually. Because of thisand motivated by the pAPS-algorithm we propose thefollowing p adaptive algorithm: pLATV-algorithm: Choose α > p := p > n := 0.0) Compute u α n ∈ arg min u h ∈ X J τ ( u h , α n )1) (a) If H τ ( u hα ) > B τ ( u hα ), then set f ωi,j := max (cid:26) S τi,j ( u hα n ) , ν τ ( u hαn ) τ (cid:27) (b) If H τ ( u hα ) ≤ B τ ( u hα ), then set f ωi,j := max (cid:26) min (cid:26) S τi,j ( u hα n ) , ν τ ( u hαn ) τ (cid:27) , ε (cid:27)

2) Update α n +1 ( x i,j ) := α n ( x i,j ) M i,j X x s,t ∈I i,j (cid:18) ν τ ( u hαn ) τf ωs,t (cid:19) p .

3) Compute u hα n +1 ∈ arg min u h ∈ X J τ ( u h , α n +1 )4) (a) if H τ ( u hα ) ≤ B τ ( u hα )(i) if H τ ( u hα n +1 ) ≤ B τ ( u hα n +1 ), go to step 5)(ii) if H τ ( u hα n +1 ) > B τ ( u hα n +1 ), decrease p ,e.g., set p = p/

10, and go to step 2)(b) if H τ ( u hα ) > B τ ( u hα )(i) if H τ ( u hα n +1 ) ≥ B τ ( u hα n +1 ), go to step 5)(ii) if H τ ( u hα n +1 ) < B τ ( u hα n +1 ), decrease p ,e.g., set p = p/

10, and go to step 2)5) Stop or set n := n + 1 and return to step 1). In our numerical experiments this algorithm is ter-minated as soon as | H τ ( u hα n ) − B τ ( u hα n ) | ≤ − and H τ ( u hα n ) ≤ B τ ( u hα n ). Additionally we stop iteratingwhen p is less than machine precision, since then any-way no progress is to expect. Due to the adaptive choiceof p we obtain a monotonic behavior of the sequence( α n ) n . Proposition 4.5

The sequence ( α n ) n generated by thepLATV-algorithm is for any point x ∈ Ω monotone. Inparticular, it is monotonically decreasing for α suchthat H τ ( u hα ) > B τ ( u hα ) , and monotonically increasingfor α such that H τ ( u hα ) ≤ B τ ( u hα ) .Proof For H τ ( u hα ) > B τ ( u hα ) we can show by induc-tion that by the pLATV-algorithm f ωi,j ≥ ν τ ( u hαn ) τ andhence 1 ≥ ν τ ( u hαn ) τf ωi,j for all n . Then by the deﬁnition of α n +1 it follows α n +1 ( x i,j ) := α n ( x i,j ) M i,j X x s,t ∈I i,j ν τ ( u hα n ) τ f ωs,t ! p ≤ α n ( x i,j ) . By similar arguments we obtain for α with H τ ( u hα ) ≤ B τ ( u hα ) that α n +1 ( x i,j ) ≥ α n ( x i,j ) for all x i,j ∈ Ω . ⊓⊔ We are aware of the fact that using u hα n as a ref-erence image in the LATV- and pLATV-algorithm tocompute ν τ may commit errors. However, we recall thatfor Gaussian noise we set ν = σ and for salt-and-pepper noise with r = r we have ν = r . In thesecases ν τ does not depend on the original image andhence we do not commit any error by computing ν τ .For random-valued impulse noise corrupted images thesituation is diﬀerent and ν indeed depends on the trueimage. In this situation errors may be committed when u hα n is used as a reference image for calculating ν τ ; seeFigs. 4.1 and 4.2. Hence, in order to improve the pro-posed algorithm, for such cases for future research itmight be of interest to ﬁnd the optimal reference imageto obtain a good approximation of the real value ν τ .In contrast to the SA-TV algorithm presented in[37,58], where the initial regularization parameter hasto be chosen suﬃciently small, in the LATV-algorithmas well as in the pLATV-algorithm the initial value α can be chosen arbitrarily positive. However, in the case H τ ( u hα ) > B τ ( u hα ) we cannot guarantee in generalthat the solution u α obtained by the pLATV-algorithmfulﬁlls H τ ( u hα ) ≤ B τ ( u hα ), not even if B τ ( · ) is con-stant, due to the stopping criterion with respect to thepower p . On the contrary, if H τ ( u hα ) ≤ B τ ( u hα ), thenthe pLATV-algorithm generates a sequence ( u hα n ) n suchthat H τ ( u hα n ) ≤ B τ ( u hα n ) for all n and hence also for the Iteration2 4 6 8 10 12 14 16 18 20 22 24 26 28 300.0910.09150.0920.09250.0930.09350.0940.09450.0950.0955 ν ( u h α n ) ν (ˆ u ) Iteration1 2 3 4 5 6 7 8 9 10 110.03100.0310.03110.03110.03110.03120.03120.03130.03130.0314 ν ( u h α n ) ν (ˆ u ) Iteration1 2 3 4 5 6 7 8 90.01550.01550.01560.01560.01560.01560.0156 ν ( u h α n ) ν (ˆ u ) Fig. 4.1

Progress of ν τ ( u hα n ) of the LATV-algorithm with p = and α = 10 − for removing random-valued impulse noisewith r = 0 . (left), r = 0 . (middle), r = 0 . (right) from the cameraman-image (cf. Fig. 6.1(b)). Iteration1 6 11 16 21 26 31 36 41 46 51 56 61 66 710.0910.09150.0920.09250.0930.09350.0940.09450.0950.0955 ν ( u h α n ) ν (ˆ u ) Iteration1 2 3 40.0310.03110.03110.03110.03120.03130.03130.03130.0314 ν ( u h α n ) ν (ˆ u ) Iteration1 2 3 4 5 6 7 8 9 10 11 12 13 140.01550.01550.01560.01560.01560.01560.01560.01560.01560.01560.0156 ν ( u h α n ) ν (ˆ u ) Fig. 4.2

Progress of ν τ ( u hα n ) of the pLATV-algorithm with p = and α = 10 − for removing random-valued impulsenoise with r = 0 . (left), r = 0 . (middle), r = 0 . (right) from the cameraman-image (cf. Fig. 6.1(b)). solution of the algorithm. As a consequence we wouldwish to choose α > H τ ( u hα ) ≤ B τ ( u hα ),which may be realized by the following simple auto-mated procedure: Algorithm 1:

Input: α > u hα ∈ arg min u h ∈ X J τ ( u h , α ).2) If H τ ( u hα ) > B τ ( u hα ) decrease α by setting α = c α α , with c α ∈ (0 , α . In this section we are concerned with developing numer-ical methods for computing a minimizer of the discretemulti-scale L τ -TV model, i.e.,min u h ∈ X { J τ ( u h , α ) := H τ ( u h ) + R α ( u h ) } . (5.1)5.1 L -TV minimizationHere we consider the case τ = 2, i.e., the minimizationproblemmin u h ∈ X k T h u h − g h k + R α ( u h ) , (5.2) and present solution methods, ﬁrst for the case T h = I and then for general linear bounded operators T h . If T h = I , then (5.2) becomes an image denoising prob-lem, i.e., the minimization problemmin u h ∈ X k u h − g h k + 2 R α ( u h ) . (5.3)For solving this problem we use the algorithm of Cham-bolle and Pock [22], which leads to the following itera-tive scheme: Chambolle-Pock algorithm:

Initialize τ, σ > θ ∈ [0 , p h , u h ) ∈ Y × X , set ¯ u h = u h , and set n = 0.1. Compute p hn +1 ( x ) = p hn ( x ) + σ ∇ h ¯ u hn ( x )max n α ( x ) | p hn ( x ) + σ ∇ h ¯ u hn ( x ) | , o , for all x ∈ Ω h .2. Compute u hn +1 = u hn + τ div h p hn +1 + τg h τ .3. Set ¯ u hn +1 = u hn +1 + θ ( u hn +1 − u hn ).4. Stop or set n := n + 1 and return to step 1). utomated Parameter Selection for Total Variation Minimization in Image Restoration 15 In our numerical experiments we choose θ = 1.In particular, in [22] it is shown that for θ = 1 and τ σ k∇ h k < Assume, that T h is a linear bounded operator from X to X , diﬀerent to the identity I . Then instead of minimiz-ing (5.2) directly, we introduce the surrogate functional S ( u h , a h ) : = 12 k T h u h − g h k + R α ( u h ) + δ k u h − a h k − k T h ( u h − a h ) k = δ k u h − z ( a h ) k + R α ( u h ) + ψ ( a h , g h , T h ) , (5.4)with a h , u h ∈ X , z ( a h ) = a h − δ ( T h ) ∗ ( T h a h − g h ), ψ a function independent of u h , and where we assume δ > k T h k ; see [31,32]. Note thatmin u h ∈ X S ( u h , a h ) ⇔ min u ∈ X k u h − z ( a h ) k + 2 R αδ ( u h )and hence to obtain a minimizer amounts to solve aminimization problem of the type (5.3) and can besolved as described in Section 5.1.1. Then an approxi-mate solution of (5.2) can be computed by the follow-ing iterative algorithm: Choose u h ∈ X and iterate for n ≥ u hn +1 = arg min u h ∈ X S ( u h , u hn ) . (5.5)For scalar α it is shown in [28,31,32] that this iter-ative procedure generates a sequence ( u hn ) n which con-verges to a minimizer of (5.2). This convergence prop-erty can be easily extended to our non-scalar case yield-ing the following result. Theorem 5.1

For α : Ω → R + the scheme in (5.5) generates a sequence ( u hn ) n , which converges to a solu-tion of (5.2) for any initial choice of u h ∈ X .Proof A proof can be accomplished analogue to [32]. ⊓⊔ L -TV minimizationThe computation of a minimizer ofmin u h ∈ X k T h u h − g h k + R α ( u h ) , (5.6)is due to the non-smooth ℓ -term in general more com-plicated than obtaining a solution of the L -TV model. Here we suggest to employ a trick, proposed in [5] for L -TV minimization problems with a scalar regulariza-tion parameter, to solve (5.6) in two steps. In partic-ular, we substitute the argument of the ℓ -norm by anew variable v , penalize the functional by an L -term,which should keep the diﬀerence between v and T u − g small, and minimize with respect to v and u . That is,we replace the original minimization (5.6) bymin v h ,u h ∈ X k v h k + 12 γ k T h u h − g h − v h k + R α ( u h ) , (5.7)where γ > g h ≈ T h u h − v h .Actually, it can be shown that (5.7) converges to (5.6)as γ →

0. In our experiments we actually choose γ =10 − . This leads to the following alternating algorithm. L -TV α algorithm: Initialize α > u h ∈ X andset n := 0.1) Compute v hn +1 = arg min v h ∈ X k v h k + 12 γ k T h u hn − g h − v h k

2) Compute u hn +1 ∈ arg min u h ∈ X γ k T h u h − g h − v hn +1 k + R α ( u h )3) Stop or set n := n + 1 and return to step 1).The minimizer v hn +1 in step 1) of the L -TV α algo-rithm can be easily computed via a soft-thresholding,i.e., v hn +1 = ST( T h u hn − g h , γ ), whereST( g h , γ )( x ) =  g h ( x ) − γ if g h ( x ) > γ, | g h ( x ) | ≤ γ,g h ( x ) + γ if g h ( x ) < − γ for all x ∈ Ω h . The minimization problem in step 2) isequivalent toarg min u h ∈ X k T h u h − g h − v hn +1 k + R γα ( u h ) (5.8)and hence is of the type (5.2). Thus an approximatesolution of (5.8) can be computed as described above;see Section 5.1. Theorem 5.2

The sequence ( u hn , v hn ) n generated by the L -TV α algorithm converges to a minimizer of (5.7) .Proof The statement can be shown analogue to [5]. ⊓⊔ L -TV minimizationFor solving (1.6) with τ = 1 we suggest, alternativelyto the above method, to use the primal-dual methodof [58] adapted to our setting, where a Huber regular-isation of the gradient of u is considered; see [58] formore details. Denoting by ¯ u a corresponding solutionof the primal problem and ¯ p the solution of the asso-ciated dual problem, the optimality conditions due tothe Fenchel theorem [39] are given by − div ¯ p ( x ) = − κ∆ ¯ u ( x ) + 1 β + µ T ∗ ( T ¯ u ( x ) − g ( x ))+ µβ + µ T ∗ T ¯ u ( x ) − g ( x )max { β, | T ¯ u ( x ) − g ( x ) |}− ¯ p ( x ) = 1 γ ∇ ¯ u ( x ) if | ¯ p ( x ) | l < α ( x ) − ¯ p ( x ) = α ( x ) ∇ ¯ u ( x ) |∇ ¯ u ( x ) | l if | ¯ p ( x ) | l = α ( x ) , for all x ∈ Ω , where κ, β, µ , and γ are ﬁxed positiveconstants. The latter two conditions can be summarizedto − ¯ p ( x ) = α ( x ) ∇ ¯ u ( x )max { γα ( x ) , |∇ ¯ u ( x ) | l } . Then setting ¯ q = − ¯ p and ¯ v = T ¯ u − g max { β, | T ¯ u − g |} leads to the following system ofequation:0 = − max { β, | T ¯ u ( x ) − g ( x ) |} ¯ v + T ¯ u ( x ) − g ( x )0 = div ¯ q ( x ) + κ∆ ¯ u ( x ) − β + µ T ∗ ( T ¯ u ( x ) − g ( x )) − µβ + µ T ∗ ¯ v ( x )0 = max { γα ( x ) , |∇ ¯ u ( x ) | l } ¯ q ( x ) − α ( x ) ∇ ¯ u ( x ) (5.9)for all x ∈ Ω . This system can be solved eﬃciently bya semi-smooth Newton algorithm; see Appendix A fora description of the method and for the choice of theparameters κ, β, µ , and γ .Note, that diﬀerent algorithms presented in the lit-erature can also be adjusted to the case of a locallyvarying regularization parameter, such as [18,21,68].However, it is not the scope of this paper to comparediﬀerent algorithms in order to detect the most eﬃcientone, although this is an interesting research topic in itsown right. In the following we present numerical experiments forstudying the behavior of the proposed algorithms (i,e.,APS-, pAPS-, LATV-, pLATV-algorithm) with respectto its image restoration capabilities and its stabilityconcerning the choice of the initial value α . The per-formance of these methods is compared quantitatively by means of the peak signal-to-noise-ratio (PSNR) [11],which is widely used as an image quality assessmentmeasure, and the structural similarity measure (MSSIM)[90], which relates to perceived visual quality betterthan PSNR. When an approximate solution of the L -TV model is calculated, we also compare the restora-tions by the mean absolute error (MAE), which is an L -based measure deﬁned asMAE = k u − ˆ u k L ( Ω ) , where ˆ u denotes the true image and u represents the ob-tained restoration. In general, when comparing PSNRand MSSIM, large values indicate better reconstructionthan smaller values, while the smaller MAE becomesthe better the reconstruction results are.Whenever an image is corrupted by Gaussian noisewe compute a solution by means of the (multi-scale) L -TV model, while for images containing impulsive noisethe (multi-scale) L -TV model is always considered.In our numerical experiments the CPS-, APS-, andpAPS-algorithm are terminated as soon as |H τ ( u α n ; g ) − B τ ( u α n ) |B τ ( u α n ) ≤ ǫ B = 10 − or the norm of the diﬀerence of two successive iterates α n and α n +1 drops below the threshold ǫ α = 10 − ,i.e., k α n − α n +1 k < ǫ α . The latter stopping criterionis used to terminate the algorithms if ( α n ) n stagnatesand only very little progress is to expect. In fact, if ouralgorithm converges at least linearly, i.e., if there existsan ε α ∈ (0 ,

1) and an m > n ≥ m wehave k α n +1 − α ∞ k < ε α k α n − α ∞ k , the second stoppingcriterion at least ensures that the distance between ourobtained result α and α ∞ is k α − α ∞ k ≤ ǫ α ε α − ε α .6.1 Automatic scalar parameter selectionFor automatically selecting the scalar parameter α in(1.6) we presented in Section 3 the APS- and pAPS-algorithm. Here we compare their performance for im-age denoising and image deblurring. For recovering images corrupted by Gaussian noise withmean zero and standard deviation σ we minimize thefunctional in (1.6) by setting τ = 2 and T = I . Then B = σ | Ω | is a constant independent of u . The auto-matic selection of a suitable regularization parameter α is here performed by the CPS-, APS-, and pAPS-algorithm, where the contained minimization problemis solved by the method presented in Section 5.1.1. We utomated Parameter Selection for Total Variation Minimization in Image Restoration 17 recall, that by [18, Theorem 4] and Theorem 3.1 it isensured that the CPS- and the pAPS-algorithm gener-ate sequences ( α n ) n which converge to ¯ α such that u ¯ α solves (1.8). In particular, in the pAPS-algorithm thevalue p is chosen in dependency of α , i.e., p = p ( α ),such that α → ( H τ ( u α ; g )) p ( α ) α is non-increasing, see Fig.6.2(a). This property is fundamental for obtaining con-vergence of this algorithm; see Theorem 3.1. For theAPS-algorithm such a monotonic behavior is not guar-anteed and hence we cannot ensure its convergence.Nevertheless, if the APS-algorithm generates α ’s suchthat the function α → H τ ( u α ; g ) α is non-increasing, thenit indeed converges to the desired solution, see Theo-rem 3.2. Unfortunately, the non-increase of the function α → H τ ( u α ; g ) α does not hold always, see Fig. 6.2(b).For a performance-comparison here we consider thephantom-image of size 256 ×

256 pixels, see Fig. 6.1(a),corrupted only by Gaussian white noise with σ = 0 . p = 32. The behaviorof the sequence ( α n ) n for diﬀerent initial α , i.e., α ∈{ , − , − } is depicted in Fig. 6.3. We observe thatall three methods for arbitrary α converge to the sameregularization parameter α and hence generate resultswith the same PSNR and MSSIM (i.e., PSNR= 19 . . σ as well.Looking at the number of iterations needed till ter-mination, we observe from Table 6.1 that the APS-algorithm always needs signiﬁcantly less iterations thanthe CPS-algorithm till termination. This is attributedto the diﬀerent updates of α . Recall, that for a ﬁxed α n in the CPS-algorithm we set α CP Sn +1 := q ν | Ω | τ H ( u αn ; g ) α n ,while in the APS-algorithm the update is performedas α AP Sn +1 := ν | Ω | τ H ( u αn ; g ) α n . Note, that q ν | Ω | τ H ( u αn ; g ) ≤ ν | Ω | τ H ( u αn ; g ) , if ν | Ω | τ H ( u αn ; g ) ≥ q ν | Ω | τ H ( u αn ; g ) ≥ ν | Ω | τ H ( u αn ; g ) ,if ν | Ω | τ H ( u αn ; g ) ≤

1. Hence, we obtain α n ≤ α CP Sn +1 ≤ α AP Sn +1 if ν | Ω | τ H ( u αn ; g ) ≥ α n ≥ α CP Sn +1 ≥ α AP Sn +1 if ν | Ω | τ H ( u αn ; g ) ≤

1. That is, in the APS-algorithm α changes more signiﬁcantly in each iteration than in theCPS-algorithm, which leads to a faster convergence withrespect to the number of iterations. Nevertheless, ex-actly this behavior allows the function α → H ( u α ; g ) α to increase which is responsible that the convergence ofthe APS-algorithm is not guaranteed in general. How-ever, in our experiments we observed that the function α → H ( u α ; g ) α only increases in the ﬁrst iterations, butnon-increases (actually even decreases) afterwards, seeFig. 6.2(b). This is actually enough to guarantee conver-gence, as discussed in Section 3, since we can consider (a) phantom(b) cameraman(c) barbara(d) lena Fig. 6.1

Original images. the solution of the last step in which the desired mono-tonic behavior is not fulﬁlled as a “new” initial value.Since from this point on the non-increase holds, we getconvergence of the algorithm.The pAPS-algorithm is designed to ensure the non-increase of the function α → ( H τ ( u α ; g )) p ( α ) α by choos- Iteration10 20 30 40 5000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (a) CPS-algorithm Iteration5 10 15 20 25 30 3500.20.40.60.811.21.4 α = 1 α = 10 − α = 10 − (b) APS-algorithm Iteration5 10 15 20 25 30 35 4000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (c) pPAS-algorithm Fig. 6.3

Denoising of the phantom-image corrupted with Gaussian white noise with σ = 0 . . Table 6.1

Number of iterations needed for the reconstruction of the phantom-image corrupted by Gaussian white noise withdiﬀerent standard deviations σ . In the pAPS-algorithm we set p = 32 . σ = 0 . σ = 0 . σ = 0 . σ = 0 . α CPS APS pAPS CPS APS pAPS CPS APS pAPS CPS APS pAPS1 55 25 8 47 21 21 46 20 20 47 17 4710 −

42 18 18 38 17 6 44 20 20 46 18 4610 −

43 27 43 40 20 40 40 18 40 39 17 610 −

43 31 43 41 22 41 41 19 41 41 20 4110 −

43 35 43 41 23 41 41 22 41 41 19 41 ing p ( α ) in each iteration accordingly, which is doneby the algorithm automatically. If p ( α ) = p = 1 / p ≥ /

2. In particular,we observe that if the starting value α is larger thanthe requested regularization parameter α , less itera-tion till termination are needed than with the CPS-algorithm. On the contrary, if α is smaller than thedesired α , p = 1 / p as visiblefrom Fig. 6.4. In this plot we also specify the numberof iterations needed till termination. On the optimalchoice of p with respect to the number of iterations,we conclude from Fig. 6.4 that p = 32 seems to do agood job, although the optimal value may depend onthe noise-level.Similar behaviors as described above are also ob-served for denoising other and real images as well. Now, we consider the situation when an image is cor-rupted by some additive Gaussian noise and addition-ally blurred. Then the operator T is chosen according to the blurring kernel, which we assume here to be known.For testing the APS- and pAPS-algorithm in this casewe take the cameraman-image of Fig. 6.1(b), which isof size 256 ×

256 pixels, blur it by a Gaussian blurringkernel of size 5 × σ . The minimization problem in the APS- andpAPS-algorithm is solved approximately by the algo-rithm in (5.5). In Fig. 6.5 the progress of α n for diﬀer-ent σ ’s, i.e., σ ∈ { . , . , . } , and diﬀerent α ’s, i.e., α ∈ { , − , − } are presented. In these tests bothalgorithms converge to the same regularization param-eter and minimizer. From the ﬁgure we observe, thatthe pAPS-algorithm needs much less iterations than theAPS-algorithm till termination. This behavior might beattributed to the choice of the power p in the pAPS-algorithm, since we observe in all our experiments that p > It has been demonstrated that for removing impulsivenoise in images one should minimize the L -TV modelrather than the L -TV model. Then for calculating asuitable regularization parameter α in the L -TV modelwe use the APS- and pAPS-algorithm, in which theminimization problems are solved approximately by the L -TV α -algorithm. Here, we consider the cameraman- utomated Parameter Selection for Total Variation Minimization in Image Restoration 19 Iteration5 10 15 20 2500.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (a) σ = 0 .

3; pAPS-algorithm

Iteration5 10 15 20 25 3000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (b) σ = 0 .

1; pAPS-algorithm

Iteration10 20 30 40 5000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (c) σ = 0 .

05; pAPS-algorithm

Iteration50 100 150 200 25000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (d) σ = 0 .

3; APS-algorithm

Iteration20 40 60 80 100 120 14000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (e) σ = 0 .

1; APS-algorithm

Iteration20 40 60 80 10000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (f) σ = 0 .

05; APS-algorithm

Fig. 6.5

Reconstruction of the cameraman-image corrupted by Gaussian blurring kernel of size × and standard deviation . In the pAPS-algorithm we set p = 32 . image corrupted by salt-and-pepper noise or random-valued impulse noise with diﬀerent noise-levels, i.e., r = r ∈ { . , . , . } and r ∈ { . , . , . } respectively.The obtained results for diﬀerent α ’s are depicted inFig. 6.6 and Fig. 6.7. For the removal of salt-and-peppernoise we observe from Fig. 6.6 similar behaviors of theAPS- and pAPS-algorithm as above for removing Gaus-sian noise. In particular, both algorithms converge tothe same regularization parameter. However, in manycases the APS-algorithm needs signiﬁcantly less iter-ations than the pAPS-algorithm. These behaviors arealso observed in Fig. 6.7 for removing random-valuedimpulse noise as long as the APS-algorithm ﬁnds a so-lution. In fact, for r = 0 .

05 it actually does not convergebut oscillates as depicted in Fig. 6.7(c).6.2 Locally adaptive total variation minimizationIn this section various experiments are presented toevaluate the performance of the LATV- and pLATV-algorithm presented in Section 4. Their performance iscompared with the proposed pAPS-algorithm as wellas with the SA-TV-algorithm introduced in [37] for L -TV minimization and in [58] for L -TV minimization.We recall that the SA-TV methods perform an approxi-mate solution for the optimization problem in (1.10), re- spectively, and compute automatically a spatially vary-ing λ based on a local variance estimation. However,as pointed out in [37,58], they only perform eﬃcientlywhen the initial λ is chosen suﬃciently small, as we willdo in our numerics. On the contrary, for the LATV- andpLATV-algorithm any positive initial α is suﬃcient.For the comparison we consider four diﬀerent im-ages, shown in Fig. 6.1, which are all of size 256 × I i,j = Ω ωi,j , see [37], and we set the window-sizeto 11 ×

11 pixels in the case of Gaussian noise and to21 ×

21 pixels in case of impulse noise. For the LATV-and pLATV-algorithm we use the window-size ω = 11,if not otherwise speciﬁed, and choose p = .6.3 Gaussian noise removal We start this section by investigating the stability ofthe SA-TV-, LATV-, and pLATV-algorithm with re-spect to the initial regularization parameter, i.e., λ for the SA-TV-algorithm and α for the other algo-rithms, by denoising the cameraman-image corruptedby Gaussian white noise with standard deviation σ = Iteration50 100 150 200 25000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (a) r = r = 0 .

3; pAPS-algorithm

Iteration10 20 30 40 50 60 7000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (b) r = r = 0 .

1; pAPS-algorithm

Iteration20 40 60 8000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (c) r = r = 0 .

05; pAPS-algorithm

Iteration20 40 60 80 10002468101214 α = 1 α = 10 − α = 10 − (d) r = r = 0 .

3; APS-algorithm

Iteration2 4 6 8 10 12 14 16 180123456 α = 1 α = 10 − α = 10 − (e) r = r = 0 .

1; APS-algorithm

Iteration2 4 6 8 10 12 1400.511.522.53 α = 1 α = 10 − α = 10 − (f) r = r = 0 .

05; APS-algorithm

Fig. 6.6

Denoising of the cameraman-image corrupted by salt-and-pepper noise. In the pAPS-algorithm we set p = 32 . Iteration10 20 30 40 50 6000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (a) r = 0 .

3; pAPS-algorithm

Iteration10 20 30 40 50 6000.10.20.30.40.50.60.70.80.91 α = 1 α = 10 − α = 10 − (b) r = 0 .

1; pAPS-algorithm

Iteration5 10 15 20 25 30 35 4000.10.20.30.40.50.60.70.80.91 pAPS: α = 1pAPS: α = 10 − pAPS: α = 10 − APS: α = 1 (c) r = 0 .

05; pAPS-algorithm

Iteration2 4 6 8 10 12 14 16 1800.511.522.533.544.55 α = 1 α = 10 − α = 10 − (d) r = 0 .

3; APS-algorithm

Iteration2 4 6 8 1000.20.40.60.811.21.41.61.8 α = 1 α = 10 − α = 10 − (e) r = 0 .

1; APS-algorithm

Fig. 6.7

Denoising of the cameraman-image corrupted by random-valued impulse noise. In the pAPS-algorithm we set p =32 . utomated Parameter Selection for Total Variation Minimization in Image Restoration 21 α (a) α (b) Fig. 6.2

Denoising of the phantom-image corrupted withGaussian white noise with σ = 0 . . (a) Plot of the func-tion α → H τ ( u α ; g ) p α of the pAPS-algorithm with α = 10 − .(b) Plot of the function α → H τ ( u α ; g ) α of the APS-algorithmwith α = 10 − . The desired monotone behavior is observedafter the second iteration (red part of the curve). p α

42 18 18 2538 17 6 944 20 20 10 σ = 0 . σ = 0 . σ = 0 . Fig. 6.4

Regularization parameter α obtained by the pAPS-algorithm with diﬀerent p for denoising the phantom-imagecorrupted with Gaussian white noise for diﬀerent σ . .

1. In this context we also compare the diﬀerence ofthe pLATV-algorithm with and without using Algo-rithm 1 for computing automatically an initial param-eter, where we set c α = . The minimization prob-lems contained in the LATV- and pLATV-algorithmare solved as described in Section 5.1.1. For compar-ison reasons we deﬁne the values PSNR diﬀ := max α PSNR( α ) − min α PSNR( α ) and MSSIM diﬀ := max α MSSIM( α ) − min α MSSIM( α ) to measure the varia-tion of the considered quality measures. Here PSNR( α )and MSSIM( α ) are the PSNR and MSSIM values ofthe reconstructions, which are obtained from the con-sidered algorithms when the initial regularization pa-rameter is set to α . From Table 6.2 we observe thatthe pLATV-algorithm with and without Algorithm 1are more stable with respect to the initial regulariza-tion parameter than the LATV-algorithm and the SA-TV-algorithm. This stable performance of the pLATV-algorithm is reasoned by the adaptivity of the value p ,which allows the algorithm to reach the desired resid-ual (at least very closely) for any α . As expected, thepLATV-algorithm with Algorithm 1 is even more stablewith respect to α than the pLATV-algorithm alone,since, due to Algorithm 1, the diﬀerence of the actu-ally used initial parameters in the pLATV-algorithm israther small leading to very similar results. Note, thatif α is suﬃciently small, then the pLATV-algorithmwith and without Algorithm 1 coincide, see Table 6.2for α ∈ { − , − , − } . Actually in the rest of ourexperiments we choose α always so small that Algo-rithm 1 returns the inputted α . In Table 6.3 we report on the performance-tests of thepLATV-algorithm with respect to the chosen type ofwindow, i.e., I i,j = ˜ Ω ωi,j and I i,j = Ω ωi,j . We observethat independently which type of window is used thealgorithm ﬁnds nearly the same reconstruction. Thismay be attributed to the fact that the windows in theinterior are the same for both types of window. Nev-ertheless, the boundaries are treated diﬀerently, whichleads to diﬀerent theoretical results, but seems not tohave signiﬁcant inﬂuence on the practical behavior. Asimilar behavior is observed for the LATV-algorithm,as the LATV- and pLATV-algorithm return nearly thesame reconstructions as observed below in Table 6.4.Since for both types of windows nearly the same re-sults are obtained, in the rest of our experiments welimit ourselves to always set I i,j = ˜ Ω ωi,j in the LATV-and pLATV-algorithm.Next, we test the pLATV-algorithm for diﬀerent val-ues of the window-size varying from 3 to 15. Fig. 6.8 Table 6.2

PSNR and MSSIM of the reconstruction of the cameraman-image corrupted by Gaussian white noise with standarddeviation σ = 0 . via the LATV- and pLATV-algorithm with diﬀerent α and via the SA-TV-algorithm with diﬀerent λ . Inthe LATV- and pLATV-algorithm we use I i,j = ˜ Ω i,j with window-size × pixels in the interior and we set p = . SA-TV LATV pLATV pLATV with Algorithm 1 α / λ PSNR MSSIM PSNR MSSIM PSNR MSSIM PSNR MSSIM1 27.82 0.8155 27.44 0.8258 27.37 0.8260 27.37 0.816810 − − − − diff . . . . diff . . . . Table 6.3

PSNR and MSSIM of the reconstruction of diﬀerent images corrupted by Gaussian white noise with standarddeviation σ via the pLATV-algorithm with α = 10 − . pLATV with I = ˜ Ω pLATV with I = Ω Image σ PSNR MSSIM PSNR MSSIMcameraman 0 . . .

05 30.91 0.8875 30.92 0.88750 .

01 40.69 0.9735 40.68 0.9735lena 0 . . .

05 30.15 0.8301 30.15 0.83000 .

01 39.69 0.9699 39.68 0.9699 shows the PSNR and MSSIM of the restoration of thecameraman-image degraded by diﬀerent types of noise(i.e., Gaussian noise with σ = 0 . σ = 0 .

1, salt-and-pepper noise with r = r = 0 . r = r = 0 .

1, orrandom-valued impulse noise with r = 0 . r = 0 . α = 10 − and p =1 / Now we test the algorithms for diﬀerent images cor-rupted by Gaussian noise with zero mean and diﬀer-ent standard deviations σ , i.e., σ ∈ { . , . , . , . } .The initial regularization parameter α is set to 10 − inthe pAPS-, LATV-, and pLATV-algorithm. In the SA-TV-algorithm we choose λ = 10 − , which seems suﬃ-ciently small. From Table 6.4 we observe that all consid-ered algorithms behave very similar. However, for σ ∈ { . , . , . } the SA-TV-algorithm most of the timesperforms best with respect to PSNR and MSSIM, whilesometimes the LATV- and pLATV-algorithm have largerPSNR and MSSIM. That is, looking at these qualitymeasures a locally varying regularization weight is pre-ferred to a scalar one, as long as σ is suﬃciently small.In Fig. 6.9 we present the reconstructions obtained viathe considered algorithms and we observe that the LATV-and pLATV-algorithm generate visually the best re-sults, while the result of the SA-TV-algorithm seems insome parts over-smoothed. For example, the very lefttower in the SA-TV-reconstruction is completely van-ished. This object is in the other restorations still visi-ble. For large standard deviations, i.e. σ = 0 .

3, we ob-serve from Table 6.4 that the SA-TV method performsclearly worse than the other methods, while the pAPS-algorithm usually has larger PSNR and the LATV- andpLATV-algorithm have larger MSSIM. Hence, when-ever the noise-level is too large and details are consid-erably lost due to noise, the locally adaptive methodsare not able to improve the restoration quality.

For this experiment we consider the cameraman-imagedegraded by Gaussian white noise with variance σ =0 . Ω except a rather small utomated Parameter Selection for Total Variation Minimization in Image Restoration 23 Table 6.4

PSNR- and MSSIM-values of the reconstruction of diﬀerent images corrupted by Gaussian white noise withstandard deviation σ via pAPS-, LATV-, pLATV-algorithm with α = 10 − and SA-TV-algorithm with λ = 10 − . In theLATV- and pLATV-algorithm we use I i,j = ˜ Ω i,j with window-size × pixels in the interior and we set p = . pAPS (scalar α ) SA-TV LATV pLATVImage σ PSNR MSSIM PSNR MSSIM PSNR MSSIM PSNR MSSIMphantom 0 . .

84 0 . .

83 0 . . . . . . .

97 0 . .

50 0 . .

05 34 .

97 0 . .

77 0 . .

51 0 . .

01 48 . . .

38 0 . .

46 0 . . . . .

62 0 . .

03 0 . .

47 0 . .

47 0 . . .

31 0 . . . .

40 0 . . . .

05 30 .

75 0 . .

60 0 . .

95 0 . .

91 0 . .

01 40 .

51 0 . . . . . .

69 0 . . . . .

78 0 . . . . . . .

70 0 . .

53 0 . .

93 0 . .

05 28 .

22 0 . .

94 0 . .

49 0 . .

01 38 .

91 0 . .

56 0 . .

08 0 . .

08 0 . . . . .

09 0 . . . .

31 0 . . .

84 0 . .

31 0 . .

85 0 . .

05 30 .

06 0 . .

92 0 . .

16 0 . .

15 0 . .

01 39 .

62 0 . . . . . .

69 0 . area (highlighted in red in Fig. 6.10(a)), denoted by ˜ Ω ,where the variance is 6 times larger, i.e., σ = 0 .

015 inthis part. Since the noise-level is in this application nothomogeneous, the pLATV-algorithm presented in Sec-tion 4 has to be adjusted to this situation accordingly.This can be done by making ν τ (here τ = 2) locallydependent and we write ν τ = ν τ (ˆ u )( x ) to stress thedependency on the true image ˆ u and on the location x ∈ Ω in the image. In particular, for our experimentwe set ν = 0 .

015 in ˜ Ω , while ν = 0 . Ω \ ˜ Ω . Since ν τ is now varying, we also have to adjust the deﬁnitionof B τ and B τ to B τ ( u ) := Z Ω ν τ ( u )( x ) d x and B τ ( u h ) := X x ∈ Ω h ν τ ( u h )( x ) , respectively for the continuous and discrete setting. Mak-ing these adaptations allows us to apply the pLATV-algorithm as well as the pAPS-algorithm to the appli-cation of removing non-uniform noise.The reconstructions obtained by the pAPS-algorithm(with p = 32 and α = 10 − ) and by the pLATV-algorithm (with p = 1 / α = 10 − ) are shownin Figs. 6.10(b) and 6.10(c) respectively. Due to theadaptive choice of α , see Fig. 6.10(d) where light col-ors indicate a large value, the pLATV-algorithm is ableto remove all the noise considerably, while the pAPS-algorithm returns a restoration, which still retains noisein ˜ Ω .6.4 Deblurring and Gaussian noise removalThe performance of the algorithms for restoring im-ages corrupted by Gaussian blur with blurring kernel of size 5 × σ is re-ported in Table 6.5. Here we observe that the LATV-as well as the pLATV-algorithm outperform the SA-TV-algorithm for nearly any example. This observationis also clearly visible in Fig. 6.11, where the SA-TV-algorithm produces a still blurred output. The pAPS-algorithm generates very similar reconstructions as theLATV- and pLATV-algorithm, which is also reﬂectedby similar PSNR and MSSIM. Similarly as before, thepAPS-algorithm performs best when σ = 0 .

3, whilefor smaller σ the LATV-algorithm has always the bestPSNR.6.5 Impulse noise removalSince it turns out that the LATV- and pLATV-algorithmproduce nearly the same output, here, we compare onlyour pAPS- and pLATV-algorithm for L -TV minimiza-tion, with the SA-TV method introduced in [58], wherea semi-smooth Newton method is used to generate anestimate of the minimizer of (1.10). For the sake of a faircomparison an approximate solution of the minimiza-tion problem in the pLATV-algorithm is solved by thesemi-smooth Newton method described in Appendix A.For the SA-TV method we use the parameters sug-gested in [58] and hence I i,j = Ω ωi,j . Moreover, we set λ = 0 . Table 6.5

PSNR- and MSSIM-values of the reconstruction of diﬀerent images corrupted by Gaussian blur (blurring kernelof size × pixels with standard deviation 10) and additive Gaussian noise with standard deviation σ via pAPS-, LATV-,pLATV-algorithm with α = 10 − and SA-TV-algorithm with λ = 10 − . In the LATV- and pLATV-algorithm we use I i,j = ˜ Ω i,j with window-size × pixels in the interior and set p = . pAPS (scalar α ) SA-TV LATV pLATVImage σ PSNR MSSIM PSNR MSSIM PSNR MSSIM PSNR MSSIMphantom 0 . .

23 0 . . .

05 18.89 0.7784 19.20 0.8193 . . .

05 24.14 0.7562 23.75 0.7393 . . .

05 22.87 0.6245 22.88 0.6268 . . .

05 25.83 0.7047 25.81

Table 6.6

PSNR- and MSSIM-values of the reconstruction of diﬀerent images corrupted by salt-and-pepper noise with r = r via pAPS-, pLATV-algorithm with α = 10 − and SA-TV-algorithm with λ = 0 . and window-size × . In the pLATV-algorithm we use I i,j = ˜ Ω i,j with window-size × pixels in the interior and we set p = . pAPS (scalar α ) SA-TV pLATVImage r = r PSNR MSSIM MAE PSNR MSSIM MAE PSNR MSSIM MAEphantom 0 . . .

05 21.61 0.9257 . . . . . . . . .

05 29.45

SA-TV algorithm seems to be outperformed in mostexamples. For example, in Fig. 6.12 we observe that thepAPS- and pLATV-algorithm remove the noise consid-erable while the solution of the SA-TV method still con-tains noise. On the contrary for the removal of random-valued impulse noise in Fig. 6.13 we see that all threemethods produce similar restorations.

For L -TV and L -TV minimization including convo-lution type of problems automatic parameter selectionalgorithms for scalar and locally dependent weights α are presented. In particular, we introduce the APS-and pAPS-algorithm for automatically determining asuitable scalar regularization parameter. While for theAPS-algorithm its convergence only under some assump-tions is shown, the pAPS-algorithm is guaranteed toconverge always. Besides the general applicability ofthese two algorithms they also possess a fast numeri-cal convergence in practice. In order to treat homogeneous regions diﬀerentlythan ﬁne features in images, which promises a betterreconstruction, cf. Proposition 4.2 and Remark 4.1, al-gorithms for automatically computing locally adaptedweights α are proposed. These methods are much morestable with respect to the initial α than the state-of-the-art SA-TV method. Moreover, while in the SA-TV-algorithm the initial λ > α > α or λ is in general not always improving therestoration quality. utomated Parameter Selection for Total Variation Minimization in Image Restoration 25 Table 6.7

PSNR- and MSSIM-values of the reconstruction of diﬀerent images corrupted by random-valued impulse noisevia pAPS-, pLATV-algorithm with α = 10 − and SA-TV-algorithm with λ = 0 . and window-size × . In the pLATV-algorithm we use I i,j = ˜ Ω i,j with window-size × pixels in the interior and we set p = in the pLATV-algorithm. pAPS (scalar α ) SA-TV pLATVImage r PSNR MSSIM MAE PSNR MSSIM MAE PSNR MSSIM MAEphantom 0 . . .

05 25.55 0.9636 0.0058 . . . . . . . . . For computing a minimizer of the respective multi-scale total variation model we present ﬁrst and secondorder methods and show their convergence to a respec-tive minimizer.Although the proposed parameter selection algo-rithms are constructed to estimate the parameter α in(1.6) and (1.9), they can be easily adjusted to ﬁnd agood candidate λ in (1.7) and (1.10), respectively, aswell.Note, that the proposed parameter selection algo-rithms are not restricted to total variation minimiza-tion, but may be extended to other type of regularizersas well by imposing respective assumptions that guar-antee a minimizer of the considered optimization prob-lem. In order to obtain similar (convergence) resultsas presented in Section 3 and Section 4 the consideredregularizer should be convex, lower semi-continuous andone-homogeneous. In particular for proving convergenceresults as in Section 3 (cf. Theorem 3.1 and Theorem3.2) an equivalence of the penalized and correspond-ing constrained minimization problem, as in Theorem2.2, is needed. An example of such a regularizer forwhich the presented algorithms are extendable is thetotal generalized variation [13]. Acknowledgements

The author would like to thank M.Monserrat Rincon-Camacho for providing the spatially adap-tive parameter selection code for the L -TV model of [58]. A Semi-smooth Newton method for solving (5.9)

A semi-smooth Newton algorithm for solving (5.9) can bederived similar as in [58] by means of vector-valued variables.Therefore let u h ∈ R N , q h ∈ R N , α h ∈ R N , g h ∈ R N where N = N N , denote the discrete image intensity, the dualvariable, the spatially dependent regularization parameter,and the observed data vector, respectively. Correspondinglywe deﬁne ∇ h ∈ R N × N as the discrete gradient operator, ∆ h ∈ R N × N as the discrete Laplace operator, T h ∈ R N × N as the discrete operator, and ( T h ) t as the transpose of T h .Here | · | , max {· , ·} , and sign( · ) are understood for vectors ina componentwise sense. Moreover, we use the function [ | · | ] : R N → R N with [ | v h | ] i = [ | v h | ] i + N = q ( v hi ) + ( v hi + N ) for 1 ≤ i ≤ N .For solving (5.9) in every step of our Newton method weneed to solve  A hk − D h ( m β k ) 0 − β + µ ( T h ) t T h + κ∆ h − µµ + β ( T h ) t − ( ∇ h ) t B hk D h ( m γ k )   δ u δ v δ q  =  − F k − F k − F k  (A.1)where A hk = [ D h ( e N ) − D h ( v hk ) χ A βk D h (sign( T h u hk − g h ))] T h ,B hk = [ D h ( q hk ) χ A γk D h ( m γ k ) − M h ( ∇ h u hk ) − D h (( α h , α h ) t )] ∇ h , F k = T u hk − g h − D h ( m β k ) v hk , F k = − ( ∇ h ) t q hk + κ∆ h u hk − β + µ ( T h ) t ( T h u hk − g h ) − µµ + β ( T h ) t v hk , F k = − D h (( α h , α h ) t ) ∇ h u hk + D h ( m γ k ) q hk ,e N ∈ R N is the identity vector, D h ( v ) is a diagonal matrixwith the vector v in its diagonal, m β k = max { β, | T h u hk − g h |} , m γ k = max { γα h , [ |∇ h u hk | ] } , χ A βk = D h ( t β k ) with ( t β k ) i = ( m β k ) i = β, χ A γk = D h ( t γ k ) with ( t γ k ) i = ( m γ k ) i = γ ( α h ) i , M h ( v ) = (cid:18) D h ( v x ) D h ( v y ) D h ( v x ) D h ( v y ) (cid:19) with v = ( v x , v y ) t ∈ R N . Since the diagonal matrices D h ( m β k ) and D h ( m γ k ) are in-vertible, we eliminate δ v and δ q from (A.1), which leads to6 Andreas Langer window-size3 5 7 9 11 13 15 PS NR σ = 0 . r = r = 0 . r = 0 . window-size3 5 7 9 11 13 15 M SS I M σ = 0 . r = r = 0 . r = 0 . window-size3 5 7 9 11 13 15 PS NR σ = 0 . r = r = 0 . r = 0 . window-size3 5 7 9 11 13 15 M SS I M σ = 0 . r = r = 0 . r = 0 . Fig. 6.8

Restoration of the cameraman-image corrupted bydiﬀerent types of noise via the pLATV-method with diﬀerentwindow-sizes. the following resulting system H k δ u = f k (a) pAPS (PSNR: 27 .

31; MSSIM:0 . .

56; MSSIM:0 . .

40; MSSIM:0 . .

38; MSSIM:0 . Fig. 6.9

Reconstruction of the cameraman-image corruptedby Gaussian white noise with σ = 0 . . utomated Parameter Selection for Total Variation Minimization in Image Restoration 27(a) noisy observation(b) pAPS (PSNR: 30 .

01; MSSIM:0 . .

85; MSSIM:0 . α Fig. 6.10

Reconstruction of the cameraman-image cor-rupted by Gaussian white noise with σ = 0 . except inthe in (a) highlighted area where σ = 0 . . (a) pAPS (PSNR: 23 .

11; MSSIM:0 . .

64; MSSIM:0 . .

17; MSSIM:0 . .

15; MSSIM:0 . Fig. 6.11

Reconstruction of the cameraman-image cor-rupted by Gaussian white noise with σ = 0 . and Gaussianblur. .

56; MSSIM:0 . .

54; MSSIM:0 . .

51; MSSIM:0 . Fig. 6.12

Reconstruction of the barbara-image corrupted bysalt-and-pepper noise with r = r = 0 . . (a) noisy image(b) pAPS (PSNR: 24 .

24; MSSIM:0 . .

96; MSSIM:0 . .

24; MSSIM:0 . Fig. 6.13

Reconstruction of the barbara-image corrupted byrandom-valued impulse noise with r = 0 . . utomated Parameter Selection for Total Variation Minimization in Image Restoration 29where H k := 1 β + µ ( T h ) t T h − κ∆ h + µµ + β ( T h ) t D h ( m β k ) − A hk + ( ∇ h ) t D h ( m γ k ) − ( − B hk ) ,f k := F k − µµ + β ( T h ) t D h ( m β k ) − F k + ( ∇ h ) ∗ D h ( m γ k ) − F k . In general B hk and hence H k is not symmetric. In [59] it isshown that the matrix H hk at the solution ( u hk , v hk , q hk ) =(¯ u, ¯ v, ¯ q ) is positive deﬁnite whenever[ | q hk | ] i ≤ ( α h ) i and ( | v hk | ) i ≤ i = 1 , . . . , N .In case these two inequalities are not satisﬁed we project q hk and v hk onto their feasible set, i.e., (( q hk ) i , ( q hk ) i + N ) isset to ( α h ) i max { ( α h ) i , [ | q hk | ] i } − (( q hk ) i , ( q hk ) i + N ) and ( v hk ) i is replaced by max { , ( | v hk | ) i } ( v hk ) i . Then the modiﬁed sys-tem matrix, denoted by H + k is positive deﬁnite; see [36]. Aspointed out in [58] we may use H + k + ε k D h ( e N ) with κ = 0, ε k > ε k ↓ k → ∞ instead of κ > Semi-smooth Newton method:

Initialize ( u h , q h ) ∈ R N × R N and set k := 0.1. Determine the active sets χ A βk ∈ R N × N and χ A γk ∈ R N × N

2. If (A.2) is not satisﬁed, then compute H + k ; otherwiseset H + k := H k .3. Solve H + k δ u = f k for δ u .4. Compute δ q by using δ u .5. Update u hk +1 := u hk + δ u and q hk +1 := q hk + δ q .6. Stop or set k := k + 1 and continue with step 1).This algorithm converges at a superlinear rate, which fol-lows from standard theory; see [52,59].In our experiments we always choose κ = 0, β = 10 − , γ = 10 − , and µ = 10 . References

1. Alliney, S.: A property of the minimum vectors of a regu-larizing functional deﬁned by means of the absolute norm.Signal Processing, IEEE Transactions on (4), 913–917(1997)2. Almansa, A., Ballester, C., Caselles, V., Haro, G.: A TVbased restoration model with local constraints. J. Sci.Comput. (3), 209–236 (2008)3. Ambrosio, L., Fusco, N., Pallara, D.: Functions ofBounded Variation and Free Discontinuity Problems.Oxford Mathematical Monographs. The ClarendonPress, Oxford University Press, New York (2000)4. Aubert, G., Aujol, J.F.: A variational approach to re-moving multiplicative noise. SIAM Journal on AppliedMathematics (4), 925–946 (2008)5. Aujol, J.F., Gilboa, G., Chan, T., Osher, S.: Structure-texture image decomposition – modeling, algorithms,and parameter selection. International Journal of Com-puter Vision (1), 111–136 (2006). DOI 10.1007/s11263-006-4331-z 6. Babacan, S.D., Molina, R., Katsaggelos, A.K.: Parameterestimation in TV image restoration using variational dis-tribution approximation. Image Processing, IEEE Trans-actions on (3), 326–339 (2008)7. Bartels, S.: Numerical Methods for Nonlinear Partial Dif-ferential Equations, vol. 14. Springer (2015)8. Bertalm´ıo, M., Caselles, V., Roug´e, B., Sol´e, A.: TVbased image restoration with local constraints. Journalof scientiﬁc computing (1-3), 95–122 (2003)9. Blomgren, P., Chan, T.F.: Modular solvers for imagerestoration problems using the discrepancy principle. Nu-merical linear algebra with applications (5), 347–358(2002)10. Blu, T., Luisier, F.: The SURE-LET approach to im-age denoising. Image Processing, IEEE Transactions on (11), 2778–2786 (2007)11. Bovik, A.C.: Handbook of Image and Video Processing.Academic press (2010)12. Braides, A.: Γ -Convergence for Beginners, Oxford Lec-ture Series in Mathematics and its Applications , vol. 22.Oxford University Press, Oxford (2002)13. Bredies, K., Kunisch, K., Pock, T.: Total generalized vari-ation. SIAM J. Imaging Sci. (3), 492–526 (2010)14. Buades, A., Coll, B., Morel, J.M.: A review of image de-noising algorithms, with a new one. Multiscale Model.Simul. (2), 490–530 (2005)15. Cai, J.F., Chan, R.H., Nikolova, M.: Two-phase approachfor deblurring images corrupted by impulse plus Gaus-sian noise. Inverse Problems and Imaging (2), 187–204(2008)16. Calatroni, L., Chung, C., De Los Reyes, J.C., Sch¨onlieb,C.B., Valkonen, T.: Bilevel approaches for learn-ing of variational imaging models. arXiv preprintarXiv:1505.02120 (2015)17. Cand`es, E.J., Romberg, J., Tao, T.: Robust uncertaintyprinciples: Exact signal reconstruction from highly in-complete frequency information. Information Theory,IEEE Transactions on (2), 489–509 (2006)18. Chambolle, A.: An algorithm for total variation mini-mization and applications. J. Math. Imaging Vision (1-2), 89–97 (2004). Special issue on mathematics and imageanalysis19. Chambolle, A., Darbon, J.: On total variation minimiza-tion and surface evolution using parametric maximumﬂows. International journal of computer vision (3),288–307 (2009)20. Chambolle, A., Lions, P.L.: Image recovery via total vari-ation minimization and related problems. Numer. Math. (2), 167–188 (1997)21. Chambolle, A., Pock, T.: On the ergodic convergencerates of a ﬁrst-order primal–dual algorithm. Mathemati-cal Programming pp. 1–3522. Chambolle, A., Pock, T.: A ﬁrst-order primal-dual algo-rithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision (1), 120–145 (2011)23. Chan, T.F., Esedo¯glu, S.: Aspects of total variation regu-larized L function approximation. SIAM J. Appl. Math. (5), 1817–1837 (2005)24. Chan, T.F., Golub, G.H., Mulet, P.: A nonlinear primal-dual method for total variation-based image restoration.SIAM J. Sci. Comput. (6), 1964–1977 (1999)25. Chan, T.F., Shen, J., Zhou, H.M.: Total variation waveletinpainting. Journal of Mathematical imaging and Vision (1), 107–125 (2006)0 Andreas Langer26. Chung, V.C., De Los Reyes, J.C., Sch¨onlieb, C.B.: Learn-ing optimal spatially-dependent regularization parame-ters in total variation image restoration. arXiv preprintarXiv:1603.09155 (2016)27. Ciarlet, P.G.: Introduction to Numerical Linear Algebraand Optimisation. Cambridge Texts in Applied Mathe-matics. Cambridge University Press, Cambridge (1989).With the assistance of Bernadette Miara and Jean-MarieThomas, Translated from the French by A. Buttigieg28. Combettes, P.L., Wajs, V.R.: Signal recovery by proximalforward-backward splitting. Multiscale Model. Simul. (4), 1168–1200 (electronic) (2005)29. Darbon, J., Sigelle, M.: A fast and exact algorithm fortotal variation minimization. In: Pattern recognition andimage analysis, pp. 351–359. Springer (2005)30. Darbon, J., Sigelle, M.: Image restoration with discreteconstrained total variation. I. Fast and exact optimiza-tion. J. Math. Imaging Vision (3), 261–276 (2006)31. Daubechies, I., Defrise, M., De Mol, C.: An iterativethresholding algorithm for linear inverse problems witha sparsity constraint. Comm. Pure Appl. Math. (11),1413–1457 (2004)32. Daubechies, I., Teschke, G., Vese, L.: Iteratively solvinglinear inverse problems under general convex constraints.Inverse Probl. Imaging (1), 29–46 (2007)33. De Los Reyes, J.C., Sch¨onlieb, C.B.: Image denois-ing: Learning the noise model via nonsmooth PDE-constrained optimization. Inverse Probl. Imaging (4)(2013)34. Deledalle, C.A., Vaiter, S., Fadili, J., Peyr´e, G.: Stein un-biased gradient estimator of the risk (sugar) for multipleparameter selection. SIAM Journal on Imaging Sciences (4), 2448–2487 (2014)35. Dobson, D.C., Vogel, C.R.: Convergence of an iterativemethod for total variation denoising. SIAM J. Numer.Anal. (5), 1779–1791 (1997)36. Dong, Y., Hinterm¨uller, M., Neri, M.: An eﬃcient primal-dual method for L TV image restoration. SIAM Journalon Imaging Sciences (4), 1168–1189 (2009)37. Dong, Y., Hinterm¨uller, M., Rincon-Camacho, M.M.: Au-tomated regularization parameter selection in multi-scaletotal variation models for image restoration. J. Math.Imaging Vision (1), 82–104 (2011)38. Donoho, D.L., Johnstone, I.M.: Adapting to unknownsmoothness via wavelet shrinkage. Journal of the ameri-can statistical association (432), 1200–1224 (1995)39. Ekeland, I., T´emam, R.: Convex Analysis and VariationalProblems, Classics in Applied Mathematics , vol. 28, en-glish edn. Society for Industrial and Applied Mathemat-ics (SIAM), Philadelphia, PA (1999). Translated fromthe French40. Eldar, Y.C.: Generalized sure for exponential families:Applications to regularization. Signal Processing, IEEETransactions on (2), 471–481 (2009)41. Engl, H.W., Hanke, M., Neubauer, A.: Regularizationof Inverse Problems, Mathematics and its Applications ,vol. 375. Kluwer Academic Publishers Group, Dordrecht(1996)42. Fornasier, M., Naumova, V., Pereverzyev, S.V.: Param-eter choice strategies for multipenalty regularization.SIAM Journal on Numerical Analysis (4), 1770–1794(2014)43. Getreuer, P., Tong, M., Vese, L.A.: A variational modelfor the restoration of MR images corrupted by blur andRician noise. In: Advances in Visual Computing, pp.686–698. Springer (2011) 44. Gilboa, G., Sochen, N., Zeevi, Y.Y.: Texture preservingvariational denoising using an adaptive ﬁdelity term. In:Proc. VLsM, vol. 3 (2003)45. Giryes, R., Elad, M., Eldar, Y.C.: The projected GSUREfor automatic parameter tuning in iterative shrinkagemethods. Appl. Comput. Harmon. Anal. (3), 407–422(2011)46. Giusti, E.: Minimal Surfaces and Functions of BoundedVariation, Monographs in Mathematics , vol. 80.Birkh¨auser Verlag, Basel (1984)47. Goldstein, T., Osher, S.: The split Bregman method for L (2),323–343 (2009)48. Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge param-eter. Technometrics (2), 215–223 (1979)49. Hansen, P.C.: Analysis of discrete ill-posed problems bymeans of the L -curve. SIAM Rev. (4), 561–580 (1992)50. Hansen, P.C., O’Leary, D.P.: The use of the L -curve inthe regularization of discrete ill-posed problems. SIAMJ. Sci. Comput. (6), 1487–1503 (1993)51. He, C., Hu, C., Zhang, W., Shi, B.: A fast adaptive pa-rameter estimation for total variation image restoration.Image Processing, IEEE Transactions on (12), 4954–4967 (2014)52. Hinterm¨uller, M., Kunisch, K.: Total bounded variationregularization as a bilaterally constrained optimizationproblem. SIAM Journal on Applied Mathematics (4),1311–1333 (2004)53. Hinterm¨uller, M., Langer, A.: Subspace correction meth-ods for a class of nonsmooth and nonadditive convex vari-ational problems with mixed L /L data-ﬁdelity in imageprocessing. SIAM J. Imaging Sci. (4), 2134–2173 (2013)54. Hinterm¨uller, M., Langer, A.: Adaptive regularization forParseval frames in image processing. SFB-Report No.2014-014 p. 12 (2014)55. Hinterm¨uller, M., Langer, A.: Non-overlapping domaindecomposition methods for dual total variation based im-age denoising. Journal of Scientiﬁc Computing (2),456–481 (2015)56. Hinterm¨uller, M., Rautenberg, C.: Optimal selection ofthe regularization function in a generalized total variationmodel. Part I: Modelling and theory. WIAS preprint 2235(2016)57. Hinterm¨uller, M., Rautenberg, C., Wu, T., Langer, A.:Optimal selection of the regularization function in a gen-eralized total variation model. Part II: Algorithm, itsanalysis and numerical tests. WIAS preprint 2236 (2016)58. Hinterm¨uller, M., Rincon-Camacho, M.M.: Expected ab-solute value estimators for a spatially adapted regulariza-tion parameter choice rule in L -TV-based image restora-tion. Inverse Problems (8), 085,005, 30 (2010)59. Hinterm¨uller, M., Stadler, G.: An infeasible primal-dual algorithm for total bounded variation-based inf-convolution-type image restoration. SIAM J. Sci. Com-put. (1), 1–23 (2006)60. Kindermann, S., Osher, S., Jones, P.W.: Deblurring anddenoising of images by nonlocal functionals. MultiscaleModel. Simul. (4), 1091–1115 (electronic) (2005)61. Kunisch, K., Pock, T.: A bilevel optimization approachfor parameter learning in variational models. SIAM Jour-nal on Imaging Sciences (2), 938–983 (2013)62. Langer, A.: Subspace correction and domain decompo-sition methods for total variation minimization. Ph.D.thesis, Johannes Kepler Universt¨at Linz (2011)63. Langer, A., Osher, S., Sch¨onlieb, C.B.: Bregmanized do-main decomposition for image restoration. Journal ofScientiﬁc Computing (2-3), 549–576 (2013)utomated Parameter Selection for Total Variation Minimization in Image Restoration 3164. Le, T., Chartrand, R., Asaki, T.J.: A variational ap-proach to reconstructing images corrupted by Poissonnoise. Journal of mathematical imaging and vision (3),257–263 (2007)65. Li, F., Ng, M.K., Shen, C.: Multiplicative noise removalwith spatially varying regularization parameters. SIAMJ. Imaging Sci. (1), 1–20 (2010)66. Liao, H., Li, F., Ng, M.K.: Selection of regularization pa-rameter in total variation image restoration. J. Opt. Soc.Amer. A (11), 2311–2320 (2009)67. Lin, Y., Wohlberg, B., Guo, H.: UPRE method for totalvariation parameter selection. Signal Processing (8),2546–2551 (2010)68. Lorenz, D.A., Pock, T.: An inertial forward-backward al-gorithm for monotone inclusions. Journal of Mathemat-ical Imaging and Vision (2), 311–325 (2015)69. Mallows, C.L.: Some comments on C P . Technometrics (4), 661–675 (1973)70. Morozov, V.A.: Methods for Solving Incorrectly PosedProblems. Springer-Verlag, New York (1984). Translatedfrom the Russian by A. B. Aries, Translation edited byZ. Nashed71. Mumford, D., Shah, J.: Optimal approximations by piece-wise smooth functions and associated variational prob-lems. Comm. Pure Appl. Math. (5), 577–685 (1989)72. Nesterov, Y.: Smooth minimization of non-smooth func-tions. Math. Program. (1, Ser. A), 127–152 (2005)73. Ng, M.K., Weiss, P., Yuan, X.: Solving constrained total-variation image restoration and reconstruction problemsvia alternating direction methods. SIAM journal on Sci-entiﬁc Computing (5), 2710–2736 (2010)74. Nikolova, M.: Minimizers of cost-functions involving non-smooth data-ﬁdelity terms. Application to the processingof outliers. SIAM J. Numer. Anal. (3), 965–994 (elec-tronic) (2002)75. Nikolova, M.: A variational approach to remove outliersand impulse noise. Journal of Mathematical Imaging andVision (1-2), 99–120 (2004)76. Osher, S., Burger, M., Goldfarb, D., Xu, J., Yin, W.: Aniterative regularization method for total variation-basedimage restoration. Multiscale Model. Simul. (2), 460–489 (electronic) (2005)77. Papaﬁtsoros, K., Sch¨onlieb, C.B.: A combined ﬁrst andsecond order variational approach for image reconstruc-tion. Journal of mathematical imaging and vision (2),308–338 (2014)78. Ramani, S., Liu, Z., Rosen, J., Nielsen, J., Fessler, J.A.:Regularization parameter selection for nonlinear iterativeimage restoration and MRI reconstruction using GCVand SURE-based methods. IEEE Transactions on ImageProcessing (8), 3659–3672 (2012)79. Rockafellar, R.T.: Convex Analysis. Princeton Math-ematical Series, No. 28. Princeton University Press,Princeton, N.J. (1970)80. Rudin, L.I., Osher, S.: Total variation based imagerestoration with free local constraints. In: Image Pro-cessing, 1994. Proceedings. ICIP-94., IEEE InternationalConference, vol. 1, pp. 31–35. IEEE (1994)81. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total varia-tion based noise removal algorithms. Physica D: Nonlin-ear Phenomena (1), 259–268 (1992)82. Stein, C.M.: Estimation of the mean of a multivariatenormal distribution. The annals of Statistics pp. 1135–1151 (1981)83. Strong, D.M., Blomgren, P., Chan, T.F.: Spatially adap-tive local-feature-driven total variation minimizing image restoration. In: Optical Science, Engineering and Instru-mentation’97, pp. 222–233. International Society for Op-tics and Photonics (1997)84. Strong, D.M., Chan, T.F.: Spatially and scale adaptivetotal variation based regularization and anisotropic diﬀu-sion in image processing. In: Diusion in Image Processing,UCLA Math Department CAM Report. Citeseer (1996)85. Sutour, C., Deledalle, C.A., Aujol, J.F.: Adaptive reg-ularization of the NL-means: Application to image andvideo denoising. Image Processing, IEEE Transactionson (8), 3506–3521 (2014)86. Tadmor, E., Nezzar, S., Vese, L.: A multiscale imagerepresentation using hierarchical ( BV, L ) decomposi-tions. Multiscale Model. Simul. (4), 554–579 (electronic)(2004)87. Tadmor, E., Nezzar, S., Vese, L.: Multiscale hierarchicaldecomposition of images with applications to deblurring,denoising and segmentation. Commun. Math. Sci. (2),281–307 (2008)88. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-PosedProblems. Vh Winston (1977)89. Vogel, C.R.: Non-convergence of the L -curve regular-ization parameter selection method. Inverse Problems (4), 535–547 (1996)90. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.:Image quality assessment: from error visibility to struc-tural similarity. Image Processing, IEEE Transactions on (4), 600–612 (2004)91. Weiss, P., Blanc-F´eraud, L., Aubert, G.: Eﬃcient schemesfor total variation minimization under constraints in im-age processing. SIAM J. Sci. Comput. (3), 2047–2080(2009)92. Wen, Y.W., Chan, R.H.: Parameter selection for total-variation-based image restoration using discrepancy prin-ciple. IEEE Transactions on Image Processing21