Variational Depth from Focus Reconstruction
Michael Moeller, Martin Benning, Carola Schönlieb, Daniel Cremers
VVariational Depth from Focus Reconstruction
Michael Moeller, Martin Benning, Carola Sch¨onlieb, Daniel Cremers ∗ November 6, 2014
Abstract
This paper deals with the problem of reconstructing a depth map from a sequence of differentlyfocused images, also known as depth from focus or shape from focus . We propose to state the depthfrom focus problem as a variational problem including a smooth but nonconvex data fidelity term, anda convex nonsmooth regularization, which makes the method robust to noise and leads to more realisticdepth maps. Additionally, we propose to solve the nonconvex minimization problem with a linearizedalternating directions method of multipliers (ADMM), allowing to minimize the energy very efficiently.A numerical comparison to classical methods on simulated as well as on real data is presented. The basic idea for depth from focus (DFF) approaches is to assume that the distance of an object to thecamera at a certain pixel corresponds to the focal setting at which the pixel is maximally sharp. Thus, fora given data set of differently focused images, one typically first finds a suitable contrast measure at eachpixel. Subsequently, the depth at each pixel is determined by finding the focal distance at which the contrastmeasure is maximal. Figure 1 illustrates this approach. DFF differs from depth from defocus (cf. [1–3]) inthe sense that many images are given and depth clues are obtained from the sharpness at each pixel. Depthfrom defocus on the other hand tries to estimate the variance of a spatially varying blur based on a physicalmodel and uses only very few images. Generally, the measurements of differently focused images do notnecessarily determine the depth of a scene uniquely, such that the estimation of a depth map is an ill-posedproblem. The ambiguity of the depth map is particularly strong in textureless areas.The literature dealing with the shape from focus problem has very different contributions. A lot of workhas targeted the development of different measures for sharpness and focus (cf. [4] for an overview). Otherworks deal with different methods of filtering the contrast coefficients before determining the maximum, i.e.the depth. The ideas range from windowed averaging (e.g. [5]), over differently shaped kernels, to nonlinearfiltering as proposed in [6]. In [7,8], the authors proposed to detect regions with a high variance in the depthmap and suggested to smooth the depth in these parts by interpolation. Pertuz, Puig, and Garcia analyzedthe behavior of the contrast curve in order to identify and remove low-reliability regions of the depth mapin [9]. Ideas for using different focus measures and fusing the resulting depth estimates in a variationalapproach using the total variation (TV) regularization have been proposed in [10]. However, very little workhas been done on finding a noisefree depth map in a single variational framework.Formulating the shape from focus problem in a variational framework has the advantage that one clearlydefines a cost function and tries to find a solution which is optimal with respect to the costs. More impor-tantly, regularity can be imposed on the depth estimate itself, e.g. by favoring piecewise constant solutions.Additionally, our framework is robust to noise and controls the ill-posedness of the shape from focus problem. ∗ M.M. (corresponding author, email [email protected]) is with the Department of Mathematics, Technische Univer-sit¨at M¨unchen, Boltzmannstrasse 3, 85748 Garching. M.B. is with the Institute of Mathematics and Image Computing (MIC),Universit¨at zu L¨ubeck, Maria-Goeppert-Str. 3, 23562 L¨ubeck, Germany. C.S. is with the Department of Applied Mathematicsand Theoretical Physics, Centre for Mathematical Sciences, Wilberforce Road, Cambridge, CB3 0WA, United Kingdom. D.C.are with the Department of Computer Science, Technische Universit¨at M¨unchen, Boltzmannstrasse 3, 85748 Garching. a r X i v : . [ c s . C V ] N ov a) Visualization of the data as a cube. (b) Depth map reconstruction by finding the maximal con-trast. Figure 1: Example of a simple DFF reconstruction: The data set of images can be visualized as a data cube(a), where the z-direction corresponds to a change of focus from the front to the back. In order to determinethe focal setting at which the contrast is maximal, one picks a contrast measure, applies a windowed filteringto the contrast coefficients and selects the focal setting for which the coefficients are maximal. By knowingthe distance at which a region appears maximally sharp in a focal setting, one reconstructs the depth. Figure(b) shows an example of the depth map, where red values indicate being far away from the camera and bluevalues correspond to pixels close to the camera. The result was obtained by using the modified Laplaciancontrast measure with 9 × We propose to define a functional E : R n × m → R that maps a depth map d ∈ R n × m to an energy, where alow energy corresponds to a good depth estimate. The dimension of the depth map, n × m , coincides withthe dimension of each image. The energy consists of two terms, E = D + αR . The data fidelity term D takes the dependence on the measured data into account and the regularization term R imposes the desiredsmoothness. The parameter α determines the trade-off between regularity and fidelity. We find the finaldepth estimate as the argument that minimizes our energy:ˆ d = arg min d D ( d ) + αR ( d ) . (1)2ypical approaches from literature find the depth estimate by maximizing some contrast measure and werefer the reader to [4] for an overview of different contrast measures and their performance. Since we want toreconstruct the depth map by an energy minimization problem (rather than maximization) it seems naturalto choose the negative contrast at each pixel as the data fidelity term, D ( d ) = − (cid:88) i (cid:88) j c i,j ( d i,j ) , (2)where c i,j denotes the (precomputed) function that maps a depth at pixel ( i, j ) to the corresponding contrastvalue. With this choice our method is a generalization of methods that maximize the contrast at each pixelseparately, since they are recovered by choosing α = 0.The regularization term R imposes some smoothness on the recovered depth map and should thereforedepend on the prior knowledge we have about the expected depth maps. In this paper we use the discreteisotropic TV, R ( d ) = (cid:107) Kd (cid:107) , , where K is the linear operator such that Kd is a matrix with an approximationto the x -derivative in the first column, to the y -derivative in the second column, and (cid:107) g (cid:107) , := (cid:80) i (cid:113)(cid:80) j ( g i,j ) . In this work, we choose the well-known modified Laplacian (MLAP) function [16] as a measure of contrast.Note that under good imaging conditions the thorough study in [4] found that Laplacian based operatorslike the modified Laplacian consistently were among the contrast measures yielding the best performances.We determine MLAP( i, j, k ) = (cid:88) l | ( ∂ xx I )( i, j, l, k ) | + | ( ∂ yy I )( i, j, l, k ) | with ( ∂ xx I )( i, j, l, k ) denoting the approximate second derivative of the l -th color channel of the k -th imagein the focus sequence in x -direction at pixel ( i, j ) using a filter kernel [1 , − , i, j ) wedetermine the continuous contrast function c i,j ( x ) depending on a depth x by an eighth order polynomialapproximation to the MLAP( i, j, k ) data. Polynomial approximations for determining the contrast curve havepreviously been proposed in [17] and offer a good and computationally inexpensive continuous representationof the contrast, which we need for the variational formulation (2). Different from [17] we use a higher orderpolynomial to approximate a noise free contrast curve using all data points, see Figure 2 for an example.Obviously, areas with lots of texture are well suited for determining the depth. In areas with no texture(Figure 2 (b)) the contrast measure consists of mostly noise. The magnitude of the contrast values is verylow in this case, such that in a variational framework its influence will be small.We can see that the contrast values can be well approximated by smooth curves such as splines or - forthe sake of simplicity in the minimization method - by a higher order polynomial. However, we can alsosee that even the smoothed contrast curves will not be convex. Thus, our data fidelity does not couple thepixels, and is smooth, but nonconvex. The regularization term on the other hand is convex, but couples thepixels and is nonsmooth, i.e. not differentiable in the classical sense. In the next section we will proposean efficient algorithm that exploits the structure of the problem to quickly determine depth maps with lowenergies. 3 Distance N ega t i v e C on t r a s t (a) Contrast curve at green point Distance N ega t i v e C on t r a s t (b) Contrast curve at cyan point(c) Example image to illustrate the positions of the example points. Figure 2: Examples for how the contrast curves look at different pixels. While pixels with a lot of surround-ing structure (green point) result in a strong and clear contrast curve with an easy to identify minimum,textureless areas (cyan point) cause problems (as expected).
In the literature of nonconvex optimization, particularly the one related to image processing, there existapproaches for problems of the form smooth nonconvex plus nonsmooth convex that provably convergenceto critical points under mild conditions. Such approaches include for instance forward-backward splittings(FBS), see [ ? , 18] and the references therein, and methods based on the difference of convex functions (DC),e.g. [19]. The drawback of these methods is that they require to solve a problem like TV denoising at eachiteration which makes these schemes computationally expensive. A simple FBS approach that provablyconverges to a critical point would be d k +1 = arg min d (cid:107) d − d k + τ ∇ D ( d k ) (cid:107) + τ αR ( d ) . g along with the constraint g = Kd ,such that we can rewrite the energy minimization problem (1) as( ˆ d, ˆ g ) = arg min d,g D ( d ) + α (cid:107) g (cid:107) , such that g = Kd.
The constraint g = Kd is enforced iteratively by using the augmented Lagrangian method. The minimizationfor g and d is done in an alternating fashion such that a straight forward application of the ADMM wouldyield d k +1 = arg min d λ (cid:107) Kd − g k + b k (cid:107) + D ( d ) , (3a) g k +1 = arg min g λ (cid:107) g − Kd k +1 + b k (cid:107) + α (cid:107) g (cid:107) , , (3b) b k +1 = b k + ( Kd k +1 − d k +1 ) . (3c)Due to the nonconvexity of D ( d ) subproblem (3a) is still difficult to solve, which is the reason why weadditionally incorporate the FBS idea of linearizing the nonconvex term. The final computational schemebecomes d k +1 = arg min d λ (cid:107) Kd − g k + b k (cid:107) + 12 (cid:107) d − d k + τ ∇ D ( d k ) (cid:107) ,g k +1 = arg min g λ (cid:107) g − Kd k +1 + b k (cid:107) + ατ (cid:107) g (cid:107) , ,b k +1 = b k + ( Kd k +1 − d k +1 ) . (4)Note that now each subproblem can be solved very efficiently. The update for d involves the inversion ofthe operator ( λK T K + I ) (a discretization of I − λ ∆) which can be done using a discrete cosine transform.The update for g has a closed form solution known as soft shrinkage of z = Kd k +1 − b k , g i,j = z i,j (cid:107) z i, : (cid:107) · max( (cid:107) z i, : (cid:107) − ατ /λ, . In this sense, the proposed algorithm can also be interpreted as solving each FBS subproblem with asingle ADMM iteration (and keeping the primal and dual variables). We simply omit converging to theminimum of the convex energy but compute a new linearization of the data fidelity term at each iteration.From a different point of view, the above could also be interpreted as a generalization of Bregmanizedoperator splitting [21]. In the context of the literature on inverse problems the linearization is related tothe Levenberg-Marquardt method. Although the techniques contributing to the algorithm are all known,the authors are not aware of any literature using the method as stated above - even in the convex setting.Hence, let us briefly state a convergence result in the convex setting:
Theorem 3.1. If D and R are convex and the symmetric Bregman distance with respect to D , S D ( d, v ) = (cid:104) d − v, p d − p v (cid:105) with p d ∈ ∂D ( d ) , p v ∈ ∂D ( v ) , can be bounded by τ (cid:107) d − v (cid:107) , then the iterates d k producedby (4) converge to a solution ˆ d of (1) with τ (cid:16) S D ( p k , ˆ p ) + αS R ( g k , K ˆ d ) (cid:17) ≤ C/k and (cid:107) Kd k +1 − g k (cid:107) ≤ C/k for some constant C . K in our notation) is surjective. Unfortunately, the latter is not thecase for our problem. However, the proposed method numerically results in a stable optimization schemethat is more efficient than methods that solve a full convex minimization problem at each iteration.Another class of methods related to our approach are quadratic penalty methods, cf. [26], which wewould obtain by keeping b k fixed to zero and steadily increasing λ . Variations exist where the alternatingminimizations are repeated several times before increasing the penalty parameter τ , see e.g. [27]. We referthe reader to [28] for a discussion on the convergence analysis for this class of methods. While we find asteadily increasing parameter λ to improve the convergence speed, we observe that including the update in b k , i.e. doing ADMM instead of a quadratic penalty, leads to a much faster algorithm. The numerical experiments are divided into three parts. In the first part, we test the proposed framework onsimulated data with known ground truth and compare it to classical depth from focus methods. We use theMatlab code provided at by the authors of [9] and [4] for thesimulation of the defocused data as well as for computing depth maps. The simulation can take an arbitraryinput image (typically some texture) and creates a sequence of images (15 in our examples) that are focusblurred as if the textured had a certain shape. A mixture of intensity dependent and intensity independentGaussian white noise is added to the images to simulate realistic camera noise. We refer to [4] as well asto the implementation and demonstration at for details. Theclassical depth from focus code allows to choose different contrast measures and different filter sizes, i.e.different mean filters applied to the contrast values before the actual depth is determined. The depth canhave subpixel accuracy by using the default 3p Gaussian interpolation. The code additionally allows toapply a median filter to the final depth map to further average out errors. Since the objective of this paperis the introduction of variational methods to the depth from focus problem we limit our comparison to themodified Laplacian (MLAP) contrast measure for the variational as well as for the classical approach andjust vary the filtering strategies.We set up the variational model as proposed in Section 2, and minimize it computationally as described inSection 3. Figure 3 shows a comparison of our framework with differently filtered depth from focus methodsalong with the mean squared error (MSE) to the ground truth for a stack of differently focused images usingthe texture in the upper left of Figure 3 for the simulation.As we can see the variational approach behaves superior to the classical ones. The image used to simulatethe results shown in Figure 3 is textured almost everywhere such that parts with high contrast can often beidentified accurately. However, the simulated depth map changes quickly, which is the reason why there existsa region which does not appear perfectly sharp in any of the 15 simulated images. As we can see in Figure 3this region causes big problems and even a 41 ×
41 contrast filtering followed by an 11 ×
11 median filteringcannot remove the erroneous parts completely. Moreover, strong filtering can only be applied without visibleloss of details if the underlying depth map is smooth. The latter holds for the simulated depth maps usedfor the comparison, but is unrealistic for real data. The problem of unreliable regions has been addressedin [9] before, however, by making a binary decision for or against the reliability of each pixel. Our variationalapproach handles unreliable regions automatically, in the sense that they do not show large contrast maximasuch that their influence on the total energy is small. This allows the method to change the depth at these6igure 3: Comparison of different depth from focus methods for simulated data. Top row from left to right:Texture image used for the simulation, true simulated depth map, 3d view of the simulated scene. Bottomrow from left to right: MLAP depth estimate with 9 × ×
41 windowed filtering and an additional 11 ×
11 median filter on the resulting depth image, our proposedvariational scheme with regularization parameter α = 1 /
2. The mean squared error (MSE) is given for eachof the depth map reconstructions.pixels at low costs and, in this sense, leads to a fuzzy instead of a binary decision for the reliability of pixels.Additional to the simulation in Figure 3, we chose three more types of depth maps for our simulateddata. The root mean square error values (RMSE) in Table 4.1 show that the variational approach showedsuperior performance for all simulated shapes.Method /Shape Cone Plane Cosine SphereMLAP 1 2.51 9.87 4.07 8.64MLAP 2 1.41 5.14 1.83 5.06TV, α = 1 1.45 α = 1 / TV, α = 1 / × ×
41 windowed filtering of theMLAP contrast coefficients followed by a 11 ×
11 median filter on the depth map. TV denotes our variationalapproach using TV regularization for different parameters.Smooth surfaces are not very realistic scenarios for depth maps in practice, because they do not containjumps and discontinuities. Therefore, we compare the different methods on real data in the next section.7 .2 Experiments on Real Data
We test the proposed scheme on two data sets recorded with different cameras. One was downloaded fromSaid Pertuz’ website, , and consists of 33 color images with640 ×
480 pixels resolution captured with a Sony SNC RZ50P camera, f = 91mm.The results obtained by two differently filtered classical approaches as well as by using the variationalmethod with three different regularization parameters is shown in Figure 4. As we can see, the variationalapproaches yield smoother and more realistic depth maps. Even with very large filterings the MLAP depthestimate still contains areas that are erroneously determined to be in the image foreground. Depending onthe choice of regularization parameter the results of our variational approach show different levels of detailsand different amounts of errors. However, even for the smallest regularization parameter no parts of theimage are estimated to be entirely in the fore- or background. Generally, due to the rather low quality ofthe data and large textureless parts the estimation of the depth is very difficult on these images.(a) Example image (b) 9 × ×
41 windowed MLAP max-imization with 11 ×
11 median fil-ter.(d) Variational approach, α = 1 /
5. (e) Variational approach, α = 1 /
7. (f) Variational approach, α = 1 / . The latter dataset consists of 373 images of a tabletop scene with an original resolution of 1080 × ×
21 window and applying a 5 × × α = 1 / α = 1 /
4. (f) Variational approach with α = 1 / For the experiments shown in the previous subsection we used the minimization scheme described in Section3 which we initialized with the 15 ×
15 filtered MLAP depth estimate that have been additionally blurredwith a 21 ×
21 mean filter. The latter was done due to the observation that the minimization with falseinitializations that attains extreme values performs much worse than blurriness in a moderate range of depthvalues. As for the ADMM parameter we start with λ = 1 and increase the parameter at each iteration by2%. Since we are using the scaled form of ADMM we consequently have to divide b k by 1 .
02 at each iteration(see [20] for details). l og ( ene r g y − m i n ( ene r g y )) l og ( || d k + − d k || + || g k + − g k || ) l og ( || K d k − g k || ) Figure 6: Convergence plots for linearized ADMM algorithm applied to the variational DFF problem. Aswe can see the total energy as well as the difference of successive iterates show a nice and monotonic decay.The squared (cid:96) norm of the difference between successive iterates decays to an order of 10 − after the1000 iterations. The decay of (cid:107) Kd k − g k (cid:107) shows some small oscillations which we, however, only see for (cid:107) Kd k − g k (cid:107) ≤ − . It is remarkable that all three plots are almost linear (in the logarithmic domain)indicating a linear convergence of the displayed quantities.Starting with a small λ gives the two variables d and g the freedom to behave more independent of oneanother. Experimentally, this helps avoiding bad local minima. Increasing λ seems to lead to a stabilizationas well as to an acceleration of the algorithm. Note that similar effects have previously been observed in therelated primal-dual hybrid gradient algorithm in the nonconvex setting in [30]. Throughout our experimentsthe time step was fixed at τ = 8 and we ran our algorithm for 400 iterations.To give some numerical evidence of the convergence, we ran the algorithm for 1000 iterations on theARRI data with a regularization parameter of α = 1 /
12. We plotted the decay of energy in a logarithmicfashion, i.e. log( E ( d k ) − E ( d )), in Figure 6 on the left hand side. As we can see the algorithm shows10ice decay properties. Also included in Figure 6 are the decays of (cid:107) g k +1 − g k (cid:107) + (cid:107) d k +1 − d k (cid:107) (middle) and (cid:107) Kd k − g k (cid:107) (right), both plotted in a logarithmic fashion.Although (cid:107) Kd k − g k (cid:107) shows some oscillations in the logarithmic plot, it decayed nicely to less than10 − . One can verify from the optimality conditions arising from the updates in (4) that both residualconverging to zero provably leads to the convergence to a critical point of our nonconvex energy (alongsubsequences).The total computational for determining the sharpness, determining the polynomial fitting, computingan initial depth estimate and performing 400 iterations of the minimization algorithm took less than threeseconds on a sequence of 33 images with 640 ×
480 pixels resolution (corresponding to the image used infigure 4) using a Cuda-GPU implemetation. The source code is freely available at https://github.com/adrelino/variational-depth-from-focus . The authors would like to thank Adrian Haarbach, DennisMack, and Markus Schlaffer, who ported our Matlab code to the GPU. We’d like to point out that the maincomputational costs of the algorithm currently consists of determining the sharpness values (about 2.38seconds in the aforementioned example) - a task which has to be done even for the classical DFF approacheswhich do not use regularization. The actual minimization using the ADMM algorithm takes less than 0.2seconds.
In this paper we proposed a variational approach to the shape from focus problem, which can be seen as ageneralization of current common shape from focus approaches. It uses an efficient nonconvex minimizationscheme to determine depth maps which are based on prior knowledge such as a realistic depth map oftenbeing piecewise constant. We showed in several numerical experiments that the proposed approach oftenyields results superior to classical depth from focus techniques. In future work we will incorporate moresophisticated regularizations in our studies. Additionally, we’ll work on correcting the inherent loss ofcontrast caused by the total variation regularization by applying nonlinear Bregman iterations in the fashionof [31] to our nonconvex optimization problem.
Acknowledgements
M.B. and C.S. were supported by EPSRC GrantEP/F047991/1, and the Microsoft Research Connections.C.S. additionally acknowledges the support of the KAUST Award No. KUK-I1-007-43. M.M. and D.C. weresupported by the ERC Starting Grant ”ConvexVision”.
References [1] P. Favaro and S. Soatto, . Secaucus, NJ, USA: Springer New York, Inc., 2006.[2] P. Favaro, S. Soatto, M. Burger, and S. Osher, “Shape from defocus via diffusion,”
IEEE Transactionson Pattern Analysis and Machine Intelligence , vol. 30, no. 3, pp. 518–531, 2008.[3] X. Lin, J. Suo, X. Cao, and Q. Dai, “Iterative feedback estimation of depth and radiance from defocusedimages,” in
Computer Vision ACCV 2012 , ser. Lecture Notes in Computer Science. Springer BerlinHeidelberg, 2013, vol. 7727, pp. 95–109.[4] S. Pertuz, D. Puig, M. Garcia, and M. Angel, “Analysis of focus measure operators for shape-from-focus,”
Pattern Recogn. , vol. 46, no. 5, pp. 1415–1432, 2013.[5] A. Thelen, S. Frey, S. Hirsch, and P. Hering, “Improvements in shape-from-focus for holographic re-constructions with regard to focus operators, neighborhood-size, and height value interpolation,”
IEEETrans. on I. Proc. , vol. 18, no. 1, pp. 151–157, 2009.116] M. Mahmood and T.-S. Choi, “Nonlinear approach for enhancement of image focus volume in shapefrom focus,”
IEEE Trans. on Image Proc. , vol. 21, no. 5, pp. 2866–2873, 2012.[7] M. Muhammad, H. Mutahira, A. Majid, and T. Choi, “Recovering 3d shape of weak textured surfaces,”in
Proceedings of the 2009 International Conference on Computational Science and Its Applications , ser.ICCSA ’09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 191–197.[8] M. Muhammad and T. Choi, “An unorthodox approach towards shape from focus,” in
Image Processing(ICIP), 2011 18th IEEE International Conference on , 2011, pp. 2965–2968.[9] S. Pertuz, D. Puig, and M. Garcia, “Reliability measure for shape-from-focus,”
Image Vision Comput. ,vol. 31, no. 10, pp. 725–734, 2013.[10] M. Mahmood, “Shape from focus by total variation,” in , 2013, pp.1–4.[11] P. Favaro, “Recovering thin structures via nonlocal-means regularization with application to depth fromdefocus,” in
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on , 2010, pp.1133–1140.[12] H. Liu, Y. Jia, H. Cheng, and S. Wei, “Depth recovery from defocus images using total variation,”in
Computer Modeling and Simulation, 2010. ICCMS ’10. Second International Conference on , vol. 2,2010, pp. 146–150.[13] V. Namboodiri, S. Chaudhuri, and S. Hadap, “Regularized depth from defocus,” in
Image Processing,2008. ICIP 2008. 15th IEEE International Conference on , 2008, pp. 1520–1523.[14] V. Gaganov and A. Ignatenko, “Robust shape from focus via markov random fields,” in
GraphiCon’2009 ,2009, pp. 74–80.[15] L. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,”
PhysicaD , vol. 60, pp. 259–268, 1992.[16] S. Nayar and Y. Nakagawa, “Shape from focus,”
IEEE Trans. Pattern Anal. Mach. Intell. , vol. 16,no. 8, pp. 824–831, 1994.[17] M. Subbarao and T. Choi, “Accurate recovery of three-dimensional shape from image focus,”
IEEETrans. Pattern Anal. Mach. Intell. , vol. 17, no. 3, pp. 266–274, 1995.[18] H. Attouch, J. Bolte, and B. Svaiter, “Convergence of descent methods for semi-algebraic and tameproblems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods,”
Math. Prog. , vol. 137, no. 1-2, pp. 91–129, 2013.[19] T. P. Dinh, H. Le, H. L. Thi, and F. Lauer, “A Difference of Convex Functions Algorithm for SwitchedLinear Regression,” to appear in IEEE Transactions on Automatic Control.[20] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statisticallearning via the alternating direction method of multipliers,”
Found. Trends Mach. Learn. , vol. 3, no. 1,pp. 1–122, 2011.[21] X. Zhang, M. Burger, X. Bresson, and S. Osher, “Bregmanized nonlocal regularization fordeconvolution and sparse reconstruction,”
SIAM J. Imaging Sci. , vol. 3, no. 3, pp. 253–276, 2010.[Online]. Available: http://dx.doi.org/10.1137/090746379[22] J. Cai, S. Osher, and Z. Shen, “Split bregman methods and frame based image restoration,”
MultiscaleModeling & Simulation , vol. 8, no. 2, pp. 337–369, 2010.1223] T. Valkonen, “A primal-dual hybrid gradient method for non-linear operators with applications to mri,”
Inverse Problems , vol. 30, no. 5, p. 055012, 2014.[24] T. Moellenhoff, E. Strekalovskiy, M. Moeller, and D. Cremers, “The primal-dual hybrid gradient methodfor semiconvex splittings,” 2014, preprint. Arxiv: http://arxiv.org/pdf/1407.1723.pdf.[25] G. Li and T. Pong, “Splitting method for nonconvex composite optimization,” 2014, preprint. Arxiv:http://arxiv.org/pdf/1407.0753.pdf.[26] M. Nikolova, M. K. Ng, and C.-P. Tam, “Fast Nonconvex Nonsmooth Minimization Methods for ImageRestoration and Reconstruction,”
IEEE Transactions on Image Processing , vol. 19, no. 12, pp. 3073–3088, 2010.[27] D. Krishnan and R. Fergus, “Fast Image Deconvolution using Hyper-Laplacian Priors.” , 2009, pp.1033–1041.[28] H. Attouch, J. Bolte, P. Redont, and A. Soubeyran, “Proximal alternating minimization and projectionmethods for nonconvex problems: An approach based on the kurdyka-lojasiewicz inequality,”
Math.Oper. Res. , vol. 35, no. 2, pp. 438–457, May 2010.[29] S. Andriani, H. Brendel, T. Seybold, and J. Goldstone, “Beyond the kodak image set: A new referenceset of color image sequences,” in ,2013, pp. 2289–2293.[30] E. Strekalovskiy and D. Cremers, “Real-Time Minimization of the Piecewise Smooth Mumford-ShahFunctional,” in
Proceedings of the European Conference on Computer Vision (ECCV) , 2014, p. (Toappear).[31] M. Bachmayr and M. Burger, “Iterative total variation schemes for nonlinear inverse problems,”
InverseProblems , vol. 25, 2009. 13 ppendix: Proof of Theorem (3.1)
Proof.
Let ˆ d be a critical point and denote ˆ g = K ˆ d, ˆ b ∈ ατλ ∂ (cid:107) ˆ g (cid:107) , . Furthermore, denote d ke = d k − ˆ d , g ke = g k − ˆ g , b ke = b k − ˆ b , p ke = ∇ D ( d k ) − ∇ D ( ˆ d ) and q ke ∈ ∂ (cid:107) g k (cid:107) , − ∂ (cid:107) ˆ g (cid:107) , . Then the optimality conditionsarising from algorithm (4) yield0 = λK T ( Kd k +1 e − g ke + b ke ) + d k +1 e − d ke + τ p ke , − λ ( Kd k +1 e − g k +1 e + b ke ) + ατ q k +1 e . The inner product of the first equation with d k +1 e yields0 = λ (cid:104) Kd k +1 e − g ke + b ke , Kd k +1 e (cid:105) + (cid:107) d k +1 e (cid:107) − (cid:104) d ke , d k +1 e (cid:105) + τ (cid:104) p ke , d k +1 e (cid:105) = λ (cid:0) (cid:107) Kd k +1 e − g ke (cid:107) + (cid:107) Kd k +1 e (cid:107) − (cid:107) g ke (cid:107) (cid:1) + λ (cid:104) b ke , Kd k +1 e (cid:105) + τ (cid:104) p ke , d k +1 e (cid:105) + 12 (cid:0) (cid:107) d k +1 e (cid:107) + (cid:107) d ke − d k +1 e (cid:107) − (cid:107) d ke (cid:107) (cid:1) . The inner product of the second equation with g k +1 e results in0 = − λ (cid:104) Kd k +1 e − g k +1 e + b ke , g k +1 e (cid:105) + ατ (cid:104) q k +1 e , g k +1 e (cid:105) = λ (cid:0) (cid:107) Kd k +1 e − g k +1 e (cid:107) + (cid:107) g k +1 e (cid:107) − (cid:107) Kd k +1 e (cid:107) (cid:1) − λ (cid:104) b ke , g k +1 e (cid:105) + ατ (cid:104) q k +1 e , g k +1 e (cid:105) . Adding the two estimates above leads to0 = λ (cid:0) (cid:107) Kd k +1 e − g k +1 e (cid:107) + (cid:107) g k +1 e (cid:107) − (cid:107) g ke (cid:107) + (cid:107) Kd k +1 e − g ke (cid:107) (cid:1) + λ (cid:104) b ke , Kd k +1 e − g k +1 e (cid:105) + 12 (cid:0) (cid:107) d k +1 e (cid:107) + (cid:107) d ke − d k +1 e (cid:107) − (cid:107) d ke (cid:107) (cid:1) + τ (cid:104) p ke , d k +1 e (cid:105) + ατ S R ( g k +1 , ˆ g ) , (5)where we used the fact that (cid:104) q k +1 e , g k +1 e (cid:105) = S R ( g k +1 , ˆ g ). We use the update formula for b k +1 to obtain that (cid:104) b ke , Kd k +1 e − g k +1 e (cid:105) = 12 ( (cid:107) b k +1 e (cid:107) − (cid:107) b ke (cid:107) − (cid:107) Kd k +1 e − g k +1 e (cid:107) ) . (6)Due to the linearization we have the term (cid:104) p ke , d k +1 e (cid:105) in our current estimate instead of the symmetricBregman distance. Thus, we estimate (cid:104) p ke , d k +1 e (cid:105) = S D ( d k , ˆ d ) + (cid:104) p ke , d k +1 e − d ke (cid:105) = S D ( d k , ˆ d ) − S D ( d k , d k +1 ) + (cid:104) p k +1 , d k +1 − d k (cid:105) − (cid:104) ˆ p, d k +1 − d k (cid:105)≥ S D ( d k , ˆ d ) − S D ( d k , d k +1 ) + D ( d k +1 ) − D ( d k ) − (cid:104) ˆ p, d k +1 − d k (cid:105) , (7)where we used p k +1 ∈ ∂D ( d k +1 ) along with the convexity of D for the last inequality. Inserting (6) and (7)into (5) yields0 ≥ λ (cid:0) (cid:107) Kd k +1 e − g k +1 e (cid:107) + (cid:107) g k +1 e (cid:107) − (cid:107) g ke (cid:107) + (cid:107) Kd k +1 e − g ke (cid:107) (cid:1) + λ (cid:107) b k +1 e (cid:107) − (cid:107) b ke (cid:107) − (cid:107) Kd k +1 e − g k +1 e (cid:107) ) + 12 (cid:0) (cid:107) d k +1 e (cid:107) − (cid:107) d ke (cid:107) (cid:1) + 12 (cid:107) d ke − d k +1 e (cid:107) − τ S D ( d k , d k +1 ) + τ S D ( d k , ˆ d ) + ατ S R ( g k +1 , ˆ g ) + τ ( D ( d k +1 ) − D ( d k ) − (cid:104) ˆ p, d k +1 − d k (cid:105) ) ≥ λ (cid:0) (cid:107) g k +1 e (cid:107) − (cid:107) g ke (cid:107) + (cid:107) b k +1 e (cid:107) − (cid:107) b ke (cid:107) (cid:1) + λ (cid:107) Kd k +1 e − g ke (cid:107) + 12 (cid:0) (cid:107) d k +1 e (cid:107) − (cid:107) d ke (cid:107) (cid:1) + τ S D ( d k , ˆ d ) + ατ S R ( g k +1 , ˆ g ) + τ (cid:0) D ( d k +1 ) − D ( d k ) − (cid:104) ˆ p, d k +1 − d k (cid:105) (cid:1) , (cid:107) d k +1 e − d ke (cid:107) − τ S D ( p k , p k +1 ) ≥ k = 0 to n to obtain0 ≥ λ (cid:107) g n +1 e (cid:107) − (cid:107) g e (cid:107) + (cid:107) b n +1 e (cid:107) − (cid:107) b e (cid:107) ) + 12 ( (cid:107) d n +1 e (cid:107) − (cid:107) d e (cid:107) ) + λ n (cid:88) k =0 (cid:107) Kd k +1 e − g k (cid:107) + τ n (cid:88) k =0 (cid:16) S D ( d k , ˆ d ) + αS R ( g k +1 , ˆ g ) (cid:17) + τ (cid:16) D ( d n +1 ) − D ( ˆ d ) − (cid:104) ˆ p, d n +1 − ˆ d (cid:105) (cid:17)(cid:124) (cid:123)(cid:122) (cid:125) ≥ − τ (cid:16) D ( d ) − D ( ˆ d ) − (cid:104) ˆ p, d − ˆ d (cid:105) (cid:17) , such that we finally obtain λ (cid:107) g e (cid:107) + (cid:107) b e (cid:107) ) + 12 (cid:107) d e (cid:107) + τ (cid:16) D ( d ) − D ( ˆ d ) − (cid:104) ˆ p, d − ˆ d (cid:105) (cid:17) ≥ λ n (cid:88) k =0 (cid:107) Kd k +1 e − g k (cid:107) + n (cid:88) k =0 (cid:16) S D ( d k , ˆ d ) + αS R ( g k +1 , ˆ g ) (cid:17) . Since the sums are bounded for all n and the summands are nonnegative, we can conclude their convergenceto zero with a rate as least as fast as 1 /k/k