Combining Weighted Total Variation and Deep Image Prior for natural and medical image restoration via ADMM
Pasquale Cascarano, Andrea Sebastiani, Maria Colomba Comes, Giorgia Franchini, Federica Porta
AADMM-DIPTV: combining Total Variation and DeepImage Prior for image restoration
Pasquale Cascarano · Andrea Sebastiani · Maria Colomba ComesAbstract
In the last decades, unsupervised deep learning based methods havecaught researchers attention, since in many applications collecting a great amountof training examples is not always feasible. Moreover, the construction of a goodtraining set is time consuming and hard because the selected data have to beenough representative for the task. In this paper, we mainly focus on the DeepImage Prior (DIP) framework powered by adding the Total Variation regularizerwhich promotes gradient-sparsity of the solution. Differently from other existingapproaches, we solve the arising minimization problem by using the well knownAlternating Direction Method of Multipliers (ADMM) framework, decoupling thecontribution of the DIP L -norm and Total Variation terms. The promising per-formances of the proposed approach, in terms of PSNR and SSIM values, are ad-dressed by means of experiments for different image restoration tasks on syntheticas well as on real data. Keywords
ADMM · Deep Image Prior · Total Variation · Imge Restoration
The task of image restoration aims to recover a well-looking image, that is cleanand sharp, from a blurred and noisy observation. Mathematically, for a givenblurred and noisy image g ∈ R n , the problem can be re-written as an inverseproblem of the following form:Find u ∈ R n s.t. Hu + η = g (1) Pasquale CascaranoAddress: University of Bologna, Piazza di Porta San Donato, 5, BolognaE-mail: [email protected] SebastianiAddress: University of Bologna, Piazza di Porta San Donato, 5, BolognaE-mail: [email protected] Colomba ComesAddress: University of Rome Tor Vergata,E-mail: [email protected] a r X i v : . [ ee ss . I V ] S e p P. Cascarano - A. Sebastiani - MC Comes where H ∈ R n × n is a known operator which models the blur, η ∈ R n is a real-ization of the random white Gaussian noise affecting g . Problems of the form (1)are well known as ill-posed problems [1]. Therefore, it is impossible to invert theoperator H for finding u from (1) due to the lack of stability and/or uniquenessproperties. In the field of image restoration, a lot of methods based on differentapproaches have been proposed in order to provide an estimate u ∗ of the desiredsolution. The most famous and promising methods can be mainly divided in twocategories: regularized reconstruction based [2] and learning based methods [3; 4].The regularized reconstruction based approaches convert the problem into an op-timization problem whose objective function has the following form: u ∗ ∈ arg min u (cid:107) Hu − g (cid:107) + λR ( u ) , (2)where the first and the second term are referred as fidelity and regularizationterms, respectively. The fidelity term models the noise affecting g . In this work,we use the norm L since we suppose that the noise comes from a zero-meangaussian distribution with standard deviation σ . The regularization term encodesprior information on the solution, such as its sparsity or regularity [5]. The positivescalar parameter λ balances the trade-off between the two terms of the sum. Apopular choice for R is the Total Variation [6] which in the discrete setting isdefined as follows: T V ( u ) = n (cid:88) i =1 (cid:18)(cid:113) ( D h u ) i + ( D v u ) i (cid:19) , (3)where D h and D v are the first order finite difference discrete operators along thehorizontal and vertical axes, respectively. Recently, the learning-based approacheshave become popular due to their outstanding performances [3; 4]. The supervisedlearning-based methods [7] make use of Deep Neural Network (DNN) architecturesto learn the correlation between the degraded images and their cleaned counter-parts from a set of example pairs. In mathematical terms, they attempt to solvethe following minimization problem: θ ∗ ∈ arg min θ L ( f θ ( G ) , U ) , (4)where f θ is a fixed Deep Neural Network architecture with weights θ , L is a fixedloss function and { ( G, U ) } is a training set of degraded-cleaned example pairs(with G and U we mean the set of degraded images in input and the target,respectively). Once (4) is solved by means of standard stochastic optimizationalgorithms, e.g., ADAM [8] or Stochastic Gradient Descent (SGD) [9], for a givendegraded image g , an approximation u ∗ of the desired solution u is obtained as u ∗ = f θ ∗ ( g ). However, the success of this supervised framework is strictly relatedto the fixed training set, whose elements in some practical applications are hardto collect. For this reason, we focus on unsupervised deep learning based methods.One of the most famous unsupervised approach is called Deep Image Prior (DIP),introduced by Ulyanov et al. in [10] and studied in [11; 12; 13], which aims to solve the following minimization problem: θ ∗ ∈ arg min θ (cid:107) Hf θ ( z ) − g (cid:107) (5)s.t. x ∗ = f θ ∗ ( z ) , DMM-DIPTV 3 where f θ is a fixed Convolutional Neural Network (CNN) architecture and z is arandom input vector. The CNN generator is initialized with random weights andthese weights are iteratively optimized so that the output of the network f θ ∗ ( z )is as close to the target u as possible. Ulyanov showed empirically how the archi-tecture of Deep CNN, used as a generative network, is able to represent naturalimages, but not random noise, without a fixed set of training examples. In thiswork, we show how the performances of DIP can be powered by introducing afurther regularizer to the objective function introduced in (5) and how the re-sulting optimization problem can be solved in an Alternating Direction Methodof Multipliers (ADMM) framework [14]. In the second section we introduce ourADMM-DIPTV method and we show how the resulting ADMM substeps can beeasily solved. In the third section we present some numerical experiments on testproblem and we compare the results with standard DIP [10] and DIPTV [15]. The main goal of the ADMM-DIPTV is to increase the performances of the DeepPrior framework adding the isotropic Total Variation regularizer in (3). Our pro-posal attempt to solve the unconstrained and no-convex minimization problem:arg min θ (cid:107) Hf θ ( z ) − g (cid:107) + λ N (cid:88) i =1 (cid:113) ( D h f θ ( z )) i + ( D v f θ ( z )) i , (6)which is equivalent to the following constrained optimization problem:arg min θ,t (cid:107) Hf θ ( z ) − g (cid:107) + λ N (cid:88) i =1 (cid:107) t i (cid:107) (7)s.t. Df θ ( z ) = t. We attempt to solve the minimization problem (7) by means of the ADMMalgorithm, which has been recently deeply investigated and applied in a no-conveximage restoration framework [16; 17; 18]. The Lagrangian function with respectto the problem (7) reads: L ( θ, t, λ t ) = 12 (cid:107) Hf θ ( z ) − g (cid:107) + λ N (cid:88) i =1 (cid:107) t i (cid:107) + β t (cid:107) Df θ ( z ) − t (cid:107) + < λ t , Df θ ( z ) − t >, (8)where β t is a positive scalar, called penalty parameter, λ t is the Lagrangian parameter associated with the constrain Df θ ( z ) = t . According to the ADMMframework, we seek for its saddle point by minimizing with respect to the primalvariable θ and t , alternatively, and by maximizing with respect to the dual variable λ t . Upon suitable initialization of the variable involved, the k -th iteration of theADMM iterative algorithm reads as follows: P. Cascarano - A. Sebastiani - MC Comes θ k +1 ∈ arg min θ (cid:107) Hf θ ( z ) − g (cid:107) + β t (cid:107) Df θ ( z ) − t k + λ kt β t (cid:107) t k +1 = arg min t λ N (cid:88) i =1 (cid:107) t i (cid:107) + β t (cid:107) t − ( Df θ k +1 ( z ) + λ kt β t ) (cid:107) λ k +1 t = λ kt + β t ( Df θ k +1 ( z ) − t k +1 ) (9)(10)(11)The first problem (9) is solved inexactly by applying one iterate of the gradient-based ADAM method with respect to the variable θ . The numerical gradient iscomputed by means of back-propagation [19; 20]. We observe that this optimiza-tion problem is very close in spirit to the one solved in the classical DIP framework.In this particular case, we force Df θ k +1 ( z ) to be close to λ kt β t − t k . From a numericalpoint of view, this squared L -norm term provides a stabilizing and robustifyingeffect to the DIP minimization. The second problem (10) can be easily solved ina closed form by applying the L -norm proximity operator in 2 D to the point Df θ k +1 ( z ) + λ kt β t .Our method is inspired by the DIPTV method introduced in [15]. However,we point out that differently from our ADMM-DIPTV, the method DIPTV makeuse of the anisotropic Total Variation [21] which decouples the contribution ofthe horizontal and vertical gradient components, while our method exploits theisotropic Total Variation in which the gradient components are jointly considered.Moreover, DIPTV directly minimize the objective function using a gradient-basedoptimization method (e.g. ADAM or SGD), whereas in this work we make use ofthe ADMM procedure described previously.In this paper, we have used the CNN architecture represented in Figure 1 whichis an adaptation of the U-net architecture proposed in [10]. I npu t O u t pu t Skips Connections n s [ i ] Encoder Decoder skip max pool upsampleLeaky ReLU3 × × u [ i ] d [ i ] n s [ i ] Fig. 1: The CNN architecture (based on the U-net) used in our method [10; 15].
DMM-DIPTV 5
In this section, we present the experimental results on the image denoising prob-lem which means that H is set equals to the identity operator in (1). For ourADMM-DIPTV and for the other competing methods (i.e., DIP and DIPTV), thealgorithmic parameters and the number of iterations to be performed are opti-mized in order to obtain the best trade-off between PSNR metric and the visualquality expressed by the SSIM metric [22]. We consider a set of 6 images composedby 4 RGB and 2 grayscale images. These images are shown in Figure 2.The starting degraded grayscale and RGB images are created by applying theimage formation model (1) to the images in Fig. (2). The codes and the imagesused for these numerical experiments are available online .First, we test our algorithm on the standard butterfly image corrupted withdifferent levels of gaussian noise and we compare the reconstruction got with ourmethod with respect to the standard DIP algorithm. These results, represented inFigure 3, show that our method preserve the image structure outperforming theDIP performances also on high noise levels, both in terms of PSNR and SSIM. Inparticular, the ADMM-DIPTV recover smooth regions without introducing visualartifacts.As second test, we compare the standard DIP method with DIPTV and ourADMM-DIPTV on the geometric grayscale tomo image depicted in (2). The imagehas some low contrast and high contrast patches with big and small objects. Wecorrupt the ground truth by adding a white gaussian noise with standard deviationequals to 30. The starting noisy image is reported in the Fig. (4) with 4 differentclose-up highlighting low and high contrast as well as small and big details. Wepoint out that the small and low contrast circles are completely covered by the https://github.com/sedaboni/ADMM-DIPTV Fig. 2: The set of images used for the numerical experiments. Top row, from leftto right: butterfly , lighthouse and house images. Bottom row, from left to right: hill , tomo and camaraman images. P. Cascarano - A. Sebastiani - MC ComesPSNR=24.185SSIM=0.864,PSNR=27.638SSIM=0.926,PSNR=28.378SSIM=0.946, PSNR=24.568SSIM=0.891,PSNR=28.212SSIM=0.943,PSNR=29.978SSIM=0.960,
Fig. 3: ADMM-DIPTV (top row) and DIP (bottom row) reconstructions for theRGB butterfly test image with different level of degradation. In the green and redboxes we depict the reconstructed images and the starting noisy images, respec-tively. From left to right the standard deviation of the noise is equals to 10,25,50,respectively.added gaussian noise (yellow arrows). The image obtained by applying the DIPalgorithm shows some issues in reconstructing the low contrast patches in theimage: the edges are not sharp and look out of focus, and the small details arenot perfectly retrieved. Moreover, the close-up showing a high contrast crossed andcircular objects reveals the presence of artifacts over the edges and the noise seemsto not be perfectly removed. The addition of TV to the standard DIP frameworkseems to solve all the aforementioned issues. For both DIPTV and ADMM DIP-TV, the low contrast small circles are both perfectly retrieved while the highercontrast details show sharp edges.In order to analyse in details the results on this synthetic image over low andhigh contrast details, we represent in Fig. 5 the line profiles of rows 90 (low contrastline profile) and 370 (high contrast line profile). The first and the second rows ofFig. 5 are the low and high contrast line profiles, respectively. We show by redlines the starting corrupted line profiles and its reconstructions by DIP, DIPTV and our ADMM-DIPTV, all of them superimposed on to the ground-truth lineprofiles (blue lines). It is evident that the standard DIP is insufficient to get rid ofthe noise since both the low and high contrast line profiles look swinging. The lowcontrast line profile obtained by DIP misses the small low contrast peak, whichcorresponds to the small low contrast circle in the solution by DIP reported in
DMM-DIPTV 7
NOISY DIPDIPTV ADMM-DIPTV
ADMM-DIPTVADMM-DIPTVADMM-DIPTV
ADMM-DIPTV
Fig. 4: DIP, DIPTV and ADMM-DIPTV reconstructions for a grayscale test im-age tomo degraded with noise level σ = 30 (NOISY). The red squares highlightdifferent close-up of various patches. The yellow arrows highlight the two smalllow contrast circles. P. Cascarano - A. Sebastiani - MC Comes
Fig. (4). This simple test shows how the use of TV promotes the reconstruction ofpiecewise constant regions with respect to the standard DIP. However, we observethat the variable splitting in our ADMM-DIPTV does not suffer of loss of contrastissues as well as DIPTV.In the case of RGB image denoising we compare the performances of the threemethods on the hill image degraded with an additive white gaussian noise withvariance σ = 30. In Fig. 6 we depict these results with 4 different close-up high-lighting some main details. Differently from the DIP, the introduction of TV regu-larization filter the noise and preserve the details, without the presence of artifacts.We can notice again that DIPTV and ADMM-DIPTV are more consistent in pres-ence of noise. Moreover, with respect to the DIPTV, ADMM-DIPTV improves thesharpness on the edges without loss of focus on small details.Finally, we perform some tests on all the 6 images in Fig. 2 with different noiselevels, for the purpose of determining which algorithm performs better. In Table 1we report the PSNR and SSIM metrics of the aforementioned results. In general,our method outperforms the others especially for high noise levels. The gain rangesfrom 0.1 dB to 1 dB for the PSNR metric. In this paper we have presented a new algorithm which extend the classical DIPframework by adding an isotropic Total Variation term and solving the arisingoptimization problem in an ADMM framework, differently from DIPTV [15] whichmake use of an anisotropic Total Variation term and solves the arising optimizationproblem by a standard gradient-based method. The usage of the ADMM splittingensures stability to the algorithm as the noise increases and allows to add differenthandcrafted regularizers to the standard DIP framework. Our approach reachescomparable and often better performances than the standard DIP [10] and DIPTV[15].
Fig. 5: Line profiles by the starting tomo grayscale test image, degraded withnoise level σ = 30, and DIP, DIPTV and ADMM-DIPTV reconstructions. Therows considered are the 90- th (top) and 370- th (bottom). DMM-DIPTV 9
NOISY DIPDIPTV ADMM-DIPTV
ADMM-DIPTVADMM-DIPTVADMM-DIPTVADMM-DIPTV
Fig. 6: DIP, DIPTV and ADMM-DIPTV reconstructions for a RGB test image hill degraded with noise level σ = 30 (NOISY). The blue squares highlight differentclose-up of various patches. σ = 20 cameraman lighthouse hill house tomoPSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIMDIP 29.087 0.908 30.670 0.933 31.585 0.920 24.920 0.831 36.371 0.944DIPTV 29.169 0.908 30.708 0.939 31.727 0.924 23.968 0.791 35.760 0.944ADMM-DIPTV 29.882 0.920 31.015 0.942 31.926 0.925 24.989 0.832 36.333 0.942 σ = 30DIP 27.576 0.881 28.425 0.907 29.337 0.885 24.456 0.825 31.840 0.919DIPTV 28.008 0.890 29.062 0.925 30.535 0.915 23.691 0.788 30.999 0.917ADMM-DIPTV 28.098 0.895 29.142 0.921 30.641 0.914 24.825 0.833 32.983 0.926 σ = 40DIP 25.441 0.820 26.706 0.875 27.382 0.847 23.794 0.808 29.764 0.910DIPTV 26.044 0.858 27.426 0.901 29.121 0.901 23.550 0.786 29.525 0.911ADMM-DIPTV 26.495 0.861 27.568 0.905 29.307 0.902 23.659 0.804 30.203 0.917 σ = 50DIP 23.510 0.728 25.925 0.860 27.424 0.862 22.717 0.780 27.463 0.894DIPTV 24.988 0.831 26.252 0.883 28.165 0.892 22.751 0.773 27.333 0.903ADMM-DIPTV 25.583 0.846 26.474 0.891 27.549 0.884 22.850 0.782 29.996 0.915 Table 1: SSIM and PSNR values for DIP,DIPTV and ADMM-DIPTV on testproblems with noise levels σ = 20 , ,
40 and 50. In blue we highlight the bestresults.
Acknowledgements
This work has been partially supported by GNCS-INDAM.
References
1. Miguel Moscoso. Introduction to image reconstruction. In
Inverse Problemsand Imaging , pages 1–16. Springer, 2008.2. KB Amur. Some regularization strategies for an ill-posed denoising problem.
International Journal of Tomography and Statistics , 19(1):46–59, 2012.3. Harold C Burger, Christian J Schuler, and Stefan Harmeling. Image denoising:Can plain neural networks compete with BM3D? In , pages 2392–2399. IEEE, 2012.4. Stamatios Lefkimmiatis. Non-local color image denoising with ConvolutionalNeural Networks. In
Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , pages 3587–3596, 2017.5. W Clem Karl. Regularization in image restoration and reconstruction. In
Handbook of Image and Video Processing , pages 183–V. Elsevier, 2005.6. Leonid I Rudin, Stanley Osher, and Emad Fatemi. Nonlinear total variationbased noise removal algorithms.
Physica D: nonlinear phenomena , 60(1-4):259–268, 1992.7. Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio.
Deep learning , volume 1. MIT press Cambridge, 2016.8. Diederik P Kingma and Jimmy Ba. ADAM: A method for stochastic opti-mization. arXiv preprint arXiv:1412.6980 , 2014.9. L´eon Bottou. Stochastic gradient descent tricks. In
Neural networks: Tricksof the trade , pages 421–436. Springer, 2012.
DMM-DIPTV 11
10. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Deep Image Prior. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition , pages 9446–9454, 2018.11. Dongwei Ren, Kai Zhang, Qilong Wang, Qinghua Hu, and Wangmeng Zuo.Neural blind deconvolution using deep priors. In
Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition , pages 3341–3350,2020.12. S¨oren Dittmer, Tobias Kluth, Peter Maass, and Daniel Otero Baguer. Regu-larization by architecture: A deep prior approach for inverse problems.
Journalof Mathematical Imaging and Vision , 62(3):456–470, 2020.13. Daniel Otero Baguer, Johannes Leuschner, and Maximilian Schmidt. Com-puted tomography reconstruction using Deep Image Prior and learned re-construction methods.
Inverse Problems , 36(9):094004, sep 2020. doi:10.1088/1361-6420/aba415.14. Mingyi Hong and Zhi-Quan Luo. On the linear convergence of the alternatingdirection method of multipliers.
Mathematical Programming , 162(1-2):165–199, 2017.15. Jiaming Liu, Yu Sun, Xiaojian Xu, and Ulugbek S Kamilov. Image restorationusing Total Variation regularized Deep Image Prior. In
ICASSP 2019-2019IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP) , pages 7715–7719. IEEE, 2019.16. Mingyi Hong, Zhi-Quan Luo, and Meisam Razaviyayn. Convergence analy-sis of alternating direction method of multipliers for a family of nonconvexproblems.
SIAM Journal on Optimization , 26(1):337–364, 2016.17. Yu Wang, Wotao Yin, and Jinshan Zeng. Global convergence of ADMM innonconvex nonsmooth optimization.
Journal of Scientific Computing , 78(1):29–63, 2019.18. Pasquale Cascarano, Luca Calatroni, and Elena Loli Piccolomini. On theinverse Potts functional for single-image super-resolution problems. arXivpreprint arXiv:2008.08470 , 2020.19. Yves Chauvin and David E Rumelhart.
Backpropagation: theory, architectures,and applications . Psychology press, 1995.20. Dougal Maclaurin, David Duvenaud, and Ryan P Adams. Autograd: Effortlessgradients in numpy. In
ICML 2015 AutoML Workshop , volume 238, page 5,2015.21. JS Moll. The anisotropic total variation flow.
Mathematische Annalen , 332(1):177–218, 2005.22. Alain Hore and Djemel Ziou. Image quality metrics: PSNR vs. SSIM. In201020th international conference on pattern recognition