[PDF] Image Restoration by Solving IVP

Abstract

Recent research on image restoration have achieved great success with the aid of deep learning technologies, but, many of them are limited to dealing SR with realistic settings. To alleviate this problem, we introduce a new formulation for image super-resolution to solve arbitrary scale image super-resolution methods. Based on the proposed new SR formulation, we can not only super-resolve images with multiple scales, but also find a new way to analyze the performance of super-resolving process. We demonstrate that the proposed method can generate high-quality images unlike conventional SR methods.

Full PDF

IImage Restoration by Solving IVP

Seobin Park and

Tae Hyun Kim

Hanyang University, Seoul, South Korea { seobinpark, taehyunkim } @hanyang.ac.kr Abstract

Recent research on image restoration have achievedgreat success with the aid of deep learning tech-nologies, but, many of them are limited to deal-ing SR with realistic settings. To alleviate thisproblem, we introduce a new formulation for im-age super-resolution to solve arbitrary scale imagesuper-resolution methods. Based on the proposednew SR formulation, we can not only super-resolveimages with multiple scales, but also ﬁnd a newway to analyze the performance of super-resolvingprocess. We demonstrate that the proposed methodcan generate high-quality images unlike conven-tional SR methods.

Super-resolution (SR) task is used to increase the image res-olution by estimating its underlying high frequency details.However, SR is a highly ill-posed problem since for anylowresolution input, there are multiple high-resolution solu-tions, which makes it a challenging problem. Therefore, con-siderable methods have been studied to solve SR problem.First, interpolation-based methods are simple and efﬁcient,and thus they are widely used in many applications. How-ever, these naive approaches have a clear performance limita-tion. Deep-learning based methods are being very successfulin generating high quality images from lowresolution image,and provide quantitatively promising results. Moreover, sev-eral generative SR networks which are composed of a high-resolution image generator and discriminator, can generatevisually more plausible results with the help of numerous per-ceptual losses (e.g. VGG and adversarial losses). However,these previous learning-based SR approaches are limited toﬁxed scaling factors (e.g., x2, x3, and x4) to allow quick in-ference. Therefore, there has been several researches to solveSR problem with an arbitrary scaling factor. Recently, therehas been increasing attempts to model differential equationswith neural network. In particular, Neural ordinary differen-tial equations (Neural ODE), allows to formulate a differen-tial path from low-resolution input to highresolution outputwith a neural network. However, this Neural ODE based SRapproach uses the Neural ODE as an intermediate layer of HRgenerator without clear reasoning in the process of modeling a differential equation. Therefore, in this work, we tackle thisproblem and present a new SR approach with Neural ODE,and our contributions can be summarized as follows:• We propose a new formulation to perform superresolu-tion on arbitrary scale.• We ﬁnd a way to analyze the differential path from low-resolution to high-resolution.• Our method achieves performance close to existingstate-of-the-art methods.

Recent works on single image restoration focus on learningmapping functions between degraded image and original im-age. SRCNN [Dong et al. , 2015] proposed to learn the non-linear mapping from LR image to HR image using a CNNmodel for the ﬁrst time. VDSR [Kim et al. , 2016b] increasedthe depth of CNN to model more complex LR-HR mappings.Recent studies have applied different kinds of skip connec-tions to ease the optimization process [Lim et al. , 2017;Zhang et al. , 2018].Parallel to devising improved feed-forward CNN architec-tures, many attempts have been done to develop SISR meth-ods that can be better applied to real-world situations. Re-alSR [Cai et al. , 2019] proposed a more realistic degradationmodel to make more natural training dataset and presenteda new method to cope with the given degradation settings.Meta-SR [Hu et al. , 2019] proposed Meta-Upscale module tohandle arbitrary scale factors for SISR. gradual approaches have been proven to perform well in low-level vision tasks. In addition to learning a complex non-linear mapping of a low-quality image to a high-quality one,these approaches decompose this process into multiple stepsand iteratively reﬁne the output images [Kim et al. , 2016a; Tai et al. , 2017; Haris et al. , 2018; Li et al. , 2019]. DBPN [Haris et al. , 2018] proposed a dedicated neural network design thatprovides an iterative error correcting mechanism to addressmutual dependencies of LR and HR images. SRFBN [Li et al. , 2019] proposed an efﬁcient recurrent neural network a r X i v : . [ ee ss . I V ] F e b hat employs the feedback mechanism, which iteratively im-proves the input of the network in each step. However, thesemethods do not have a clear underlying formulation or the-oretical analysis on the gradual image restoration process,which consequently require a large amount of engineeringto develop the neural architectures [Li et al. , 2019; Haris etal. , 2018] and specialized training strategies [Li et al. , 2019;Kim et al. , 2016a]. LapSRN [Lai et al. , 2017] can producelarge SR results (e.g., x8) with intermediate SR results (e.g.,x2, x4), but it can only handle the pre-determined discretescale factors such as x2, x4, and x8. Recently, many attempts have been done to integrate differ-ential formulations and deep learning methodologies. Theseattempts have led to neural ordinary differential equations(NODE) [Chen et al. , 2018]. NODE is a new family of deepneural network models that parameterizes a differential formusing a neural network and produces the output by using anODE solver. Meanwhile, differential equations have oftenbeen involved in image restoration task by modeling non-linear reaction-diffusion and total variation schemes [Chenand Pock, 2016; Rudin et al. , 1992]. Such is also the casein deep-learning-based image restoration methods. [He et al. ,2019] designed new neural network architectures inspired bythe differential equation solving process, such as Leapfrogand Runge–Kutta approaches. In particular, to solve the SRproblem, [Scao, 2020] utilized NODE as a part of neural ar-chitecture with an integration. However, the interval of theintegral is not ﬁxed (open) and they did not present any con-crete formulation on the intermediate images involved in thegradual SR process. Consequently, they need to empiricallyspeculate the optimal neural architecture and the integrationinterval.

In this section, we present a new neural approach for the grad-ual SR reconstruction. We ﬁrst formulate the SR problem asa gradual SR process with an ODE. We then elaborate how toperform SR with the proposed formulation and how to trainit.

Existing SR methods utilizing gradual SR process [Lai et al. ,2017; Haris et al. , 2018; Li et al. , 2019] are based on iterativemulti-stage approaches and can be viewed as variants of thefollowing: I n = g n − ( I n − ) ( n ≤ N ) , (1)where n denotes the iteration step, I denotes the given initialinput LR image, and I n is the iteratively reﬁned image fromits previous state I n − . These approaches typically producemultiple intermediate HR images during the reﬁnement, andthe rendered image at the last N -th iteration [Tai et al. , 2017;Li et al. , 2019] or a combined version of the multiple inter-mediate images ( { I n } ≤ n ≤ N ) [Kim et al. , 2016a; Haris et al. ,2018] becomes the ﬁnal SR result. Although these previousgradual methods show promising SR results, they still have some limitations. First, these methods need plenty of timeand effort in determining the network conﬁgurations includ-ing the number of gradual updates N and hyper-parametersettings, and designing cost functions to train the SR net-works g . In addition, well-engineered and dedicated learningstrategy, such as curriculum learning [Li et al. , 2019] and re-cursive supervision [Kim et al. , 2016a], is required for eachmethod. This complication comes from the lack of clear un-derstanding on their intermediate image states { I n } . To al-leviate these problems, we formulate the gradual SR processwith a differential equation. This allows us to implement andtrain the SR networks in an established way while outper-forming the performance of conventional gradual SR process.Assume that ( I HR ) ↓ t is a downscaled version of a ground-truth clean image I HR using a traditional SR kernel (e.g.,bicubic) with a scaling factor t . We then deﬁne I ( t ) by up-scaling ( I HR ) ↓ t using that SR kernel with a scaling factor t so that I HR and I ( t ) have the same spatial resolution (seethe illustration of “Generating LR image” in (a). Note that t ≥ , and I (1) denotes the ground-truth clean image I HR .To model a gradual SR process,we ﬁrst estimate the high-frequency image residual with a neural network. Speciﬁcally,when t is a conventional discrete-scaling factor (e.g., x2, x3,and x4), image residual between I ( t ) and I ( t − can bemodeled using a neural network f discrete as: I ( t − − I ( t ) = f discrete ( I ( t ) , t ) . (2)Notably, I ( t − includes more high-frequency details than I ( t ) without loss of generality. In our method, we model theslightest image difference to formulate a continuously grad-ual SR process. Therefore, we take the scale factor t to con-tinuous domain, and reformulate (2) as an ODE with a neuralnetwork f as:where θ denotes the trainable parameter of the network f .Using this formulation, we can predict the high-frequency im-age detail required to slightly enhance I ( t ) with the network f . (Note that we can obtain I ( t ) with any rational number t by adding padding to the border of image before resizingand then center cropping the image.) As existing SR neuralnetworks have been proven to be successful at predicting thehigh-frequency residual image [Kim et al. , 2016b], we canuse conventional SR architectures as our network f withoutmajor changes. In this section, we explain how to super-resolve a given LRimage with a continuous scaling factor using our ODE-basedSR formulation.First, we obtain I ( t ) by upscaling the given LR input im-age (“Test time LR image” in (a) using the bicubic SR kernelto a desired output resolution with a scaling factor t . Next,we solve the ODE initial value problem with the initial con-dition I ( t ) by integrating the neural network f from t to to acquire the high-quality image I (1) as follows:As the neural network f is modeled to predict desired high-frequency details, our formulation gradually adds the pre-dicted ﬁne details from the input LR image I ( t ) throughhe integration shown as the solid orange line in (a). Thus,our SR approach becomes a gradual SR process which addsthe high-frequency details gradually. To compute the inte-gration with f in the proposed formulation, we use conven-tional ODE solvers to numerically calculate the output image I (1) . Speciﬁcally, we approximate the high-quality image I (1) given a fully trained neural network f , network parame-ter θ , initial condition I ( t ) , and integral interval [ t , usingan ODE solver ( ODESolve () ) as: I (1) ≈ HR (3)Thus, our method does not need to consider the stop condi-tion (i.e., the number of feedback iterations) of the gradualSR process unlike conventional approaches [Tai et al. , 2017;Li et al. , 2019]. Notably, we can use conventional ODEsolvers to render the desired outputs for the inference, butthe solutions should be differentiable to train the network f through the backpropagation scheme. We compare the SRperformance with different ODE solvers (e.g., Runge-Kuttaand Euler methods) in our experiments.Our formulation is made upon a continuous context, al-lows a continuous scale factor t where t ≥ . This makesour method able to handle the arbitrary-scale SR problem.But unlike conventional multi-scale SR methods [Kim et al. ,2016b; Lim et al. , 2017; Hu et al. , 2019] that successfullylearn multi-scale SR tasks by sharing common features acrossvarious scales, we explicitly learn the relationship betweenimages with different scales in image domain itself ratherthan the feature space. To train the deep neural network f , and learn the parameter θ in (3), we minimize the loss summed over scale factors t using the L1 loss function as: L ( θ ) = (cid:88) t (cid:107) I HR − ODESolve ([ t, (cid:107) . (4)By minimizing the proposed loss function, our network pa-rameter θ is trained to estimate the image detail to be addedinto the network input.Notably, during the training phase, we need to employ anODE solver which allows end-to-end training using back-propagation with other components such as the neural net-work f . Unlike other gradual SR methods [Kim et al. , 2016a;Li et al. , 2019], we do not require any other learning strate-gies like curriculum learning during the training phase. In this section, we carry out extensive experiments to demon-strate the superiority of the proposed method, and add variousquantitative and qualitative comparison results. We also pro-vide detailed analysis of our experimental results. We willrelease our source code upon acceptance.

Network conﬁguration.

We use VDSR [Kim et al. , 2016b]and RDN [Zhang et al. , 2018] as backbone CNN architecturesfor our network f with slight modiﬁcations. For each CNN architecture, we change the ﬁrst layer tofeed the scale factor t as an additional input. To be speciﬁc,we extend the input channel from 3 to 4, and the pixel lo-cations of the newly concatenated channel (4-th channel) areﬁlled with a scalar value t as shown in (b). In addition, forRDN, we remove the last upsampling layer so that input andoutput resolutions are the same in our work. Note that, noextra parameters are added except for the ﬁrst layers of thenetworks.To train and infer the proposed SR process, we use thePython torchdiffeq library [Chen et al. , 2018] to employRunge–Kutta (RK4) method as our ODE solver in (4), whichrequires only 6 additional lines of code with PyTorch.For simplicity, our approaches with VDSR and RDN back-bones are called vdsr and RDN in the remaining parts of theexperiments, respectively. Dataset and evaluation.

We use the DIV2K [Agustssonand Timofte, 2017] dataset to train our vdsr and RDN. Dur-ing the training phase, we augment the dataset using randomcropping, rotating, and ﬂipping.During the test phase, we evaluate the SR results in termsof PSNR and SSIM metrics on the standard benchmarkdatasets (Set14 [Zeyde et al. , 2010], B100 (BSD100) [Martin et al. , 2001], and Urban100 [Huang et al. , 2015]). To be con-sistent with previous works, quantitative results are evaluatedon the Y (luminance) channel in the YCbCr color space.

Training setting.

We train the network by minimizing theL1 loss in (4) with the Adam optimizer ( β = 0 . , β =0 . , (cid:15) = 10 − ) [Kingma and Ba, 2015]. The initial learn-ing rate is set as − , which is then decreased by half every100k gradient update steps, and trained for 600k iterations intotal. The mini-batch size of vdsr is 16 (200x200 patches), butour vdsr takes 8 patches as a mini-batch (130x130 patches)owing to the memory limit of our graphic units. Similar tothe training settings in Meta-SR [Hu et al. , 2019], we trainthe network f by randomly changing the scale factor t in (4)from 1 to 4 with a stride of 0.1 (i.e., t ∈ { . , . , . , ..., } ). First, we compare our RDN with several state-of-the-art grad-ual SR methods: DRCN [Kim et al. , 2016a], LapSRN [Lai et al. , 2017], DRRN [Tai et al. , 2017], D-DPBN [Haris etal. , 2018], and SRFBN [Li et al. , 2019]. As in [Lim et al. ,2017], self-ensemble method is used to further improve RDN(denoted as RDN+). Note that, our RDN and RDN+ can han-dle multiple scale factors t including non-integer scale factors(e.g., x1.5) using the same network parameter. In contrast,other approaches are required to be trained for certain discreteinteger scale factors (x2, x3, and x4) separately, resulting ina distinct parameter set for each scale factor. Nevertheless,quantitative restoration results show that our RDN, RDN+consistently outperforms conventional gradual SR methodsfor the discrete integer scaling factors (x2, x3, and x4) interms of PSNR.We investigate intermediate images produced during thegradual SR process with the scale factors x2 and x4. Finalresults by DRRN are obtained after 25 iterations, and theﬁnal results by SRFBN are obtained with 4 iterations as in ethods Scale x1.1 x1.2 x1.3 x1.4 x1.5 x1.6 x1.7 x1.8 x1.9 x2.0 x2.1 x2.2 x2.3 x2.4 x2.5bicubic 36.56 35.01 33.84 32.93 32.14 31.49 30.90 30.38 29.97 29.55 29.18 28.87 28.57 28.31 28.13VDSR - - - - - - - - - 31.90 - - - - -VDSR+t 39.51 38.44 37.15 36.04 34.98 34.15 33.39 32.78 32.22 31.70 31.27 30.86 30.53 30.2 29.91vdsr (ours) RDN - - - - - - - - - 32.34 - - - - -RDN+t 42.83 39.92 38.18 36.87 35.71 34.80 33.99 33.34 32.77 32.22 31.76 31.33 30.99 30.64 30.34Meta-RDN 42.82 40.04 38.28 36.95 35.86 34.90 34.13 33.45 32.86 32.35 31.82 31.41 31.06 30.62 30.45RDN (ours) 43.22 40.06 38.35 37.02 35.86 34.95 34.14 33.47 32.89 32.34 31.89 31.46 31.12 30.76 30.46RDN+ (ours)

Methods Scale x2.6 x2.7 x2.8 x2.9 x3.0 x3.1 x3.2 x3.3 x3.4 x3.5 x3.6 x3.7 x3.8 x3.9 x4.0bicubic 27.89 27.66 27.51 27.31 27.19 26.98 26.89 26.59 26.60 26.42 26.35 26.15 26.07 26.01 25.96VDSR - - - - 28.83 - - - - - - - - - 27.29VDSR+t 29.64 29.39 29.15 28.93 28.74 28.55 28.38 28.22 28.05 27.89 27.76 27.58 27.47 27.34 27.20vdsr (ours)

RDN - - - - 29.26 - - - - - - - - - 27.72RDN+t 30.06 29.80 29.55 29.33 29.12 28.92 28.76 28.59 28.43 28.26 28.13 27.95 27.84 27.71 27.58Meta-RDN 30.13 29.82 29.67 29.40

Table 1: Average PSNR values on the B100 dataset evaluated with different scale factors. The best performance is shown in bold number . GT Bicubic(x4) Meta-RDN

NODE-RDN(ours)

GT Bicubic(x2.5) Meta-RDN

NODE-RDN(ours) ‘196073’ from B100‘48026’ from B100

Figure 1: Visual comparison of RDN (ours) with Meta-RDN onscale x2.5 and x4. their original settings. We provide 4 intermediate HR imagesduring the updates for visual comparisons. For our RDN,intermediate image states are represented as ˆ I ( t i ) where ≤ t i ≤ t and ˆ I ( t i ) = ODESolve ( I ( t ) , f, θ, [ t , t i ]) .We observe that DRRN and SRFBN fail to gradually re-ﬁne patches with high-frequency details, while our RDNcan gradually improve the intermediate images and renderpromising results at the ﬁnal states. Our approach can handle a continuous scale factor forthe SR task, thus we compare ours with existing multi-scale SR methods that can handle continuous scale factors:VDSR [Kim et al. , 2016b] and Meta-SR [Hu et al. , 2019].Notably, Meta-SR implemented using RDN (i.e., Meta-RDN)is the current state-of-the-art SR approach.In Table 1, we show quantitative results compared to exist-ing SR methods (VDSR, RDN, and Meta-RDN). Note that,VDSR+t and RDN+t are modiﬁed versions of VDSR andRDN to take the scale factor t as an additional input of thenetworks and have the same input and output resolutions asin our network f . We also compare our method with these new baselines (VDSR+t and RDN+t) for fair comparisons.We evaluate the SR performance on the B100 benchmarkdataset by increasing the scaling factor from 1.1 to 4.Interestingly, we observe that vdsr outperforms VDSR andVDSR+t at every scale by a large margin although VDSRand VDSR+t have similar network architecture to our vdsr.Similarly, RDN shows better performance than Meta-RDNand RDN+t. We also provide qualitative comparison resultswith Meta-SR in Figure 1, and we see that our RDN recoversmuch clearer edges than Meta-RDN. Interpolation and extrapolation.

We experiment ourmethod on various scale factors that are not shown during thetraining phase. In Figure 2, we plot PSNR values from vdsrand RDN by changing the scale factor on the B100 dataset.We see that our method learns an interpolation ability and cansuccessfully deal with unseen scales between 1 and 4 (e.g.,1.15, 1.25, ... 3.95). Moreover, ours also learns an extrapo-lation ability and handles unseen scale factors larger than 4(e.g., 4.1, 4.2, ... , 4.5). To sum up, our proposed SR pro-cess has a power of generalization (i.e., interpolation and ex-trapolation abilities), even the network is trained with only alimited number of scale factors.

SR performance with different ODE solvers.

We experi-ment our method with different ODE solvers (e.g., Euler andRK4 methods). Note that Euler method is computationallycheaper than RK4, but RK4 provides more accurate approx-imation results generally. Similarly, in Table 2, we see thatvdsr trained with RK4 shows slightly better SR performancethan vdsr trained with Euler method on the B100 and Set5datasets. This result suggests that, we can employ conven-tional ODE solvers to solve our own SR problem, but thequality of the predicted HR images are relying on the per-formance of the employed ODE solver.

Visual output of the network f . In Figure 3, to see the in-termediate results by the network f during the gradual SRuler Method Runge-Kutta Methodx2 x3 x4 x2 x3 x4B100 31.92 28.89 27.33 Set5 37.57 33.92 31.50

Table 2: Benchmark results of vdsr trained with Euler and Runge-Kutta methods on different scale factors.

Bold number indicatesbetter SR performance. P S N R NODE-RDN (seen scale factors)NODE-RDN (unseen scale factors)NODE-VDSR (seen scale factors)NODE-VDSR (unseen scale factors)Bicubic

Figure 2: PSNR evaluations of bicubic upscaling, vdsr, and RDN onthe B100 dataset by changing the scale factor from 1.1 to 4.5 withstride 0.05. Dotted marks correspond to seen scale factors during thetraining process (e.g., 1.1, 1.2, ..., 4.0) and cross marks correspondto unseen scale factors (e.g., 1.15, 1.25, ..., 4.5) during the training. procedure at the test stage, we compute absolute value of f (ˆ I ( t ) , t, θ ) where t decreases from 4 to 1, and the initialcondition is I ( t = 4) .Interestingly, on the sharp patch corresponding to theeye (red box), the absolute values are higher when t issmall. While, on the homogeneous patch corresponding tothe cheek (green box), the absolute values are higher when t is large. Recall that I ( t ) becomes close to the ground-truth image when t gets small, and the image difference d I ( t ) dt ( ≈ f (ˆ I ( t ) , t, θ )) includes more high-frequency compo-nents. Therefore, the absolute values of the network at the eyeregion which includes high-frequency detail becomes largewhen t is small, while the absolute values at the homogeneousregion which does not require high-frequency detail becomessmall when t is small. In this work, we proposed a novel differential equation for theSR task to gradually enhance a given input LR image, andallow continuous-valued scale factor. Image difference be-tween images over different scale factors is physically mod-eled with a neural network, and formulated as a NODE. Torestore a high-quality image, we solve the ODE initial valueproblem with the initial condition given as an input LR im-age. The main difference with existing gradual SR methodsis that our formulation is based on the physical modeling of

Higher frequencycomponents

Figure 3: Intensity of intermediate image derivatives by changingscale factor t at two different locations. the intermediate images, and adds ﬁne high-frequency detailsgradually. The analysis on the intermediate states during theSR process gives us more insight on the gradual SR recon-struction. Detailed experimental results show that our methodachieves superior performance compared to state-of-the-artSR approaches. eferences [Agustsson and Timofte, 2017] Eirikur Agustsson and RaduTimofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. 2017.[Cai et al. , 2019] Jianrui Cai, Hui Zeng, Hongwei Yong,Zisheng Cao, and Lei Zhang. Toward real-world single im-age super-resolution: A new benchmark and a new model.In Proceedings of the IEEE International Conference onComputer Vision , 2019.[Chen and Pock, 2016] Yunjin Chen and Thomas Pock.Trainable nonlinear reaction diffusion: A ﬂexible frame-work for fast and effective image restoration.

IEEEtransactions on pattern analysis and machine intelligence ,2016.[Chen et al. , 2018] Ricky TQ Chen, Yulia Rubanova, JesseBettencourt, and David K Duvenaud. Neural ordinary dif-ferential equations. In

Advances in neural information pro-cessing systems , 2018.[Dong et al. , 2015] Chao Dong, Chen Change Loy, Kaim-ing He, and Xiaoou Tang. Image super-resolution usingdeep convolutional networks.

IEEE transactions on pat-tern analysis and machine intelligence , 38(2), 2015.[Haris et al. , 2018] Muhammad Haris, GregoryShakhnarovich, and Norimichi Ukita. Deep back-projection networks for super-resolution. In

Proceedingsof the IEEE conference on computer vision and patternrecognition , 2018.[He et al. , 2019] Xiangyu He, Zitao Mo, Peisong Wang,Yang Liu, Mingyuan Yang, and Jian Cheng. Ode-inspirednetwork design for single image super-resolution. In

Pro-ceedings of the IEEE Conference on Computer Vision andPattern Recognition , 2019.[Hu et al. , 2019] Xuecai Hu, Haoyuan Mu, Xiangyu Zhang,Zilei Wang, Tieniu Tan, and Jian Sun. Meta-sr: Amagniﬁcation-arbitrary network for super-resolution. In

Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , 2019.[Huang et al. , 2015] Jia-Bin Huang, Abhishek Singh, andNarendra Ahuja. Single image super-resolution fromtransformed self-exemplars. In

Proceedings of the IEEEconference on computer vision and pattern recognition ,2015.[Kim et al. , 2016a] Jiwon Kim, Jung Kwon Lee, and Ky-oung Mu Lee. Deeply-recursive convolutional network forimage super-resolution. In

Proceedings of the IEEE con-ference on computer vision and pattern recognition , 2016.[Kim et al. , 2016b] Jiwon Kim, Jung Kwon Lee, and Ky-oun Mu Lee. Accurate image super-resolution using verydeep convolutional networks. In

Proceedings of the IEEEconference on computer vision and pattern recognition ,2016.[Kingma and Ba, 2015] Diederik P. Kingma and Jimmy Ba.Adam: A method for stochastic optimization. In

Interna-tional Conference on Learning Representations , 2015. [Lai et al. , 2017] Wei-Sheng Lai, Jia-Bin Huang, NarendraAhuja, and Ming-Hsuan Yang. Deep laplacian pyramidnetworks for fast and accurate super-resolution. In

Pro-ceedings of the IEEE conference on computer vision andpattern recognition , 2017.[Li et al. , 2019] Zhen Li, Jinglei Yang, Zheng Liu, XiaominYang, Gwanggil Jeon, and Wei Wu. Feedback networkfor image super-resolution. In

Proceedings of the IEEEConference on Computer Vision and Pattern Recognition ,2019.[Lim et al. , 2017] Bee Lim, Sanghyun Son, Heewon Kim,Seungjun Nah, and Kyoung Mu Lee. Enhanced deep resid-ual networks for single image super-resolution. In

Pro-ceedings of the IEEE conference on computer vision andpattern recognition workshops , 2017.[Martin et al. , 2001] David Martin, Charless Fowlkes,Doron Tal, and Jitendra Malik. A database of humansegmented natural images and its application to evaluatingsegmentation algorithms and measuring ecological statis-tics. In

Proceedings of the IEEE International Conferenceon Computer Vision . IEEE, 2001.[Rudin et al. , 1992] Leonid I Rudin, Stanley Osher, andEmad Fatemi. Nonlinear total variation based noise re-moval algorithms.

Physica D: nonlinear phenomena ,1992.[Scao, 2020] Teven Le Scao. Neural differential equationsfor single image super-resolution. In

ICLR 2020 Work-shop on Integration of Deep Neural Models and Differen-tial Equations , 2020.[Tai et al. , 2017] Ying Tai, Jian Yang, and Xiaoming Liu.Image super-resolution via deep recursive residual net-work. In

Proceedings of the IEEE conference on computervision and pattern recognition , 2017.[Zeyde et al. , 2010] Roman Zeyde, Michael Elad, and MatanProtter. On single image scale-up using sparse-representations. In

International conference on curves andsurfaces . Springer, 2010.[Zhang et al. , 2018] Yulun Zhang, Yapeng Tian, Yu Kong,Bineng Zhong, and Yun Fu. Residual dense network forimage super-resolution. In