SRPGAN: Perceptual Generative Adversarial Network for Single Image Super Resolution
SSRPGAN: Perceptual Generative Adversarial Network for Single ImageSuper Resolution
Bingzhe Wu Haodong Duan Zhichao Liu Guangyu SunPeking UniveristyBeijing, 100871, China { wubingzhe,duanhaodong,liuzceecs,gsun } @pku.edu.cn Abstract
Single image super resolution (SISR) is to reconstructa high resolution image from a single low resolution image.The SISR task has been a very attractive research topic overthe last two decades. In recent years, convolutional neuralnetwork (CNN) based models have achieved great perfor-mance on SISR task. Despite the breakthroughs achievedby using CNN models, there are still some problems remain-ing unsolved, such as how to recover high frequency detailsof high resolution images. Previous CNN based models al-ways use a pixel wise loss, such as l2 loss. Although thehigh resolution images constructed by these models havehigh peak signal-to-noise ratio (PSNR), they often tend tobe blurry and lack high-frequency details, especially at alarge scaling factor. In this paper, we build a super resolu-tion perceptual generative adversarial network (SRPGAN)framework for SISR tasks. In the framework, we proposea robust perceptual loss based on the discriminator of thebuilt SRPGAN model. We use the Charbonnier loss func-tion to build the content loss and combine it with the pro-posed perceptual loss and the adversarial loss. Comparedwith other state-of-the-art methods, our method has demon-strated great ability to construct images with sharp edgesand rich details. We also evaluate our method on differentbenchmarks and compare it with previous CNN based meth-ods. The results show that our method can achieve muchhigher structural similarity index (SSIM) scores on most ofthe benchmarks than the previous state-of-art methods.
1. Introduction
Single image super resolution (SISR) is a well definedproblem in computer vision area. It tries to reconstruct ahigh resolution image from a single low resolution image.It has been a very attractive research topic over the last twodecades [7] [1] [8]. Since SISR can restore some high fre-quency details, it has been applied to many practical appli- cations such as medical imaging [25], satellite imaging [27],and face identification [3], where rich details are greatly de-sired.In recent years, CNNs have shown powerful ability tolearn highly non-linear transformations. Due to their power-ful learning ability, the CNN based methods are widely usedfor SISR tasks and have achieved remarkable progress. De-spite those breakthroughs brought by the CNN based meth-ods, there are still some critical problems, which remainlargely unsolved, such as how to recover high resolution im-age with high perceptual quality and high frequency details.A common objective function of previous CNN based meth-ods is the pixel wise loss function between the reconstructedand the ground truth high resolution images. The most com-monly used pixel wise loss is l l a r X i v : . [ c s . C V ] D ec enerator Discriminator … Bicubic interpolation Encoder Decoder
Feature Concatenate
Feature Extractor
Perceptual Loss
Adversarial Loss
Total loss for G and D
Content Loss
Figure 1. The structure of the SISR framework(SRPGAN). Our Framework consists of a generator G (left) and a discriminator D (right).The generator part generates high resolution image from low resolution one. The discriminator takes the generated image and the groundtruth image as inputs to extract features of these inputs. Based on these extracted features, the discriminator can build the objectivefunctions for D and G . respect of the network architecture, we propose to replacethe batch normalization layer with the instance normaliza-tion layer. We evaluate our method on most used bench-marks with a large upscaling factor. Our method outper-forms other previous methods with SSIM score on most ofbenchmarks. Beside the quantitative evaluation, our methodhas also demonstrated great ability to reconstruct imageswith rich details and high perceptual quality.The rest of this paper is organized as follows. The relatedwork part summarizes the previous related works briefly.The methods section describes the framework details andthe proposed individual loss functions. The quantitativeevaluation and results visualization can be found in the ex-periments section.
2. Related Works
From the great performance achieved by the deep con-volutional neural network at ImageNet challenge, variousCNN based methods are applied to the super-resolutionproblem. SRCNN is the first paper that applies CNN tothe single image super-resolution problem [5]. SRCNN isa simple model with three convolutional layers working forfeature extraction, non-linear mapping, and image recon-struction. This method learns the end-to-end transformationbetween low and high resolution images.Based on the results of SRCNN, the authors acceler-ated the previous model by using an hour-glass shape CNNstructure, and achieved a better SR performance and a real-time SR model with the rate higher than 24 fps on a genericCPU [6]. To achieve a real-time model for SR, they re-placed the bicubic interpolation part in the previous modelwith deconvolution layers. After removing bicubic interpo-lation, this model can learn directly from the low resolutionimage. FSRCNN model only includes convolution layersand deconvolution layers. The convolution layers share theweights for different upscaling factors. With weight shar- ing, FSRCNN is able to deal with various scales using asingle model.Because of its success at the ImageNet challenge, thedeep architecture similar with VGG net was proposed forlarge receptive fields in the VDSR (Very Deep network forSuper-Resolution) [14]. The convergence rate is the mainlimitation of the deeper model. The VDSR tries to use ahigher learning rate of 10 − while SRCNN used a learningrate of 10 − . Due to the high learning rate, it can be eas-ier to diverge. By using the gradient clipping, the VDSRcan be controlled strictly. The VDSR included the con-cept of residual learning to generate final output results.Due to the property of SISR problem, the output image isquite similar to the input image. With the concept of resid-ual learning, the input image is added to the output of themodel before making the final output image. As a result,the model can focus on the detail with high-frequency com-ponents. Besides, DRCN(Deeply-Recursive ConvolutionalNetwork) based on the VDSR model added recursion con-nections for weight sharing and model compression withonly 5 layers [15]. The LapSRN is the one of the mostrecent frameworks for SISR problem [17]. The model in-cludes a cascaded framework of feature extraction and im-age reconstruction parts using laplacian pyramids. And theCharbonnier loss function is used instead of the l l
3. Methods
We build our single image super resolution frameworkon Image-to-Image model [12]. Our framework consists ofan image generator G and a discriminator D . The generatoris trying to transform the image in the domain generated bybicubic upsampling to the image in the ground truth highresolution image domain. The discriminator is trying to ex-tract the features of the input high resolution images and theconstructed images. Based on the features obtained by thediscriminator, we can get the adversarial loss and the per-ceptual loss. Finally, we combine these two loss functionswith the content loss to build the total objective functions of D and G . The overall framework is illustrated in Figure1.Considering a single low-resolution image, we firstly up-scale it by the specified factor using bicubic interpolationfor further computing. Then, the generator network takesthe interpolated image as the input and maps it to a high res-olution image. Our final goal is to train a generator network G that can generate high resolution image that is as similaras possible to the ground truth high resolution image. Toachieve this, we construct a robust loss using the output andthe intermediate features obtained by the discriminator D .Additionally, we design a content loss which can be usedfor evaluating the similarity between the generated imageand the ground truth image. The individual loss functionsare described with more details in the following subsection. Our start point is the Image-to-Image model [12], we furthertailor the Image-to-Image model for the SISR task. To thebest of our knowledge, this is the first paper that attempts toapply the image-to-image model to the SISR task.The generator G is the core of the whole framework. Thestructure of G illustrated in Figure1(left) has an encoder-decoder shape. We add skip connections following the gen-eral shape of the U-Net [22] to combine the local and globalinformation. Specifically, in our generator, skip connectionis implemented by concatenating features obtained by layer i and layer n − i , where n is the total number of the con-volution layers. The encoder part of G consists of a stackof convolution layers. More Specifically, we use convo-lution layers with small 3 × ConvBNReLUPool Stride-ConvINLReLU
Traditional Ours
Figure 2. Comparison of a conv block in a traditional GAN modeland that in our model. Our model replaces the pooling layer witha stride-convolution layer. And we use the instance normalization(i.e. IN), not the batch normalization. ing [12]. Compared with the traditional discriminator, thepatch discriminator tries to classify whether each patch in animage is real or fake instead of the whole image. Such a dis-criminator can restrict the GAN model to focus on the highfrequency details. And the existence of the content loss canmake sure the correctness of the low frequency part. Thedetail of the loss functions can be found in the followingsubsection. In the convolution blocks of the discriminator,we remove the batch normalization layers directly.3 T Ours
LapSRN VDSR
Figure 3. Reconstruction results [4 × upscaling] of Ours and recent state-of-the-art methods(LapSRN and VDSR). We use color boxes tohighlight sub regions which contain rich details. We magnify the sub regions in the bellow boxes to show more details. From the sub regionimages, we can see that our method has stronger ability to recover high frequency details and sharp edges.
Although batch normalization [11] has been proved to beeffective on many image classification tasks, recent works[20] [29] have pointed out that batch normalization will de-crease the performance of image generation tasks. In [29],the authors proposed to use the instance normalization in-stead of batch normalization on the image style transformtask. Following this, we replace the batch normalizationlayers with instance normalization layers to get better per-formance on SISR tasks. Instance normalization is to applythe normalization on a single image instead of the wholebatch of images. To introduce the formulation, we denote x ∈ R T × C × W × H as an input batch which contains T images.Let x tki j denote the tki j − th element, where i and j are thespatial dimensions, k is the input feature channel, and t isthe index of the image in the batch. Then the formulation ofthe instance normalization is given by: y tki j = x tki j − u tk (cid:113) σ tk + ε (1)where u tk = HW ∑ Wl = ∑ Hm = x tklm and σ tk = HW ∑ Wl = ∑ Hm = ( x tklm − u tk ) .We replace batch normalization with instance normal-ization in every layer of the generator G . The instancebatch normalization layer can achieve better performancethan batch normalization, and it also can be used for pre-venting the divergence of the training. In this section, we will introduce the formulas of the lossfunctions. To get the objective loss functions of discrimina-tor and generator, we need to design adversarial loss, con-tent loss and perceptual loss, respectively. We can get these individual loss functions based on the outputs and interme-diate features obtained by discriminator D . Our generator G tries to learn a mapping from the image z by bicubic interpolation to the ground truth high resolutionimage y . We design our discriminator D in a conditionalGAN fashion. The adversarial loss function of our GANmodel can be expressed as below: l a ( G , D ) = E z , y ∼ p data ( z , y ) [ logD ( z , y )]+ E z ∼ p data ( z ) [ log ( − D ( z , G ( z ))] (2)In the training phase, the discriminator D tries to minimizethis objective function and the generator G tries to maxi-mize it. Compared with the unconditional GAN, the formu-lation of the adversarial loss function in the unconditionalGAN can be expressed as below: l ( G , D ) = E y ∼ p data ( y ) [ logD ( y )]+ E z ∼ p data ( z ) [ log ( − D ( G ( z ))] (3)In contrast, the discriminator of unconditional GAN cannotobserve the input bicubic image z .The adversarial loss can encourage our generator to gen-erate the solution that resides on the manifold of the groundtruth high resolution images by trying to fool the discrimi-nator. The adversarial loss can be helpful to recover the high fre-quency details. Except for the high frequency part, we alsoneed to design a content loss to ensure the correctness ofthe low frequency part of the constructed image. The com-monly used content loss is the mean square loss. In this pa-per, we propose to use the Charbonnier loss [2] to achieve4etter performance on the SISR task. We denote y as theground truth high resolution image and G ( z ) as the con-structed image. The Charbonnier loss can be expressed asbelow: l y ( y , ˆ y ) = E z , y ∼ p data ( z , y ) ( ρ ( y − G ( z ))) (4)Where ρ ( x ) = √ x + ε is the Charbonnier penalty func-tion. To give a comparison, we also try the l1 loss and l2loss in the experiments. Previous methods based on the pixel-wise loss always gen-erate images that lack high frequency details. Some per-ceptual loss based on VGG16 network has been proposedto deal with this issue. Instead of using the perceptual lossbased on the VGG16 classification network, we use the in-termediate features of the discriminator to build the percep-tual loss. We can get a more robust perceptual loss for im-age super resolution by that. Additionally, we can reducethe computation budget of the perceptual loss through thereuse of the extracted features obtained by the discrimina-tor. To introduce the formula of the perceptual loss, we de-note φ i as the feature map computed by the i-th convolutionlayer(after the activation function layer) within the discrim-inator. Then, we can define the perceptual loss as: l p = L ∑ i = E z , y ∼ p data ( z , y ) ( || φ i ( y ) − φ i ( G ( z )) || ) (5)In this formula, each term in the equation measures the l1loss of features extracted by i-th layer of the discriminator D , where the G ( z ) represents the image constructed by thegenerator G . We use an alternative optimization way to optimize the gen-erator G and discriminator D . Based on the individual lossfunctions presented above, we can define the objective func-tions for discriminator D and generator G . The formulas aredefined as : l d = − l a ( G , D ) + λ l p (6) l g = l a ( G , D ) + λ l p + λ l y (7)In the training phase, we minimize l d with respect to theparameters of discriminator D and minimize l g with respectto the parameters of generator G . To optimize our networks,we alternate between one gradient descent step on D andone step on G . For optimizing solver, we use the ADAMalgorithm for both G and D . The details can be found in theexperimental section.
4. Experiments
For training dataset, we use images from T91, BSDS200[19] and General100 datasets. In each training batch, werandomly select 64 image patches as the high resolutionpatches, with each patch in the size of 128 × β = . β = .
99, and ε = − . The learn-ing rate is initialized as 10 − and the learning rate de-creased to 10 − while we finished 10 iterations. We set theweight term in the loss function as λ = .
01 in equation(6), λ = and λ = We evaluate the performance of our method on fivebenchmarks: SET5, SET14, BSDS100, URBAN100, andMANGA109. The metrics we used are PSNR and SSIM[30]. We compare the proposed method with previous state-of-the-art SISR methods. For scaling factors, we test ourmodel on 2x,4x and 8x. Table1 shows the overall quanti-tative comparisons for 2x, 4x and 8x. Most of the resultsof other methods are cited from [17] and [15]. Because ourmethod dose not optimize the mean square loss directly, thePSNR score of our method is much lower than other meth-ods. Other than that, our method tends to generate morerealistic images and recover more details than the state-of-the-art methods as shown in Figure3. On the other hand, ourmethod is competitive on the mean SSIM score which hasbeen shown to correlate with human perception on differ-ent benchmarks. Our SRPGAN method performs favorablyagainst existing methods on the most used benchmarks withdifferent scaling factor (2x,4x and 8x). From the results, ourmethod has a poor performance on the MANGA dataset,the main reason is that our training dataset consists of reallife images and our GAN model tend to reconstruct realisticimages, but the MANGA dataset is a dataset consisting ofJapanese comics. For other benchmarks, our methods haveobvious improvements on SSIM score.5 round-truth HR Bicubic VDSRLapSRN ours
Ground-truth HR Bicubic VDSRLspSRN ours
Figure 4. Reconstruction results of
Ours and recent state-of-the-art methods.[4 × upscaling] Algorithm Scale S ET ET
14 BSDS100 U
RBAN
MANGA
SRPGAN (Ours)
SRPGAN(Ours)
SRPGAN(Ours)
Table 1. Quantitative evaluation of state-of-the-art SR algorithms: average PSNR/SSIM for scale factors 2 × , 4 × and 8 × . Red textindicates the best SSIM score and blue text indicates the second best performance of SSIM score.
In the context of perceptual quality, our method can re-cover realistic textures from heavily down-sampling imageson the public benchmarks. We have selected some im- ages from the benchmarks to visualization the effective ofour method in Figure3 and Figure4. From the results, theimages constructed by our method have shown significantgains in perceptual quality.We have conducted a series experiments to show the ef-6ectiveness of our proposed SISR framework and loss func-tions. In Figure3, we compare our method with previousstate-of-the-art methods LapSRN [17] and VDSR [14]. Ex-cept for these two methods, we have also compared ourmethod with other CNN based methods. We just list theresults of these two methods due to the page limitation. Tobetter visualize the effectiveness of our method, we selectedsmall regions which contain rich details in the images tomagnify. As the Figure3 shows, our method successfullyreconstructs stripes on Zebra’s bodies (shown in red and or-ange boxes). On the other hand, LapSRN and VSDR basedon a pixel-wise loss just generate blurry images withoutstripes in that area. In the leg area(yellow and green box),the perceptual quality of our method is not good enough,but our method tries to recover the stripe details on the leg,the other methods just construct leg images without any de-tails. We also compare with other CNN based methods,such as SRCNN [5]. In contrast, our approach surpassesother state-of-the-art methods to generate richer texture de-tails. Further examples of perceptual improvements can befound in Figure4. For these images, we can see that ourmethod has constructed high resolution images with goodperceptual quality, those methods which are based on pixel-wise loss have generated blurry and over-smooth images.
Except for comparison with other methods, we also eval-uate the performance of the generator network without ourperceptual loss or replacing the Charbonnier loss with otherpixel-wise loss, such as l2 loss. We firstly train a modelwithout perceptual loss. The quantitative results are sum-marized in Table2 and the visualization results are in Fig-ure5. From the Table2, the proposed framework with per-ceptual loss achieves much better performance than trainedwithout perceptual loss. From Figure5 the model trainedwith perceptual loss has a better perceptual quality than themodel trained without perceptual loss. We can observe fromthe sub images that our method trained with perceptual losscan accurately reconstruct the beards of the baboon(in redbox). Dataset Perceptual loss SSIMSet14 Yes 0.786Set14 No 0.754BSDS100 Yes 0.749BSDS100 No 0.716
Table 2. Quantitative comparison of different perceptual lossstrategies. We can get a better performance with our robust per-ceptual loss.
We also compare our perceptual loss with the SRGANwhich is based on the VGG perceptual loss (see more detailsin the follow subsection). Dataset Content Loss SSIM Training epochsSet14 Charbinnier 0.786 100Set14 l1 0.782 100Set14 l2 0.763 100
Table 3. Quantitative comparison of different content loss.
We have conducted further experiments to explore thecontent loss. To validate the effect of the Charbonnier lossfunction, we trained a model with l2 content loss respec-tively. Through the experiments, the model with l2 contentloss requires more training epochs to achieve comparableperformance than the model trained with the Charbonniercontent loss. The results are shown in Table3.
We conduct experiments to compare our method with theSRGAN [18] based on the VGG perceptual loss. We cansee that our perceptual loss is more robust than the VGGperceptual loss (see in Figure5). Note that we have traineda SRGAN model using the open source Tensorflow code onGithub . For fair comparison, we train the SRGAN modelusing the same training dataset as our own method. In thetraining phase, we train that on a scaling factor of 4x for100 epochs which is the same as our method.As we can see in Figure5, our SRPGAN method does abetter job at reconstructing fine details, such as the beardsof the baboon ( in red boxes of the second and the fourthcolumns), leading to pleasing visual results. On the otherhand, in the training phase, our SRPGAN does not need anextra VGG classification net to build the perceptual loss,compared with the SRGAN method. This may help to re-duce the computation budget while training. While our model is capable of constructing realistic im-ages with sharp edges and rich details on a large scale fac-tor(4x, 8x). There are still some limitations of our meth-ods. One limitation of our GAN based model is that theconstructed images have checker board artifacts at the pixellevel. The artifacts are visible in Figure5 upon magnifica-tion of the sub image regions. This phenomenon is alsomentioned in many previous literatures [21]. To solve thisissue, one can replace the transpose convolution with resizeconvolution [21] and sub-pixel convolution [24]. We markthis as a part of future work.
5. Conclusion
In this paper, we have highlighted some limitations of thepixel wise loss based methods. To solve these issues, wepropose a general framework based on generative and ad-versarial network (GAN) for single image super resolution. https://github.com/zsdonghao/SRGAN T Ours
Ours(without perceptualloss) SRGAN
Figure 5. Reconstruction results [4x upscaling] of different perceptual loss strategy (our framework with proposed perceptual loss (thesecond column), our framework without perceptual loss(the third column), SRGAN with VGG perceptual loss(the fourth column))
Based on the framework, we design individual loss func-tions and combined them to form the objective functionsfor discriminator and generator respectively. Our methodachieves the highest SSIM score on most of commonly usedbenchmarks, and also construct images with better percep-tual quality than previous methods, especially for large up-scaling factors (4x and 8x). The quantitative and visualiza-tion results have shown that SRPGAN surpasses previousmethods on details recovering and perceptual quality.
References [1] H. K. Aghajan and T. Kailath. Sensor array processingtechniques for super resolution multi-line-fitting and straightedge detection.
IEEE Transactions on Image Processing ,2(4):454–465, 1993.[2] J. T. Barron. A more general robust loss function.
CoRR ,abs/1701.03077, 2017.[3] E. Bilgazyev, B. Efraty, S. K. Shah, and I. A. Kakadiaris. Im-proved face recognition using super-resolution. In
Biomet-rics (IJCB), 2011 International Joint Conference on , pages1–7. IEEE, 2011.[4] R. Dahl, M. Norouzi, and J. Shlens. Pixel recursive superresolution. arXiv preprint arXiv:1702.00783 , 2017.[5] C. Dong, C. C. Loy, K. He, and X. Tang. Image Super-Resolution Using Deep Convolutional Networks. arXiv.org ,Dec. 2014.[6] C. Dong, C. C. Loy, and X. Tang. Accelerating the Super-Resolution Convolutional Neural Network. arXiv.org , Aug.2016.[7] R. Gerchberg. Super-resolution through error energy reduc-tion.
Journal of Modern Optics , 21(9):709–720, 1974.[8] D. Glasner, S. Bagon, and M. Irani. Super-resolution from asingle image. In
Computer Vision, 2009 IEEE 12th Interna-tional Conference on , pages 349–356. IEEE, 2009. [9] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learn-ing for image recognition. In
Proceedings of the IEEE con-ference on computer vision and pattern recognition , pages770–778, 2016.[10] J.-B. Huang, A. Singh, and N. Ahuja. Single image super-resolution from transformed self-exemplars. In
Proceedingsof the IEEE Conference on Computer Vision and PatternRecognition , pages 5197–5206, 2015.[11] S. Ioffe and C. Szegedy. Batch normalization: Acceleratingdeep network training by reducing internal covariate shift. In
International Conference on Machine Learning , pages 448–456, 2015.[12] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004 , 2016.[13] J. Johnson, A. Alahi, and F.-F. Li. Perceptual Losses forReal-Time Style Transfer and Super-Resolution.
CoRR ,cs.CV, 2016.[14] J. Kim, J. K. Lee, and K. M. Lee. Accurate ImageSuper-Resolution Using Very Deep Convolutional Networks. arXiv.org , Nov. 2015.[15] J. Kim, J. K. Lee, and K. M. Lee. Deeply-Recursive Convo-lutional Network for Image Super-Resolution.
CoRR , cs.CV,2015.[16] Y. Kim, H. Jung, D. Min, and K. Sohn. Deeply AggregatedAlternating Minimization for Image Restoration. arXiv.org ,Dec. 2016.[17] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang. Deeplaplacian pyramid networks for fast and accurate super-resolution. arXiv preprint arXiv:1704.03915 , 2017.[18] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunning-ham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, andW. Shi. Photo-Realistic Single Image Super-Resolution Us-ing a Generative Adversarial Network. arXiv.org , Sept. 2016.[19] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A databaseof human segmented natural images and its application to valuating segmentation algorithms and measuring ecologi-cal statistics. In Proc. 8th Int’l Conf. Computer Vision , vol-ume 2, pages 416–423, July 2001.[20] S. Nah, T. H. Kim, and K. M. Lee. Deep multi-scale convolu-tional neural network for dynamic scene deblurring.
CoRR ,abs/1612.02177, 2016.[21] A. Odena, V. Dumoulin, and C. Olah. Deconvolution andcheckerboard artifacts.
Distill , 2016.[22] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolu-tional networks for biomedical image segmentation. In
In-ternational Conference on Medical Image Computing andComputer-Assisted Intervention , pages 234–241. Springer,2015.[23] S. Schulter, C. Leistner, and H. Bischof. Fast and accurateimage upscaling with super-resolution forests. In
Proceed-ings of the IEEE Conference on Computer Vision and PatternRecognition , pages 3791–3799, 2015.[24] W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken,R. Bishop, D. Rueckert, and Z. Wang. Real-Time SingleImage and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network.
CoRR , cs.CV, 2016.[25] W. Shi, J. Caballero, C. Ledig, X. Zhuang, W. Bai, K. Bha-tia, A. M. S. M. de Marvao, T. Dawes, D. ORegan, andD. Rueckert. Cardiac image super-resolution with global cor-respondence using multi-atlas patchmatch. In
InternationalConference on Medical Image Computing and Computer-Assisted Intervention , pages 9–16. Springer, 2013.[26] C. K. Sønderby, J. Caballero, L. Theis, W. Shi, andF. Huszar. Amortised MAP Inference for Image Super-resolution. arXiv.org , Oct. 2016.[27] M. Thornton, P. M. Atkinson, and D. Holland. Sub-pixel mapping of rural land cover objects from fine spa-tial resolution satellite sensor imagery using super-resolutionpixel-swapping.
International Journal of Remote Sensing ,27(3):473–491, 2006.[28] R. Timofte, V. De Smet, and L. Van Gool. A+: Adjustedanchored neighborhood regression for fast super-resolution.In
Asian Conference on Computer Vision , pages 111–126.Springer, 2014.[29] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Instance normal-ization: The missing ingredient for fast stylization. arXivpreprint arXiv:1607.08022 , 2016.[30] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simon-celli. Image quality assessment: from error visibility tostructural similarity.
IEEE transactions on image process-ing , 13(4):600–612, 2004.[31] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang. DeepNetworks for Image Super-Resolution with Sparse Prior. arXiv.org , July 2015.[32] J. Yang, J. Wright, T. S. Huang, and Y. Ma. Image super-resolution via sparse representation.
IEEE transactions onimage processing , 19(11):2861–2873, 2010., 19(11):2861–2873, 2010.