Jose Caballero
Imperial College London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jose Caballero.
computer vision and pattern recognition | 2017
Christian Ledig; Lucas Theis; Ferenc Huszár; Jose Caballero; Andrew Cunningham; Alejandro Acosta; Andrew P. Aitken; Alykhan Tejani; Johannes Totz; Zehan Wang; Wenzhe Shi
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
computer vision and pattern recognition | 2016
Wenzhe Shi; Jose Caballero; Ferenc Huszár; Johannes Totz; Andrew P. Aitken; Rob Bishop; Daniel Rueckert; Zehan Wang
Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods.
IEEE Transactions on Medical Imaging | 2014
Jose Caballero; Anthony N. Price; Daniel Rueckert; Joseph V. Hajnal
The reconstruction of dynamic magnetic resonance data from an undersampled k-space has been shown to have a huge potential in accelerating the acquisition process of this imaging modality. With the introduction of compressed sensing (CS) theory, solutions for undersampled data have arisen which reconstruct images consistent with the acquired samples and compliant with a sparsity model in some transform domain. Fixed basis transforms have been extensively used as sparsifying transforms in the past, but recent developments in dictionary learning (DL) have been shown to outperform them by training an overcomplete basis that is optimal for a particular dataset. We present here an iterative algorithm that enables the application of DL for the reconstruction of cardiac cine data with Cartesian undersampling. This is achieved with local processing of spatio-temporal 3D patches and by independent treatment of the real and imaginary parts of the dataset. The enforcement of temporal gradients is also proposed as an additional constraint that can greatly accelerate the convergence rate and improve the reconstruction for high acceleration rates. The method is compared to and shown to systematically outperform k- t FOCUSS, a successful CS method that uses a fixed basis transform.
IEEE Transactions on Medical Imaging | 2018
Jo Schlemper; Jose Caballero; Joseph V. Hajnal; Anthony N. Price; Daniel Rueckert
Inspired by recent advances in deep learning, we propose a framework for reconstructing dynamic sequences of 2-D cardiac magnetic resonance (MR) images from undersampled data using a deep cascade of convolutional neural networks (CNNs) to accelerate the data acquisition process. In particular, we address the case where data are acquired using aggressive Cartesian undersampling. First, we show that when each 2-D image frame is reconstructed independently, the proposed method outperforms state-of-the-art 2-D compressed sensing approaches, such as dictionary learning-based MR image reconstruction, in terms of reconstruction error and reconstruction speed. Second, when reconstructing the frames of the sequences jointly, we demonstrate that CNNs can learn spatio-temporal correlations efficiently by combining convolution and data sharing approaches. We show that the proposed method consistently outperforms state-of-the-art methods and is capable of preserving anatomical structure more faithfully up to 11-fold undersampling. Moreover, reconstruction is very fast: each complete dynamic sequence can be reconstructed in less than 10 s and, for the 2-D case, each image frame can be reconstructed in 23 ms, enabling real-time applications.
medical image computing and computer assisted intervention | 2013
Wenzhe Shi; Jose Caballero; Christian Ledig; Xiahai Zhuang; Wenjia Bai; Kanwal K. Bhatia; Antonio de Marvao; Tim Dawes; Declan P. O’Regan; Daniel Rueckert
The accurate measurement of 3D cardiac function is an important task in the analysis of cardiac magnetic resonance (MR) images. However, short-axis image acquisitions with thick slices are commonly used in clinical practice due to constraints of acquisition time, signal-to-noise ratio and patient compliance. In this situation, the estimation of a high-resolution image can provide an approximation of the underlaying 3D measurements. In this paper, we develop a novel algorithm for the estimation of high-resolution cardiac MR images from single short-axis cardiac MR image stacks. First, we propose to use a novel approximate global search approach to find patch correspondence between the short-axis MR image and a set of atlases. Then, we propose an innovative super-resolution model which does not require explicit motion estimation. Finally, we build an expectation-maximization framework to optimize the model. We validate the proposed approach using images from 19 subjects with 200 atlases and show that the proposed algorithm significantly outperforms conventional interpolation such as linear or B-spline interpolation. In addition, we show that the super-resolved images can be used for the reproducible estimation of 3D cardiac functional indices.
medical image computing and computer assisted intervention | 2016
Ozan Oktay; Wenjia Bai; Matthew C. H. Lee; Ricardo Guerrero; Konstantinos Kamnitsas; Jose Caballero; Antonio de Marvao; Stuart A. Cook; Declan P. O’Regan; Daniel Rueckert
3D cardiac MR imaging enables accurate analysis of cardiac morphology and physiology. However, due to the requirements for long acquisition and breath-hold, the clinical routine is still dominated by multi-slice 2D imaging, which hamper the visualization of anatomy and quantitative measurements as relatively thick slices are acquired. As a solution, we propose a novel image super-resolution (SR) approach that is based on a residual convolutional neural network (CNN) model. It reconstructs high resolution 3D volumes from 2D image stacks for more accurate image analysis. The proposed model allows the use of multiple input data acquired from different viewing planes for improved performance. Experimental results on 1233 cardiac short and long-axis MR image stacks show that the CNN model outperforms state-of-the-art SR methods in terms of image quality while being computationally efficient. Also, we show that image segmentation and motion tracking benefits more from SR-CNN when it is used as an initial upscaling method than conventional interpolation methods for the subsequent analysis.
IEEE Transactions on Medical Imaging | 2018
Ozan Oktay; Enzo Ferrante; Konstantinos Kamnitsas; Mattias P. Heinrich; Wenjia Bai; Jose Caballero; Stuart A. Cook; Antonio de Marvao; Timothy Dawes; Declan O'Regan; Bernhard Kainz; Ben Glocker; Daniel Rueckert
Incorporation of prior knowledge about organ shape and location is key to improve performance of image analysis approaches. In particular, priors can be useful in cases where images are corrupted and contain artefacts due to limitations in image acquisition. The highly constrained nature of anatomical objects can be well captured with learning-based techniques. However, in most recent and promising techniques such as CNN-based segmentation it is not obvious how to incorporate such prior knowledge. State-of-the-art methods operate as pixel-wise classifiers where the training objectives do not incorporate the structure and inter-dependencies of the output. To overcome this limitation, we propose a generic training strategy that incorporates anatomical prior knowledge into CNNs through a new regularisation model, which is trained end-to-end. The new framework encourages models to follow the global anatomical properties of the underlying anatomy (e.g. shape, label structure) via learnt non-linear representations of the shape. We show that the proposed approach can be easily adapted to different analysis tasks (e.g. image enhancement, segmentation) and improve the prediction accuracy of the state-of-the-art models. The applicability of our approach is shown on multi-modal cardiac data sets and public benchmarks. In addition, we demonstrate how the learnt deep models of 3-D shapes can be interpreted and used as biomarkers for classification of cardiac pathologies.
medical image computing and computer assisted intervention | 2012
Jose Caballero; Daniel Rueckert; Joseph V. Hajnal
Sparse representation methods have been shown to tackle adequately the inherent speed limits of magnetic resonance imaging (MRI) acquisition. Recently, learning-based techniques have been used to further accelerate the acquisition of 2D MRI. The extension of such algorithms to dynamic MRI (dMRI) requires careful examination of the signal sparsity distribution among the different dimensions of the data. Notably, the potential of temporal gradient (TG) sparsity in dMRI has not yet been explored. In this paper, a novel method for the acceleration of cardiac dMRI is presented which investigates the potential benefits of enforcing sparsity constraints on patch-based learned dictionaries and TG at the same time. We show that an algorithm exploiting sparsity on these two domains can outperform previous sparse reconstruction techniques.
international conference information processing | 2017
Jo Schlemper; Jose Caballero; Joseph V. Hajnal; Anthony N. Price; Daniel Rueckert
The acquisition of Magnetic Resonance Imaging (MRI) is inherently slow. Inspired by recent advances in deep learning, we propose a framework for reconstructing MR images from undersampled data using a deep cascade of convolutional neural networks to accelerate the data acquisition process. We show that for Cartesian undersampling of 2D cardiac MR images, the proposed method outperforms the state-of-the-art compressed sensing approaches, such as dictionary learning-based MRI (DLMRI) reconstruction, in terms of reconstruction error, perceptual quality and reconstruction speed for both 3-fold and 6-fold undersampling. Compared to DLMRI, the error produced by the method proposed is approximately twice as small, allowing to preserve anatomical structures more faithfully. Using our method, each image can be reconstructed in 23 ms, which is fast enough to enable real-time applications.
computer vision and pattern recognition | 2017
Jose Caballero; Christian Ledig; Andrew P. Aitken; Alejandro Acosta; Johannes Totz; Zehan Wang; Wenzhe Shi
Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal sub-pixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the joint processing of multiple consecutive video frames. We also propose a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods, relying on a fast multi-resolution spatial transformer module that is end-to-end trainable. These contributions provide both higher accuracy and temporally more consistent videos, which we confirm qualitatively and quantitatively. Relative to single-frame models, spatio-temporal networks can either reduce the computational cost by 30% whilst maintaining the same quality or provide a 0.2dB gain for a similar computational cost. Results on publicly available datasets demonstrate that the proposed algorithms surpass current state-of-the-art performance in both accuracy and efficiency.