Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks
11 Light Field Super-Resolution using a Low-RankPrior and Deep Convolutional Neural Networks
Reuben A. Farrugia,
Senior Member, IEEE,
Christine Guillemot,
Fellow, IEEE,
Abstract —Light field imaging has recently known a regain of interest due to the availability of practical light field capturing systems thatoffer a wide range of applications in the field of computer vision. However, capturing high-resolution light fields remains technologicallychallenging since the increase in angular resolution is often accompanied by a significant reduction in spatial resolution. This paperdescribes a learning-based spatial light field super-resolution method that allows the restoration of the entire light field with consistencyacross all sub-aperture images. The algorithm first uses optical flow to align the light field and then reduces its angular dimension usinglow-rank approximation. We then consider the linearly independent columns of the resulting low-rank model as an embedding, which isrestored using a deep convolutional neural network (DCNN). The super-resolved embedding is then used to reconstruct the remainingsub-aperture images. The original disparities are restored using inverse warping where missing pixels are approximated using a novellight field inpainting algorithm. Experimental results show that the proposed method outperforms existing light field super-resolutionalgorithms, achieving PSNR gains of 0.23 dB over the second best performing method. This performance can be further improvedusing iterative back-projection as a post-processing step.
Index Terms —Deep Convolutional Neural Networks, Light Field, Low-Rank Matrix Approximation, Super-Resolution. (cid:70)
NTRODUCTION L IGHT field imaging has emerged as a promising tech-nology for a variety of applications going from photo-realistic image-based rendering to computer vision applica-tions such as 3D modeling, object detection, classificationand recognition. As opposed to traditional photographywhich captures a 2D projection of the light in the scene,light fields collect the radiance of light rays along differentdirections [1], [2]. This rich visual description of the sceneoffers powerful capabilities for scene understanding and forimproving the performance of traditional computer visionproblems such as depth sensing, post-capture refocusing,segmentation, video stabilization and material classificationto mention a few.Light fields acquisition devices have been recently de-signed, going from rigs of cameras [2] capturing the scenefrom slightly different viewpoints to plenoptic cameras us-ing micro-lens arrays placed in front of the photo-sensor [1].These acquisition devices offer different trade-offs betweenangular and spatial resolution. Rigs of cameras captureviews with a high spatial resolution but in general with alimited angular sampling hence large disparities betweenviews. On the other hand, plenoptic cameras capture viewswith a high angular sampling, but at the expense of alimited spatial resolution. In plenoptic cameras, the angularsampling is related to the number of sensor pixels locatedbehind each microlens, while the spatial sampling is relatedto the number of microlenses. • • C. Guillemot is with the Institut National de Recherche en Infor-matique et en Automatique, Rennes 35042, France, e-mail: ([email protected]).Manuscript received January, 2018; revised xxx.
Light fields represent very large volumes of high dimen-sional data bringing new challenges in terms of capture,compression, editing and display. The design of efficientlight field image processing algorithms, going from anal-ysis, compression to super-resolution and editing has thusrecently attracted interest from the research community. Acomprehensive overview of light field image processingtechniques can be found in [3].This paper addresses the problem of light field spatialsuper-resolution. Single image super-resolution has been anactive field of research in the past years, leading to quitemature solutions. However, super-resolving each view sep-arately using state of the art single-image super-resolutiontechniques would not take advantage of light field prop-erties, in particular of angular redundancy which dependson scene geometry [4]. Moreover, considering each sub-aperture image as a separate entity may reconstruct lightfields which are angularly incoherent [5].Assuming that the low-resolution light field capturesdifferent views of the same scene taken with sub-pixelmisalignment, the problem can be posed as the one ofrecovering the high-resolution (HR) views from multiplelow- resolution images with unknown non integer trans-lational misalignment. A number of methods hence proceedin two steps. A first step consists in estimating the trans-lational misalignment using depth or disparity estimationtechniques. The HR light field views are then found us-ing Bayesian or variational optimization frameworks withdifferent priors. This is the case in [6] and [7] where theauthors first recover a depth map and formulate the spatiallight field super-resolution problem either as a simple linearproblem [6] or as a Bayesian inference problem [7] assum-ing an image formation model with Lambertian reflectancepriors and a depth-dependent blurring kernel. A Gaussianmixture model (GMM) is proposed instead in [8] to address a r X i v : . [ c s . C V ] J a n denoising, spatial and angular super-resolution of lightfields. The reconstructed 4D-patches are estimated using alinear minimum mean square error (LMMSE) estimator, as-suming a disparity-dependent GMM for the patch structure.In [9], the geometry is estimated by computing structuretensors in the Epipolar Plane Images (EPI). A variationaloptimization framework is then used to spatially super-resolve the different views given their estimated depth mapsand to increase the angular resolution.Another category of methods is based on machine learn-ing techniques which learn a model of correspondencesbetween low- and high-resolution data. In [5], the authorslearn projections between low-dimensional subspaces of3D patch-volumes of low- and high-resolution, using ridgeregression. Data-driven learning methods based on deepneural network models have been recently shown to bequite promising for light fields super-resolution. Stackedinput images are up-scaled to a target resolution usingbicubic interpolation and super-resolved using a spatialconvolutional neural network (CNN) in [10] which learnsa non-linear mapping between low- and high-resolutionviews. The output of the spatial CNN is then fed into asecond CNN to perform angular super-resolution. While theapproach in [10] takes at the input of the spatial CNN pairsor -tuples of neighboring views, leading to three spatialCNNs to be learned, a single CNN is proposed by the sameauthors in [11] to process each view independently. Theproblem of angular super-resolution of light fields is alsoaddressed in [12] using an architecture based on two CNNs,one CNN being used to estimate disparity maps and thesecond CNN being used to synthesis intermediate views.The authors in [13] define a CNN architecture in the EPI toincrease the angular resolution.One can also cite some related work using a hybrid lightfield imaging system coupling a high-resolution camerawith either a light field camera [14] or with multiple low-resolution cameras [15]. In [14], the HR image captured bythe DSLR camera is used to super-resolve the low-resolutionimages captured by an Illum light field camera. The authorsin [15] describe an acquisition device formed by eight low-resolution side cameras arranged around a central high-quality SLR lens. A super-resolution method, called iterativepatch- and depth-based synthesis (iPADS), is then proposedto reconstruct a light field with the spatial resolution of theSLR camera and an increased number of views.In this paper, we propose a spatial light field super-resolution method using a deep CNN (DCNN) with tenconvolutional layers. Instead of using DCNN to restore eachsub-aperture image independently, as done in [10], [11], werestore all sub-aperture images within a light field simulta-neously. This allows us to exploit both spatial and angularinformation to restore the light field and thus generate lightfields which are angularly coherent. A Na¨ıve approach isto train a DCNN with n = P × Q inputs, where P and Q represent the number of vertical and horizontal angularviews respectively. However, this will significantly increasethe complexity of the DCNN which makes it harder totrain and more prone to over-fitting . Instead, given thateach sub-aperture image captures the same scene from adifferent view point, we align all sub-aperture images to thecentre view using optical flow and then reduce the angular dimension of the aligned light field using a low-rank modelof rank k , where k (cid:28) n . Results in section 3.1 show that thealignment allows us, with the considered low rank model, tosignificantly reduce the angular dimension of the light field.The linearly independent column-vectors of the low-rankrepresentation of the aligned light field, which constitute anembedding of the light field views in a lower-dimensionalspace, are then considered as a volume and simultaneouslyrestored using a DCNN with k input channels. This allowsus to significantly reduce the complexity of the networkwhich is easier to train while still preserving angular con-sistency. The restored column-vectors are then combined toreconstruct the aligned high-resolution light field. In thefinal stage we use inverse warping to restore the originaldisparities of the light field and fill the cracks caused byocclusion using a novel diffusion based inpainting strategythat propagates the restored pixels along the dominantorientation of the EPI.Simulation results demonstrate that the proposedmethod outperforms all other schemes considered herewhen tested on 13 different light fields from two differentdatasets. It is important to mention that our method wasnot trained on the Stanford light fields and these resultsclearly show that our proposed method generalizes welleven when considering light field structures whose dispar-ities are significantly larger than those used for training.Further analysis in section 4 shows that additional gainin performance can be achieved using iterative back pro-jection (IBP) as a post processing step. These results showthat our method can significantly outperform existing lightfield super-resolution methods including the deep learning-based light field super-resolution method presented in [11].This paper is organized as follows. After introducing thenotation in section 2, we describe the proposed method insection 3. Section 4 discusses the simulation results with dif-ferent types of light fields and provide the final concludingremarks in section 5. OTATION AND P ROBLEM F ORMULATION
We consider here the simplified 4D representation of lightfields called 4D light field in [16] and lumigraph in [17],describing the radiance along rays by a function I ( x, y, s, t ) where the pairs ( x, y ) and ( s, t ) respectively represent spa-tial and angular coordinates. The light field can be seenas capturing an array of viewpoints (called sub-apertureimages) of the scene with varying angular coordinates ( s, t ) .The different views will be denoted here by I s,t ∈ R X,Y ,where X and Y represent the vertical and horizontal di-mension of each sub-aperture image.In the following, the notation I s,t for the different sub-aperture images will be simplified as I i with a bijectionbetween ( s, t ) and i . The complete light field can hence berepresented by a matrix I ∈ R m,n : I = [ vec ( I ) | vec ( I ) | · · · | vec ( I n )] (1)with vec ( I i ) being the vectorized representation of the i − thsub-aperture image, m represents the number of pixels ineach view ( m = X × Y ) and n is the number of views in thelight field ( n = P × Q ), where P and Q represent the numberof vertical and horizontal angular views respectively. Let I H and I L denote the high- and low- resolutionlight fields, respectively. The super-resolution problem canbe formulated in Banach space as I L = ↓ α BI H + η (2)where η is an additive noise matrix, ↓ α is a downsamplingoperator applied on each sub-aperture image where α isthe magnification factor and B is the blurring kernel. Thereare many possible high-resolution light fields I H whichcan produce the input low-resolution light field I L via theacquisition model defined in (2). Hence, solving this ill-posed inverse problem requires introducing some priors on I H , which can be a statistical prior such as a GMM model[8], or priors learned from training data as in [5], [10], [11].Another way to visualize a light field is to considerthe EPI representation. An EPI is a spatio-angular slicefrom the light field, obtained by fixing one of the spatialcoordinates and one of the angular coordinates. Considerwe fix y := y ∗ and t := t ∗ , an EPI is an image definedas (cid:15) y ∗ ,t ∗ := I ( x, y ∗ , s, t ∗ ) . Alternatively, the vertical EPI isobtained by fixing x := x ∗ and s := s ∗ . Figure 5b showsa typical EPI structure, where the slopes of the isophotelines in the EPI are related to the disparity between theviews [9]. Isophote lines with a slope of π/ indicatethat there is no disparity across the views while the largeris the difference between the slope and π/ the larger isthe disparity across the views. ROPOSED METHOD
Figure 1 depicts the block diagram of the proposed spatiallight field super-resolution algorithm. Each sub-apertureimage of the low-resolution light field I L is first bicubicinterpolated so that both I H and I L have the same reso-lution. While a light field consists of a very large volume ofhigh-dimensional data, it also contains a lot of redundantinformation since every sub-aperture image captures thesame scene from a different viewpoint. Moreover, differentlight field capturing devices have different spatial and angu-lar specifications, which makes it very hard for a learning-based algorithm to learn a generalized model suitable torestore all kind of light fields irrespective of the capturingdevice. The Dimensionality Reduction module tries to solveboth problems simultaneously where it uses optical flow toalign the light field and a low-rank matrix approximationto reduce the dimension of the light field. Results in section3.1 show that we can reduce the dimensionality of the lightfield from R m,n to R m,k , where k (cid:28) n is the rank of thematrix, while preserving most of the information containedin the light field.The Light Field Restoration module then considers the k linear independent column-vectors of the rank- k represen-tation of the low-resolution light field as an embeddingof the light field. We then use a DCNN to recover thetexture details of the light field embedding in the lowerdimensional space. The super-resolved embedding givesan estimate of the aligned high-resolution light field. The Light Field Reconstruction module then warps the estimatedaligned high-resolution light field to restore the originaldisparities. Holes corresponding to cracks or occlusions are then filled in by diffusing information in the Epipolar PlaneImages (EPI) along directions of isophote lines computed,for the positions of missing pixels, in the EPI of the low-resolution light field.
Iterative back-projection can be furtherused as a post-process to refine the super-resolved light fieldand assure to be consistent with the low-resolution lightfield. More information about each module is provided inthe following subsections.
Many acquisition devices have been recently designed tocapture light fields, including multi-sensor approaches [2],time-sequential capture methods [18], [19] and plenopticcameras [1], [20]. All these different light field cameras havediverse spatial and angular specifications, which makes itvery hard for a learning-based algorithm to learn a general-ized model suitable to restore any kind of light field inde-pendent from the source. Moreover, a light field contains ahuge amount of redundant information since it represents adifferent view of the same scene.The authors in [21] hypothesized that this redundancycan be suppressed by jointly aligning the sub-aperture im-ages in the light field and estimating a low-rank approx-imation (LRA) of the light field. This approach has shownvery promising results in the field of light field compression.In the same spirit, the RASL algorithm [22] was used tofind the homographies that globally align a batch of linearlycorrelated images. Both methods seek for an optimal set ofhomographies such that the matrix of aligned images can bedecomposed in a low-rank matrix of aligned images, withthe latter constraining the error matrix to be sparse. Theresults in Figure 2 show that while both RASL and LRAmethods manage to globally align the sub-aperture images,the mean sub-aperture image is still very blurred, indicatingthat the light field is not suitably aligned.The authors in [5] have used the block matching algo-rithm (BMA) to align patch volumes. The results in Figure 2show that BMA manages to align better the sub-aperture im-ages, where the average variance across the n sub-apertureimage is significantly reduced. This result suggests that localmethods can improve the alignment of the sub-apertureimages, which as we will see in the sequel, will allow usto significantly reduce the dimensionality of the light field.In this paper, we formulate the light field dimensionalityreduction problem as min u , v , A || Γ u , v (cid:16) I L (cid:17) − A || s.t. rank ( A ) = k (3)where u ∈ R m,n and v ∈ R m,n are flow vectors that specifythe displacement of each pixel needed to align each sub-aperture image with the centre view, A is a rank- k matrixwhich approximates the aligned light field and Γ u , v ( · ) isa forward warping operator (which here performs a dis-parity compensation where the disparities maps ( u , v ) areestimated with an optical flow estimator). This optimizationproblem is computationally intractable. Instead, we decom-pose this problem in two sub-problems: i) use optical flowto find the flow matrices u and v that best align each sub-aperture image with the centre view and ii) use low-rankapproximation to derive the rank- k matrix that minimizesthe error with respect to the aligned light field. Fig. 1: Block diagram of the proposed light field super-resolution method
No Align RASL [22] LRMA [21] BMA Horn-Scunk [23] SIFT Flow [24] CPM [25] SPM-BP [26]44.714 43.896 44.787 4.434 16.466 1.796 7.809 5.817306.960 286.078 306.166 24.519 45.588 8.560 68.067 22.31399.092 98.261 99.080 46.374 36.388 19.164 51.073 71.652
Fig. 2: Cropped regions of the mean sub-aperture images when using different disparity compensation methods.Underneath each image we provide the average variance across the n sub-aperture images which was used in [5] tocharacterize the performance of the alignment algorithm, where smaller values indicate better alignment. The problem of aligning all the sub-aperture images withthe centre view can be formulated as I j ( x, y ) = I i ( x + u i , y + v i ) i ∈ [1 , n ] ( i (cid:54) = j ) (4)where j corresponds to the index of the centre view, and ( u i , v i ) are the flow vectors optimal to align the i -th sub- aperture image with the centre view. There are severaloptical flow algorithms intended to solve this problem[23], [24], [25], [26] where Figure 2 shows the performanceof some of these methods. It can be seen that the meanaligned sub-aperture images computed using [23] is gen-erally blurred while those aligned using the methods in[25], [26] generally provide ghosting artefacts at the edges.Moreover, it can be seen that SIFT Flow [24] generally provides very good alignment and manages to attain thesmallest variation across the sub-aperture images. While theSIFT Flow algorithm will be used in this paper to computethe flow vectors, any other optical flow method can be used. Given that the flow-vectors ( u i , v i ) for the i -th sub-apertureimage are already available, the minimization problem inEq. (3) can now be reduced to min B L , C L || I L Γ − B L C L || s.t. rank ( B L ) = k (5)where I L Γ = Γ u , v (cid:0) I L (cid:1) , B L ∈ R m,k is a rank- k matrix and C L ∈ R k,n is the combination weight matrix. These matricescan be found using singular value decomposition (SVD) I L Γ = UΣV T , where B L is set as the k first columns of UΣ and C L is set as the k first rows of V T , so that B L C L is the closest k -rank approximation of the aligned light field I L Γ . The error matrix E L is the error matrix which is simplycomputed using E L = I L Γ − B L C L Figure 3 depicts the performance of three different di-mensionality reduction techniques at different ranks. Tomeasure the dimensionality reduction ability of these meth-ods we compute the root mean square error (RMSE) be-tween the aligned original and the rank- k representationof the aligned light field. It can be seen that the RASLalgorithm has the largest distortions at almost all rankswhen compared to the other two approaches. On the otherhand, it can be seen that HLRA manages to significantlyoutperform the RASL method, which gain is attained bythe improved alignment. Nevertheless, it can be clearly ob-served that the proposed Sift Flow + LRA method gives thebest performance, especially at lower ranks, indicating thatmore information is captured within the low-rank matrix.To emphasize this point we show in figure 3 the principalbasis of PCA, HLRA and Sift Flow + LRA. PCA is computedon the light field without disparity estimation and thereforecan be considered here as a baseline to show that alignmentallows us to get more information in the principal basis.Moreover, it can be seen that the principal basis derivedusing our Sift Flow + LRA manages to capture more tex-ture detail in the principal basis than the other methodsand confirms the benefit that local alignment has on theenergy compaction ability of the proposed dimensionalityreduction method. We consider a low-rank representation of the aligned low-resolution light field A L = B L C L , where A L ∈ R m,n isa rank- k matrix with k (cid:28) n . Similarly, A H = B H C H is a rank- k representation of the aligned high-resolutionlight field. The rank of a matrix is defined as the maxi-mum number of linearly independent column vectors in thematrix. Moreover, the linearly dependent column vectorsof a matrix can be reconstructed using a weighted sum-mation of the linearly independent column vectors of the
1. Note that the RASL method does decompose the matrix into acombination of basis elements and therefore the principal basis of RASLcould not be shown here. same matrix. This leads us to decompose A L in two sub-matrices: ˘ A L ∈ R m,k which groups the linear independentcolumn vectors of A L and ˆ A L ∈ R m,n − k which groups thelinearly dependent column vectors of A L . In practice, wedecompose the rank- k matrix A L using QR decomposition( i.e . A L = QR ). The index of the linearly independentcomponents of A L then correspond to the index of the non-zero diagonal elements of the upper-triangular matrix R .We then use the same indices to decompose A H into sub-matrices ˘ A H and ˆ A H . The matrix ˆ A L can be reconstructedas a linear combination of ˘ A L , where the weight matrix W is computed using W = (cid:16) ˘ A L (cid:124) ˘ A L (cid:17) † ˘ A L (cid:124) ˆ A L (6)where ( · ) † stands for the pseudo inverse operator. Weassume here that the weight matrix W , which is optimalin terms of least squares to reconstruct ˆ A L , is suitable toreconstruct ˆ A H .Driven by the recent success of deep learning in the fieldof single-image [27], [28] and light field super-resolution[10], [11], we use a DCNN to model the upscaling functionthat minimizes the following objective function / || ˘ A H − f (cid:16) ˘ A L (cid:17) || (7)where f ( · ) is a function modelled by the DCNN illustratedin Figure 4 which has ten convolutional layers. The linearlyindependent sub-matrix ˘ A L is passed through a stack ofconvolutional and rectified linear unit (ReLU) layers. We usea convolution stride of 1 pixel with no padding nor spatialpooling. The first convolutional layer has 64 filters of size × × k while the last layer, which is used to reconstructthe high-resolution light field, employs k filter of size × × . All the other layers use 64 filters of size × × which are initialized using the method in [29]. The DCNNwas trained using a total of 200,000 random patch-volumesof size × × k from the 98 low- and high-resolutionlow-rank approximation of rank k of the light fields fromthe EPFL, INRIA and HCI datasets . The Titan GTX1080TiGraphical Processing Unit (GPU) was used to speed up thetraining process.During the evaluation phase, we estimate the super-resolved linearly independent representation of the lightfield ˘ A H = f (cid:16) ˘ A L (cid:17) . We then estimate the super-resolvedlinear dependent part of the light field using ˆ A H = ˘ A H W (8)The super-resolved low-rank representation of the alignedlight field is then derived by the union of the two matrices ˆ A H and ˘ A H i.e . A H = ˆ A H ∪ ˘ A H . The super-resolved lightfield is then reconstructed using ˆ I H Γ = A H + E L (9)
2. It must be noted that none of the light fields used for validationwere used for training.
Rank R M SE RASLHLRASift Flow + LRA
Rank R M SE RASLHLRASift Flow + LRA
PCA HLRA Sift Flow + LRA PCA HLRA Sift Flow + LRA
Fig. 3: These figures show how the error between the low-rank and full rank representation vary at different ranks. It canbe seen that using optical flow to align the light field followed by low-rank approximation attains the best performance.The images in the second row show the principal basis derived using different methods. The sharper the principal basis isthe more information is being captured in the principal basis.Fig. 4: The proposed network structure which receives a low-resolution light field and restores it using the proposedDCNN.
The restored aligned light field ˆ I H Γ has all sub-apertureimages aligned with the centre view. A na¨ıve approach torecover the original disparities of the restored sub-apertureimages is to use forward warping. However, as can be seenin the first column of Figure 5a, forward warping is not ableto restore all pixels and results in a number of cracks or holescorresponding to occlusions (marked in green). One can useeither bicubic interpolation or inverse warping to fill theholes. However, in case of occlusions, the neighbouring pix-els may not be well correlated with the missing information,which often results in inaccurate estimations (see Figure 5asecond column). More advanced inpainting algorithms [30],[31] can be used to restore each hole separately. However,these methods do not exploit the light field structure andtherefore provide inconsistent reconstruction of the samespatial region at different angular views. Forward Warping Bicubic Interpolation EPI Diffusion (a) Inpainting the cracks marked in green(b) Diffusion based inpainting
Fig. 5: Filling the missing information caused by occlusion.In this work, we use a diffusion based inpainting algo-rithm that estimates the missing pixels by diffusing infor-mation available in other views. Similar to the work in [32],we exploit the EPI structure to diffuse information along the dominant orientation of the EPI. However, instead ofpredicting the orientation of unknown pixels from theirspatial neighborhood as done in [32], we exploit the sim-ilarity between the low- and super-resolved EPIs and usethe structure tensor computed on the low-resolution EPI toguide the inpainting process in the high-resolution EPI.Without loss of generality we consider the EPI where thedimensions y ∗ and t ∗ are fixed. The case of vertical slicesis analogous. We first compute the structure tensor of thelow-resolution EPI (cid:15) Ly ∗ ,t ∗ at coordinates ( x, s ) using T ( x, s ) = ∇ (cid:15) Ly ∗ ,t ∗ ( x, s ) ∇ (cid:15) Ly ∗ ,t ∗ ( x, s ) (cid:124) (10)where ∇ stands for the single order gradient computedusing the sobel kernel. The authors in [32] compute anaverage weighting of the columns of T ( x, s ) to derivethe dominant orientation, where the weights are given byan anisotropy measure. Nevertheless, the anisotropy mayfail in regions that are smooth and therefore the weightedaverage may fail in these regions to estimate the dominantdirection of the EPI. This problem becomes an issue whencomputing these orientations on low-resolution versions ofthe light fields. Instead, we estimate the orientation at everypixel in the EPI by computing the eigen decomposition of T ( x, s ) and choose the direction θ ( x, s ) which correspondsto the eigen-vector with the smallest eigen-value. Moreover,driven by the observation that the disparities in a light fieldare typically small, and considering that the local slope inthe EPI is proportional to the disparity, it is reasonable toassume that slopes which correspond to large disparitiesare less probable to occur. Therefore, to ensure that thetensor driven diffusion is performed along a single coherentdirection per column of the EPI and reduce noise, thedominant orientation θ ( x ) is computed using the column-wise median of θ ( x, s ) whose orientation is in the range [ π − α, π + α ] radians. In all our experiments we set α = π/ . While the dominant orientation vectors are lessnoisy, we further reduce the noise by applying the TotalVariation (TV-L1) denoising [33] on the orientation field θ ( x ) , which searches to optimize ˆ θ = arg min ˆ θ ||∇ ˆ θ || + λ || ˆ θ − θ || (11)where λ is the total variation regularization parameter andwas set to 0.5 in our experiments. Figure 5b (top) shows theEPI of the low-resolution light field (cid:15) Ly ∗ ,t ∗ and the dominantorientations ˆ θ ( x ) marked by blue arrows.The restored EPI (cid:15) Hy ∗ ,t ∗ := ˆ I H ( x, y ∗ , s, t ∗ ) (see Figure 5b(bottom)) has a number of missing pixels (marked in green).Consider that the hole we want to inpaint has coordinates ( x p , s p ) . The aim of the proposed diffusion based inpaintingalgorithm is to propagate known pixels in the orientation ˆ θ ( x p ) to fill the missing pixels. The diffusion over the EPI (cid:15) Hy ∗ ,t ∗ evolves as ∂ (cid:15) Hy ∗ ,t ∗ ∂s = Tr (cid:16) ˆ θ ( x p ) ˆ θ ( x p ) (cid:124) H ( x p , s p ) (cid:17) (12)where Tr ( · ) stands for the trace operator and H ( x, s ) denotes the Hessian of (cid:15) Hy ∗ ,t ∗ at coordinates ( x, s ) . The term ˆ θ ( x p ) ˆ θ ( x p ) (cid:124) is used to enforce the diffusion to occur onlyin the direction of the isophote eigenvector. The missing pixels are restored iteratively by finding the solution to (12)which is closest to zero. Figure 5a (third column) shows theresults attained using the proposed inpainting strategy.In order to restore all the cracks in the light field we firstfix t ∗ to that of the center view and iteratively restore allthe horizontal EPIs for all y ∗ ∈ [1 , Y ] by solving Eq. (12).This corresponds to filling the cracks for the centre row ofthe matrix of sub-aperture images. We then fix s ∗ to that ofthe centre view and iteratively restore all the vertical EPIsfor all x ∗ ∈ [1 , X ] which effectively restores all cracks forthe centre column of the matrix of sub-aperture images. Theremaining propagations are performed row-by-row whereeach time we restore all pixels within t ∗ ∈ [1 , Q ] . One problem with the method proposed in this paper isthat after we restore the aligned light field ˆ I H Γ we have tocompute inverse warping to restore the original dispari-ties. However, the inverse warping is not able to recoveroccluded regions and some pixels are displaced by ± -pixel due to rounding errors. While the former problemis solved using the method described in Section 3.3, thesecond problem was not yet addressed. Nevertheless, theresults illustrated in Tables 1, 2 and Figure 6 indicate thatsignificant performance gains can be achieved even if wedo not explicitly cater for distortions caused by roundingerrors in the inverse warping process. Nevertheless, thesedistortions can be corrected using the classical method ofiterative back projection [34], which is adopted by severalsingle image super-resolution methods (see [35]) to ensurethat the down sampled version of the super-resolved lightfield is consistent with the observed low-resolution lightfield. The IBP algorithm iteratively refines the estimatedhigh-resolution light field ¯ I Hκ at iteration κ by first back-projecting it into an estimated low-resolution light field ¯ I Lκ using ¯ I Lκ = ↑ α (cid:16) ↓ α B ¯ I Hκ (cid:17) (13)where ↓ α is a downsampling operator, ↑ α is the bicubicupscaling operation, α is the magnification factor and B is the blurring kernel. The deviation between the LR viewsfound by back-projection and the original LR views is thenused to further correct each HR estimated view of the lightfield as ¯ I Hκ +1 = ¯ I Hκ + (cid:16) I L − ¯ I Lκ (cid:17) (14)The IBP algorithm is initiated by setting ¯ I H = ˆ I H and theiterative procedure terminates when κ = K . It was observedthat significant improvements were achieved in the first fewiterations and we therefore set K = 10 in our experiments.The restored light field following iterative back-projection istherefore set to ¯ I H = ¯ I HK . XPERIMENTAL R ESULTS
The experiments conducted in this paper use both syntheticand real-world light fields from publicly available datasets.We use 98 light fields from the EPFL [36], INRIA and
3. INRIA dataset: https://goo.gl/st8xRt
HCI for training. We conducted the tests using light fieldsfrom the INRIA and Stanford datasets. We use the Stanforddataset in this evaluation since it has disparities significantlylarger than both INRIA and EPFL light fields, which werecaptured using plenoptic cameras. Moreover, unlike the HCIdataset, the Stanford light fields capture real world objects.We therefore use this dataset to assess the generalizationability of the algorithms considered in this experiment tolight fields which are captured using camera sensors whichdiffer from the ones used for training. While the sub-aperture images of the EPFL, HCI and Stanford datasets areavailable, the light fields in the INRIA dataset were decodedusing the method in [37] as mentioned on their website. Inall our experiments we consider a × array of sub-apertureimages. For computational purposes, the high-resolutionimages of the Stanford dataset were downscaled such thatthe lowest dimension is set to 400 pixels. The high-resolutionimages of the other datasets were kept unchanged, i.e . × for the HCI light fields and × for bothEPFL and INRIA light fields. Unless otherwise specified, thelow-resolution light fields were generated by blurring eachhigh-resolution sub-aperture image with a Gaussian filterusing a window size of 7 and standard deviation of 1.6,down-sampled to the desired resolution and up-scaled backto the target resolution using bi-cubic interpolation. Unlessotherwise specified, the iterative back-projection refinementstrategy was disabled to permit a fair comparison to theother state of the art super-resolution methods considered.We compare the performance of our system againstthe best performing methods found in our recent work[5], namely the CNN based light field super-resolutionalgorithm (LF-SRCNN) [11] and both linear subspace pro-jection based methods, PCA+RR and BM+PCA+RR [5].These methods were retrained using samples from the 98training light fields mentioned above using training pro-cedures explained in their respective papers. Moreover,given that the very deep super-resolution (VDSR) method[28] achieved state-of-the-art performance on single im-age super-resolution, we apply this method to restore ev-ery sub-aperture image independently. It is important tomention here that in our previous work we found thatBM+PCA+RR significantly outperforms several other lightfield and single-image super-resolution algorithms includ-ing [8], [10], [27], [38], [39], [40], [41]. Due to space con-straints we did not provide comparisons against the latterapproaches.The results in table 1 and table 2 compare these super-resolution methods in terms of PSNR for magnificationfactors of × and × respectively. Moreover, the resultsin Figure 6 show the centre views of light fields restoredusing these methods when considering a magnification fac-tor of × . The VDSR algorithm [28] achieves on averagea PSNR gain of 0.33 dB over bicubic interpolation. Onemajor limitation of VDSR is that it does not exploit the lightfield structure where each sub-aperture image is being re-stored independently. The PCA+RR algorithm [5] managesto restore more texture detail and is particularly effectiveto restore light fields with small disparities, which is the
4. HCI dataset: http://hci-lightfield.iwr.uni-heidelberg.de/5. Stanford dataset: http://lightfield.stanford.edu/ case of the INRIA light fields. This can be attributed tothe fact that the PCA+RR does not compensate for dispar-ities and therefore is not able to generalize to light fieldscontaining disparities which were not considered duringtraining, which is the case for the Stanford light fields. Infact the Stanford light fields restored using PCA+RR containblocking artefacts. The BM+PCA+RR method [5] extendsthis method by aligning the patch-volumes using block-matching. This results in a more generalized method thatachieves average PSNR gains of 2.52 dB and 1.76 dB overbi-cubic on the INRIA and Stanford light fields respectively.Nevertheless, the light fields restored using this methodmay contain some artefacts especially near the edges.The LF-SRCNN method [11], which uses deep learn-ing to restore each sub-aperture image independently, wasfound to achieve a marginal gain over BM+PCA+RR (0.05dB). While, light fields restored using LF-SRCNN gener-ally contain less artefacts compared to BM+PCA+RR, theyrisk to restore light fields which are angularly incoherent.Nevertheless, our method achieves the best performanceachieving an overall gain of 0.2 dB over the second-bestperforming algorithm LF-SRCNN. This performance gainis more evident in the subjective results illustrated in Figure6 where the restored light fields are sharper and are visuallymore pleasing.As mentioned in section 3.4, one problem with theproposed method is that the inverse warping is unableto perfectly restore the original disparities of the lightfield. Nevertheless, the results in tables 1, 2 and Figure 6clearly show that our proposed method outperforms theother schemes even without the use of the iterative back-projection refinement strategy.In order to fairly assess the contribution of iterative back-projection, we apply it as a post process for the two bestperforming methods, namely LF-SRCNN and our proposedscheme LR-LFSR. Tables 3 and 4 show the performance ofthe two best performing methods with and without iterativeback projection as a post-processing step at magnificationfactors of × and × respectively. It can be immediatelynoticed that IBP significantly improves the quality of bothmethods. Nevertheless, our method which uses iterativeback projection as a post-processing step achieves the bestperformance achieving PSNR gains of 0.41 dB and 0.31 dBat magnification factors of × and × respectively over LF-SRCNN followed by iterative back projection. It is importantto notice that while the performance gain of our methodover LF-SRCNN without IBP is around 0.23 dB and 0.12dB at magnification factors of × and × respectively, thisgain roughly doubles when both use IBP as a post pro-cessing step. This indicates that since LF-SRCNN processeseach view independently, IBP only corrects inconsistenciesbetween the low-resolution and restored light fields. Apartfrom this distortion, LR-LFSR-IBP corrects the distortionscaused by the inverse warping process which provides lightfields which are more visually pleasing and with smoothertransitions across views. Supplementary multimedia filesuploaded on ScholarOne show some sample restored lightfield. Supplementary material is available on the project’swebsite while the code of the LR-LFSR will be made
6. https://goo.gl/8DDsDi available upon publication. R EFERENCES [1] R. Ng, M. Levoy, M. Br´edif, G. Duval, M. Horowitz, andP. Hanrahan, “Light Field Photography with a Hand-HeldPlenoptic Camera,” Tech. Rep., Apr. 2005. [Online]. Available:http://graphics.stanford.edu/papers/lfcamera/[2] B. Wilburn, N. Joshi, V. Vaish, E.-V. Talvala, E. Antunez,A. Barth, A. Adams, M. Horowitz, and M. Levoy, “Highperformance imaging using large camera arrays,”
ACM Trans.Graph. , vol. 24, no. 3, pp. 765–776, Jul. 2005. [Online]. Available:http://doi.acm.org/10.1145/1073204.1073259[3] G. Wu, B. Masia, A. Jarabo, Y. Zhang, L. Wang, Q. Dai, T. Chai, andY. Liu, “Light field image processing: An overview,”
IEEE Journalof Selected Topics in Signal Processing , vol. PP, no. 99, pp. 1–1, 2017.[4] C.-K. Liang and R. Ramamoorthi, “A light transport frameworkfor lenslet light field cameras,”
ACM Trans. Graph. , vol. 34,no. 2, pp. 16:1–16:19, Mar. 2015. [Online]. Available: http://doi.acm.org/10.1145/2665075[5] R. A. Farrugia, C. Galea, and C. Guillemot, “Super resolutionof light field images using linear subspace projection of patch-volumes,”
IEEE Journal of Selected Topics in Signal Processing ,vol. PP, no. 99, pp. 1–1, 2017.[6] A. Levin, W. Freeman, and F. Durand, “Understanding cameratrade-offs through a bayesian analysis of light field projections,”in
European Conference on Computer Vision (ECCV) , Oct. 2008.[7] T. E. Bishop and P. Favaro, “The light field camera: Extended depthof field, aliasing, and superresolution,”
IEEE Transactions on PatternAnalysis and Machine Intelligence , vol. 34, no. 5, pp. 972–986, May2012.[8] K. Mitra and A. Veeraraghavan, “Light field denoising, light fieldsuperresolution and stereo camera based refocussing using a gmmlight field patch prior,” in , June 2012, pp.22–28.[9] S. Wanner and B. Goldluecke, “Variational light field analysis fordisparity estimation and super-resolution,”
IEEE Transactions onPattern Analysis and Machine Intelligence , vol. 36, no. 3, pp. 606–619, March 2014.[10] Y. Yoon, H.-G. Jeon, D. Yoo, J.-Y. Lee, and I. S. Kweon, “Learn-ing a deep convolutional network for light-field image super-resolution,” in
IEEE Conference on Computer Vision and PatternRecognition (CVPR) , June 2015, pp. 57–65.[11] Y. Yoon, H. G. Jeon, D. Yoo, J. Y. Lee, and I. S. Kweon, “Light-field image super-resolution using convolutional neural network,”
IEEE Signal Processing Letters , vol. 24, no. 6, pp. 848–852, June 2017.[12] N. K. Kalantari, T.-C. Wang, and R. Ramamoorthi, “Learning-based view synthesis for light field cameras,”
ACM Transactionson Graphics (Proceedings of SIGGRAPH Asia 2016) , vol. 35, no. 6,2016.[13] G. Wu, M. Zhao, L. Wang, Q. Dai, T. Chai, and Y. Liu, “Lightfield reconstruction using deep convolutional network on epi,” in
IEEE Computer Society Conference on Computer Vision and PatternRecognition (CVPR) , July 2017.[14] T.-C. Wang, J.-Y. Zhu, N. K. Kalantari, A. A. Efros, and R. Ra-mamoorthi, “Light field video capture using a learning-basedhybrid imaging system,”
ACM Transactions on Graphics (Proceedingsof SIGGRAPH) , vol. 36, no. 4, 2017.[15] Y. Wang, Y. Liu, W. Heidrich, and Q. Dai, “The light field attach-ment: Turning a dslr into a light field camera using a low bud-get camera ring,”
IEEE Transactions on Visualization and ComputerGraphics , vol. 23, no. 10, pp. 2357–2364, Oct 2017.[16] M. Levoy and P. Hanrahan, “Light field rendering,” in
Proceedings of the 23rd Annual Conference on Computer Graphicsand Interactive Techniques , ser. SIGGRAPH ’96. New York,NY, USA: ACM, 1996, pp. 31–42. [Online]. Available: http://doi.acm.org/10.1145/237170.237199[17] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen,“The lumigraph,” in
Proceedings of the 23rd Annual Conference onComputer Graphics and Interactive Techniques , ser. SIGGRAPH ’96.New York, NY, USA: ACM, 1996, pp. 43–54. [Online]. Available:http://doi.acm.org/10.1145/237170.237200[18] C. Kim, H. Zimmer, Y. Pritch, A. Sorkine-Hornung, and M. Gross,“Scene reconstruction from high spatio-angular resolution lightfields,”
ACM Trans. Graph. , vol. 32, no. 4, pp. 73:1–73:12, Jul. 2013.[Online]. Available: http://doi.acm.org/10.1145/2461912.2461926 TABLE 1: PSNR using different light field super-resolution algorithms when considering a magnification factor of × . Forclarity bold blue marks the highest and bold red indicates the second highest score. Light Field Name Bicubic PCA+RR [5] BM+PCA+RR [5] LF-SRCNN [11] VDSR [28] Proposed
Bee 2 (INRIA) 30.1673
Duck (INRIA) 23.5394
Framed (INRIA) 27.5974
Fruits (INRIA) 28.4907 31.7884
Mini (INRIA) 27.6332
Rose (INRIA) 33.5436
Chess (STANFORD) 30.2895 31.9292
Eucalyptus (STANFORD) 30.7865 32.4162
Lego Gantry (STANFORD) 27.6235 28.0729 28.7230
Lego Knights (STANFORD) 27.3794 27.8457
TABLE 2: PSNR using different light field super-resolution algorithms when considering a magnification factor of × . Forclarity bold blue marks the highest and bold red indicates the second highest score. Light Field Name Bicubic PCA+RR [5] BM+PCA+RR [5] LF-SRCNN [11] VDSR [28] Proposed
Bee 2 (INRIA) 27.8623 31.2436 31.1886
Dist. Church (INRIA) 23.3138
Duck (INRIA) 22.0702 23.9844 23.8083
Framed (INRIA) 26.1627 28.0771
Fruits (INRIA) 26.5269 29.0908 29.1969
Mini (INRIA) 26.3035
Rose (INRIA) 31.7687
Chess (STANFORD) 27.3679 29.5207 29.6773
Eucalyptus (STANFORD) 29.2136
Lego Knights (STANFORD) 24.0182 24.4944 25.5768
TABLE 3: PSNR obtained with the best two methods at a magnification of × with and without iterative back projectionas a post process. Light Field Name Bicubic LF-SRCNN LF-SRCNN-IBP LR-LFSR LR-LFSR-IBP
Bee 2 (INRIA) 30.1673 33.8268 34.7257 33.4915
Dist. Church (INRIA) 24.3059 25.9930 26.7415 26.7502
Duck (INRIA) 23.5394 26.0713 27.293 26.3777
Framed (INRIA) 27.5974 30.0697 31.0592 30.5365
Fruits (INRIA) 28.4907 31.6820 32.6581 32.0789
Mini (INRIA) 27.6332 29.8941 30.5193 30.1601
Rose (INRIA) 33.5436 36.8245 37.4991 36.8416
Amethyst (STANFORD) 30.5227 32.2953 33.5662 32.2737
Bracelet (STANFORD) 26.4662 28.8858 30.6367 29.4046
Chess (STANFORD) 30.2895 32.1922 33.6715 32.6123
Eucalyptus (STANFORD) 30.7865 32.1989 33.0700 32.6205
Lego Gantry (STANFORD) 27.6235 29.8086 31.5071 29.8112
Lego Knights (STANFORD) 27.3794 29.5354 31.2148 29.3177 [19] Y. Taguchi, A. Agrawal, S. Ramalingam, and A. Veeraraghavan,“Axial light field for curved mirrors: Reflect your perspective,widen your view,” in , June 2010, pp. 499–506.[20] A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, andJ. Tumblin, “Dappled photography: Mask enhanced camerasfor heterodyned light fields and coded aperture refocusing,”
ACM Trans. Graph. , vol. 26, no. 3, Jul. 2007. [Online]. Available:http://doi.acm.org/10.1145/1276377.1276463[21] X. Jiang, M. L. Pendu, R. A. Farrugia, and C. Guillemot, “Lightfield compression with homography-based low rank approxima-tion,”
IEEE Journal of Selected Topics in Signal Processing , vol. PP,no. 99, pp. 1–1, 2017.[22] Y. Peng, A. Ganesh, J. Wright, W. Xu, and Y. Ma, “Rasl: Robustalignment by sparse and low-rank decomposition for linearly correlated images,” in , June 2010, pp. 763–770.[23] B. K. Horn and B. G. Schunck, “Determining optical flow,” Cam-bridge, MA, USA, Tech. Rep., 1980.[24] C. Liu, J. Yuen, and A. Torralba, “Sift flow: Dense correspondenceacross scenes and its applications,”
IEEE Transactions on PatternAnalysis and Machine Intelligence , vol. 33, no. 5, pp. 978–994, May2011.[25] Y. Hu, R. Song, and Y. Li, “Efficient coarse-to-fine patch matchfor large displacement optical flow,” in , June 2016, pp.5704–5712.[26] Y. Li, D. Min, M. S. Brown, M. N. Do, and J. Lu, “Spm-bp: Sped-uppatchmatch belief propagation for continuous mrfs,” in , Dec 2015, pp. Original Bicubic PCA+RR [5] BM+PCA+RR [5] LF-SRCNN [11] VDSR [28] Proposed
Fig. 6: Restored center view using different light field super-resolution algorithms. These are best viewed in color and byzooming on the views.
IEEE Trans. Pattern Anal.Mach. Intell. , vol. 38, no. 2, pp. 295–307, Feb. 2016. [Online].Available: http://dx.doi.org/10.1109/TPAMI.2015.2439281[28] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolutionusing very deep convolutional networks,” in , June 2016, pp.1646–1654.[29] X. Glorot and Y. Bengio, “Understanding the difficulty oftraining deep feedforward neural networks,” in
Proceedings ofthe Thirteenth International Conference on Artificial Intelligence andStatistics (AISTATS-10)
Trans. Img. Proc. ,vol. 13, no. 9, pp. 1200–1212, Sep. 2004. [Online]. Available:http://dx.doi.org/10.1109/TIP.2004.833105[31] Z. Xu and J. Sun, “Image inpainting by patch propagation usingpatch sparsity,”
IEEE Transactions on Image Processing , vol. 19, no. 5,pp. 1153–1165, May 2010.[32] O. Frigo and C. Guillemot, “Epipolar plane diffusion: An efficientapproach for light field editing,” London, France, Sep. 2017.[33] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variationbased noise removal algorithms,”
Physica D: Nonlinear Phenomena
CVGIP: Graphical Models and Image Processing , Sept 2009, pp. 349–356.[36] M. Rerabek and T. Ebrahimi, “New light field image dataset,” in
Int. Conf. on Quality of Experience , 2016.[37] D. G. Dansereau, O. Pizarro, and S. B. Williams, “Decoding,calibration and rectification for lenselet-based plenoptic cameras,”in ,June 2013, pp. 1027–1034.[38] T. Peleg and M. Elad, “A statistical prediction model based onsparse representations for single image super-resolution,”
IEEETransactions on Image Processing , vol. 23, no. 6, pp. 2569–2582, June2014.[39] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolutionusing deep convolutional networks,”
IEEE Transactions on PatternAnalysis and Machine Intelligence , vol. 38, no. 2, pp. 295–307, Feb2016.[40] K. Zhang, B. Wang, W. Zuo, H. Zhang, and L. Zhang, “Jointlearning of multiple regressors for single image super-resolution,”
IEEE Signal Processing Letters , vol. 23, no. 1, pp. 102–106, Jan 2016.[41] R. Timofte, V. De, and L. V. Gool, “Anchored neighborhoodregression for fast example-based super-resolution,” in , Dec 2013, pp. 1920–1927.
ONCLUSION
In this paper, we have proposed a novel spatial light fieldsuper-resolution algorithm able to reconstruct high qualitycoherent light fields. We have shown that the informationin a light field can be efficiently compacted by aligning thesub-aperture images using optical flow followed by low-rank matrix approximation. The low rank approximation of TABLE 4: PSNR obtained with the best two methods at a magnification of × with and without iterative back projectionas a post process. Light Field Name Bicubic LF-SRCNN LF-SRCNN-IBP LR-LFSR LR-LFSR-IBP
Bee 2 (INRIA) 27.8623 31.3945 32.4662 31.2545
Dist. Church (INRIA) 23.3138 24.5874
Framed (INRIA) 26.1627 27.9157 28.9111 28.2954
Fruits (INRIA) 26.5269 29.2100 30.1651 29.5297
Mini (INRIA) 26.3035 28.1731 28.6895 28.4009
Rose (INRIA) 31.7687 34.3064 34.9356 34.3392
Amethyst (STANFORD) 29.0665 30.5971 31.5991 30.4628
Bracelet (STANFORD) 24.1221 26.1013 27.0010 26.2712
Chess (STANFORD) 27.3679 30.0279 31.1502 30.1485
Eucalyptus (STANFORD) 29.2136 30.9433 31.5263 31.1772
Lego Gantry (STANFORD) 24.6721 26.8054 27.8395 26.9466
Lego Knights (STANFORD) 24.0182 26.0771 27.6550 26.1358 the aligned light field gives an embedding in a lower dimen-sional space which is super-resolved using deep learning.All aligned views of the high-resolution light field can bereconstructed from the super-resolved embedding by simplelinear combinations. These views are then inverse warpedto restore the disparities of the original light field. Holescorresponding to dis-occlusions or cracks resulting fromthe inverse warping are filled in using a novel diffusionbased inpainting algorithm which diffuses known pixels inthe EPI along dominant orientations computed in the low-resolution EPI.Extensive simulations show that the proposed methodmanages to generalize well, i.e. manages to successfullyrestore light fields whose disparities are considerably dif-ferent from those used during training. These results alsoshow that our proposed method is competitive and mostof the time superior to existing state-of-the-art light fieldsuper-resolution algorithms, including a recent approachwhich adopts deep learning to restore each sub-apertureimage independently. One major limitation of the proposedscheme is that the inverse warping process is not able torestore the original disparities and produces some distortioncaused by rounding errors. We proposed here to use theclassical iterative back-projection as a post processing step.Simulation results clearly show the benefit of using IBPas a post processing of the super-resolved light field anddemonstrate that the proposed method with IBP achievesthe best performance, outperforming LF-SRCNN followedby IBP by 0.4 dB. A CKNOWLEDGMENTS
This project has been supported in part by the EU H2020 Re-search and Innovation Programme under grant agreementNo 694122 (ERC advanced grant CLIM).
Reuben A. Farrugia (S04, M09, SM17) receivedthe first degree in Electrical Engineering fromthe University of Malta, Malta, in 2004, and thePh.D. degree from the University of Malta, Malta,in 2009. In January 2008 he was appointed As-sistant Lecturer with the same department andis now a Senior Lecturer. He has been in tech-nical and organizational committees of severalnational and international conferences. In partic-ular, he served as General-Chair on the IEEE Int.Workshop on Biometrics and Forensics (IWBF)and as Technical Programme Co-Chair on the IEEE Visual Communica-tions and Image Processing (VCIP) in 2014. He has been contributing asa reviewer of several journals and conferences, including IEEE Transac-tions on Image Processing, IEEE Transactions on Circuits and Systemsfor Video and Technology and IEEE Transactions on Multimedia. OnSeptember 2013 he was appointed as National Contact Point of theEuropean Association of Biometrics (EAB).