PlenoptiCam v1.0: A light-field imaging framework
11 PlenoptiCam v1.0: A light-field imaging framework
Christopher Hahne and Amar AggounUniversity of Wolverhampton, School of Mathematics and Computer ScienceWulfruna Street, WV1 1LY Wolverhampton, United Kingdom { c.hahne, a.aggoun } @wlv.ac.uk Abstract —Light-field cameras play a vital role for rich 3-Dinformation retrieval in narrow range depth sensing applications.The key obstacle in composing light-fields from exposures takenby a plenoptic camera is to computationally calibrate, re-alignand rearrange four-dimensional image data. Several attemptshave been proposed to enhance the overall image quality bytailoring pipelines dedicated to particular plenoptic cameras andimproving the color consistency across viewpoints at the expenseof high computational loads. The framework presented hereinadvances prior outcomes thanks to its cost-effective color equal-ization from parallax-invariant probability distribution transfersand a novel micro image scale-space analysis for generic cameracalibration independent of the lens specifications. Our frameworkcompensates for hot-pixels, resampling artifacts, micro imagegrid rotations just as vignetting in an innovative way to enablesuperior quality in sub-aperture image extraction, computationalrefocusing and Scheimpflug rendering with sub-sampling capa-bilities. Benchmark comparisons using established image metricssuggest that our proposed pipeline outperforms state-of-the-arttool chains in the majority of cases. The software described inthis paper is released under an open-source license offering cross-platform compatibility, few dependencies and a lean graphicaluser interface to make the reproduction of results and theexperimentation with plenoptic camera technology convenient forpeer researchers, developers, photographers, data scientists andeveryone else working in this field.
Index Terms —plenoptic, light-field, calibration, color transfer
I. I
NTRODUCTION T HERE is a growing body of literature in the fields ofexperimental photography [1], [2], medical imaging [3],[4], [5], [6] and machine learning [7] recognizing capabilitiesoffered by light-fields. The probably most influential light-field model in computer graphics was devised by Levoy andHanrahan [8] who described a light-field to be a collection ofray vectors piercing through two planes stacked behind oneanother. The intersections of ray vectors at the two planesmake up four Cartesian coordinates, which is why light-fieldsare often referred to as four-dimensional (4-D). In their muchcelebrated paper, Levoy and Hanrahan demonstrate that thetwo planes serve as the spatial and angular image domainproviding two different light-field representations that can betransferred into each other. Such light-field transformationcorresponds to changing the sequential order of the two planes.Capturing a light-field from a monocular lens is achievedwith a single sensor stacked behind an aperture grid [9] suchas a Micro Lens Array (MLA). This optical setup is known asa plenoptic camera and can be thought of as accommodatingan array of consistently spaced virtual cameras located at the aperture plane [10]. As opposed to light-fields obtained froman array of multiple cameras, captures from the plenopticcamera need to undergo additional processing to be repre-sented as such [11], [12]. The software described in this paperconsolidates preliminary state-of-the-art decoding procedures,while introducing innovations with regards to calibration,resampling, color management and refocusing.Until today, the landscape of software tools treating plenop-tic content has been characterized by heterogeneity. One ofthe most mature and influential software applications wasreleased by camera manufacturer Lytro in 2012. Lytro’s imageprocessing pipeline remains closed-source and is not publiclymaintained since the company shut down business in 2018.In the earlier days of Lytro’s lifetime, independent program-mers developed binary file decoders by reverse-engineeringLytro’s file formats [13], [14]. These tools, however, areunable to perform light-field rendering functionalities suchas refocusing. Several scientists published methods on howto algorithmically calibrate Lytro’s plenoptic camera withinMatlab [11], [12], [15]. These methods are based on thegiven metadata while concentrating on the micro image centerdetection, 4-D rearrangement and rectification of radial lensdistortions. More recently, research has taken the directionto successfully recover physical information at light-fieldboundaries [16], [17]. While these studies revealed convincingresults, obtaining light-field color consistency as proposed byMatysiak et al. [17] requires high computational resources.Unlike afore-mentioned attempts, P
LENOPTI C AM is notlimited to Lytro file data, but is also capable of handlingfootage from custom-built plenoptic cameras through its microimage scale estimation. Other innovative features of P LENOP - TI C AM include a novel and rapid color correction across sub-aperture images, de-vignetting based on least-squares fitting, afirst hexagonal artifact removal, and alternatives to resamplingand the much popularized refocusing technique, which enablesfocus manipulation and Scheimpflug rendering of a pictureafter the fact. These innovations help leverage P LENOPTI C AM to accomplish outstanding results, which are validated in thisstudy with standard image metrics. Equipped with generic cali-bration, decoding and rendering routines, P LENOPTI C AM maybuild the foundation for future research on light-field imagealgorithms. Novel research goals can be achieved more easilyas image scientists can adapt code and focus on their individualidea. The absence of powerful and heavy-weight librariesis intended to keep the dependency list short. Nonetheless,additional libraries can be used as an extension, e.g. to rectifylens distortions and extract depth at a post-processing stage. a r X i v : . [ ee ss . I V ] O c t Fig. 1:
Package architecture with gray blocks representing modules whereas orange blocks indicate essential top-level classes.Note that P
LENOPTI C AM can be used in conjunction with thegeometry tool P LENOPTI S IGN [18] to pinpoint metric objectpositions in a light-field captured by a plenoptic camera.The organisation of this paper begins with an introductionto the architectural design of the presented software in Sec-tion II, which can be regarded as a roadmap for Section III.Subsections contained in Section III then give insightful detailson algorithmic aspects with respect to plenoptic image cali-bration, sub-aperture extraction and refocusing. A benchmarkcomparison of results rendered by P
LENOPTI C AM and othertool chains is carried out in Section IV. Section V drawsconclusions, while reflecting on the framework’s potentialimpact and sketching out ideas for future work.II. S OFTWARE D ESCRIPTION
Usability and reproducibility of the proposed pipeline havebeen given the same priority as the enhancement of theimage quality. Access is provided for a variety of experi-ence levels. P
LENOPTI C AM offers the possibility to use itsApplication-Programming-Interface (API), a Graphical-User-Interface (GUI) just as the classical Command-Line-Interface(CLI) for sake of comfort when processing a batch of images.Quick API tests can be made in Jupyter Notebook usingBinder [19]. For CLI usage, installation is made possible viathe package installer pip . At this point, we may refer theinterested reader to use the help option via plenopticam -h for more information on CLI commands. The GUI is madeof TKINTER as part of the Python distribution. Configurationsettings with several options are documented in the “Help”section in the GUI menu. When feeding Lytro image data,the white image calibration file is selected automaticallyfrom tar-archives that correspond to the provided camera file.Instructions on how to generate a tar-archive are found inthe documentation. Until now, all code of P
LENOPTI C AM is written in Python and structured in an object-orientedmanner. An architectural scheme of the processing pipelineis sketched in Fig. 1. Objects of a P LENOPTICAM C ONFIG and P
LENOPTICAM S TATUS class are thought to be singletons and shared across modules. The processing direction of inputimage data is from left to right (on module-level) and topto bottom (on class-level) as indicated by the arrows. Notethat the diagram in Fig. 1 can be regarded as a roadmap forthis section in which the core functionality of each module ispresented. III. M
ODULE F UNCTIONALITIES
A. LfpReader
The L FP R EADER module supports standard image file types( tiff, bmp, jpg, png ) and the more specific raw Lytro file typedecoding ( lfp, lfr, raw ) for Bayer image composition accordingto the findings by Nirav Patel [13]. For raw data from a Lytrocamera, this module exports an image as a tiff file in Bayerrepresentation as well as a json file containing correspondingmetadata which is used for gamma correction. Other image filetypes are expected to be in sRGB space for gamma handling.
B. LfpCalibrator
The fundamental problem we aim to solve when calibratinga plenoptic camera is to register geometric properties of the4-D micro image representation. This enables a light-field tobe rearranged as if captured by an array of multiple cameraswith consistent spacing [10]. P
LENOPTI C AM introduces anovel and generic plenoptic image calibration pipeline witha sequence of steps shown in Fig. 2.Fig. 2: Calibration pipeline
The entire procedure can be sub-divided into pitch estima-tion, centroid extraction along with optional refinements andcentroid sorting of present micro images. To let the pipelinecover multiple types of plenoptic cameras with varying lensand sensor specifications, a blob detector is employed as a firststep for analysis of the micro image size.
1) PitchEstimator:
Based on a white calibration image I w ( x ) with arbitrary micro images, we adapt the classical scalespace theory [20] for optimized performance using half-octavepyramids [21] denoted by P ( ν, x ) and built via ∀ ν, P ( ν + 1 , x ) = D ( P ( ν, x ))) ∗ ∇ G ( σ, x ) (1)where { ν ∈ N | ν < log (min( K, L )) } and K, L are spatialresolutions of I w ( x ) . Downsampling is represented by D ( · ) considering the half-octave requirement and ∇ G ( σ, x ) is theLaplacian of Gaussian (LoG) convolution kernel, known asthe Mexican hat, which is approximated by the Difference ofGaussians. Each pyramid level at index ν is then a down-scaled LoG-filtered version of the initial white image where P (0 , x ) = I w ( x ) ∗∇ G ( σ, x ) serves as the first level. The level ν (cid:63) exhibits the intensity maximum across all image scales andrelates to its σ (cid:63) and respective micro image size M given by M ∝ σ (cid:63) = 2 ν(cid:63) / (cid:112) mod( ν (cid:63) , , where ν (cid:63) = arg max ν ( P ( ν, x )) (2)and ∝ is the equality up to scale for which the proof is foundin Appendix A. The base-2 terms account for halved resolu-tions and the half-octave representation across pyramid levels,respectively. Section IV-A demonstrates that our automaticdetection of the micro image size M is an important feature formicro image registration and subsequent light-field alignmentprocesses, which enable P LENOPTI C AM to cope with a varietyof different MLA and objective lens specifications.
2) CentroidExtractor:
Initial approximation of Micro Im-age Centers (MICs) is a key task in plenoptic image registra-tion and has been subject of existing research. An early methoddeveloped by Dansereau et al. [12] convolves a white imagefrom a Lytro camera with a dedicated kernel of constant sizeand subsequently analyzes local peaks. Cho et al. [11] andsimilarly Liang et al. [22] iteratively apply a morphologicalerosion operator to white images until a pattern of isolatedmicro image regions is obtained. Although image erosionfacilitates rough centroid detection from intensity maxima, theiterative nature of this approach leaves space for optimizationswith regards to speed and robustness. Unlike the previousdetection methods, we identify MICs as local extrema in aLoG-convoluted image I c ( x ) = I w ( x ) ∗ ∇ G ( σ (cid:63) , x ) by usingNon-Maximum-Suppression (NMS) in the form of the × neighborhood scan. Here the kernel ∇ G ( σ (cid:63) , x ) scales with M and helps carve out micro image peaks as it is responsiveto the detected feature size. The vector x = (cid:8) k, l (cid:9) ∈ Z consists of image coordinates k, l for the horizontal andvertical direction, respectively. We obtain each MIC denotedas c = (cid:8) k, l (cid:9) ∈ Z by employing Pham’s NMS method [23].
3) CentroidRefiner:
It is important to note that the pre-liminary C
ENTROID E XTRACTOR yields centroids located atinteger sensor coordinates. Several studies have shown that integer coordinate centroids induce artifacts when decom-posing a raw plenoptic image to the sub-aperture imagerepresentation of a light-field [11], [12], [22]. For accuratelight-field decomposition, we refine centroids with sub-pixelprecision by taking pixel intensities into account that belongto the same corresponding micro image region R , which istypically in a range of M × M size. To obtain sub-pixel precisecentroids ¯ c = (cid:8) ¯ k, ¯ l (cid:9) ∈ R , we perform coordinate refinementby ¯ k = (cid:88) ( k,l ) ∈ R I c ( k, l ) · k and ¯ l = (cid:88) ( k,l ) ∈ R I c ( k, l ) · l (3)where I c ( k, l ) represents the LoG-convoluted white image.As an alternative to this, we compute area centroids of abinary micro image after applying some threshold such that I c ( k, l ) ∈ { , } which then simplifies Eq. (3) to ¯ k = 1 |R| (cid:88) ( k,l ) ∈ R k and ¯ l = 1 |R| (cid:88) ( k,l ) ∈ R l (4)where | · | denotes the cardinality providing the total numberof elements within a micro image region R above a certainthreshold value. By default, P LENOPTI C AM uses the latterapproach given in Eq. (4) as it proves to be the more genericsolution for white images suffering from noise or saturation.Figure 14(a) shows an example of detected MICs using thismethod.
4) CentroidSorter:
Let C = { ¯ c p | p ∈ N } be a finiteand unordered set of MIC coordinates ¯ c p = { ¯ k, ¯ l } , onlylittle is known about the dimensions and geometric arrange-ment of the micro lenses. To enable generic calibration fordifferent plenoptic camera models, it is thus necessary toexamine fundamental properties of a custom MLA with ar-bitrary dimensions. Assuming that the ratio of image sensordimensions K and L will match the aspect ratio of theMLA, we approximate J as the horizontal number of microlenses via J = (cid:112) | C | × K / L with | C | as the total numberof centroids and similarly the vertical number of micro lenses H = (cid:112) | C | × L / K . From these MLA dimensions, it is possibleto estimate average distances between micro image centers via K / J and L / H which are used to detect whether the geometricarrangement of micro lenses is hexagonal or rectangular. Fora rectangular micro lens packing, these ratios tend to beequal whereas the spacings deviate by a factor of √ / in thehexagonal case as demonstrated in Fig. 3(b).Rearranging plenoptic micro images to sub-aperture repre-sentation requires MICs to be indexed since their relativepositions act as spatial coordinates in the sub-aperture imagedomain. At this point, however, the order of MICs in the arrayremains ambiguous. Thus, we seek a procedure that assignsvertical and horizontal micro image indices j ∈ { , , . . . , J } and h ∈ { , , . . . , H } to each centroid ¯ c j,h . The proposedsort procedure begins with the search for the most upper leftcentroid ¯ c , with j = 1 , h = 1 which is found by theminimum Euclidean distance to the image origin as given by ¯ c , = ¯ c p (cid:63) , where p (cid:63) ∈ arg min ≤ p ≤| C | (cid:0) (cid:107) ¯ c p (cid:107) (cid:1) (5) and (cid:107)·(cid:107) denotes the (cid:96) norm. On the basis of an indexedcentroid ¯ c j,h , we search for its spatial neighbor by iteratingthrough C until finding a candidate ¯ c p = (cid:8) ¯ k, ¯ l (cid:9) that satisfiesa window boundary condition ϕ ( · ) ∈ { , } as seen in ¯ c j +1 ,h = { ¯ c p | ¯ c p ∈ C ∧ ϕ (¯ c j,h , ¯ c p ) = 1 } (6)which is given by ϕ (¯ c a , ¯ c b ) = ¯ k a + Υ < ¯ k b < ¯ k a + ε Υ ∧ ¯ l a − Υ < ¯ l b < ¯ l a + Υ (7)where Υ = M / and ε is a boundary scale parameter setto 3 in the rectangular case and used to cover a hexagonalarrangement otherwise. Boundary parameters thus rely on apreviously determined MLA structure and its micro imagesize M . Note that ¯ k and ¯ l may be swapped in Eq. (7) toswitch between horizontal and vertical search directions. Anexemplary result showing indexed centroids is depicted inFig. 5.
5) GridFitter:
At this stage of the calibration pipeline,MICs rely on intensity distributions, which may be affectedby a broad range of irregularities arising from the MLA-sensor compound. Based on provided metadata, Dansereau etal. [12] compress centroid information by forming a con-sistently spaced grid, which neglects deviations caused byoptical aberrations such as vignetting. Usage of Delaunaytriangulation as proposed by Cho et al. [11] suits hexago-nal arrangements, but becomes impractical when applied torectangular grids. Instead, we elaborate on a least-squares(LSQ) regression mentioned in a Lytro patent [22]. The spatialcoordinates ¯ k j,h and ¯ l j,h along rows and columns of MICs areregarded as a function given by ¯ l j,h = β + β ¯ k j,h + β ¯ k j,h (8)which can be represented as matrix X and vectors βββ, y by X = k ,h ¯ k ,h k ,h ¯ k ,h ... ... ... k J,h ¯ k J,h , βββ = β β β , y = ¯ l ,h ¯ l ,h ... ¯ l J,h (9)Since X is not invertible, the equation system is solved viathe Moore-Penrose pseudo-inverse denoted as † and given by X † = ( X (cid:124) X ) − X (cid:124) (10)with (cid:124) as the matrix transpose so that βββ = X † y (11)yields function fit coefficients in βββ for a row or column ofMICs. Centroids are then re-calculated via intersections of twofunction fits βββ j and βββ h and re-assigned to spatial coordinates ¯ c j,h = { ¯ k, ¯ l } . Enhancements of the herein proposed LSQ gridfit are examined in Section IV-A. C. LfpAligner1) CfaOutliers:
In contrast to other pipelines [12], [17],we detect and rectify hot and dead pixels prior to Bayerdemosaicing. The reasoning behind our decision is that falseintensities are propagated from pixel outliers to adjacent pixelsduring demosaicing, which may pass unnoticed by a detectionmethod at a later processing stage. To identify pixel candidates,we regard each of the four Bayer channels as a conventionaltwo-dimensional grey scale image I B ( x ) and analyze thedifference from its Median filtered version as seen in I R ( x ) = I B ( x ) − M (cid:0) I B ( x ) (cid:1) (12)where I R ( x ) is the reference image that is used for furtheranalysis and M ( · ) denotes the Median filter operator. From I R ( x ) , we obtain the arithmetic mean ¯ I R and the standarddeviation σ as local statistical measures from a sliding windowof n + 1 size to replace potential outliers I B ( x ) as follows I B ( x ) = (cid:40) M (cid:0) I B ( x ) (cid:1) , if I B ( x ) > ¯ I R + 4 σI B ( x ) , otherwise (13)where the if clause statement helps detect intensity outliers.Once a condition is fulfilled, nearby intensities in the range of n + 1 are used to replace an outlier. The selection is furtherconstrained by only accepting a small number n of candidatesin a n × n window. Without this constrain, many pixels ofa saturated image area would be falsely detected as potentialoutliers.
2) LfpDevignetter:
In optical imaging, vignetting occursat non-paraxial image areas as a result of either mechanicalblocking of light rays or illumination fall-off from the lawof four cosines, which arise from the Lambertian reflectance,pupil size reduction and the distance-dependant inverse squarelaw [24]. Images from a plenoptic camera suffer from vi-gnetting in similar ways whereas the appearance is given in4-D representation. This image artifact may be treated througha pixel-wise division of the normalized white calibrationimage [12], [17]. P
LENOPTI C AM uses this method by default,as it works sufficiently well for white images not severelysuffering from noise.To reduce potential sensor noise, L FP D EVIGNETTER offersan alternative approach to counteract vignetting based on LSQfitting which is traditionally used in 2-D image processing.The LSQ de-vignetting principle, however, has not yet beenapplied to raw images of a plenoptic camera. We adapt theclassical procedure by iterating through each micro image andperforming rectification via division of normalized LSQ fitvalues to prevent further propagation of noise to the light-field.The intensity surface of a white micro image I m ( u, v ) isapproximated by a multivariate polynomial regression. For 2 nd order polynomials, this fit function may be given as I m ( u, v ) = w + w u + w v + . . . + w u v (14)with w as the regression coefficients. Provided the 2-D microimage coordinate indices u and v , this is translated to matrix form by A = u v · · · u v u v · · · u v ... ... ... . . . ... u M v M · · · u M v M , b = I m ( u , v ) I m ( u , v ) ... I m ( u M , v M ) (15)with A as Vandermonde matrix and vector b containingmicro image intensities. Similar to Eq. (10), we use theMoore-Penrose pseudo-inverse A † to obtain fit values w = (cid:2) w w ... w (cid:3) (cid:124) for each micro image by w = A † b (16)After an estimation of weight coefficients, we divide eachmicro image of the light field capture by its fitted counterpartfrom the white calibration image. The herein described methodis left optional due to the comparably extensive computationalrequirements. The effectiveness of the proposed method isdemonstrated in Section IV-C.
3) CfaProcessor:
The C FA P ROCESSOR class is dedicatedto raw sensor images and can be useful for processing Lytrobinary file data for which the featured metadata is considered.Regardless of the file type, the C FA P ROCESSOR takes careof debayering, white balancing just as color correction. Bydefault, Menon’s algorithm [25] is employed for debayeringfrom the available
COLOUR DEMOSAICING implementation.Findings made by CLIM-VSENSE with regards to highlightprocessing were adopted by this class as they yield enhancedsub-aperture image quality.
4) LfpRotator:
At the assembling stage of a plenopticcamera, a micro lens grid is ideally placed so that its tiltangles are in line with that of a sensor array. In the realworld, however, this grid may likely be rotated or tilted withrespect to the image pixel grid. We suppose that the sensorand MLA planes can be considered parallel by mechanicaldesign, where the back focal plane of a plano-convex MLAcoincides with the sensor’s image plane once its cover glasstouches the planar back vertex of an MLA. To counteractrotations about the z axis, our approach exploits the fact thatimages exposing aberrations tend to be aberration-free alongtheir central axes. This suggests that a row or column of MICscrossing the image center forms a line that may be a reliableindicator for the MLA rotation angle. Such a central row ofMICs ¯ c j,δ = { ¯ k j,δ , ¯ l j,δ } is obtained by δ := (cid:98) ( H − / (cid:101) where (cid:98)·(cid:101) is the nearest integer operation and H the total number ofmicro lenses in vertical direction. Applying LSQ regression onMICs as proposed in Eqs. (8) to (11) yields θ = arctan ( β ) asthe angle from a slope coefficient β . Rotation of MICs and therespective image is accomplished by multiplying coordinateswith a counter-clockwise rotation matrix R z about axis z and atranslation matrix T . Transformation matrices are successivelymultiplied with one another to obtain a rotated centroid set ¯ c (cid:48) j,h = T − R z T ¯ c j,h . An example showing results of theproposed rotation procedure is provided in Fig. 14.
5) LfpResampler:
Section III-B reveals that micro imagecentroids ¯ c j,h ∈ R are likely to be found at sub-pixel posi-tions. Taking pixel intensities from nearest integer coordinates leads to errors and thus image artifacts [11]. It is mandatoryto preserve fractional digits of MIC coordinates to accuratelydecompose a light-field from a plenoptic camera. Therefore,we introduce an interpolation scheme, which retains accuracyby resampling each micro image I m ( u, v ) individually.Figure 3(a) depicts an MIC ¯ c j,h and the correspondingbi-cubic coefficients γ based on surrounding pixel centersindexed by ( i, j ) . Given γ i,j as pre-determined weightingcoefficients, the sub-pixel precise illuminance surface A ( · ) ateach centroid ¯ c j,h = (cid:8) ¯ u j,h , ¯ v j,h (cid:9) ∈ R is computed via A ( u, v ) = (cid:88) i =1 4 (cid:88) j =1 γ i,j · u i − · v j − (17)where u, v are indices in the × image pixel patch and ¯ c j,h = (cid:8) ¯ u j,h , ¯ v j,h (cid:9) ∈ R denote the coordinates of the centroid atwhich the intensity is interpolated. c j,h (1,3)(0,0) (0,1) (0,2) (0,3)(1,0) (1,1) (1,2) (2,3)(2,2)(2,1)(2,0)(3,0) (3,1) (3,2) (3,3) γ (a) d M tb (b) Fig. 3:
Interpolation schemes with (a) micro image centeringand (b) MIC geometry in a hexagonal grid with t as the heightand d M as the side length of an isosceles triangle.Apart from handling rectangular MLA grids, L FP R ESAM - PLER supports rectification of hexagonal micro lens arrange-ments which are detected prior to the centroid-sort procedure.Depending on the hexagonal orientation, the alignment isaccomplished by de-interleaving and elongating one dimensionat the resampling stage. This shift and stretch is required as aconsequence of the hexagonal geometry depicted in Fig. 3(b).We obtain the height of a triangle spanned by three MICswith t = b / and its side length by d M = b √ whichrepresents the spacing of two MICs as in a rectangular grid.From this, it follows that the sampling density along t is / √ times higher in relation to d M . One may note that d M / t = √ / / becomes / √ after rearranging. Further, itis seen that hexagonal packings imply a displacement of d M / for every second MIC coordinate vector. Thus, compensationfor hexagonal arrangements may be achieved by translatingevery other coordinate vector by d M / and simultaneouslyinterpolating the number of micro images by factor / √ along the direction orthogonal to t , which corresponds to thehorizontal dimension in case of Lytro cameras. D. LfpExtractor1) LfpRearranger:
Based on aligned micro image inten-sities E f s [ s j , u c + i ] captured at the image sensor plane, the E i [ s j ] isachieved by assigning pixels according to E i [ s j ] = E f s [ s j , u c + i ] (18)in the horizontal dimension, which collects micro image pixelsat u c + i and relocates them consecutively in a new image array E i [ s j ] . Centroid c = ( M − / is subject to mod( M,
2) = 1 for each micro image s j after the alignment. A change inindex i ∈ {− c, . . . , c } then controls the relative view positionin a light-field and jumps to an adjacent viewpoint whereasindex j ∈ { , , . . . , J } iterates through the micro lensesrepresenting the spatial domain. Note that Eq. (18) is equallyapplied in the vertical direction for complete rendering.
2) HexCorrector:
While a hexagonal micro lens latticeincreases the packing density and thus the spatial resolution,it causes a shortcoming at the rectangular resampling stage,becoming visible as fringe patterns along straight object edgesin sub-aperture images (see Fig. 10 for details). This is aresult of hexagonal resampling, shifting every other row byhalf the spacing while actual object edge transitions may moveto incorrect positions causing a visual zipper-like appearance.To the best of our knowledge, we are the first to recognizeand tackle that issue. To achieve this, it is essential to identifypixel candidates affected by this hexagonal shift artifact.For an analytical detection, we compose two auxiliary im-ages E [ˆ s j ] and E [ˇ s j ] by de-interlacing, i.e. taking every otherrow, of a sub-aperture image E i [ s j ] where E [ˆ s j ] representsunshifted (ground-truth) pixel vectors and E i [ˇ s j ] the shiftedones. Then we deduct unshifted pixels from their shiftedneighbors via ˜ V i [ˇ s j ] = E i [ˇ s j ] − E i [ˆ s j ] (19)to extract local variances in an image vector ˜ V i [ˇ s j ] containingstrong responses at edges parallel to the shift and stretchdirection. To remove edges of real objects, we further subtractthe magnitude of the partial derivative ∂ ˆ s E i [ˆ s j ] by ¯ V i [ˇ s j ] = | ˜ V i [ˇ s j ] | − | ∂ ˆ s E i [ˆ s j ] | (20)to leave real edges out of consideration. At this stage, weassume ¯ V i [ˇ s j ] to exhibit peaks for potential candidates whichwe isolate from noisy responses via thresholding with τ by V i [ˇ s j ] = (cid:40) , if ¯ V i [ˇ s j ] > τ , otherwise (21)To further exclude false positives along the shift and stretchdirection, we reject candidates not being part of a consecutivesequence of minimum length 4 in an unraveled vector V i [ˆ s j ] .This way it is ensured that treated areas have sufficient lengthto be visually recognized. Finally, the remaining candidates in V i [ˆ s j ] are used to make substitutions in E i [ s j ] by E i [ s j ] = (cid:40) |R| (cid:80) h ∈R E i [ t h ] , if V i [ˇ s j ] = 1 E i [ s j ] , otherwise (22)for each j ∈ { , , . . . , J } and where t h denotes the verticaldimension (orthogonal to s j ) in a surrounding region R of a2-D sub-aperture image. The effectiveness of this solution isdemonstrated in Fig. 10.
3) LfpColorEqualizer:
Lens components generally exposea gradual intensity decline toward image edges caused by fourcosine terms [24]. With a micro image fully covered by thesensor, this illumination fall-off is spread across a relativelysmall group of Bayer pattern pixels that are merged during de-mosaicing. Visible intensity variances in off-axis sub-apertureviews are thus a result of vignetting at the micro imagelevel that we aim to rectify with the L FP C OLOR E QUALIZER module. Recent research dealt with enhancing the consistencyof light intensities across light-field views [17]. The ideabehind this concept is to propagate trusted intensities fromparaxial image areas that expose no severe image aberrationsto sub-aperture images located at the edge of a light-field.It can thus be used in addition to de-vignetting by utilizingthe information redundancy within the light-field. In a recentstudy, Gaussian Mixture Models (GMMs) were employed inconjunction with disparity correspondences computed at a pre-ceding stage to reduce color inconsistencies [17]. However, theauthors note that their procedure requires a high computationalload (240 mins per light-field) due to the many computationalsteps involved.To overcome this limitation, we introduce linear mappingsbased on probability distributions. A well known classicalmethod in image processing to achieve this is formally knownas Histogram Matching (HM). HM proves to be advantageouswhen applied to light-fields as it is invariant of the texture sothat it preserves parallax information while requiring relativelylittle computational effort as opposed to iterative methods likek-means. Besides this method, we employ a recent advance inimage color transfer that is based on the Monge-Kantorovich-Linearization (MKL). MKL has been introduced to the fieldof image processing by Piti´e and Kokaram [26] to facilitateautomatic color grading in media production. As of now, theMonge-Kantorovich principle has not been applied to light-field image processing, especially in the context of colorequalization among light-field views acquired from a plenopticcamera. Due to its linearization of Multi-Variate Gaussian Dis-tribution (MVGD) modeling, we expect MKL to outperformHM in terms of accuracy while reducing the computationalcomplexity imposed by the GMM method [17]. Mathematicaldetails of our implemented HM-MKL compound are presentedhereafter.Similar to other literature [27], we denote f ( r k ) to be aProbability Density Function (PDF) with r k as color channelintensities of a sub-aperture image E i [ s j ] given by f ( r k ) = η ( r k ) JH , ∀ k ∈ { , , . . . , L} . (23)Here η ( · ) yields the number of pixels with intensity level k where L is the level maximum. The histogram is normalizedby JH as the total pixel count of the sub-aperture image. Fromthis we compute F ( r k ) as the Cumulative Density Function(CDF) which is obtained using an auxiliary index (cid:15) via F ( r k ) = k (cid:88) (cid:15) =1 f ( r (cid:15) ) , ∀ k ∈ { , , . . . , L} . (24) z MainLens s s s s s s s MicroLenses E fs ImageSensor E' [ s ] u u u ObjectPlaneImagePlane E' a Fig. 4:
Refocusing concept with chief ray intersections indicating object and image plane for a recovered point E (cid:48) [ s ] .To match F ( r k ) as a source CDF with that of a target T ( z q ) ,we perform a pixel mapping via the inverse T − that satisfies F ( r k ) = T ( z q ) and reads z q = T − ( F ( r k )) (25)where each value r k gets assigned a new value z q from thecorresponding probability map T − which is implemented asa discrete lookup table. While this channel-wise mapping isstraightforward, it fails to sufficiently transfer color intensitiesat the level satisfying our human visual perception.It is therefore our motivation to determine an optimal PDFtransport plan using linear algebra. Due to the statisticalnature of ( r, g, b ) intensity distributions, we regard the source r = (cid:104) r ( r ) k , r ( g ) k , r ( b ) k (cid:105) (cid:124) and target z = (cid:104) z ( r ) q , z ( g ) q , z ( b ) q (cid:105) (cid:124) ascorrelated color channel MVGDs given by N ( r ; µ r , Σ r ) = exp (cid:0) − ( r − µ r ) (cid:124) Σ − r ( r − µ r ) (cid:1)(cid:112) (2 π ) rank ( Σ r ) | Σ r | (26)where Σ r and Σ z denote covariance matrices while µ r and µ z are the mean vectors of r and z , respectively. A desired transferrequires the PDFs to be equal and thus fulfill the condition N ( z ; µ z , Σ z ) ∝ N ( r ; µ r , Σ r ) , which yields ( z − µ z ) (cid:124) Σ − z ( z − µ z ) = ( r − µ r ) (cid:124) Σ − r ( r − µ r ) (27)after dropping the constant leading terms. Using the pseudo-inverse † from Eq. (10), we arrange the above equation to ( z − µ z ) = (cid:0) ( z − µ z ) (cid:124) Σ − z (cid:1) † ( r − µ r ) (cid:124) Σ − r ( r − µ r ) (28)As we seek a compact transfer matrix, we define M to be M = (cid:0) ( z − µ z ) (cid:124) Σ − z (cid:1) † ( r − µ r ) (cid:124) Σ − r (29)so that after substitution and rearranging, Eq. (28) becomes z = M ( r − µ r ) + µ z (30)for the forward transfer. The determination of M is key foran optimal transport. We utilize Piti´e’s findings who employedMKL with M (cid:124) Σ − z M = Σ − r and which is given by M = Σ − / r (cid:0) Σ / r Σ z Σ / r (cid:1) / Σ − / r (31)It is worth noting that Piti´e elaborated on a variety of othersolutions for M , e.g. Cholesky decomposition, among which MKL proved to be the most successful in terms of accu-racy [28]. An experimental evaluation of the proposed HM,MKL and HM-MKL methods is carried out in Section IV. E. LfpRefocuser
The L FP R EFOCUSER module covers algorithms concernedwith the computational change of image focus based onlight-field images. To achieve this, L FP R EFOCUSER offers 3different mechanisms on how to conduct refocusing. The fun-damental technique is the conventional shift and sum methodas originally presented by Isaksen et al. [29] who employeda light-field taken by an array of cameras. Transferring thisconcept to sub-aperture images E i [ s j ] , a refocused image E (cid:48) a [ s j ] with synthetic focus parameter a is given by E (cid:48) a [ s j ] = c (cid:88) i = − c E i (cid:2) s j − a ( c + i ) (cid:3) , a ∈ Q (32)for the 1-D case leaving out variables for the vertical do-main. A refinement option in P LENOPTI C AM enables sub-pixel precision in refocusing via upsampling each sub-apertureimage before integration. This not only increases the spatialresolution, but also allows for sub-sampling the refocuseddepth range along the optical axis of the main lens.As an alternative to Eq. (32), shift and sum can beaccomplished from an aligned micro image representation E f s [ s j , u c + i ] as a starting point, which then writes E (cid:48) a [ s j ] = c (cid:88) i = − c E f s (cid:2) s j + a ( c − i ) , u c + i (cid:3) , a ∈ Q (33)and can be thought of as employing an interleaved convolutionkernel as shown in our previous work examining FPGA-basedrefocusing practicability [30]. The concept behind Eq. (33) isillustrated in Fig. 4 with the aid of chief rays and paraxialoptics for a more intuitive understanding of computational re-focusing. Interested readers may note that refocused distancescan be pinpointed in space, using this model [2], [18].The refocuser module features the L FP S CHEIMPFLUG classwhich mimics the identically named principle on a computa-tional level. Tilting the refocused plane is achieved through afusion of spatial image areas from monotonically varying syn-thetic focus parameter a . This enables the L FP S CHEIMPFLUG module to render tilted focus along horizontal, vertical andboth diagonal image corner-to-corner directions, where focalstart and end points rely on provided a . Figure 18 depicts aplenoptic-based photograph exhibiting the Scheimpflug effect.IV. R ESULTS
A. Calibration P LENOPTI C AM ’s generic calibration capability is testedunder different scenarios with micro images varying in size,number, rotation angle, packing geometry and pixel noise. Ourproposed calibration pipeline is validated through synthesizeddata featuring respective ground-truth references. Results aredepicted in Fig. 5 for visual inspection. Ground Truth ˚ c j,h C ENTROID E XTRACTOR C ENTROID R EFINER C ENTROID S ORTER G RID F ITTER (a) M = 143 , J = H = 5 , θ = 0 ◦ Ground Truth ˚ c j,h C ENTROID E XTRACTOR C ENTROID R EFINER C ENTROID S ORTER G RID F ITTER (b) M = 65 , J = H = 13 , θ = 0 ◦ Ground Truth ˚ c j,h C ENTROID E XTRACTOR C ENTROID R EFINER C ENTROID S ORTER G RID F ITTER (c) M = 11 , J = H = 64 , θ = 1 ◦ Ground Truth ˚ c j,h C ENTROID E XTRACTOR C ENTROID R EFINER C ENTROID S ORTER G RID F ITTER (d) M = 8 , J = H = 90 , θ = − ◦ Fig. 5:
Auto-calibration results from synthetic data showingnoisy micro images varying in size with square pupil and rect-angular packing in (a); circular pupil and hexagonal packingin (b); dense, tilted hexagonal packing in (c); and dense, tiltedrectangular packing in (d).To analyze the P
ITCH E STIMATOR as an early stage of thepipeline, scale space maxima along ν are presented from theexamples shown in Fig. 5(a) to (d). ν . . . m a x ( P ( ν , x )) (a)(b)(c)(d) Fig. 6:
Scale maxima analysis from P
ITCH E STIMATOR withpyramid data from Fig. 5 where crosses signify respective ν (cid:63) For a quantitative examination of the centroid accuracy, weuse C as a deviation metric in pixel unit given by C = 1 JH J (cid:88) j =1 H (cid:88) h =1 (cid:107) ˚ c j,h − ¯ c j,h (cid:107) (34)for each of our centroid detection methods. Here ˚ c j,h rep-resents actual coordinates from the image synthesis that actas a ground-truth. Numerical results from Eq. (34) in Fig. 5are provided in Table I. Figure 5(a) suggests that approxi-mates from C ENTROID E XTRACTOR appear significantly offwith regards to the ground-truth while subsequent refinementstages enhance the accuracy. Closer inspection reveals thatthe C
ENTROID E XTRACTOR may yield several centroids for amicro image suffering from noise. This is a side effect of NMSwhich would have been of greater concern if white imageswere not convolved with ∇ G ( σ (cid:63) , x ) prior to NMS, as thisproves to cancel out false maxima. Candidates passing throughL FP E XTRACTOR thus tend to be close to the ground-truth.TABLE I: Centroid deviation C in [px] Fig. 5
Stage 1 Stage 2 Stage 3 Stage 4 C ENTROID E XTRACTOR C ENTROID R EFINER C ENTROID S ORTER G RID F ITTER peak area (a) 6.171 2.984 1.941 (d) 0.896 n.a. 0.291 0.288
From the results in Table I, it follows that the C EN - TROID R EFINER uses Eq. (4) by default as opposed to Eq. (3)which propagates more imprecise coordinates to subsequentprocedures. The reason behind C
ENTROID S ORTER yieldingbetter results in Table I is that it merges centroids belonging tothe same micro image. Usage of G
RID F ITTER is left optionalas it proves to reduce deviations only in certain cases, e.g.tilted MLAs or for a large number of micro images. Theseimprovements arise from larger data sets resulting in lowerresiduals as well as slight MLA rotations that help stabilize thefits. One may note that vignetting causes off-center micro im-ages to be of non-radially symmetric shape, shifting detectedcenters away from their actual physical counterparts. To workagainst this displacement, interested readers are referred to themethods proposed by peers [31], [22], [32], [33].
B. Sub-aperture images
The central view of a light-field generally suffers the leastof aberrations and thus, exposes best image quality. Figure 8depicts decoded central views of light-field photographs fromthe available IRISA dataset [34] rendered by state-of-the-artplenoptic imaging pipelines for comparison. Closer inspectionreveals that central views from LFT
OOLBOX V
Blind Reference-less Image Spatial QualityEvaluator (BRISQUE) [35] from the pybrisque implementa-tion [36]. Scores from the BRISQUE metric are presented in B ee_1 B ee_2 B u il d i ng B u m b l ebee C a c t u s C he ck e r boa r d C he z E dga r C o rr i do r D i s t an t C hu r c h D o v e s D u ck F i e l d F r a m ed F r u i t s Len s F l a r e F l o w e r s H ou s e R oundabou t S t r ee t M e ssy D e sk M i n i M o t i on B l u r P e r s pe c t i v e P l u s h i e s P o s t s R ond R o s e S c u l p t u r e S i gn S t a t ue S t ep s S unn y R o s e S y mm e t r i c T e x t u r e T i n y M oon T r an s l u c en t B R I S Q U E [ S c o r e ] LFToolbox v0.5CLIM-VSENSEPlenoptiCam v1.0
Fig. 7:
Analysis of central sub-aperture images from [34] along the horizontal axis and BRISQUE metric scores along thevertical axis where lower values signify higher quality. Our tool chain outperforms others in 24 out of 36 images in total.Fig. 7. Light-field denoising as borrowed by [17] was left outin the evaluation as it is considered equally effective for eachpipeline and thus, not crucial for the decomposition. For ourbenchmark comparison, all pipelines are set to render withde-vignetting, color and sRGB options.The BRISQUE metric is known to be sensitive to noiseand blur by local statistical analysis so that an informationloss caused by image processing modules would deterioratethe score. Low scores of our pipeline in Fig. 7 justify that onaverage our decomposition retains the physical informationoptimally. Our central sub-aperture images outperform theother two pipelines in 24 out of 36 cases of the dataset sam-ples [34]. We achieve a total score of 1567 whereas CLIM-VSENSE yields 1595 and the LFT
OOLBOX V FP R ESAMPLER contributes to lower scores by conductingmicro image alignment in an element-wise manner insteadof re-mapping the entire plenoptic image as a whole [37].Another reason for the improvements is that P
LENOPTI C AM benefits from insights previously made available by peerssuch as the de-saturation approach [17]. Apart from that,our automatic histogram alignment based on percentiles ofdifferent color spaces tends to be a more generic solutionfor images of various exposure, which becomes apparent bycomparing (h) with (i) in Fig. 8.For comparisons with the Lytro engine, results were com-puted with L YTRO D ESKTOP V
YTRO P OWER T OOLS V λ for focus), making experimentalreproduction easier. All Lytro results presented hereafter were therefore generated using L YTRO P OWER T OOLS V
OOLBOXV
LENOPTI C AMV (a) (b) (c)(d) (e) (f)(g) (h) (i)(j) (k) (l)
Fig. 8:
Central sub-aperture images from the IRISAdataset [34] rendered by (left) LFT
OOLBOX V L YTRO P OWER T OOLS V
LENOPTI C AM V (a) (b)(c) (d)
Fig. 9:
View comparison with Lytro engine revealing dif-ferences between classical sub-aperture images (right column)and Lytro’s results (left column), a consequence of all-in-focusrendering from depth-based segmentation [38], [39].Figure 9 provides a comparison of light-field images ren-dered by L
YTRO P OWER T OOLS V
LENOPTI C AMV
LENOPTI C AM ’s sub-aperture images with the scientific tool-box [12] and its extensions [17]. Nonetheless, we then contrastour refocusing results against those of the Lytro engine inSection III-E.For further sub-aperture image analysis, we focus on theresults of the H EX C ORRECTOR where artifacts caused by hexagonal sampling are subject to removal. Figure 10 shows (a) Original E i [ s j ] → (b) Mask V i [ s j ] → (c) Rectified E i [ s j ] (d) Original E i [ s j ] → (e) Mask V i [ s j ] → (f) Rectified E i [ s j ] Fig. 10:
Fringe artifact reduction results showing untreatedsub-aperture images in the left column, exhibiting fringesalong vertical lines seen upon closer inspection. The binarymasks V i [ s j ] indicate detected pixels being sufficiently af-fected whereas the right column depicts results with smoothedges.magnified portions of sub-aperture images with apparent fringeartifacts in Figs. 10(a) and 10(d) as well as rectified counter-parts exposing smooth edges in Figs. 10(c) and 10(f). Althoughthe efficacy of our treatment is visually notable, this improve-ment does not present a huge impact on BRISQUE scores.This is likely due to the blind character of the metric andits unawareness of the hexagonal sampling artifact, which canbe interpreted as reasonably high spatial frequencies. Anotherobservation made in Fig. 10 is that organic object structuresintentionally receive no treatment as they lack straight edgesand are thus less affected. (a) Without treatment (b) LFT OOLBOX V et al. [17] (d) P
LENOPTI C AM V
Fig. 11:
Color equalization results with × stitchedsub-aperture images indicating illumination variances acrossa decomposed light-field. (a) our result without treatment, (b)Dansereau et al. [12], (c) GMM-based method [17] with 240mins computation time and (d) our HM-MKL compound with81 secs computation time. Central sub-aperture images are not the only criterion todescribe the quality of a light-field obtained from a plenopticcamera. For instance, rays arriving from non-paraxial anglessuffer from natural vignetting causing color inconsistencieswithin a light-field which we aim to compensate for withthe L FP C OLOR E QUALIZER module. Figure 11 presents sub-aperture images stitched together for visual inspection of thelight-field illumination variances before and after equalizationtreatment. For better visual analysis, sub-aperture images at amarginal position ( i = 7 , g = 2 ) are provided in Fig. 12. (a) Ours untreated (b) Ours by HM (c) Ours by HM-MKL(d) Dansereau et al. [12] (e)
Matysiak et al. [17] (f) Central view
Fig. 12:
Marginal view comparison of methods combatingillumination fall-off at off-axis positions using an exemplarylight-field image with M = 15 . The marginal view locationis i = 7 , g = 2 with (a) the untreated image, (b) afterHistogram Matching (HM), (c) from the HM-MKL compound,(d) LFT OOLBOX V et al. [17] and (f)the central sub-aperture image ( i = 7 , g = 7 ) as a reference.For quantitative assessment of the color transfer we computethe average histogram distance denoted by D using the (cid:96) norm as given by D = (cid:107) f ( E c [ s j ]) − f ( E g [ s j ]) (cid:107) (35)where E g [ s j ] represents the marginal and E c [ s j ] the centralsub-aperture image serving as the reference while f ( · ) isthe PDF from Eq. (23). Results of our color consistencymeasurements from Eq. (35) are depicted in Fig. 13.Computation times of the different sub-aperture decomposi-tion stages are given in Table II. The extensive computationalload imposed by Matysiak’s recoloring procedure [17] makesit impractical for us to iterate through an entire dataset.TABLE II: Computation time comparison where M = 15 Process Dansereau et al. Matysiak et al. Ours
De-Multiplexing 70 secs 125 secs 325 secsHot Pixel Removal not supported 100 secs 52 secsColor Equalization not supported 234 mins 81 secsTotal 70 secs 237 mins 458 secs
Using the afore-mentioned image metrics, no measurableimprovements could be observed from rotating a plenopticimage. Nonetheless, rotation of the 4-D micro image represen-tation proved to be more robust when processing large microimages (e.g. M = 15 ) located at the image border. This isresolved by rotation as it shifts marginal micro images away from the image border and enables consideration of marginalimage pixels. Horizontal dimension [px] V e r t i c a l d i m e n s i o n [ p x ] (a) Horizontal dimension [px] V e r t i c a l d i m e n s i o n [ p x ] (b) Fig. 14:
Micro image array rotation with cropped views froma Lytro white image being untreated in (a) and rotated in (b).Transparent lines embrace MICs of the same row near imageborders to demonstrate the effectiveness of the L FP R OTATOR . C. De-Vignetting (a)
74 dB
SNR from division (b)
85 dB
SNR from LSQ fit
Fig. 15:
De-vignetting from a white image with Gaussiannoise σ = 0 . . The noise was absent when de-vignettingthe ground-truth. Noise propagates to the light-field duringdivision as seen in (a) and significantly suppressed in (b) viapatch-wise micro image fitting based on Eqs. (14) to (16). Ourapproach gains ∼
10 dB of SNR compared to the ordinarydivision as used by [12], [17].Experimental validation of our LSQ-based de-vignetting inEqs. (14) to (16) is assessed using the Signal-to-Noise-Ratio(SNR) given by
SN R = 20 · log − (cid:113) J (cid:80) Jj =1 ( E T [ s j ] − E c [ s j ]) (36)as a metric to analyse its effectiveness. For the ground-truthdata E T [ s j ] , we use a photograph divided by a white image B ee_1 B ee_2 B u il d i ng B u m b l ebee C a c t u s C he ck e r boa r d C he z E dga r C o rr i do r D i s t an t C hu r c h D o v e s D u ck F i e l d F r a m ed F r u i t s Len s F l a r e F l o w e r s H ou s e R oundabou t S t r ee t M e ssy D e sk M i n i M o t i on B l u r P e r s pe c t i v e P l u s h i e s P o s t s R ond R o s e S c u l p t u r e S i gn S t a t ue S t ep s S unn y R o s e S y mm e t r i c T e x t u r e T i n y M oon T r an s l u c en t H i s t og r a m d i s t an c e D [ a . u .] LFToolbox v0.5CLIM-VSENSEPlenoptiCam v1.0
Fig. 13:
Light-field color consistency metric from histogram distance D with low values indicating higher similarity betweenmarginal views E g [ s j ] at position i = 7 , g = 2 and corresponding central views E c [ s j ] in light-fields of size M = 15 .considered noiseless. This white image is then exposed to ad-ditive Gaussian noise with σ = 0 . for de-vignetting by eitherclassical division [12], [17] and our proposed fitting scheme.In doing so, we expect the synthetic noise to propagate duringde-vignetting, however, with sufficient suppression (i.e. higherSNR) for our LSQ approach. The results in Fig. 15 show thatour method retains the image quality by gaining ∼
10 dB ofdynamic range with respect to noise. This is a consequence ofthe least-squares fit treating pixel noise as residuals. Althoughour evaluation contains highly amplified noise, this method isregarded equally beneficial for low-noise images.
D. Refocusing
For the refocusing assessment, we compare the proposedL FP R EFOCUSER module against L
YTRO P OWER T OOLS V
Bumblebee.lfp image from the IRISA dataset [34] as it exhibitsorganic object structures spread through a wide range of depth. (a) a = − . (b) a = 0 . (c) a = 2 . (d) λ = 27 (e) λ = 1 (f) λ = − Fig. 16:
Refocused photographs from P
LENOPTI C AM V a in comparisonto results processed by L YTRO P OWER T OOLS V λ . Quantitative comparison is achieved by analysis of thesharpness and BRISQUE score of local image details. For thesharpness analysis, we follow the Mavridaki and Mezaris [40]approach by transforming cropped versions of a refocused im-age slice E (cid:48)(cid:48) a [ s j , t h ] to the Fourier domain using the DiscreteFourier Transformation (DFT) and extracting the magnitudesignal X [ σ ω , ρ ψ ] by X [ σ ω , ρ ψ ] = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Ξ (cid:88) j = ξ Π (cid:88) h = (cid:36) E (cid:48)(cid:48) a [ s j , t h ] (37) exp ( − πκ ( jω/ (Ξ − ξ ) + hψ/ (Π − (cid:36) )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) where κ = √− and | · | yields absolute values. Given the 2-Dmagnitude signal X [ σ ω , ρ ψ ] , the total energy T E reads
T E = Ω (cid:88) ω =1 Ψ (cid:88) ψ =1 X [ σ ω , ρ ψ ] (38)where Ω = (cid:100) (Ξ − ξ ) / (cid:101) and Ψ = (cid:100) (Π − (cid:36) ) / (cid:101) are borderscropping the first quarter of the unshifted magnitude signal.To isolate the energy of high frequency elements HE , wecompute the power of low frequencies and subtract them from T E which reads HE = T E − Q H (cid:88) ω =1 Q V (cid:88) ψ =1 X [ σ ω , ρ ψ ] (39)with Q H and Q V as scalar limits in the range of { , , . . . , Ω } and { , , . . . , Ψ } separating low from high frequencies. Inour experiments we set the limits to a five hundredths ofthe cropped image resolution. Finally, the sharpness S is afrequency ratio of refocused image portions obtained by S = HET E (40)which serves as our blur metric. From the cropped view of the Lytro image in Fig. 17(l),it is seen that there is no gradual decrease in blur betweenobjects at different depth leading to an unnatural appearanceof image blur. Images rendered with L FP R EFOCUSER do notexpose such sudden change in sharpness. It is only up tospeculation on how Lytro’s refocusing algorithm works. Thehigh resolution and occasional blur artifacts around objectedges suggest that Lytro employs a super-resolution methodat the first stage and then re-blurs spatial areas guided bya previously computed depth map. This may explain blurartifacts at object boundaries, which rely on the quality ofa depth map.P
LENOPTI C AMV
LENOPTI C AMV
YTRO P OWER T OOLS V (a) a = − (b) a = − . (c) λ = 27 (d) a = 1 (e) a = 0 . (f) λ = 1 (g) a = 1 (h) a = 0 . (i) λ = 1 (j) a = 2 (k) a = 2 . (l) λ = − Fig. 17:
Magnified refocused image tiles showing texture de-tails which are quantified by our blur metric S in Table III. Theleft and middle column contain results from P LENOPTI C AMV
YTRO P OWER T OOLS V λ .It is therefore questionable whether the measured sharp-ness in Lytro’s refocus algorithm arises from high frequencyartifacts. By visual inspection, however, it becomes apparentthat the magnified views from Lytro expose more image detailat small λ value. This observation is backed by respectivescores from the BRISQUE metric in Table III. On the contrary,occasional artifact patterns as in Fig. 17(l) may still appealunpleasant to the human visual perception. TABLE III: Metric assessment of refocused images Metrics Fig. 17 (a) (b) (c) (d) (e) (f)Sharpness S BRISQUE score 84.29 83.51
Metrics Fig. 17 (g) (h) (i) (j) (k) (l)Sharpness S BRISQUE score 92.43 81.29
An exemplary result of the L FP S CHEIMPFLUG renderingengine is depicted in Fig. 18. Note how image areas in theupper left background expose sharp details, whereas the focusgradually moves toward the lower right foreground. This isa consequence of the synthetic focus plane being tilted withrespect to the image sensor.Fig. 18:
Scheimpflug rendered photograph from L FP -S CHEIMPFLUG class with a = { , } and sub-pixel precision.V. C ONCLUSIONS
Plenoptic image decomposition is a crucial task in light-fieldrendering for which we propose a pipeline with outstandingimage quality. To the best of our knowledge, P
LENOPTI C AM is the only framework that allows users to automaticallycalibrate footage from different types of plenoptic camerasregardless of the micro lens specification. We achieve thisby employing blob detection, non-maximum suppression anda generic MLA grid geometry recognition. The light-fieldcolor equalization stage is taken to a new level by usinghistogram matching in conjunction with a Monge-Kantorovichsolver to yield convincing quantitative results at reasonablecomputation times, which accelerates a previous method basedon Gaussian Mixture Models (GMM) by several orders ofmagnitude. For plenoptic image alignment, we created analternative element-wise micro image resampling procedure.Moreover, we are first to address fringe artifacts caused byhexagonal micro lens arrangements, which are successfullyreduced by our novel identification scheme. An extensivequantitative assessment of state-of-the-art sub-aperture imagedecomposition pipelines has been carried out in this paper withthe result of outperforming other tool chains in two thirdsof the cases, using a wide range of image metrics such as BRISQUE, average histogram distance via (cid:96) norm and a blurmetric.These findings and respective implementations are releasedas an open-source software framework made available as arepository on GitHub [41] to allow others to participate andcontribute. The herein presented framework is intended todisburden newcomers and facilitate first steps in the field ofplenoptic imaging. Peers are encouraged to report bugs andactively expand this software. P LENOPTI C AM may lay thegroundwork for future algorithm development and testing ofplenoptic data.Such extensions may include, but are not limited to, an algo-rithm development for the focused plenoptic camera as initiallydiscovered by Lumsdaine and Georgiev [42]. An implemen-tation of a super-resolution technique may be accomplishedon the basis of early pioneering work [43], [38]. It shouldalso be feasible to enhance the image quality by combatingoptical aberrations as shown in previous studies [44], [45],[46], [39], [47] or by making use of the Bayer demosaicingsimilar to what Yu and Yu have revealed [48]. Until now,rectification for lens distortions exceeded the scope as this canbe accomplished using traditional computer vision libraries.However, correct disparity estimation requires counteractingdistortions in sub-aperture images at a preceding stage.We believe that our work will be a substantial contributionto the field of plenoptic cameras, which is not limited to theimage processing community. As Dansereau’s LFT OOLBOX has shown, there is a large group of researchers experiment-ing with plenoptic cameras demonstrating the demand foran easy-to-use software addressing the special requirementsfor plenoptic imaging. P
LENOPTI C AM may serve this broaduser group including image scientists, programmers in thefields of data science, medical and industrial engineering orphotography independent of the user’s operating system. Inparticular, P LENOPTI C AM facilitates light-field data prepara-tion for visual machine learning systems such as convolutionalneural networks, which currently receive an increasing interestfrom a variety of scientific and industrial communities. Thechosen open source license model enables cost-free usageand code republication with modifications, giving users theopportunity to extend the presented software.R EFERENCES[1] R. Ng, M. Levoy, M. Br`edif, G. Duval, M. Horowitz, and P. Hanrahan,“Light field photography with a hand-held plenoptic camera,” StanfordUniversity, Tech. Rep. CTSR 2005-02, 2005.[2] C. Hahne, A. Aggoun, V. Velisavljevic, S. Fiebig, and M. Pesch,“Refocusing distance of a standard plenoptic camera,”
Opt. Express et al. , “Simulta-neous whole-animal 3d imaging of neuronal activity using light-fieldmicroscopy,”
Nature methods , vol. 11, no. 7, p. 727, 2014.[4] H. Li, C. Guo, D. Kim-Holzapfel, W. Li, Y. Altshuller, B. Schroeder,W. Liu, Y. Meng, J. French, K.-I. Takamaru, M. Frohman, andS. Jia, “Fast, volumetric live-cell imaging using high-resolutionlight-field microscopy,” bioRxiv
Biomedical Optics Express , vol. 8,p. 260, 01 2017. [6] D. W. Palmer, T. Coppin, K. Rana, D. G. Dansereau, M. Suheimat,M. Maynard, D. A. Atchison, J. Roberts, R. Crawford, and A. Jaiprakash,“Glare-free retinal imaging using a portable light field fundus camera,”
Biomedical Optics Express , vol. 9, p. 3178, 07 2018.[7] P. P. Srinivasan, T. Wang, A. Sreelal, R. Ramamoorthi, and R. Ng,“Learning to synthesize a 4d rgbd light field from a single image,” in
The IEEE International Conference on Computer Vision (ICCV) , Oct2017.[8] M. Levoy and P. Hanrahan, “Light field rendering,” Stanford University,Tech. Rep., 1996.[9] E. H. Adelson and J. Y. Wang, “Single lens stereo with a plenoptic cam-era,”
IEEE Transactions on Pattern Analysis and Machine Intelligence ,vol. 14, no. 2, pp. 99–106, February 1992.[10] C. Hahne, A. Aggoun, V. Velisavljevic, S. Fiebig, and M. Pesch,“Baseline and triangulation geometry in a standard plenoptic camera,”
International Journal of Computer Vision , vol. 126, no. 1, pp. 21–35, Jan2018. [Online]. Available: https://doi.org/10.1007/s11263-017-1036-4[11] D. Cho, M. Lee, S. Kim, and Y.-W. Tai, “Modeling the calibrationpipeline of the lytro camera for high quality light-field image reconstruc-tion,” in
IEEE International Conference on Computer Vision (ICCV) ,December 2013, pp. 3280–3287.[12] D. G. Dansereau, O. Pizarro, and S. B. Williams, “Decoding, calibrationand rectification for lenselet-based plenoptic cameras,” in
IEEE Interna-tional Conference on Computer Vision and Pattern Recognition (CVPR) ,June 2013, pp. 1027–1034.[13] N. Patel, “lfptools,” https://github.com/nrpatel/lfptools, 2011.[14] B. Esfahbod, “python-lfp-reader,” https://github.com/behnam/python-lfp-reader, 2013.[15] Y. Bok, H.-G. Jeon, and I. S. Kweon, “Geometric calibration of micro-lens-based light-field cameras using line features,” in
Proceedings ofEuropean Conference on Computer Vision (ECCV) , 2014.[16] P. David, M. Le Pendu, and C. Guillemot, “White lenslet image guideddemosaicing for plenoptic cameras,” 10 2017, pp. 1–6.[17] P. Matysiak, M. Grogan, M. L. Pendu, M. Alain, and A. Smolic,“A pipeline for lenslet light field quality enhancement,” , Oct 2018.[Online]. Available: http://dx.doi.org/10.1109/ICIP.2018.8451544[18] C. Hahne and A. Aggoun, “Plenoptisign: An optical design toolfor plenoptic imaging,”
SoftwareX
Proceedings of the17th Python in Science Conference , Fatih Akici, David Lippa, DillonNiederhut, and M. Pacer, Eds., 2018, pp. 113 – 120.[20] T. Lindeberg,
Scale-Space Theory in Computer Vision , 01 1994.[21] J. L. Crowley, O. Riff, and J. H. Piater, “Fast computation of charac-teristic scale using a half-octave pyramid,” in
In: Scale Space 03: 4thInternational Conference on Scale-Space theories in Computer Vision,Isle of Skye , 2002.[22] C.-K. Liang and Z. Wang, “Calibration of light-field camera geometryvia robust fitting,” U.S. Patent 9 420 276 B2, Aug. 2016.[23] T. Q. Pham, “Non-maximum suppression using fewer than two compar-isons per pixel,” in
Advanced Concepts for Intelligent Vision Systems ,J. Blanc-Talon, D. Bone, W. Philips, D. Popescu, and P. Scheunders, Eds.Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 438–451.[24] A. Kordecki, H. Palus, and A. Bal, “Practical vignetting correctionmethod for digital camera with measurement of surface luminancedistribution,”
Signal, Image and Video Processing , vol. 10, pp. 1417—-1424, 2016.[25] D. Menon, S. Andriani, and G. Calvagno, “Demosaicing WithDirectional Filtering and a posteriori Decision,”
IEEE Transactions onImage Processing , vol. 16, no. 1, pp. 132–141, jan 2007. [Online].Available: http://ieeexplore.ieee.org/document/4032820/[26] F. Piti´e and A. Kokaram, “The linear monge-kantorovitch colour map-ping for example-based colour transfer,”
IET Conference Proceedings ,pp. 23–23(1), January 2007.[27] R. Gonzalez and R. Woods,
Digital Image Processing, Global Edition .Pearson, 2017.[28] F. Piti´e, A. Kokaram, and R. Dahyot, ser. Image Processing Series.CRC Press, Sep 2008, ch. Enhancement of Digital PhotographsUsing Color Transfer Techniques, pp. 295–321, 0. [Online]. Available:http://dx.doi.org/10.1201/9781420054538.ch11 [29] A. Isaksen, “Dynamically reparameterized light fields,” Master’s thesis,Electrical Engineering and Computer Science, Massachusetts Instituteof Technology, November 2000.[30] C. Hahne, A. Lumsdaine, A. Aggoun, and V. Velisavljevic, “Real-time refocusing using an fpga-based standard plenoptic camera,” IEEETransactions on Industrial Electronics
IEEETransactions on Computational Imaging , p. 10, Apr. 2019. [Online].Available: https://hal.archives-ouvertes.fr/hal-02263377[33] M. Schambach and F. Puente Le´on, “Microlens array grid estimation,light field decoding, and calibration,”
IEEE transactions on computa-tional imaging
IEEE Transactions on Image Pro-cessing , vol. 21, pp. 4695–4708, 2012.[36] Bukalapak, “Pybrisque,” https://github.com/bukalapak/pybrisque, 2020.[37] D. G. Dansereau, O. Pizarro, and S. B. Williams, “Linear volumetricfocus for light field cameras,”
ACM Trans. Graph. , vol. 34,no. 2, pp. 15:1–15:20, Mar. 2015. [Online]. Available: http://doi.acm.org/10.1145/2665074[38] C.-K. Liang and R. Ramamoorthi, “A light transport framework forlenslet light field cameras,”
ACM Trans. Graph. , vol. 34, no. 2, Mar.2015. [Online]. Available: https://doi.org/10.1145/2665075[39] L.-Y. Wei, C.-K. Liang, G. Myhre, C. Pitts, and K. Akeley, “Improvinglight field camera sample design with irregularity and aberration,”
ACM Trans. Graph. , vol. 34, no. 4, Jul. 2015. [Online]. Available:https://doi.org/10.1145/2766885[40] E. Mavridaki and V. Mezaris, “No-reference blur assessment in naturalimages using fourier transform and spatial pyramids,” in , October 2014,pp. 566–570.[41] C. Hahne, “Plenopticam,” https://github.com/hahnec/plenopticam, 2020.[42] A. Lumsdaine and T. Georgiev, “Full resolution lightfield rendering,”Adobe Systems, Inc., Tech. Rep., January 2008.[43] T. E. Bishop, S. Zanetti, and P. Favaro, “Light field superresolution,” in
IEEE International Conference on Computational Photography (ICCP) ,2009.[44] R. Ng, “Digital light field photography,” Ph.D. dissertation, StanfordUniversity, July 2006.[45] R. Ng and P. Hanrahan, “Digital correction of lens aberrations in lightfield photography,” in
International Optical Design Conference , vol.Proc. SPIE 6342. Stanford University, July 2006. [Online]. Available:http://dx.doi.org/10.1117/12.692290[46] N. Cohen, S. Yang, A. Andalman, M. Broxton, L. Grosenick,K. Deisseroth, M. Horowitz, and M. Levoy, “Enhancing the performanceof the light field microscope using wavefront coding,”
Opt. Express
IEEE International Conferenceon Computer Vision and Pattern Recognition (CVPR) , 2012, pp. 901–908. A PPENDIX AS CALE S PACE T HEOREM
Proof.
Let G ( σ, x ) be a Gaussian function of x = { x, y } ∈ R and σ the scale of interest in a Laplacian pyramid, so that G ( σ, x, y ) = e − x y σ ∂ x G ( σ, x, y ) = − xσ e − x y σ ∂ xx G ( σ, x, y ) = − xσ e − x y σ + xσ e − x y σ ∇ G ( σ, x, y ) = ∂ xx G ( σ, x, y ) + ∂ yy G ( σ, x, y )= (cid:18) x + y σ − σ (cid:19) e − x y σ r = x + y r σ − σ r = 2 σ r = √ σM ≈ r = 2 √ σ Christopher Hahne is a visiting scholar affiliatedwith the School of Mathematics and Computer Sci-ence at the University of Wolverhampton. He re-ceived the BSc degree in 2012 from the University ofApplied Sciences Hamburg while working at R&Ddepartments of Rohde & Schwarz GmbH & Co. KGas well as ARRI Cinetechnik GmbH & Co. KG.After becoming a visiting student at Brunel Univer-sity in 2012, he transferred to a bursary-funded PhDprogramme and was awarded the doctoral degree inComputer Science from the University of Bedford-shire in 2016. His research interests include algorithm development for audio-visual content.