[PDF] HMLFC: Hierarchical Motion-Compensated Light Field Compression for Interactive Rendering

Abstract

We present a new motion-compensated hierarchical compression scheme (HMLFC) for encoding light field images (LFI) that is suitable for interactive rendering. Our method combines two different approaches, motion compensation schemes and hierarchical compression methods, to exploit redundancies in LFI. The motion compensation schemes capture the redundancies in local regions of the LFI efficiently (local coherence) and hierarchical schemes capture the redundancies present across the entire LFI (global coherence). Our hybrid approach combines the two schemes effectively capturing both local as well as global coherence to improve the overall compression rate. We compute a tree from LFI using a hierarchical scheme and use phase shifted motion compensation techniques at each level of the hierarchy. Our representation provides random access to the pixel values of the light field, which makes it suitable for interactive rendering applications using a small run-time memory footprint. Our approach is GPU friendly and allows parallel decoding of LF pixel values. We highlight the performance on the two-plane parameterized light fields and obtain a compression ratio of 30-800X with a PSNR of 40-45 dB. Overall, we observe a 2-5X improvement in compression rates using HMLFC over prior light field compression schemes that provide random access capability. In practice, our algorithm can render new views of resolution 512X512 on an NVIDIA GTX-980 at ~200 fps.

Full PDF

HHMLFC: Hierarchical Motion-Compensated Light Field Compressionfor Interactive Rendering

SRIHARI PRATAPA,

Uinversity of North Carolina at Chapel Hill

DINESH MANOCHA,

University of Maryland at College Park

We present a new motion-compensated hierarchical compression scheme(HMLFC) for encoding light field images (LFI) that is suitable for interac-tive rendering. Our method combines two different approaches, motioncompensation schemes and hierarchical compression methods, to exploitredundancies in LFI. The motion compensation schemes capture the re-dundancies in local regions of the LFI efficiently ( local coherence ) and thehierarchical schemes capture the redundancies present across the entire LFI( global coherence ). Our hybrid approach combines the two schemes effec-tively capturing both local as well as global coherence to improve the overallcompression rate. We compute a tree from LFI using a hierarchical schemeand use phase shifted motion compensation techniques at each level of thehierarchy. Our representation provides random access to the pixel values ofthe light field, which makes it suitable for interactive rendering applicationsusing a small run-time memory footprint. Our approach is GPU friendly andallows parallel decoding of LF pixel values. We highlight the performanceon the two-plane parameterized light fields and obtain a compression ratioof 30–800 × with a PSNR of 40–45 dB. Overall, we observe a ∼ × improve-ment in compression rates using HMLFC over prior light field compressionschemes that provide random access capability. In practice, our algorithmcan render new views of resolution 512 ×

512 on an NVIDIA GTX-980 at ∼

200 fps.

Virtual reality (VR) is being increasingly used for immersive mul-timedia experiences and telepresence applications. To achieve ahigh degree of presence in VR, we need to generate high-fidelityrenderings of real world scenes at interactive rates. Photo-realisticrenderings increase the sense of immersion in real world scenes andprovide artistic, life-like experiences in VR . The plenoptic func-tion (7D) describes the total flow of light through all the pointsin space [Adelson and Bergen 1991]. Light Fields (LF) are a low-dimensional (4D or 5D) function of the plenoptic function thatcapture the radiance of the light rays over a specific region of space.Yu [Yu 2017] outlines the emergence of light fields and lists theadvantages of using LF technology to generate high-quality contentfor VR applications.Levoy & Hanrahan [Levoy and Hanrahan 1996] and Gortler etal. [Gortler et al. 1996] describe a 4D parameterized LF and practi-cal approaches for capturing and rendering static scenes using 2Dimage samples. To generate photo-realistic renderings from differ-ent viewpoints, such image-based-rendering (IBR) techniques needlarge amounts of data to be captured, which is a major issue forinteractive applications. The number of image samples required fora good quality rendering using LF is generally in the order of tens MIT Technology Review: VR is still a novelty, but Googles light-field technology couldmake it serious art. https://goo.gl/F79udnAuthors’ addresses: Srihari Pratapa, Department of Computer Science , Uinversity ofNorth Carolina at Chapel Hill, [email protected]; Dinesh Manocha, Department ofComputer Science and Electrical & Computer Engineering , University of Maryland atCollege Park, [email protected]. of thousands [Chai et al. 2000]. The data sizes of the sampled LFvary from hundreds of MB [Levoy and Hanrahan 1996] to hundredsof GB [Levoy et al. 2000; Lin and Shum 2000] depending on thescene complexity, sampling rate, and sampling resolution. For 360°panoramic light fields [Overbeck et al. 2018] the LF data-sizes areclose to 4–6 GBs. Therefore, compressing the LF is necessary forstoring, transmitting, and interactive rendering.The LF-based rendering algorithms involve retrieving pixel val-ues from LFI and interpolating the pixel values to compute a newview. The pixels required for computing the new view may be lo-cated in different regions of different LFI. Therefore, for real-timeLF rendering, the relevant portions of uncompressed LFI should bepresent in the local memory. To render a new view, only a portion ofdata is required from the entire LFI. As a result, we do not need theentire uncompressed LFI in memory, as it may result in memory bot-tlenecks. During rendering pixel data is continuously fetched frommemory, and in real-time systems, with limited bandwidth (mobileand untethered AR/VR), can cause a significant performance bottle-neck [Fenney 2003]. Random access compression schemes help inmitigating the memory and bandwidth bottlenecks. Random accesscompression schemes of LFI have two main properties: (1) selectivedecoding of only the required data; (2) allowing fast hardware de-compression. To enable interactive LF rendering applications, it isnecessary to develop LFI compression schemes that maintain theproperties of random access compression schemes as well as providegood compression rates.Prior LFI compression schemes can be broadly categorized into hi-erarchical schemes and motion compensated schemes. Hierarchicalcompression approaches for LFI compute a tree (parent, child depen-dencies) from the LFI using image transformations and create levelsof hierarchy [Peter and Straßer 2001; Pratapa and Manocha 2018].Motion compensation methods capture redundancies in nearby LFIby using a pair of motion vectors or disparity values [Overbeck et al.2018; Zhang and Li 2000]. The hierarchical schemes and motioncompensation schemes exhibit different characteristics in terms ofcapturing the redundancies across the LFI.

Main Results:

We present a new motion compensated hierarchi-cal compression scheme (HMFLC) for encoding LFI for interactiverendering. Ours is a hybrid method that combines two differentrandom access compression approaches to maximize the redundan-cies captured across the LFI. The first class of methods is motioncompensation schemes in which the redundancies present in thesmall regions of the LFI are efficiently captured using extensivesearch based techniques. The other class of methods is hierarchicalcompression approaches in which image manipulation and transfor-mation techniques are applied to the entire LFI to capture redundan-cies across the LFI in a global manner. We use a hierarchical lightfield compression approach to capture the redundancies in a global , Vol. 1, No. 1, Article . Publication date: April 2017. a r X i v : . [ c s . G R ] A p r • Srihari Pratapa and Dinesh Manocha fashion and then apply phase shifted motion compensation to vari-ous levels of the hierarchy. We apply motion compensation to all thelevels of the hierarchy by selecting a set of reference frames at eachlevel, creating a new motion-compensated hierarchy. The tree struc-ture computed in the underlying hierarchical scheme is maintainedafter applying motion compensation at each level. After motioncompensation, the amount of data in the levels of the hierarchy isreduced by a significant factor leading to a higher compression rate.We also a present simple and fast scheme to decompress the lightfields and use them for interactive rendering on commodity GPUs.The main contributions of our approach include:(1) A novel compression approach combining two different schemes(motion-compensation and hierarchical schemes) for LFI com-pression to achieve better compression performance (Section-3);(2) New phase shifted motion-compensation technique suitablefor the properties of the images computed in the hierarchy(Section- 4);(3) A hybrid compression scheme (HMLFC) that provides manybenefits including random access, progressive decoding, andparallel decompression on commodity hardware (Section- 4).Our compression algorithm, HMLFC, provides a 2–5 × improvementin compression rate for similar compression quality compared toprior hierarchical schemes as well as motion compensation schemesthat provide random access capability (Section- 5). The decompres-sion memory overhead and decompression time overhead due toour hybrid combination is minimal. We can render new views ata resolution of 512 ×

512 using an NVIDIA GTX-980 at ∼

200 fps(Section- 5).

In this section, we give a brief overview of prior work on light fieldrendering and compression algorithms.

The plenoptic function describes the flow of light in space. Adelson& Bergen [Adelson and Bergen 1991] use the term plenoptic function(7D) to describe the light intensity ( L ) at any point ( x , y , z ) and ori-entation ( θ , ϕ ) in free space at any given time ( t ) , and over a rangewavelengths ( λ ) in the visible spectrum: L = P ( x , y , z , θ , ϕ , t , λ ) .Levoy & Hanrahan [Levoy and Hanrahan 1996] and Gortler etal. [Gortler et al. 1996] describe a low-dimensional (4-D) form ofthe plenoptic function, called light field as a set of outgoing lightrays from a static object or scene. Levoy & Hanrahan [Levoy andHanrahan 1996] use two parallel planes described by ( u , v ) and ( s , t ) ,a two-plane parameterization, to describe the 4-D function. Severalother low-dimensional parameterizations such as the spherical [Ihmet al. 1997; Overbeck et al. 2018] or the unstructured [Davis et al.2012] have been proposed to describe and capture the plenopticfunction. In all the parameterizations, the light rays are capturedby densely sampling 2-D camera images from multiple viewpointsaround the scene. New views from arbitrary positions in space aregenerated by interpolating the captured light rays (pixel values).High sampling rates (orders of tens of thousands) are required to achieve photo-realistic reconstructions, which need huge amountsof data to store the captured images [Chai et al. 2000]. A large amount of image data is needed for LF rendering and itcreates a bottleneck for interactive applications. LF compressionschemes are used to transmit and store LFI for rendering. JPEGPleno [Ebrahimi et al. 2016] has been launched by the JPEG stan-dards committee with the goal of establishing standards for thebroader adaptability of 4D LF applications. Several compressionschemes have been proposed to handle the image data problem inLF rendering. We categorize the existing schemes into two types: hi-erarchical compression schemes , which apply image transformationsand image manipulations (wavelet, image warping, and arithmeticmanipulations) to the original LFI and build hierarchical structuresthat exploit redundancies; motion compensated compression schemes ,which use standard techniques similar to MPEG video compression(motion vectors) or disparity compensation to capture redundancies.In addition to these two categories, Levoy & Hanrahan [Levoy andHanrahan 1996] use vector quantization (VQ) to compress the LFIusing a 4D dictionary. The compression rates attained using VQ arearound 10:1 to 20:1. A survey of compression schemes for LFI ispresented in Viola et al. [Viola et al. 2017].

We further cate-gorize motion compensated schemes into two sub-categories: highefficiency schemes , which provide very high compression ratios with-out random access properties and random access schemes , whichenable interactive rendering by allowing random access to the pixelvalues of the LFI.

High-efficiency schemes:

The primary approaches in LFI compres-sion adopt methods similar to image and video compression meth-ods. These approaches apply techniques such as motion-vector com-pensation, domain transform (DCT, wavelet), and image-warpingto exploit redundancies among the LFI. In the case of two-plane 4Dparameterized LF, the light rays are sampled using uniform cameramotion between adjacent samples. Using this observation, Girod etal. [Girod et al. 2003]; Jagmohan et al. [Jagmohan et al. 2003]; andMagnor & Girod [Magnor and Girod 2000] describe methods thatuse a single disparity value instead of a pair of motion-vector valuesto encode the LFI predictively. The compression ratios achievedare close to 100:1 to 200:1. Moreover, these methods do not pro-vide random access capabilities. Very high compression efficiencyschemes that provide compression ratios of 100:1 to 1000:1 havebeen proposed [Chen et al. 2018; Liu et al. 2016; Perra and Assuncao2016]. In Liu et al. [Liu et al. 2016], the grid of LFI is first processed toarrange them in sequential order to get an optimal pseudo-temporalordering that maximizes when compressed using HEVC encoding.Chen et al. [Chen et al. 2018] process the LFI using predictive andimage-warping methods from which from a small set of key-viewsare selected and the rest of the LFI are predicted using the key-views.After the pre-processing, the images are temporally ordered usingthe method in Liu et al. [Liu et al. 2016] and then compressed usingHEVC. Techniques that use additional information about scene ge-ometry and characteristics in addition to image-warping techniquesare presented in Chang et al. [Chang et al. 2006]. Image homography , Vol. 1, No. 1, Article . Publication date: April 2017.

MLFC: Hierarchical Motion-Compensated Light Field Compression for Interactive Rendering • 3 is used to warp the LFI onto a fixed set of reference images to findredundancies in Kundu [Kundu 2012], yielding compression ratesof 10:1 to 50:1. Although some of the above methods provide largecompression ratios, they fail to address the problems of randomaccess and heavy memory consumption for interactive rendering.These methods are efficient for transmitting and streaming LF dataover the internet but require the entire LF to be decoded for render-ing.

Random Access schemes:

The method in Zhang & Li [Zhang and Li2000] is the first approach that uses motion compensation and pro-vides random access capabilities for LFI compression. They describea multi-reference, frame-based motion compensation approach thatprovides compression ratios of 80:1. Overbeck et al. [Overbeck et al.2018] present a scheme for compressing 360° panoramic light fieldscaptured using their LF capturing system. They achieve compressionratios of 40:1 to 200:1 on the complex panoramic LFI datasets.

Peter & Straßer [Peterand Straßer 2001] present a 4D wavelet hierarchical scheme forcompressing LF that provides random access. This method uses 4DHarr wavelets to transform the LFI into wavelet coefficients andorganizes the coefficients into a tree structure. They attain compres-sion rates of 20:1 to 40:1 and their method makes assumptions aboutthe scene captured in the light field. Pratapa & Manocha [Pratapaand Manocha 2018] present a hierarchical compression scheme thatis based on computing representative and residual views at eachlevel of the hierarchy to exploit redundancies across the LFI. Thetop-level images of the hierarchical tree capture the redundant com-mon details among the LFI and the other levels of the tree store thelow-level, high-frequency details of the LFI. Their method obtainscompression rates of 20:1 to 200:1, provides random access to thecompressed stream, enables progressive decompression, and sup-ports fast hardware decoding. Magnor & Girod [Magnor and Girod1999] describe a hierarchical predictive-based encoding scheme us-ing disparity maps. In Magnor & Girod [Magnor and Girod 1999],an explicit hierarchical tree is not constructed, but a hierarchicalrelationship among the LFI is established by iteratively dividing theLFI into sub-quadrants. This method provides compression rates ofaround 400:1, but it does not provide random access capability forinteractive rendering.

In this section, we present high-level descriptions of motion compen-sation and hierarchical compression methods for LFI that providerandom access. We discuss the advantages and limitations of eachof these approaches and motivate the design of our hybrid compres-sion scheme. In the following discussion, we assume 4D two-planeparameterization, though our approach can be extended to other LFparameterizations.For a set of light field images captured using two-plane parame-terization, redundancies are present across all the captured LFI. Theamount of coherence between two captured light field images variesbased on the distance between the actual image capture points in thespace. Adjacent light field images exhibit higher coherence, while far off images exhibit lesser coherence. We refer to the coherenciesthat are commonly present across the entire LFI as global coherencies .We refer to the coherencies present across the adjacent images ofthe LFI as local coherencies . An example of LFI highlighting localand global coherencies is highlighted in the suppl. material (Section3).

We use the following terminology to give an overview of priormotion compensation schemes used for LF compression.

Reference Images : The set of images selected from the original LFIthat are encoded independently, using standard compression schemesand used as the reference set for encoding the rest of the images inthe LFI in the motion compensation schemes.

Predictive Images : The set of images from the original LFI that areassociated with a reference image and are encoded using motioncompensation techniques.At a high level, motion compensation schemes start by selectinga subset of frames from the LFI as reference images . The processof selecting the reference images varies depending on the exactcompression scheme, as discussed in Section 2. Once the referenceimages are selected, the rest of the LFI are marked as predictiveimages . Each of the predictive images is associated with a referenceimage and encoded using motion compensation. The predictive im-ages are divided into non-overlapping rectangular blocks, and eachblock of pixels is predicted (computed as difference) from the refer-ence image using a pair of motion vectors. Motion compensationschemes typically use an exhaustive search over a large region in thereference image to minimize the residual difference for each blockin the predictive images. Therefore, they compute the redundan-cies between a given reference image and the associated predictiveimages very efficiently using an exhaustive search.The LFI exhibit a large amount of coherence across all the LFIcaptured. Motion compensation schemes capture local coherencies effectively using exhaustive search. Although the motion compen-sation schemes exploit local coherencies (reference and predictiveimage sets) of the LFI efficiently using an exhaustive search, theyfail to capture the redundancies present across the entire LFI in aglobal fashion (e.g., coherencies across grids in Fig 2).

We use the following terms to present an overview of the hierarchi-cal approach:

Parent Images & Child Images : The new sets of transformed imagescomputed using image manipulations and transformations fromthe original LFI in hierarchical compression schemes. The new setsof transformed images are parent image sets and children imagesets forming hierarchical relationships between the sets, creating ahierarchy.Hierarchical schemes use image manipulation and transformationtechniques on the LFI to compute a new set of images capturing theredundancies across the entire LFI. The new set of transformed im-ages is partitioned into two subsets, parent images and child images, , Vol. 1, No. 1, Article . Publication date: April 2017. • Srihari Pratapa and Dinesh Manocha compressed stream

Motion Compensation

LF Images

Hierarchical Compression {Parent Images} n {Child Images} (n-1) {Child Images} {Child Images} {Parent Images} n {Child Images} (n-1) {Child Images} {Child Images} Processing & Encoding

Hierarchical Structure

Motion CompensatedHierarchical Structure

Fig. 1. Overview of our HMLFC compression pipeline: The compression pipeline consists of different stages. In the first stage, a hierarchical compressionscheme is applied to LFIs to compute levels of new, transformed parent and child images. In the next step, all the levels of the computed hierarchy areprocessed using a motion compensation scheme to compute a new motion-compensated hierarchy. In the final stage, the motion compensated hierarchy isfurther processed and encoded to generate the compressed bit stream. creating a hierarchy. The image manipulation and transformations(wavelet transforms, image warping, image filtering, and arithmeticmanipulations) used to compute the new set of images and the exactparent-child relationships depend on the particular compressionscheme. Typically, the parent images capture the common redun-dant details across the LFI and the children contain image specificlow-level details of the LFI. The parent subset is further processedrecursively to compute the next level of the hierarchy.The primary advantage of hierarchical methods is that they cap-ture the global coherencies across distant images of the LFI, whichthe motion compensation schemes fail to capture. Due to the lackof an exhaustive search for redundancies, the global redundanciesthat are encapsulated in the parent images are limited by the im-age transformation and manipulation techniques employed in thecompression scheme. Figure 2 (b) shows a high-level overview of ahierarchical LFI compression scheme.

Our goal is to develop a hybrid scheme that captures the benefitsof motion compression schemes in terms of local coherency andhierarchical schemes in terms of global coherency. We design ahybrid approach that is based on applying an additional layer ofmotion compensation to a hierarchical representation. To obtaingood compression rates and provide random access capabilities, weneed to address these issues:(1) Once the redundancies across the LFI are captured globally,the properties of the parent and children images computedusing a hierarchical scheme differ significantly from the prop-erties of typical images or original LFI. To effectively cap-ture the local coherency, we need new motion compensationschemes to account for the properties of the transformedimages in the hierarchy for achieving further compression.(2) The resulting motion compensation scheme should conserveall the properties and benefits of the underlying hierarchicalscheme, including the hierarchy structure and random accesscapability.(3) The overhead of the additional costs of decompression af-ter an additional layer of motion compensation should beminimal.

In this section, we describe our novel hybrid compression algo-rithm that captures local and global coherency and addresses thechallenges highlighted above.

To tackle the limitations of the motion compensation methods and hierarchical methods , we combine both approaches to capture re-dundancies in both global and local fashion, resulting in bettercompression rates. In other words, we first apply a hierarchicalapproach to the LFI gathering all the global coherencies , then applyan additional layer of compression to the images at each level tocapture remaining redundancies using motion compensated search.

Our approach (HMLFC) uses a hierarchical motion compensationscheme to capture the redundancies present across the entire setof LFI in a global fashion ( global coherence ). Next, we treat theparent images and each of the children subsets at all levels computedfrom a hierarchical scheme as separate subsets of images. Our goalis to apply motion compensation methods to each of the subsetsindependently and design a new scheme that exploits the propertiesof these subsets. The application of motion compensation on thechildren and parent subsets further exploits the local redundancies( local coherence ) efficiently by using the exhaustive search that thehierarchical methods fail to exploit.The decoding properties (random access, progressive decoding,and hardware decoding) of our hybrid approach depend mainly onthe underlying hierarchical scheme used for computing the hierar-chy and the motion compensation scheme. In the following sectionwe present the overview of the underlying hierarchical scheme andthe details of the compression and decompression of our hybridapproach.

We choose the RLFC hierarchical compression algorithm describedin Pratapa & Manocha [Pratapa and Manocha 2018] because it pro-vides random access to the compressed data and allows hardwaredecompression. In addition to RLFC, we use a novel motion com-pensation scheme on the levels of the hierarchy. We maintain therandom access property of RLFC after this motion compensationstep. , Vol. 1, No. 1, Article . Publication date: April 2017.

MLFC: Hierarchical Motion-Compensated Light Field Compression for Interactive Rendering • 5 L e v e l : L e v e l : L e v e l : L e v e l : ( b ) H i e r a r c h i c a l S c h e m e L e v e l : L e v e l : L e v e l : L e v e l : ( c ) H y b r i d S c h e m e R i R j R j R i R i R i R i R j R i R i P i0 P i1 P i2 P i3 R i P i4 P i5 P i6 P i7 R j P j0 P j1 P j2 P j3 P j4 P j5 P j6 P j7 ( a ) M o t i o n C o m p e n s a t i o n S c h e m e Grid: 1 Grid: 2Grid: 3 Grid: 4

Fig. 2. An example overview of our hybrid compression scheme. (a) MotionCompensation: The local coherencies are effectively captured in small re-gions (grids marked in green) of the LFI using an exhaustive search. A setof images from LFI is selected as the reference images, highlighted in red( R i , R j ). The rest of the images are marked as predictive images and arecompressed from the reference images using motion compensation. (b) Thecoherencies present across the entire LF are captured using image manipu-lation and transformations applied to the LFI in a global fashion. The levelsof the hierarchy are marked and arrows indicate parent-child relationshipsin the hierarchy. (c) Our Hybrid Approach: Once the global coherencies areencapsulated using the hierarchical scheme, the additional redundancies ateach level of the hierarchy are captured as local coherencies by applyingmotion compensation at every level. At each level of the hierarchy, referenceimages ( R i , R j ) are indicated using a bounding box. The predictive imagesassociated with the reference images are indicated with arrows pointingtowards the reference images. In RLFC, the LFI are clustered based on the spatial locationsof the samples. For each of the clusters, a new image referred toas the representative key view (RKV) is computed by filtering allthe image samples in the clusters. The RKV images encapsulatecommon details among all the images in a given cluster, and the setof RKVs from all the clusters forms a new level (parent images) in thehierarchy. After computing the new level of RKVs, the differencesbetween the RKV and the images in the corresponding cluster arecomputed. The difference images are referred to as sparse residualviews (SRV) and the new set of SRVs are the child images in thehierarchy. The SRVs are high-frequency images that contain thespecific low-level details of the images that are not captured in theRKVs. This process is recursively implemented on the new RKVsuntil the tree height reaches a user-set level. We refer the readers toPratapa & Manocha [Pratapa and Manocha 2018] for exact detailsof the hierarchy and tree structure computed in RLFC.

Notation:

We use the following notation for explaining the ap-proaches: R i denotes the i th reference image in the motion com-pensation methods; P ki denotes the k th predictive image associatedwith R i in the motion compensated methods; ( x , y ) denotes the mo-tion vector pair in the motion compensation schemes; B P ki denotesa block of pixels in the k th predictive image; B xyR i denotes a block ofpixels for motion vectors ( x , y ) in the reference image R i ; ∆ repre-sents the prediction residual error computed between the referenceblock and the predictive block. As shown in Fig. 3 (left), the SRV images exhibit significant localcoherency at each level of the hierarchy. The SRV images are com-puted as the difference between RKV images at a given level andimages in the level below. Due to the difference computation, thepixels in the SRVs have both negative and positive intensity values. The negative and positive pixel intensity values correspond to theinversions of pixel intensity values across the SRV image signals ata given level. We refer to these inversions as phase-shifts in the SRVimage signals. We present a new phase-shifted motion compensa-tion to capture local coherencies in the levels of the hierarchy. Thesephase shifts in the SRV image signals need to be accounted whileapplying motion compensation to the SRV images. More detailsabout the phase shifts that occur in the SRV images are presentedin the suppl. material, Sec-1.For a selected SRV reference image R i at any given level, let { P i , P i , .., P ni } denote the set of predictive SRV images associatedwith R i . Each block ( B P ki ) in the predictive P ki is motion compen-sated by searching over a large search window W in the referenceSRV image R i using a pair of motion vectors ( x , y ) . We include thephase shifts in our motion prediction scheme by computing tworesidual errors for each block (number of pixels in a block: N ) asfollows: ∆ xy − = N (cid:213) l = | B P ki ( l ) − B xyR i ( l )| , (1) ∆ xy + = N (cid:213) l = | B P ki ( l ) + B xyR i ( l )| . (2)For a given pair of motion vectors ( x , y ) , we compute a subtractiveprediction residual error ∆ xy − and an additive prediction residualerror ∆ xy + to include possible phase shifts between the referenceimage and the predictive images in a given region. ∆ − = min ( x − , y − ) ( ∆ xy − ) ∀ x , y ∈ [− W , W ] , ∆ + = min ( x + , y + ) ( ∆ xy + ) ∀ x , y ∈ [− W , W ] . The minimum subtractive prediction residual error ∆ − and the min-imum additive prediction residual error ∆ + are computed for eachblock ( B P ki ). ( x − , y − ) and ( x + , y + ) are the motion vectors corre-sponding to the minimum prediction residuals. The final motionprediction residual error ∆ and the corresponding motion vectorsare computed as follows: ∆ = min ( ∆ − , ∆ + ) , ( x , y ) = (cid:40) ( x − , y − ) if ∆ = ∆ − , ( x + , y + ) if ∆ = ∆ + . Next, we perform a replacement step in which the original pixelvalues in the block B P ki in the predictive SRV image ( P ki ) are re-placed with predictive residuals of the block. The replacement stepmodifies the SRV images in the original RLFC tree and computes anew HMLFC tree , but the tree structure and the hierarchy remainexactly the same as the original RLFC tree. Figure 3 shows the predic-tive residual SRV images after applying our phase inclusive motioncompensation. The predictive residual SRV images computed af-ter motion compensation are much sparser than the original SRVimages. Therefore, the predictive residual SRV images can be com-pressed more significantly without quality loss, resulting in bettercompression rates. A zoomed-in (16X) visual comparison between , Vol. 1, No. 1, Article . Publication date: April 2017. • Srihari Pratapa and Dinesh Manocha

Fig. 3. (left) An SRV image cluster from the RLFC hierarchy from level: 0. Itis evident that SRV images are visually similar to each other and exhibit a lotof coherency between them. We exploit the redundancy by applying motioncompensation to achieve further compression. (right) The SRV image clusteris shown at left after applying motion compensation. The reference imageis highlighted in red, and the rest of the images are residual difference SRVimages after motion compensation. Compared to the original SRV images,the residual difference SRV images are sparser leading to significantly bettercompression rates. an original SRV image and the corresponding predictive residualSRV image after motion compensation is shown in suppl. material,Sec-5.

YCoCg [Malvar et al. 2008] color space is used in our implementationto decorrelate the RGB color channels and the chroma channels aresub-sampled. The dynamic range (number of bits to store pixels) ofthe pixel values is adjusted accordingly to avoid any loss of informa-tion due to transformations. The hierarchy computation and motioncompensation are performed separately on all the channels. Afterthe motion compensation, the top-level RKV images of the hierarchyare similar to the standard images (RGB) and are compressed usingJPEG2000 in the lossless mode.The compression rate and compression quality of the scheme arecontrolled by encoding parameters set as user-input to the compres-sion method. The main encoding parameters in the RLFC schemeare tree height , block size , and block thresholds . In addition to thethree encoding parameters another important encoding parameteris search window size used in the phase-shifted motion compensa-tion to perform the exhaustive search. The SRV images at all levelsof the hierarchy are divided into block size non-overlapping rectan-gular blocks and motion compensation is applied to all the blocksas described in the previous section.After applying the motion compensation to the hierarchy andcomputing the HMLFC tree, SRV images (both residual differenceSRVs and reference SRVs) are thresholded to discard insignificantdata based on the two block thresholds set as encoding parameters.For each block in the SRV images, an energy value is computed bysumming up the absolute values of the pixels in the block. If theenergy is less than the user set threshold the block is marked asinsignificant and not stored in the final compression. Due to motioncompensation, any losses introduced in the reference SRVs due tothresholding gets propagated to the associated predictive SRVs. To avoid that we use two independent block thresholds for thresholdingthe motion compensated residual SRVs and reference SRVs. In the construction of the hierarchy, we need to perform losslessinteger computations, and the final pixel values in all the images ofthe hierarchy are integer values. BISE [Nystad et al. 2012] presentsan efficient way of encoding a sequence of integer values within afixed range [0, N −

1] and allows for fast random access decodingin constant time with minimal hardware. The straightforward solu-tion for allowing fast hardware random access to the sequence ofintegers as bit strings of their corresponding binary representations.However, this solution is only optimal when N is a power of twobecause it uses log N bits to store the integer values equivalent tothe information present in each integer value. Besides the simplecase when N is a power of two BISE provides an efficient encodingthat is close to the information theoretic bounds for other rangesof N . The significant blocks in the SRV images after thresholdingin the HMFLC tree are encoded using BISE. The resulting formu-lation is easily supported by the hardware and provides losslesscomputations. We further process and compress the HMLFC tree and the additionalmotion vector values computed from the motion compensationstep. The HMLFC tree is linearized using breadth-first search (BFS)traversal indexing all the SRVs in the traversal order starting fromthe top of the tree. The BISE compressed blocks of the SRV images inthe tree are arranged in the same BFS linearized order and appendedto the compressed stream. To maintain fast random access property,we extend the application of BISE to encode the motion vectorvalues. Compressing motion vectors using BISE also preserves thefast hardware decompressible property of our stream.

Decoding a block of pixel values from aparticular location from the LFI consists of two main steps: (1) De-coding the blocks from the HMLFC hierarchy using tree traversal;(2) Applying motion re-compensation for the motion compensatedblocks. At first, the top-level RKV images are decoded and stored inmemory. To decode a block of pixels, we use tree traversal procedureand collect the required BISE compressed SRV blocks from the toplevel to the bottom level. Using the block indices, we infer whethera block belongs to the predictive SRV images in the hierarchy. Theoriginal SRV blocks are computed from the predictive residuals andreference SRV images using motion re-compensation. The motionvector values for the corresponding block are decoded from therandom accessible BISE compressed motion vector stream. The finalpixel values are computed by combining the SRV pixel values withthe corresponding top-level RKV pixel values.

The pixels from the ref-erence SRV images are necessary during decoding to perform theadditional motion re-compensation step. To avoid the additionaltime required for decoding the reference SRV pixels, at the start ofthe decompression, the reference SRV images at all the levels are , Vol. 1, No. 1, Article . Publication date: April 2017.

MLFC: Hierarchical Motion-Compensated Light Field Compression for Interactive Rendering • 7 decoded and loaded into the memory. The SRV images in the hier-archy are highly sparse in terms of the pixel distribution present inthe images (Fig. 3). We use a sparse matrix representation [Neelimaand Raghavendra 2012] to store the decompressed reference imagesin the memory while rendering. The sparse matrix representationreduces the additional run-time memory overhead required for themotion re-compensation step to decode a given block of pixels. Al-though there is a minor time overhead in reading the pixels fromsparse matrices, the overhead is much smaller than the time requiredfor decoding the reference SRV pixels for motion re-compensation.The size of the additional memory overhead depends on two factors:One of them is a user-set encoding parameter such as the numberof levels in the hierarchy and number of reference images in eachlevel. The second factor is the sparsity of the SRV images whichdepends on details of the scene captured in the light field images.

The additional layer of motion-compensated step on top of the hierarchy requires decompressionand results in additional decompression overhead. This includestree traversal decoding operations needed to retrieve a block of pix-els. Furthermore, the HMLFC algorithm performs three additionalbasic operations: (1) Bit manipulations required to decode the corre-sponding motion vectors; (2) Loading the bytes of data (pixels) fromthe reference image in memory into the registers; (3) Performingarithmetic operations to compute the motion re-compensated block.In terms of these additional operations required to decode a blockof pixels, only the memory load operations are slightly more expen-sive. In our parallel GPU decoding implementation and experiments(Sec. 5), we noticed this overhead to be minimal.

Random access tothe pixel values in the HMLFC tree is guaranteed by the tree tra-versal decoding operations described (Section 4.8.1). To decode arequired pixel value, a block of pixel values corresponding to therequired pixel value is decoded. Following that, only the motionvectors corresponding to the predictive residual blocks are retrievedfrom the BISE compressed motion vector stream. Only a part of thecompressed stream is decoded to retrieve required blocks of pixelsand the corresponding motion vector values, while the rest of thecompressed stream remains intact. Our method also supports par-allel decompression of different pixel values from the compressedstream enabling fast GPU decoding. To retrieve a single pixel valueof LFI using our decoding, a block of pixel values are decompressed.As a block of pixels are decoded, our method benefits any LF ren-dering scheme by providing fast access to neighboring pixels forinterpolation to compute new views. A set of new views computedfor different camera positions and for given LF geometry are shownin suppl. material Sec-8 . We identify two primary properties of the LFI that affect the fi-nal compression rate of our method, and we briefly discuss theirrelationship with the encoding parameters used in our approach.(1) Distance between the captured light field image samples; (2) De-tails of the scene captured in the light field. Supplementary material link: https://bit.ly/2K2b1Ba

LF Dataset (Resolution): Size (MB) Compressionrate (bpp) PSNR (dB)Amethyst ( × × × ) : 576 0.045 40.7Bracelet ( × × × ) : 480 0.143 40.1Bunny ( × × × ) : 768 0.027 41Jelly Beans ( × × × ) : 384 0.029 40.5Lego Knights ( × × × ) : 768 0.157 41Lego Gallantry ( × × × ) : 480 0.155 40.1Tarot Cards ( × × × ) : 768 0.68 40.3 Table 1. The compression computed using our HMLFC algorithm and thequality for several LF datasets from the Stanford light field archive. All theimage samples are 24-bit color RGB images. For a similar PSNR quality, thecompression rate varies for each LF depending on the details of the scenerecorded in the LF.Fig. 4. We compare HMLFC with RLFC and motion compensation schemesin terms of compression rates (bpp) for several datasets. The datasets arecompressed to have a similar compression quality for each of the methodsin comparison. Overall, HMLFC improves the compression rate by a factorof ∼ − × , compared to prior schemes. As the distance between the light field samples increases the dis-parity for a real-world scene point in the pixel space of the adjacentlight images also increases. The RKVs are computed as weightedfiltering (pixel-wise) of the close light field images; as the disparitygets higher, the correlation between the same pixels decreases. Asa consequence, the redundancies captured in the RKVs decreaseleading to a decrease in the sparsity of the SRV images and moreadditional redundancies across the SRV images in a given level ofthe hierarchy. As a result, for a fixed search window size, as thesampling distance between LFI increases, the resulting bit rate in-creases. For a scene with extensive details, the sampled light fieldimages contain a lot of high-frequency components. In this case,even for a small capture distance between light field images due tothe vast regions of high-frequency components, the resulting SRVimages have low sparse regions with large intensity values. For agiven block threshold, as the complexity of the scene increases, theresulting compression rate also increases as the number of signifi-cant blocks in the SRV images increases. However, the redundanciesin the high-frequency components of the SRV images in a level canbe captured using a motion compensated search.

We present the results from the evaluation of our hybrid approachand analyze its performance on the Stanford LF archives [Levoy , Vol. 1, No. 1, Article . Publication date: April 2017. • Srihari Pratapa and Dinesh Manocha and Hanrahan 1996; Wilburn et al. 2005]. We use peak-signal-to-noise-ratio (PSNR) [Ohm et al. 2012] for quality comparison (suppl.material, Sec-5) and bits per pixel (bpp) to present the compres-sion rates. We present a comparison of our hybrid method withRLFC [Pratapa and Manocha 2018] and a motion compensationscheme that enables random access in terms of compression ratesand compression quality. The motion compensation scheme is im-plemented based on Zhang & Li [Zhang and Li 2000].In Table 1, the compression rates and PSNR values are shown fordifferent datasets from the Stanford LF archive. For a similar PSNRquality,the compression rates vary from 0 .

029 bpp to 0 .

68 bpp dueto the variation of the details captured in the scenes of the datasets.The encoding parameters are varied across the datasets to achieve asimilar compression quality.Figure 4 shows the comparison of compression rates for severaldatasets for similar compression quality (variation in PSNR qual-ity 0 . − . ∼ − × compared to RLFC for datasets where RLFC providesbetter compression rates. For other datasets with complex and high-frequency details, HMLFC improves the compression by ∼ − × compared to RLFC and improves the compression by ∼ − × overmotion compensation schemes.We analyze the rate-distortion properties of HMLFC by varyingthe following encoding parameters: block size , block threshold , and search window size . Table 2 shows the variation of the compressionrate and resulting quality with a change in the block size . Increas-ing the block size with a fixed block threshold causes a decrease inthe thresholding errors, which results in an increase of PSNR andbpp. Figure 5 shows the effect of varying the window size on thecompression rates and compression quality. As the search windowsize increases, the predictive blocks find better matching blocks inthe reference images resulting in a sparser predictive residual. Theincrease in sparsity of the predictive residuals leads to a reductionin the compression rate. Better matching blocks in the referenceimages lead to better compression quality and an increase in thePSNR. The coherency between the predictive blocks and referenceimages is limited to only a certain local region and is diminishedbeyond a certain search window size. As the window size gets largerthan a certain range, we notice that the compression rate and com-pression quality become saturated. If the spatial distance betweenthe sampled light field images is large, we notice a large benefit interms of compression (Tarot, Bracelet) as the search window sizeincreases. The results of varying the search window size agree withthe compression analysis presented in Section-4.8.As estimated in the compression analysis in Section - 4.8 and, aspresented in Table 1 as the details of the contents captured in thescene (example images of the dataset are shown in suppl. material,Sec-2) increase we notice an increase in the bit rate. In the newStanford LF archive, the Tarot Cards scene is captured with twodifferent sampling distances (small and large) between the light LF Dataset Metric BlockSize: 2 BlockSize: 4 BlockSize: 8Amethyst PSNR 38.7 43.69 48.35bpp 0.0592 0.106 0.707Bunny PSNR 40.52 43.35 47.52bpp 0.0173 0.0411 0.548Bracelet PSNR 37.03 44.15 48.79bpp 0.033 0.35 1.108Knight PSNR 38.27 43.046 47.89bpp 0.096 0.243 1.15 Table 2. The effect of varying the block size on the compression rate andquality is highlighted. Increasing the block size for a fixed block threshold reduces the total number of thresholding errors, resulting in an increase ofbit rate and PSNR. The block threshold is set to 75, the search window size isset to 16, and the tree height is set to 3. field images. The compression rate on the dataset with the largersampling distance is 0 .

68 bpp for a PSNR of 40.3 dB; on the datasetwith smaller sampling distance it is 0 .

47 bpp for a PSNR of 40.2 dB.The variations of compression quality with compression ratefor both HMLFC and RLFC are shown in the Figure 6. The rate-distortion for both methods is computed by varying the block thresh-old. We notice that for different ranges of PSNR, HMLFC achievesbetter compression compared to RLFC. Visual quality comparisonbetween RLFC and HMLFC is shown in Figure 7. The encodingtime using our current single-threaded implementation required forcompressing varies from 30–90 minutes depending on the input sizeof the LFI and resolution of the LFI. More compression evaluationsof HMLFC (in comparison with RLFC) on datasets Heidelberg LFbenchmark [Honauer et al. 2016] in suppl. material, Sec-6. Novelviews not present in the original LFI computed for new cameraviewpoints are presented in suppl. material, Sec-8.

Decompression Analysis:

We have implemented a GPU LFrendering (more details in suppl. material, Sec-7) using a basicray-tracing method to test the implementation of our decompres-sion scheme on an NVIDIA GTX-980. We tested the decompressionscheme on the Lego Knights dataset compressed using the followingencoding parameters: block size tree height search window size

16. Our method takes 3 − × ∼

200 fps. The average frame rate to render new views at resolu-tion 1024 × ∼

160 fps. Although HMLFC involvesfew additional steps in decoding a block of pixels, it achieves sim-ilar frame rates as RLFC. We speculate that the decompression ofRLFC is bottle-necked on the number of memory operations re-quired to perform the decoding of a block. The inclusion of a fewadditional memory operations to perform the extra step of motionre-compensation for decoding HMLFC is negligible on the overallrendering performance. The decompression memory overhead tostore the sparse matrix representation of the reference images whilerendering is ∼

800 KB in the Lego Knights dataset.@inproceedingsLFHeidelberg16, title=A dataset and evaluationmethodology for depth estimation on 4d light fields, author=Honauer, , Vol. 1, No. 1, Article . Publication date: April 2017.

MLFC: Hierarchical Motion-Compensated Light Field Compression for Interactive Rendering • 9

Amethyst Bracelet Bunny Beans Gallantry Knights Tarot

Fig. 5. We highlight the variation in the compression rates and compressionquality of HMLFC with the change in the window size. (left) The increasein the search window size leads to better matching blocks resulting insmaller prediction residual errors and better compression rates. (right) Theprediction residual errors are reduced with an increase in the search windowsize and the resulting compression quality increases. The block size is set to4 and tree height is set to 3. The block threshold is varied across differentdatasets to keep the PSNR within a certain range.

HMLFCRLFC

Amethyst Bracelet Bunny Beans Gallantry Knights Tarot

Fig. 6. The variation of compression quality with bit rate is highlighted forHMLFC and RLFC. HMLFC provides better compression rates for all thedatasets over a range of PSNR values. The block size is set to 4, tree height is set to 3, and the search window size is set to 16.

Katrin and Johannsen, Ole and Kondermann, Daniel and Gold-luecke, Bastian, booktitle=Asian Conference on Computer Vision,pages=19–34, year=2016, organization=Springer

Conclusions:

We present a novel hybrid compression scheme thatcombines two prior compression methods, hierarchical schemes and motion compensation schemes , to encode LFI. Our approach capturesthe local and global coherencies in the LFI and improves the com-pression rate by a factor of ∼ − × without any significant loss inthe compression quality. Our scheme provides random access capa-bility and can be used for interactive rendering on current GPUs. Wehave highlighted its benefits on standard benchmarks and observecompression rates of 30 − × with a PSNR of 40 −

45 dB.

Limitations:

Our approach has some limitations. The primarylimitation of the hybrid approach is in designing a suitable motioncompensation scheme for the transformed images in the hierarchy.Without proper motion compensation suitable for the underlying hi-erarchy the benefits, from the hybrid combination might be limited.Another limitation of our method as pointed out in the results (Sec -5) is that the compression rate is dependent on the distance betweenthe light field images in the light field samples. In the case of lightfields captured with a sparse sampling rate, the performance of ourcompression scheme is reduced. The current GPUD decoder forour compression scheme is not optimized in terms of the memoryoperations required for decoding.

Future Work:

In the current implementation, we use per-pelmotion compensation, i.e., a search for a matching block is per-formed at a pixel level. Using sub-pel motion compensation, i.e.,sub-pixel level motion compensation to search for a matching blockusing bi-linear interpolation methods could provide better com-pression rates. We have implemented the hybrid approach for onespecific hierarchical scheme (RLFC), and we would like to extendand test our hybrid approach for other hierarchical compressionschemes that allow random access (e.g., Peter & Straßer [Peter andStraßer 2001]). Adding depth information to the light field pixeldata improves the rendering quality by a significant factor and pro-vides more parallax. Extending our approach to compress depthinformation alongside image data is also a good direction for fu-ture work. Also, using motion compensation vectors for parallaxcorrection to reduce artifacts during LF rendering may be possible.Our current implementation focuses on 4D two-plane parameteri-zation of the light fields; in the future, we would like to extend ourcompression approach to more complex parameterizations such asspherical [Ihm et al. 1997], panoramic [Overbeck et al. 2018], andunstructured LF [Davis et al. 2012]. Our current implementationfor encoding LFI is single threaded and slow. The encoding speedcan be improved by a factor with a mutli-thread and parallelizedimplementation on the CPU or the GPU.

REFERENCES

Edward H. Adelson and James R. Bergen. 1991. The Plenoptic Function and the Elementsof Early Vision. In

Computational Models of Visual Processing . MIT Press, 3–20.Jin-Xiang Chai, Xin Tong, Shing-Chow Chan, and Heung-Yeung Shum. 2000. Plenopticsampling. In

Proceedings of the 27th annual conference on Computer graphics andinteractive techniques . ACM Press/Addison-Wesley Publishing Co., 307–318.Chuo-Ling Chang, Xiaoqing Zhu, Prashant Ramanathan, and Bernd Girod. 2006. Lightfield compression using disparity-compensated lifting and shape adaptation.

IEEEtransactions on image processing

15, 4 (2006), 793–806.Jie Chen, Junhui Hou, and Lap-Pui Chau. 2018. Light Field Compression With Disparity-Guided Sparse Coding Based on Structural Key Views.

IEEE Transactions on ImageProcessing

27, 1 (2018), 314–324.Abe Davis, Marc Levoy, and Fredo Durand. 2012. Unstructured light fields. In

ComputerGraphics Forum , Vol. 31. Wiley Online Library, 305–314.T. Ebrahimi, S. Foessel, F. Pereira, and P. Schelkens. 2016. JPEG Pleno: Toward anEfficient Representation of Visual Reality.

IEEE MultiMedia

23, 4 (Oct 2016), 14–20.https://doi.org/10.1109/MMUL.2016.64Simon Fenney. 2003. Texture compression using low-frequency signal modulation. In

Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware(HWWS ’03) . Eurographics Association, 84–91. http://dl.acm.org/citation.cfm?id=844174.844187Bernd Girod, Chuo-Ling Chang, Prashant Ramanathan, and Xiaoqing Zhu. 2003. Lightfield compression using disparity-compensated lifting. In

Acoustics, Speech, andSignal Processing, 2003. Proceedings.(ICASSP’03). 2003 IEEE International Conferenceon , Vol. 4. IEEE, IV–760.Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. 1996. TheLumigraph. In

Proceedings of the 23rd Annual Conference on Computer Graphics andInteractive Techniques (SIGGRAPH ’96) . ACM, New York, NY, USA, 43–54. https://doi.org/10.1145/237170.237200Katrin Honauer, Ole Johannsen, Daniel Kondermann, and Bastian Goldluecke. 2016.A dataset and evaluation methodology for depth estimation on 4d light fields. In

Asian Conference on Computer Vision . Springer, 19–34.Insung Ihm, Sanghoon Park, and Rae Kyoung Lee. 1997. Rendering of spherical lightfields. In

Computer Graphics and Applications, 1997. Proceedings., The Fifth PacificConference on . IEEE, 59–68.A Jagmohan, A Sehgal, and N Ahuja. 2003. Compression of lightfield rendered imagesusing coset codes. In

Signals, Systems and Computers, 2004. Conference Record of theThirty-Seventh Asilomar Conference on , Vol. 1. IEEE, 830–834.Shinjini Kundu. 2012. Light field compression using homography and 2D warping. In

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conferenceon . IEEE, 1349–1352.Marc Levoy and Pat Hanrahan. 1996. Light Field Rendering. In

Proceedings of the 23rdAnnual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’96) ., Vol. 1, No. 1, Article . Publication date: April 2017.

Fig. 7. We show a zoomed in comparison between decoded images from the different schemes. A small region of size × marked in red box is selectedand scaled upto × for visual quality comparison. The PSNR and bpp values for each of the methods are mentioned in the figure. For same PSNRvalues We find no additional visual degradation in HMLFC compared to RLFC and motion compensation. The factor of improvement of HMLFC over RLFC ishighlighted in the bracket. ACM, New York, NY, USA, 31–42. https://doi.org/10.1145/237170.237199Marc Levoy, Kari Pulli, Brian Curless, Szymon Rusinkiewicz, David Koller, Lucas Pereira,Matt Ginzton, Sean Anderson, James Davis, Jeremy Ginsberg, et al. 2000. The digitalMichelangelo project: 3D scanning of large statues. In

Proceedings of the 27th annualconference on Computer graphics and interactive techniques . ACM Press/Addison-Wesley Publishing Co., 131–144.Zhouchen Lin and Heung-Yeung Shum. 2000. On the number of samples needed in lightfield rendering with constant-depth assumption. In

Computer Vision and PatternRecognition, 2000. Proceedings. IEEE Conference on , Vol. 1. IEEE, 588–595.Dong Liu, Lizhi Wang, Li Li, Zhiwei Xiong, Feng Wu, and Wenjun Zeng. 2016. Pseudo-sequence-based light field image compression. In

Multimedia & Expo Workshops(ICMEW), 2016 IEEE International Conference on . IEEE, 1–4.M. Magnor and B. Girod. 1999. Hierarchical coding of light fields with disparity maps.In

Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348) ,Vol. 3. 334–338 vol.3. https://doi.org/10.1109/ICIP.1999.817130Marcus Magnor and Bernd Girod. 2000. Data compression for light-field rendering.

IEEE Transactions on Circuits and Systems for Video Technology

10, 3 (2000), 338–343.H. S. Malvar, G. J. Sullivan, and S. Srinivasan. 2008. Lifting-Based Reversible ColorTransformations for Image Compression. In

SPIE Applications of Digital ImageProcessing . International Society for Optical Engineering. http://research.microsoft.com/apps/pubs/default.aspx?id=102040B Neelima and Prakash S Raghavendra. 2012. Effective sparse matrix representationfor the GPU architectures.

International Journal of Computer Science, Engineeringand Applications

2, 2 (2012), 151.Jorn Nystad, Anders Lassen, Andy Pomianowski, Sean Ellis, and Tom Olson. 2012.Adaptive scalable texture compression. In

Proceedings of the Fourth ACM SIG-GRAPH/Eurographics conference on High-Performance Graphics . Eurographics Asso-ciation, 105–114. J. R. Ohm, G. J. Sullivan, H. Schwarz, T. K. Tan, and T. Wiegand. 2012. Comparison ofthe Coding Efficiency of Video Coding Standards;Including High Efficiency VideoCoding (HEVC).

IEEE Transactions on Circuits and Systems for Video Technology

ACM SIGGRAPH 2018 Virtual, Augmented, and MixedReality (SIGGRAPH ’18) . ACM, New York, NY, USA, Article 32, 1 pages. https://doi.org/10.1145/3226552.3226557Cristian Perra and Pedro Assuncao. 2016. High efficiency coding of light field imagesbased on tiling and pseudo-temporal data arrangement. In

Multimedia & ExpoWorkshops (ICMEW), 2016 IEEE International Conference on . IEEE, 1–4.Ingmar Peter and Wolfgang Straßer. 2001. The wavelet stream: Interactive multiresolution light field rendering. In

Rendering Techniques 2001 . Springer, 127–138.Srihari Pratapa and Dinesh Manocha. 2018. RLFC: Random Access Light FieldCompression using Key Views.

CoRR abs/1805.06019 (2018). arXiv:1805.06019http://arxiv.org/abs/1805.06019I. Viola, M. ÅŸeÅŹÃąbek, and T. Ebrahimi. 2017. Comparison and Evaluation of LightField Image Coding Approaches.

IEEE Journal of Selected Topics in Signal Processing

11, 7 (Oct 2017), 1092–1106. https://doi.org/10.1109/JSTSP.2017.2740167Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez, AdamBarth, Andrew Adams, Mark Horowitz, and Marc Levoy. 2005. High performanceimaging using large camera arrays. In

ACM Transactions on Graphics (TOG) , Vol. 24.ACM, 765–776.J. Yu. 2017. A Light-Field Journey to Virtual Reality.

IEEE MultiMedia

24, 2 (Apr 2017),104–112. https://doi.org/10.1109/MMUL.2017.24Cha Zhang and Jin Li. 2000. Compression of lumigraph with multiple reference frame(MRF) prediction and just-in-time rendering. In