Deep Anti-aliasing of Whole Focal Stack Using its Slice Spectrum
DDeep Anti-aliasing of Whole Focal Stack Using Its Slice Spectrum
Yaning Li, Xue Wang, Guoqing Zhou, and Qing WangNorthwestern Polytechnical University [email protected], [email protected]
Abstract
The paper aims at removing the aliasing effects for thewhole focal stack generated from a sparse 3D light field,while keeping the consistency across all the focal layers. Wefirst explore the structural characteristics embedded in thefocal stack slice and its corresponding frequency-domainrepresentation, i.e., the focal stack spectrum (FSS). Wealso observe that the energy distribution of FSS alwayslocates within the same triangular area under differentangular sampling rates, additionally the continuity ofpoint spread function (PSF) is intrinsically maintainedin the FSS. Based on these two findings, we propose alearning-based FSS reconstruction approach for one-timealiasing removing over the whole focal stack. What’s more,a novel conjugate-symmetric loss function is proposedfor the optimization. Compared to previous works, ourmethod avoids an explicit depth estimation, and can handlechallenging large-disparity scenarios. Experimental resultson both synthetic and real light field datasets show thesuperiority of the proposed approach for different scenesand various angular sampling rates.
1. Introduction
Light field imaging technology [13] enables digitalrefocusing at different focal planes after the time of capture.Basically this is performed by integrating a light fieldover the angular domain, which corresponds to the sliceoperation in the frequency domain [22]. However, with asparse angular coverage, i.e. , the disparity between adjacentviews is more than one pixel [5], there will be significantaliasing artifacts in the out-of-focus regions in the refocusedimages [28], as shown in Fig.1(a).To enhance visual quality, many approaches have beenproposed [8, 28, 18, 6, 20, 7] to remove the aliasingeffects based on view interpolation [6], depth-based filtering[5], or multi-scale fusion [28]. However, since mostof them rely on depth estimations [19, 4], inaccuratedepth maps will cause severe degradation in anti-aliasingperformance. Moreover, existing methods aim at removing
Figure 1. Aliasing effects and aliasing-removed results. (a) Inputwith aliasing and the view count is × . (b) Anti-aliasling outputby our method. From top to bottom, refocused image at a certaindepth, focal stack slice along the red line, and corresponding FSS. aliasing artefacts from an individual refocused image,which is corresponding to a specific depth layer in thewhole focal stack. Without taking all the layers togetherinto consideration, they could not provide consistentenhancement over the focal stack (as shown in the secondrow of Fig.1 and Fig.12). Namely, along the focal directionin the focal stack, the PSF-continuity can not be maintainedwell. This will become more critical for the refocusedimages with large disparities and complex occlusions.In the paper, we focus on exploring the structuralcharacteristics embedding in the focal stack and itscorresponding Fourier spectrum. Different from EPIs (a 2Drepresentation for light fields in the spatial domain) wherethe slopes of the EPI lines vary with depths, for a givenlight field, the FSSs for different depths share the samecone-shaped pattern (as shown in the Fig.3). In otherwords, the energy distribution of the FSS locates withinthe same triangular area. Furthermore, the PSF-continuityis intrinsically maintained in the FSS. These importantcharacteristics of the FSS make it possible to exploit aunified anti-aliasing scheme for whole depth contents.The main contributions of the paper are,1 a r X i v : . [ ee ss . I V ] J a n ) Two important characteristics of thefrequency-domain representation for the light fieldfocal stack are explored. The FSS could preserve thePSF-continuity and provide the same bounds of spectralsupport along the focal axis under different angularsampling rates.2) A deep FSS-based anti-aliasing algorithm is proposedto perform one-time aliasing removing for all the refocusedlayers and meanwhile preserve the consistency betweendifferent focal layers, only a rough depth range beingneeded.3) A robust conjugate-symmetric loss function is definedin the U-Net for the optimization.
2. Related Work
The 4D light field L ( u, v, x, y ) [18, 10] records lightrays in a 3D space. So far many researches have beendone for analyzing the characteristics of the 4D light fieldsampling and digital refocusing. Ng [21] pointed out thatthe spectrum of a light field concentrated on a 3D manifoldand each focal image could be synthetised by applying aninverse Fourier transform to a 2D slice in the manifold.Shi et al . [25] found the sparsity of a light field fromwhich a dense light field is reconstructed by applying thesparse Fourier transform. Considering the regular 2D meshsampling structure of the light field, Levin and Durand[17] analyzed the dimensional gap between the 3D focalstack and the light field and proposed to inpaint the lightfield spectrum from the focal stack. Isaksen et al . [13]re-parameterized the light field as a 3D focal stack in thespatial domain. Dansereau et al . [7] extended the capabilityof refocus from one single depth layer to a volumetricrange of depths by replacing the slice operation with adepth-dependent band-pass filter.However, due to the limitation of the sensor size,there always exists a contradiction between the spatial andangular resolutions, which results in the aliasing artifactsin digital refocusing for angularly undersampled lightfields [22, 26]. Considering the formation of aliasing effects, itis straightforward to perform anti-aliasing by angularsuper-resolution [14, 27]. Thus aliasing-removing requireseither abundant samples or appropriate filters in the spatialor frequency domains.
Spatial-domain methods.
Levoy and Hanrahan [18]proposed a prefiltering to reduce the spatial artifacts. Chang et al . [6] proposed an anti-aliasing method by interpolatingangular samples within each sampling interval using depthinformation. Bishop et al . [3] eliminated the aliasing by fusing multiview information. Xiao et al . [28] furtheranalyzed the aliasing in the spatial domain. They firstdetected aliasing and then used a multi-scale fusion methodto remove the aliasing. Lin et al . [19] analysed thesymmetry characteristic of the light field focal stack inthe spatial domain and proved it was possible to use usedepth-dependent light field rendering to reduce aliasing.
Frequency-domain methods.
Isaksen et al . [13] firstproposed a frequency-planar light field filter. Chai et al .[5] proposed a comprehensive analysis about the trade-offbetween the sampling density and depth resolution. Ng [21]suggested band limited filtering in the frequency domainand slicing could effectively inhibit the aliasing effect.Based on focal stack and sparse collection of views, Levinand Durand [17] employed the focal manifold in derivationsof 2D deconvolution kernels in the 4D Fourier spectrum.Dansereau et al . [7] presented the linear volumetric focusfor light field cameras and derived the regions of supportin the frequency domain. They employed a simple, linearsingle-step filter to combine information over the light field.In summary, previous published methods focus on thedepth-dependent characteristics in the original spectrum oflight fields, which is prone to depth errors. Additionally,these methods tackle each refocused image individuallyso that they can not provide PSF-continuous anti-aliasingfocal stack (as shown in the Fig.12). Different fromthese methods, scenarios with different disparities sharethe same cone-shaped pattern in the proposed FSSrepresentation, supporting a unified depth-independentsolution and providing the PSF-continuous anti-aliasingfocal stack in one single pass.
3. Focal Stack Spectrum (FSS)
In this section, we first elucidate the way to obtain theFSS from a light field, and then analyse the characteristicsof the FSS. Without loss of generality, a 2D light fieldEPI instead of the full 4D one is used here for betterdemonstration.
For better understanding, the notations used in the paperare given in Tab.1. E ( u, x ) denotes a 2D light field,where u and x refer to the angular and spatial dimensions,respectively. E d ( u, x ) denotes the sheared EPI at specificdisparity d , E d ( u, x ) = E ( u, x + d ( u − u ref )) , (1) where u ref refers to the reference view . Once the shearedEPI with arbitrary disparity f ∈ [ d min , d max ] is integratedfor all views, the focal stack F ( f, x ) is formed, F ( f, x ) = (cid:90) E f ( u, x ) du = (cid:90) E ( u, x + f ( u − u ref )) du. (2) The central view is selected as the reference view in the paper. igure 2. Characteristic analyses of the EPI in the u - x space, the focal stack in the f - x space and the FSS in the ω f - ω x space. (a)-(c)Original and sheared EPIs at different disparities. (d) The focal stack without aliasing, where ratio is the ratio of down-sampling. (e)Aliased focal stack. (f) Corresponding FSS of (e). For better visualization, only one single image pixel P is considered here.Table 1. Notations used in the paperTerm Definition E ( u, x )
2D light field EPI E d ( u, x ) Sheared EPI at the disparity du Angular coordinate x Spatial coordinate f Focal layer’s disparity F ( f, x ) Focal stack formed by E ( u, x ) d Disparity during the shearing process F ( ω f , ω x ) FSS
F T ( · ) Fourier transform operator u ref Reference view N u View number
Subsequently, the Fourier form of F ( f, x ) is F ( ω f , ω x ) = F T ( F ( f, x )) , (3) where F T ( · ) refers to the 2D Fourier transform operator.Specifically, the view number of the given light field is N u . Fig.2(a) shows the EPI of a point with disparity d ∗ . TheEPI is sheared with d ∗ (in Fig.2(b)) according to Eqn.1.Since the EPI line is perpendicular to the x -axis, there is nodefocus blur or aliasing at point P in the refocused image,shown in the Fig.2(e). Then the original EPI is shearedwith d ∗ + α (see Fig.2(c)). It is noticed that the EPI lineis not perpendicular to x -axis and there will appear defocusblur or aliasing in the focal stack as shown in Fig.2(d) and(e) respectively. The radius of the defocus blur or aliasingincreases with the increasing of α and a triangle is formedin the focal stack slice. Given a 3D light field with N u views, the apex angle ϕ of the triangle has two forms, ϕ = (cid:40) (cid:0) ( N u − (cid:1) Continuous focal stack , (cid:0) ∆ α ( N u − (cid:1) Discrete focal stack , (4) where ∆ α is the disparity gap between neighboring focalplanes during the construction of the focal stack.The diameter l of the defocus blur (or aliasing), i.e. , theinterval between the points P and P in Fig.2(e), can becalculated by l P ,P = α ( N u − . (5) Note that, the aliasing only occurs when α ( N u − > N u − holds, otherwise the defocus blur appears. In other words,as long as the scene has texture information, the aliasingalways exists in the refocused image as long as the shearedparameter | α | is large enough.Revisiting Eqn.1, it is found that the aliasingpoints P , P , ..., P come from the views u , u , ..., u respectively. The slope of each aliasing line P P i in Fig.2(e)can be computed by Slope ( P P i ) = (cid:40) u i − u ref Continuous focal stack , α ( u i − u ref ) Discrete focal stack . (6) According to the Eqns.4 and 6, it is found that in thefocal stack slice,a) The shape of defocus blur or aliasing line is determinedby the refocus parameter ∆ α and the view index, and isindependent of the scene depth.b) All aliasing lines from the same view have the sameslope (as shown the lines P P i in Fig2 (e)).3 a) (b) (c) Figure 3. Different angular sampling rates and representations fora light field. From top to bottom: EPI and its spectrum, focal stackand its FSS. From left to right: (a) Intensive sampling with 121views. (b) 5 × downsampling. (c) 15 × downsampling. Given a focal stack formed from a N u -view light field,there are N u frenquency lines in the FSS according tothe property of Fourier transformation [2] : each linecorresponds to a specified view and more views lead tomore lines. Consequently, the FSS has the followingproperties,a) The shape of the FSS is determined by the refocusparameter ∆ α and the view index and is independentof the scene depth.b) According to the property of Fourier transformation [9],the FSS is conjugate symmetric.Fig.3 shows the comparisons between the aliasing EPIspectrum and the aliasing FSS under different samplingrates. Taking a closer look at the EPI spectrum (2ndrow) and the FSS (the last row) in Fig.3, with the numberof views decreasing, more repeating areas appear in theEPI spectrum, while the structural distribution of the FSSremains. Additionally, there is a one-to-one correspondencebetween the line in the FSS and the view when ∆ α is fixed.As shown in Fig.3(c) (the rightmost column), 9 views in theEPI correspond to 9 lines in the FSS.
4. Proposed Method
Aliasing effects are caused by insufficient angularsampling. According to the analysis in Sec.3.2, there is acorrespondence between the number of views and the FSS,so the anti-aliasing problem could be tackled as a spectrumcompletion one. The whole pipeline of the proposedFSS-based anti-aliasing algorithm is shown in Fig.6. The Fourier transformation tells that the energy of all lines with thesame slope in the spatial domain will concentrate on a perpendicular linepassing through the origin.
U-NetU-NetU-NetU-Net Euler's Formula. cos sin ix e x i x = + Euler's Formula. cos sin ix e x i x = + Concat
Real Part
Imaginary Part
CNN LayersCNN Layers
Power spectrum
Phase angle Anti-aliasing FSS
Figure 4. The structure of network.Figure 5. U-Net architecture.
Specifically, the aliasing focal stack F a ( f, x ) is obtainedfrom the undersampled EPI using Eqn.2, then the Fouriertransform operator F T ( · ) is applied to obtain the FSS F a ( ω f , ω x ) . Finally, a CNN φ parameterized by σ is proposed to reconstruct the aliasing-removed FSS F ( ω f , ω x ) from the F a ( ω f , ω x ) , F g t is the ground truth.Parameters σ are optimized by arg min σ {(cid:107)F g t , φ σ ( F a ) (cid:107)} . (7) As shown in Fig.4, to deal with complex numberinputs, a dual-stream U-Net is designed. The powerspectrum and phase angle are firstly fed into twosub-networks respectively, then the features are combinedusing the Euler’s formula to obtain the real and imaginaryparts, which are concatenated into another network foroptimization. Fig.5 shows the details of the U-Net proposedin this paper.The loss function is, loss = (cid:107)F − F gt (cid:107) + λ · loss s , (8) where loss s constrains the conjugate symmetry of thereconstructed FSS. The scaler λ is set to 1.5 for balancingthe weights of two loss terms, loss s = 1 N C W N C − (cid:88) i =0 W − (cid:88) j =0 |F ( ω i , ω j ) − F ∗ ( − ω i , − ω j ) | , (9) where | · | refers to the norm of a complex number, ∗ indicates the standard conjugate operation on a complexnumber. N C and W are the number of refocus layers andthe image width respectively.The complete FSS-based deep anti-aliasing algorithm issummarized in Algorithm 1.4 igure 6. The pipeline of the proposed FSS-based anti-aliasing algorithm. Algorithm 1
The FSS-based deep anti-aliasing algorithm
Input:
Angularly undersampled 3D light field L ( u, x, y ) , with N u views of H × W pixels. Output:
Anti-aliasing 3D focal stack. for y = 1 to H do Get the 2D light field E ( u, x ) . Obtain focal stack slice F a ( f, x ) by Eqn.2. Get aliasing FSS F a ( ω f , ω x ) by Eqn.3. Reconstruct non-aliasing FSS F ( ω f , ω x ) with thedual-stream U-Net. Perform an inverse Fourier transform on F ( ω f , ω x ) . end for Get the anti-aliasing 3D focal stack for light field.
5. Experimental Results
In this section, we report experimental results ofthe proposed FSS-based anti-aliasing algorithm on bothsynthetic and real light fields. The robustness ofthe proposed method is firstly analysed on light fieldswith different sampling rates. Then, an ablationexperiment is conducted to verify the effectiveness of theconjugate-symmetric loss. Finally, both the quantitativeand qualitative comparisons with SOTAs are providedto demonstrate the advantages of the proposed method.Additional experiments on light fields captured from acamera array further verify the generalization of theproposed method.
To train and verify the proposed network, both thesynthetic and real light fields are used. For synthetic data,6 light fields are rendered using the POV-Ray [23], where4 for training and 2 for testing. For real data, the highangular resolution light field dataset [11] is used, where10 for training and 2 for testing. Note that, only 121views are used. Additionally, the Stanford [1] and theDisney [15] light fields are used to verify the performanceof the proposed method on unseen light fields captured bya camera array. Tab.2 shows the resolution. Notice that,
Table 2. Experiment and data parameters of different light fileds.
Data Rang of d Angular Res. Spatial Res.Syn. LF [-1.00,0.98] ×
121 526 × Real LF [11] [-1.00,0.98] ×
121 376 × Couch [15] [-2.60,-0.60] ×
101 628 × Church [15] [-1.45,-0.53] ×
101 670 × Lego [1] [-1.00,0.98] ×
17 1024 × Figure 7. Aliasing-removed results for different downsamplingsettings. From top to bottom: 5 × and 15 × downsampling. (a)Anti-aliasing on the focal stack. (b) Reconstructed FSS. (c) Errormap of the FSS. (d) Error map of the focal stack. the spatial resolutions of the couch and church light fields( × and × ) are resized in experiments.Please refer to the supplementary material for the selectionof light fields.In order to verify the capability of the proposed methodfor large disparities, we conduct experiments with differentangular sampling rates. At present, only single directiondisparity is concerned, so the 2D EPI image can beused to represent the input light field. The details oftested light fields under 5 × and 15 × downsampling scalesare introduced in the supplementary material. For eachlight field in the experiment, the focal stack F ( f, x ) isconstructed by performing refocusing operations 199 timeswhere ∆ α = 0 . . Tab.2 shows the ranges of refocusoperations( d ).The network converges after 150 epochs where eachepoch contains 30 iterations. The Adam optimizer [16]is used for iterative optimization. The learning rate isinitially set to e − . The first and second moments ofthe gradients are set to 0.9 and 0.99 respectively to enableadaptive learning rates. The network is implemented usingthe Tensorflow framework with 7 GTX 1080Ti GPUs. Spectrum Domain . In this subsection, we demonstratethe performance of our approach with respect to different5 able 3. Average spectral energy loss on different test scenarios.
Dataset 5 × × Syn. LFs 2.03% 3.58%Real LFs [11] 2.50% 3.13%Lego [1] 0.56% (2 × )Couch [15] 2.71% (10 × )Church [15] 0.67% (10 × ) sampling rates. Comparing Fig.3(b) and (c) with Fig.7,we can see that, for different sampling rates, ourmethod achieves a preferable performance on anti-aliasingrendering in both spatial and frequency domains. It isimportant to note that the errors in the frequency domainmainlly come from the Direct Component (DC) of thespectrum. The main reason is that the energy of DC is muchlarger than other compenents [9], and deep learning is moreinclined to learn low-frequency components in an inputsignal map [24]. The differences in DC will cause colorinconsistency during the FSS reconstruction. To tackle thisproblem, we replace the DC of the output spectrum with theone of the input spectrum.Tab.3 shows the average spectral energy loss on differenttest scenarios. Comparing the last column of Tab.5 withTab.3, we find that the average spectral energy loss plays animportant role in the PSNR while has little effect on theSSIM. According to[29], human eyes are more sensitiveto the structure similarity compared with the PSNR. Thisfeature is important especially for refocused images wherethe errors in defocus blur are hard to be distinguishedby people. So, a higher SSIM in Tab.5 could betterdemonstrate the advantages of the proposed method. Image Domain . Fig.8 shows the results of anti-aliasing indifferent focal images under 5 × and 15 × downsamplingsettings respectively. In the × downsampling part, theproposed method could remove aliasing on severe occludedobjects (the trees in the top row). The middle and bottomrows show the anti-aliasing results at different focal layersfor the same scene (Bicycle). The middle and bottom rowsshow the results for the Bicycle scene when the focuseddepth is out of the range of the scene depth and in the rangeof scene depth, respectively. Although the radius of defocusblur varies from the middle to bottom, the proposed methodcould eliminate the aliasing effects in the middle one andretains the sharp edges in the bottom one at the same time,which verifies the PSF-continuity of the proposed method.In 15 × downsampling, although the aliasing is more seriousdue to the large disparity, our method can still obtain goodaliasing-removed results.Fig.9 shows the quantitative results of our methodacross focal layers under different downsampling settings.The relative PSNR is defined as the absolute PSNRvalue minus the PSNR of the input focal layer image.As shown in Fig.9(b) and (d), the PSNR and SSIMfluctuate with refocusing depth. When there is less (a) (c)(b) do w n s a m p li ng do w n s a m p li ng Figure 8. Aliasing-removed results at different focal layers under5 × downsampling and 15 × downsampling. (a) Input image. (b)Ground truth. (c) Our result.Table 4. Average PSNR/SSIM under different loss functions. scene w/o. loss s w. loss s Tree (Syn. LF) 34.97/0.880 35.92/0.912Bicycle (Real LF) [11] 33.08/0.949 34.45/0.960 aliasing in the refocused image, the improvement regardingPSNR and SSIM is relatively limited (the bottom row inFig.8). Moreover, the varying trends of PSNR and SSIMdemonstrate that our approach maintains the continuity ofPSF in the focal stack.
Conjugate symmetry loss . To verify the effectivenessof the conjugate symmetry loss, an ablation experimentis conducted. Tab.4 shows the average PSNR/SSIM atdifferent focal layers under 15 × downsampling with orwithout the loss s . It is noticed that the PSNR and SSIMare improved after applying the conjugate symmetry loss inthe U-Net. Vertical EPI . All the above experiments are carried out inthe x − u dimension. Fig.10 shows the aliasing-removedresults in the y − v dimension on the Stanford light fields[1], from which we find that our method is also effective inthe y − v direction of the light field. Undersampling light field . Finally, we did an experimenton HCI dataset [12]. In this set, the disparity betweenadjacent views is larger, which produces severe aliasingeffects when refocusing. As shown in Fig.1, whenrefocusing on the red cloth in the background, the beesand wooden balls in the foreground will be aliased. Theproposed method can eliminate the aliasing effects well.
In this subsection, we compare our method againsttwo SOTAs, Kalantari et al . [14] and Xiao et al . [28].6 igure 9. Quantitative results of our method across focal layers under different downsampling settings. Top row: the Tree scene. Bottomrow: the Bicycle scene. (a) Absolute PSNR. (b) Relative PSNR. (c) Absolute SSIM. (d) Relative SSIM.Figure 10. Anti-aliasing results of Lego [1] along the directionsof y − v . The first row shows the refocus image at a certain depthand the second row shows the focal stack along the red line and thelast row displays the corresponding FSS. (a) Input with aliasing,the number of views is × ; (b) Anti-aliasling output by theproposed method. Tab.5 shows the average PSNR/SSIM on both synthetic(abbreviated in Syn.) and real light fields (Real for short)over all focus layers. It is noticed that the proposed methodoutperforms the SOTA methods. Qualitative comparisonson two test scenes are shown in Figs.11 and 12. Fig.13shows the quantitative comparisons of Fig.11 on each focuslayer.Fig.11 compares the anti-aliasing results under 15 × downsampling. It is generally accepted that aliasing resultsfrom sparse view sampling. We regard the method byKalantari et al . [14] as a view synthesis network for Table 5. Quantitative comparisons (PSNR/SSIM) with SOTAsunder different downsampling rates on synthetic and real lightfields.
Kalantari [14] Xiao [28] OursSyn. LFs × × × × × × × eliminate aliasing. However, Kalantari et al . [14] doesnot perform well for view synthesis under large parallax.The edges of the image objects are distorted significantly.Extensive disparities and complex occlusions also lead tosevere aliasing in the out-of-fucus regions. For the secondtest scene, the focal plane is located on the yellow stepahead, so significant aliasing appears in the background.Although the method by Xiao et al . [28] also locates thealiasing area, it can not deal with the massive disparitysituation. Moreover, multi-scale image fusion can eliminatealiasing to a certain, however, the whole image will beconsequently blurred. For example, Xiao et al . [28] willincrease the number of pyramid layers to remove severealiasing, which will also result in the blurring of the entireimage space (as shown the first row of Fig.11(d)). Ourmethod can not only locate but also eliminate these aliasingeffects.Fig.12 shows the results with different focal parametersunder 15 × downsampling setting. According to the analysisin Section 3.3, the focal stack slices have the symmetry7 b) Ours (c) Kalantari et al . (d) Xiao et al .(a) Input/GT InputGT PSNR:35.54
SSIM:0.966 PSNR:34.11SSIM:0.879PSNR:33.38
SSIM:0.813
PSNR:32.17SSIM:0.946 PSNR:31.99SSIM:0.811 PSNR:31.46
SSIM:0.899Input GT Figure 11. Anti-aliasing results by different methods under 15 × downsampling (refocused images and error maps). (a) Input focal imageand ground truth. The results by (b) our method, (c) Kalantari et al . [14] and (d) Xiao et al . [28]. Several local areas are zoomed in forbetter visualization. (c) Ours (d) Kalantari et al . (e) Xiao et al. (a) GT (b) Input Figure 12. Anti-aliasing results on the focal stack slice under 15 × downsampling. (a) Ground truth. (b) Input. The results by (c) ourmethod, (d) Kalantari et al . [14] and (e) Xiao et al . [28]. Several local areas are zoomed in for better visualization. property along the f -axis, and the symmetric structure issimilar for the scene points locating at different depthsspace (PSF-continuity). Maintaining this characteristicis one of the important indicators for evaluating theanti-aliasing effects. As shown in Fig.12, compared to othertwo methods, our method can not only eliminate aliasingbut also maintain this PSF-continuity. Quantitative resultsare shown in Fig.13. The PSNR values of Xiao et al . [28]are higher than those of our method on several focal layers,however, since Xiao et al . [28] process each focus layerimage separately and can not preserve the continuity ofthe PSF, the overall trend of its PSNR and SSIM curves fluctuate along sampling rates. Results of light fields from camera array.
In order toverify the generalization performance of our algorithm, wetest our model on part of the Disney datasets [15] and lightfield Lego from the Stanford dataset views [1]. Becausethe angular resolutions of Disney and the Stanford LFsdiffer from the proposed synthetic and real LFs (see Tab.3and 5), we set the sampling rates for these two datasetsas × and × to guarantee the same angular resolutionin the downsampling LFs (9 views). Fig.14 compares theanti-aliasing results in the Church LF. When the imageis refocused on the tree in the front, there are obvious8 igure 13. Quantitative comparisons on two test scenes for different focal layers. Top row: the synthetic scene. Bottom row: the realscene. The blue, red and yellow lines indicate the results by ours, Kalantari et al . [14] and Xiao et al . [28] respectively. (c) Ours (d) Kalantari et al . (e) Xiao et al . (a) GT (b) Input Figure 14. Anti-aliasing results by different methods on Disney datasets. (a) Ground truth. (b) Input focal image under 10 × downsampling.(c) our method, (d) Kalantari et al . [14], (e) Xiao et al . [28]. Several local areas are zoomed in for better visualization. aliasing effects in the non-focusing area. Our method cansignificantly remove the aliasing in the non-focusing areas,such as the white building in the distance. The averagePSNR and SSIM over whole aliasing-removed focal stackare listed in the last three rows of Tab.5. Currently, the proposed method could only process thelight field with one angular dimension. For a light fieldwith two angular dimensions, the contents from differentlines damage the linear structure in the focal stack (red boxin Fig.15), leading the characteristics of the spectrum bedamaged.
6. Conclusions and Future Work
In this paper, we propose an FSS-based anti-aliasingmethod for angularly undersampled light fields. The FSSpreserves the PSF-continuity and spectrum distribution
Figure 15. Focal stack and FSS of Lego datasets[1]. These focalstack and FSS are obtained by vertical-horizontal direction( u, v )and horizontal direction ( u ) views respectively. The first rowshows the focal stack and the last row displays the correspondingFSS. (a) The number of views is 9 × × downsampling). (b)The number of views is 1 × × downsampling). References [1] The new stanford light field archive. http://lightfield.stanford.edu/lfs.html .[2] J. Bigun and G. H. Granlund. Optimal orientation detectionof linear symmetry. In
IEEE ICCV , pages 433–438, 1987.[3] T. E. Bishop and P. Favaro. The light field camera: Extendeddepth of field, aliasing, and superresolution.
TPAMI ,34(5):972–986, 2011.[4] T. Broad and M. Grierson. Light field completion using focalstack propagation. In
ACM SIGGRAPH 2016 Posters , pages1–2. 2016.[5] J.-X. Chai, X. Tong, S.-C. Chan, and H.-Y. Shum. Plenopticsampling. In
ACM SIGGRAPH , pages 307–318. ACMPress/Addison-Wesley Publishing Co., 2000.[6] A.-C. Chang, T.-P. Sung, K.-T. Shih, and H. H. Chen.Anti-aliasing for light field rendering. In
IEEE ICME , pages1–6. IEEE, 2014.[7] D. G. Dansereau, O. Pizarro, and S. B. Williams. Linearvolumetric focus for light field cameras.
ACM TOG ,34(2):15, 2015.[8] T. Georgiev and A. Lumsdaine. Reducing plenoptic cameraartifacts. In
Computer Graphics Forum , volume 29, pages1955–1968. Wiley Online Library, 2010.[9] R. C. Gonzales and R. E. Woods. Digital image processing,2002.[10] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F.Cohen. The lumigraph. In
Proceedings of the 23rdannual conference on Computer graphics and interactivetechniques , pages 43–54, 1996.[11] M. Guo, H. Zhu, G. Zhou, and Q. Wang. Dense light fieldreconstruction from sparse sampling using residual network.In
Springer ACCV , pages 1–14, 2018.[12] K. Honauer, O. Johannsen, D. Kondermann, andB. Goldluecke. 4D light field dataset. http://hci-lightfield.iwr.uni-heidelberg.de/ ,2016.[13] A. Isaksen, L. McMillan, and S. J. Gortler. Dynamicallyreparameterized light fields. In
ACM SIGGRAPH , pages297–306. ACM Press/Addison-Wesley Publishing Co.,2000. [14] N. K. Kalantari, T.-C. Wang, and R. Ramamoorthi.Learning-based view synthesis for light field cameras.
ACMTOG , 35(6):193:1–193:10, 2016.[15] C. Kim, H. Zimmer, Y. Pritch, A. Sorkine-Hornung, andM. H. Gross. Scene reconstruction from high spatio-angularresolution light fields.
ACM TOG , 32(4):73:1–73:12, 2013.[16] D. P. Kingma and J. Ba. Adam: A method for stochasticoptimization. In
ICLR , 2015.[17] A. Levin and F. Durand. Linear view synthesis using adimensionality gap light field prior. In
IEEE CVPR , pages1831–1838, 2010.[18] M. Levoy and P. Hanrahan. Light field rendering. In
ACMSIGGRAPH , pages 31–42. ACM, 1996.[19] H. Lin, C. Chen, S. B. Kang, and J. Yu. Depth recovery fromlight field using focal stack symmetry. In
IEEE ICCV , 2015.[20] A. Lumsdaine, T. Georgiev, et al. Full resolution lightfieldrendering.
Indiana University and Adobe Systems, Tech. Rep ,91:92, 2008.[21] R. Ng. Fourier slice photography. In
ACM TOG , volume 24,pages 735–744. ACM, 2005.[22] R. Ng.
Digital light field photography . PhD thesis, Stanforduniversity, 2006.[23] POV-ray. .[24] N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin,F. Hamprecht, Y. Bengio, and A. Courville. On the spectralbias of neural networks. In
ICML , pages 5301–5310. PMLR,2019.[25] L. Shi, H. Hassanieh, A. Davis, D. Katabi, and F. Durand.Light field reconstruction using sparsity in the continuousfourier domain.
ACM TOG , 34(1):12:1–12:13, 2014.[26] J. Stewart, J. Yu, S. Gortler, and L. McMillan. A newreconstruction filter for undersampled light fields. In
ACMInternational Conference Proceeding Series . EurographicsAssociation/Association for Computing Machinery, 2003.[27] T.-C. Wang, J.-Y. Zhu, N. K. Kalantari, A. A. Efros,and R. Ramamoorthi. Light field video capture usinga learning-based hybrid imaging system.
ACM TOG ,36(4):133:1–133:13, 2017.[28] Z. Xiao, Q. Wang, G. Zhou, and J. Yu. Aliasing detection andreduction scheme on angularly undersampled light fields.
IEEE TIP , 26(5):2103–2115, 2017.[29] Z Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli.Image quality assessment: from error visibility to structuralsimilarity.
IEEE TIP , 13(4):600–612, 2004. upplementary Material EPIs under different view downsampling rates
Fig.16 shows the detail of tested light field under 5 × and15 × downsampling. (a) (b)(c)(d) Figure 16. EPIs under different view downsampling rates . (a)Reference view. (b) Original EPI. (c) 5 × downsampling. (d) 15 × downsampling. The selection of light fields
Fig.17 shows the reference view of our synthetic lightfield datasets. In synthetic light field datasets, we usethe tree scene and pot-cube scene for testing, because theocclusion relationship in these scenes are complex. In reallight field datasets[11], the Bicycle and the Firehydrantare selected for testing due to the large disparity range,which could better verify the generalization of the proposedmethod under different scenes. In Disney datasets [15], theCouch and Church are selected since there is no motionin these two light fields and the large disparity. Besides,we also choose the StillLife light field [12] for testing.Different from previous datasets which have dense views,the sampling is inadequate in the StillLife, i.e. , there isaliasing in the refocused image even using all of views. Theproposed method could eliminate the intrinsic aliasing inthe StillLife as shown in the Fig.1 of the paper.
More results
Please refer the attached video result.mp4 for thepipeline of the FSS reconstruction and more anti-aliasingresults.