Sparkle Vision: Seeing the World through Random Specular Microfacets
SSparkleVision: Seeing the World through Random Specular Microfacets
Zhengdong Zhang, Phillip Isola, and Edward H. AdelsonMassachusetts Institute of Technology { zhangzd, phillipi, eadelson } @mit.edu Abstract
In this paper, we study the problem of reproducing theworld lighting from a single image of an object covered withrandom specular microfacets on the surface. We show thatsuch reflectors can be interpreted as a randomized mappingfrom the lighting to the image. Such specular objects havevery different optical properties from both diffuse surfacesand smooth specular objects like metals, so we design spe-cial imaging system to robustly and effectively photographthem. We present simple yet reliable algorithms to calibratethe proposed system and do the inference. We conduct ex-periments to verify the correctness of our model assump-tions and prove the effectiveness of our pipeline.
1. Introduction
An objects appearance depends on the properties of theobject itself as well as the surrounding light. How muchcan we tell about the light from looking at the object? Ifthe object is smooth and matte, then we can tell rather little[3, 12, 11, 13]. However, if the object is irregular and/ornon-matte, there are more possibilities.Figure 1 shows a picture of a surface covered in glitter.The glitter is sparkly, and the image shows a scattering ofbright specularities. We may think of the glitter as con-taining mirror facets randomly oriented. Each facet reflectslight at a certain angle. If we knew the optical and geomet-rical properties of the facets, we could potentially decodethe reflected scene.Figure 2 shows a variety of optical arrangements inwhich light rays travel from a scene to a camera sensor byway of a reflector. For simplicity we assume the scene isplanar; for example it could be a computer display screenshowing a test image. A subset of rays are seen by the sen-sor in the camera. Here we show a pinhole camera for sim-plicity.Figure 2(a) shows the case of an ordinary flat mirror re-flector. The pinhole camera forms an image of the displayscreen (reflected in the mirror) in the ordinary way. Thereis a simple mapping between screen pixels and sensor pix- (a) Specular microfacets (b) Closer look (c) Reconstructedlighting
Figure 1. Reproducing the world from a single image of specu-lar random facets: (a) shows the image of a surface covered withglitter illuminated by a screen showing the image of Obama. (b)gives a close up look of (a), highlighting both the bright spots anddark spots. (c) shows the lighting, i.e., the face of Obama the ouralgorithm constructs from (a). els. Figure 2(b) shows the same arrangement with a curvedmirror. Again there is a simple mapping between screenpixels and sensor pixels. The field of view is wider dueto the mirror’s curvature. Figure 2(c) shows the case of asmashed mirror, which forms an irregular array of mirrorfacets. The ray directions are scrambled, but the mappingbetween screen pixels and sensor pixels is still relativelysimple. This is the situation we consider in the presentwork.Figure 2(d) shows the case of an irregular matte reflector.Each sensor pixel sees a particular point on the matte reflec-tor, but that point integrates light from a broad area of thedisplay screen. Unscrambling the resulting image is almostimpossible, although there are cases where some informa-tion may be retrieved, as shown by [17] in their discussionof accidental pinhole cameras. Figure 2(e) shows the case ofan irregular mirror, but without benefit of a pinhole camerarestricting the rays hitting the sensor. This case correspondsto the random camera proposed by Fergus et al [6], in whichthe reflector itself is the only imaging element. Since eachpixel captures light from many directions, unscrambling isextremely difficult.The case in Figure 2(c), with a sparkly surface and a pin-hole camera, deserves study. We call this case “SparkleVi-sion”. It involves relatively little mixing of light rays, sounscrambling seems feasible. Moreover it could be of prac-tical value, since irregular specular surfaces occur in the real1 a r X i v : . [ c s . C V ] D ec a) flat mirror (b) curved mirror (c) smashed mirror, sparkle-vision (d) irregular matte reflector (e) irregular mirror withoutpinhole camera Figure 2. Optical arrangements in which light rays travel from a scene to a camera sensor by way of a reflector: “SparkleVision” refers tothe setup in (c). world (e.g., with metals, certain fabrics, micaceous miner-als, and the Fresnel reflections from foliage or wet surfaces).For a surface covered in glitter, it is difficult to build aproper physical model. Instead of an explicit model, wecan describe the sparkly surface plus camera as providinga linear transform on the test image. With a planar displayscreen, each sparkle provides information about some lim-ited parts of the screen. Non-planar facets and limited opti-cal resolution will lead to some mixture of light from mul-tiple locations. However, the transform is still linear. Thereexists a forward scrambling matrix, and in principle we canfind its inverse and unscramble the image.To learn the forward matrix we can probe the system bydisplaying a series of test images. These could be orthogo-nal bases, such as a set of impulses, or the DCT basis func-tions. They could also be non-orthogonal sets, and can beovercomplete. Having determined the forward matrix wecan compute its inverse.All the optical systems shown in Figure 2 implement lin-ear transforms, and all can be characterized in the samemanner. However, if a system is ill-conditioned, the inver-sion will be noisy and unreliable. The performance in prac-tice is an empirical question. We will show that the case inFigure 2(c), SparkleVision, allows one to retrieve an imagethat is good enough to recognize objects and faces.
A diffuse object like a ping pong ball tells us little aboutthe lighting. If a Lambertian object is convex, its appear-ance approximately lies in a nine-dimensional subspace[3, 12, 11, 13], making it impossible to reconstruct morethan a × environment map. For non-convex objects, animage of the object under all possible lighting conditionslies in a much higher dimension space due to shadows andocclusions[20], enabling the reconstruction of light beyond × [7]. But in general, matte surfaces are tough to workwith.A smooth specular object like a curved mirror providesa distorted image of the lighting, which humans can rec-ognize [4] and algorithms can process [2, 5, 18]. How-ever, specular random facets are different. Typically theyare highly irregular and discontinuous, making it hard evenfor humans to perceive. We utilize a similar model with in- verse light transport [14, 15] to analyze this new setup, andpropose a novel pipeline to effectively reduce the noise andincrease the stability of the system, both in calibration andreconstruction.Researchers have applied micro-lens arrays to capturelightfields [1, 9]. To some extent, a specular reflector canalso be considered as a coded aperture of a general cam-era system[8]. Our work differs from the previous work inthe sense that our setup is randomized – to the best of ourknowledge previous work in this domain mainly uses spe-cially manufactured array with known mapping whereas inour system the array is randomly distributed.Many ideas in this paper are inspired by previous workon Random Camera [6]. However, the key difference be-tween our paper and the previous work is that in [6] no lensis used and hence all the lights from all directions in thelightfield get mixed up which is difficult to invert. In oursetup, we place a lens between the world and the camerasensor, which makes the problem significantly easier andmore tractable to solve. Also similar ideas appears in the“single pixel camera” [16] where measurements of the lightare randomized for compressed sensing.The idea that some everyday objects can accidentallyserve as a camera has been explored before. It is in shown in[10] that an photograph of a human eye reflects the environ-ment in front of the eye, and this can be used for relighting.In addition, a window or a door can act like a pinhole, ineffect imaging the world outside the opening[17].
2. The formulation of SparkleVision
We discuss the optical setup of SparkleVision in dis-cretized settings. Suppose the lightfield in a particular en-vironment is denoted by a stacked, discrete vector x in thespace. We place a specular object O with random specularmicrofacets into the environment. Further we use a camera C with a focused lens to capture the intensity of the light re-flected by O . Let the discrete vector y be the sensor output.It is well known that any passive optical system is linear.So we use a matrix A ( · ) to represent the linear mappingrelating the lightfield x to y . Therefore, y = Ax .Note that all the above discussion makes no assumptionon any material, albedo, smoothness or continuity proper-ies of the objects in the scene. Therefore, this linear rep-resentation holds for any random specular microfacets. Inthis notation, the task of SparkleVision can be summarizedas • Accurately capture the image y of a sparkling object. • Determine the matrix A , which is a calibration task. • Infer the light x from the image y , which is an infer-ence task.In the later discussion, we will use many pairs of light-ings and images so we use the subscript ( x i , y i ) to denotethe i -th pair of them. In addition, let e i be the i -th unit vec-tor of the identity basis, i.e., a vector whose entries are allzero except the i -th entry which is one. Similarly, let d i rep-resent the i -th unit vector of the bases of the Discrete CosineTransform (DCT). We use b i to represent a random basisvector where all entries are i.i.d random variables. Also let A = [ a , a , . . . , a N ] with a i as its i -th column.
3. Imaging specular random facets throughHDR
In this section we examine the properties of sparklingobjects with microfacets. Their special characteristics en-able the recovery of lighting from an image while imposingunique challenges to accurately capture images of them. Todeal with these challenges, we use High Dynamic Rang-ing (HDR) imaging, using multiple exposures of the samescene.
Specular random microfacets can be considered as a ran-domized mapping between the world light and the camera.Each single facet faces a random orientation. It acts as amirror reflecting all the incoming lights. However, becauseof the existence of a camera with focused lens and the smallsize of each facet, only lights from a very narrow rangeof directions will be reflected into camera from any givenfacet. Therefore, given a single point light source, only avery small number of the facets will reflect light to the cam-era and appear bright. The rest of the facets will be unillu-minated. This effect makes the dynamic range of a photoof specular facets extremely high, creating a unique chal-lenge to photographing them. Figure 3(a) and 3(b) plots thehistogram of a photo of such reflector.Now suppose we slightly change the location of the im-pulse light, generating a small disturbance to the directionof the incoming light to the facets. Thanks to the narrowrange of reflecting direction of each facet, this slight dis-turbance will cause a huge change of light patterns on therandom facets. Provided that the orientations of the facetsare random across all the surfaces, we should expect that theset of aligned facets will be significantly different. Figure3(c) gives us an illustration of this phenomenon. Intuitively, (a) image (b) histogram (c) non-overlap
Figure 3. Optical properties of specular random micro facets: (a)shows an image of specular facet with scattered bright spots. (b)demonstrates its histogram, although the bright spots in the imageare shining, most of pixels are actually dark. (c) the surface simul-taneously illuminated by two adjacent impulse lighting, one in redand one in green. The reflected green lights and red lights seldomoverlap as few spots in the image are yellow. if our task is just to decide whether a certain point lightsource is on or not, we could just count whether the corre-sponding set of facets for that light’s position is active ornot. This also suggests that our system is very sensitive tothe alignment of the geometric setup, which we will addressin Section 6.6.
As we have seen, the dynamic range of an image of asparkling object is extremely high. Dark regions are noisyand numerous throughout the image. To accurately capturethem, long exposure is needed. Unfortunately, long expo-sure makes the bright spots saturated and therefore breaksthe linearity assumption. If we adjust the exposure for thesparse bright spots, the exposure time would be too short tocapture the noisy dark pixels accurately. Therefore, it is notpractical to capture both high and low intensity illuminationwith just a single shot with a commercial DSLR camera.Our solution is to take multiple shots of the same scenewith K different exposure time t k . Let the resulting imagesbe I , I , . . . , I k . We can then combine those K images intoa single image I with much higher dynamic range than anyof the original K images. Note that we use a heavy tripod inthe experiment and therefore we assume all I i are alreadyregistered. Therefore, we only need to develop a way todecide I ( x ) from I k ( x ) for any arbitrary location x .The Canon Rebel T2i camera that we use in our experi-ments has roughly linear response with respect to the expo-sure time for a fairly large range – roughly when the inten-sity ranges in (0 . , . . When the intensity goes beyond . the response function becomes curved and graduallysaturated and hence the linearity assumption breaks down.When the intensity is lower than . the image is very noisy.So we need to discard these undesired intensities. Denotethe remaining exposure time and intensity pairs ( t i , I i ( x )) .The goal is to determine the value I ( x ) independently foreach location x . We solve this problem by fitting a leastsquares line to ( t i , I i ( x )) : ( r ) = argmin s (cid:88) i ( s · t i − I i ( x )) (1)With simple algebra we can derive a closed form solution: I ( r ) = (cid:80) i t i I i ( r ) (cid:80) i t i (2)Note that the derived solution can be viewed as an averageof intensities under different exposures weighted by the ex-posure time.
4. Calibration and Inference of SparkleVisionSystem
In this section, we examine the algorithm to calibrate thesystem and reconstruct the environmental map x from y . We probe the system y = Ax by illuminating the objectwith impulse lights e i . Ideally, y i = A · e i = a i . So we canscan through all e i to get A . However, due to the presenceof noise the measured y i will typically differ from the a i ofan ideal system. As we will show later in experimental sec-tions, this noise on the calibrated matrix A is lethal to therecovery of the light. Our system relies on a clean, accuratetransformation matrix A to succeed. Therefore, we furtherprobe the system with multiple different basis. Specificallywe use the DCT basis d i and a set of random basis b i . Do-ing this we make the system over-complete and hence theestimated A becomes more robust to noise. Let E be the N × N impulse basis matrix, D be the DCT basis matrixand B K ∈ R N × K be the matrix of K random basis vectors.This implies the following optimization to do the calibra-tion: min A (cid:107) Y − AE (cid:107) F + λ (cid:107) Y − AD (cid:107) F + λ (cid:107) Y − AB K (cid:107) F (3) λ here is a weight to balance the error since the illumina-tion from impulse lights tend to be much dimmer than theillumination from DCT and random lighting. In our experi-ments we set λ = N .To further refine the quality of the calibrated A againstthe noise in the dominant dark regions of A , we only re-tain intensities above a certain intensity during calibration.Specifically let Ω i be the set of the 1% brightest pixels in y i illuminated by the impulse e i . Let Ω = (cid:83) i Ω i . We thenonly keep the pixels inside Ω and discard the rest. Let P Ω ( · ) represents such a projection. This turns the calibration intothe following optimization: min A (cid:107) P Ω ( Y ) − AE (cid:107) F + λ (cid:107) P Ω ( Y ) − AD (cid:107) F + λ (cid:107) P Ω ( Y ) − AB K (cid:107) F (4) Note that the size of the output A from (4) is differentfrom (3) due to the projection Ω( · ) . Given A , reconstructing an environment map from animage y is a classic inverse problem. A straightforward ap-proach to this problem is to solve it by least-squares. How-ever, this unconstrained least square may produce entriesless than 0, which is not physically meaningful. Instead wesolve a constrained least squares problem: min x (cid:107) y − Ax (cid:107) F , s.t. x ≥ (5)Nevertheless, through experiments we find they are ac-tually too slow for application. When the resolution of thescreen is × , i.e., x ∈ R , solving the inequality con-strained least square is approximately 100 times slower. Yetthe improvement is minor. So we just solve the naive leastsquare without non-negative constraints and then crop theout-ranged pixels back to [0 , .We observe that in practice there is room for improve-ment to smooth the outcome of the above optimization.For example, we could impose stronger image priors tomake the result more visually appealing. However, doingso would disguise some of the intrinsic physical behaviorof SparkleVision, and therefore we decide to stick to themost naive optimization (5). We use RAW files from the camera to avoid any non-linearity post-processing in image format like JPEG andPNG. In addition, we model the background light as e and shot y = Ae by turning all active light sources off.We subtract y from every y i in the experiments by default.Since y is used for all of y i , we repeatedly photograph itmultiple times and take the average as actual image to su-press the noise on y .We can easily extend the pipeline to handle color imageswhere the transformation matrix A is R M × N instead of R M × N . For calibration, just use enumerate e i three timesin red, blue and green. Reconstruction is basically the same.
5. Simulated Analysis of SparkleVision
In this section we conduct synthetic experiments to sys-tematically analyze how noise affects the performance ofthe proposed pipeline. In addition, we study how the size ofmirror facet and the spatial resolution of the sparkling sur-face influence the resolution of the lighting that the systemcan recover. These experiments improve our understandingon the limits of the geometric setup, provide guidance totune the setup, and help us interpret the results.
The setup of the simulated experiment is shown in Figure4, where a planar screen is reflected by a sparkling surface igure 4. Configuration of the synthetic simulation. The pixel p i with certain width is reflected by the facet q i to the camera. to the camera. The resolution of the screen is the resolutionof the lightmap. We model the sparkling plane as a rect-angle with fixed size. We divide the plane into blocks andeach block is fully covered by a mirror facing certain orien-tation. The resolution of the sparkling plane is the just thenumber of blocks in that rectangular space. For simplicity,we do not consider interreflection or occlusion between themirrors facets.For each mirror facet, we assume that its orientation fol-lows a distribution. Let θ ∈ [0 , π/ be the slant of theorientation and φ ∈ [0 , π ) be the its tilt. We model the tilt φ as uniformly distributed in [0 , π ) . In practice the mir-ror is centered around . Therefore we model it as positivehalf of the Gaussian distribution with standard deviation σ θ .Specifically, we have P θ ( θ ) = 2 √ πσ θ exp (cid:18) − θ σ θ (cid:19) , θ ≥ (6)Note that the mean of θ is actually not and hence the actualstandard deviation is not σ θ .We assume that the orientation of each facet does not de-pend on the other so we can independently sample its valueand create a random reflector. We use the classic ray-tracingalgorithm to render the light reflected by the specular sur-face into the camera. We test the pipeline in this syntheticsetup and in the ideal noise free case the recovery is perfect. For simplicity, we model the image noise as i.i.d. whitenoise. We perform three control experiments to tease apartthe effects of noise during calibration versus during testtime reconstruction. In the first test, we add noise to bothcalibration and test images. In the second test, we only addnoise to the training dictionary while in the third we onlyadd noise to the test images. Suppose we test our pipelineon N test lightmaps I i , ≤ i ≤ N and get the recover (cid:98) I i .We measure the error of the recovery by the average sumof squared difference (SSD) between I i and (cid:98) I i . Varying Standard deviation of the noise SS D SSD v.s. standard deviation of the noise
Noisy dictionary + noisy testClean dictionary + noisy testNoisy dictionary + clean test
Figure 5. How noise in the calibration and the reconstruction stageaffects the recovery accuracy. Our pipeline is stable to the noise inthe reconstruction stage, but not stable to the noise in the trainingstage. the noise standard deviation from . to . , we get threecurves for the tests shown in Figure 5.The result indicates that our system is much more robustto the noise in the test image than the noise in the imagesfor calibration. In fact, when the standard deviation of thenoise is . the recovery is still great with the clean cali-bration images. In addition, when the noise std is as lowas . , the SSD with pure the training noise is . whilethe SSD with pure testing noise is just . . Therefore thiscomparison validates the need to use an over-complete basisin our proposed pipeline to reduce the noise in the trainingstage. The spatial resolution of the random reflector determinesthe resolution of the light map that we can recover. Keep inmind that each micro facet in our model is a mirror andour system relies on the light from the screen reflected bythe facet to the camera. If some part of the screen is neverreflected to the camera, there is no hope to recover fromwhat that part of the screen is showing from the photographtaken by the camera. Since the facets are randomly oriented,this undesirable situation may well happen.Figure 6 demonstrates such a phenomenon. Figure 6(a)shows a high-resolution image shown on the screen servingas the lightmap. The lightings are reflected by the microfacets to the camera sensor. However, the number of thefacets is very small. As a consequence, some blocks of thephoto are dark and part of the lightmap is missing, as isobserved in Figure 6(b). Intuitively speaking, if we havemore facets, the chance that part of the light map is reflected a) Image shown on the screen (b) Photography of a specularfacet
Figure 6. Photography of a specular reflector with low spatial res-olution. to the camera will increase, even if the size of each facet issmaller.We develop a mathematical model to approximatelycompute the probability that a block of pixels on the screenwill be reflected by at least one micro facet to the camerasensor. The model involves several approximations such asa small angle approximation, so the relative values in thisanalysis are more important than the raw values. Followingthe general setup in Figure 4, we first calculate the probabil-ity that a certain pixel p i on the screen gets reflected by themicro facet q i to the camera. Suppose the width of the pixelis w , then the foreshortened area of the pixel with respect tothe incoming light direction −−→ p i q i is w cos θ , where θ is theangle between −−→ q i p i and the screen. Then the solid angle ofthis foreshortened area with respect to q i is w cos θ (cid:107)−−→ p i q i (cid:107) .The normal n that just reflects −−→ p i q i to the camera C is thenormalized bisector of −−→ q i p i and −−→ q i C . Since the incominglights can vary in the solid angle of w cos θ (cid:107)−−→ p i q i (cid:107) , n can vary in w cos θ (cid:107)−−→ p i q i (cid:107) and still the mirror can reflect some light from thepixel on the screen to the camera. Let q i ◦ p i be the eventthat the facet at q i will reflect some light emitted from p i tothe camera C . Then its chance is the same as the probabilityfor the orientation of the facet to be within that range, whichis approximated by Pr ( q i ◦ p i ) = 2 √ πσ θ exp (cid:18) − θ σ θ (cid:19) w cos θ (cid:107)−−→ p i q i (cid:107) (7) Suppose there are M micro facets in total and we com-pute Pr ( q i ◦ p i ) for all ≤ i ≤ M . Then we can computethe probability that the light from pixel p i is reflected by atleast one micro facet to the camera as follows. Pr ( ∃ j, q j ◦ p i ) = 1 − Pr ( ∀ j, q j (cid:54) ◦ p i ) = 1 − (cid:89) j Pr ( q j (cid:54) ◦ p i )= 1 − (cid:89) j (1 − Pr ( q j ◦ p i )) We visualize such probability in Figure 7 in four differentconfiguration of screen and specular surface resolutions. (a) Object × , Screen × (b) Object × , Screen × (c) Object × , Screen × (d) Object × , Screen × Figure 7. Probability map of light from a block pixels getting re-flected by the specular surface to the screen.
From the results, we can see that overall higher resolu-tion of the specular object and lower resolution of the screenwill reduce the chance that some block of pixels on thescreen are not reflected to the sensor. In addition, on thesame screen, the chance to avoid such bad events are differ-ent for different blocks of pixels, which is due to the differ-ent distances and relative angles between different parts ofthe screen and the reflector. This suggests that for a specularobject there will be a limit on the resolution of the lightmapwe can infer from it.
6. Experiments
We place the sparkling object in front of a computerscreen in a dark room and use a camera to photograph theobject. The images displayed on the screen are consideredas light map. Figure 8(a) illustrates the setup. Specifi-cally in this experiment, we use a 24-inches ACER screen,a CANON rebel T2i camera and a set of specular objects in-cluding a hair pin, skull, and painted glitter board. The cam-era is placed on a heavy tripod to prevent even the slightestmovement. We show that our system can reconstruct theworld lighting at resolution up to × . At this resolutionmany objects, such as faces, can be easily recognized. Overlapping of bright pixels
We measure quantativelyhow the displacement of impulse light will change the pat-tern of bright spots. Let a i be the image illuminated bythe impulse light and S i be the set of bright pixels in a i with intensities larger than / of the maximum. Then the a) Experiment Setup (b) Overlap graph Figure 8. The left (a) shows the setup of the experiment in thelab. The right (b) shows the overlap graph between images fromdifferent impulses. It can be seen from the figure that only imagesfrom neighboring impulse have slight overlap. (a) diffuse paper (b) glitter (c) hairpin (d) skull
20 40 60 80 100024681012 index s i n g u l a r v a l u e Distribution of singular values. (e) κ = 1846 .
20 40 60 80 1000510152025 index s i n g u l a r v a l u e Distribution of singular values. (f) κ = 6 .
20 40 60 80 10000.511.522.533.54 index s i n g u l a r v a l u e Distribution of singular values. (g) κ = 26 .
20 40 60 80 10000.20.40.60.811.21.41.61.82 index s i n g u l a r v a l u e Distribution of singular values. (h) κ = 14 . Figure 9. Singular value distribution of the transformation matrixof different objects. Also the condition number is given. Notethat the sparkling reflectors create systems with much lower con-ditional number compared with a diffuse object. overlap between a i and a j , i (cid:54) = j can be defined as O( i, j ) = | S i ∩ S j | min( | S i | , | S j | ) , i (cid:54) = j (8) Here | S | denotes set cardinality. At world light resolutionof × , there are impulse basis images, and theoverlap between each of them can be plotted in a × image where the entry at i -th row and j -th column repre-senting O( i, j ) , as is shown in Figure 8(b). As the fig-ure suggests, most of the overlap happens between im-ages from neighboring impulses. And the maximal over-lap max i (cid:54) = j O( i, j ) < . . This further validates the non-overlapping property of SparkleVision system. Condition Number
The condition number of the trans-formation matrix A , κ ( A ) , determines the invertibility of atransformation matrix A . κ ( A ) is defined as the ratio be-tween the largest and the smallest singular values of A . Forall the optical systems shown in Figure 9, we plot all thesingular values of their A in descending order. From thefigure, we can see that the best κ ( A ) ≈ which is good inpractice. Figure 10. SparkleVision through glitter board. Qualatitively therecovery is fairly close to the ground-truth light map and humancan easily recognize the objects in the recovered image.Figure 11. Colored SparkleVision through glitter board. Althoughthere is slight color distortion, the overall quality of the recoveryis good.
Here we show results of our pipeline using the glitter asthe reflector. For the gray-scale setting, we push the resolu-tion of the screen to × . For the colored setting, we justpresent a few test at a lower resolution × to demon-strate that our system generalizes. The gray scale results areshown in Figure 10 and the color ones are shown in Figure11. They demonstrate the success of the pipeline at suchresolution.
200 400 600 800 1000 1200 1400 1600050100150200
Number of random basis SS D SSD vs number of random basis vectors added in the calibration stage
Figure 12. Adding more random basis vectors to the calibrationhelps to reduce the recovery error. But this benefit saturates out. (a) No noise (b) 0.04 (c) 0.08 (d) 0.12 (e) 0.16
Figure 13. Stability to noise on the test image: title of the subfigurerepresents noise level. The noise is large considering the imagesare in [0 , and only a few spots are bright. Figure 12 illustrates how the increasing number of ran-dom basis used in the calibration improve the recovery ofthe light x . Note that the resolution of the light in this setupis × , hence the number of impulse and DCT basesare both . It is worth noting that the benefit graduallysaturates out so we only need to employ a limited numberof random basis. We perform synthetic experiments by adding noise tothe real test image to understand how robust the real cali-brated transformation matrix is. We measure the robustnessby Root-mean-squared-error (RMSE) between the noisy re-covery and the non-noisy recovery. We plot how the recon-structed lighting change as the noise level increases in Fig-ure 13. The result validates the robustness of our system.
The success of SparkleVision relies largely on the sen-sitivity of light pattern on a specular object to even a slightmovement of the source light. However, this property si- (a) 1 pixel (b) 0.8 pixel (c) 0.4 pixel (d) 0.2 pixel (e) No shift
Figure 14. Instability to misalignment: even if we shift the testimage by one pixel horizontally, there is significant degrade in theoutput. We can compensate this by grid search and image prior. multaneously make the whole system extremely sensitiveto subtle misalignment. To show this we perform syntheticexperiments by shifting the test image I by ∆ x and exam-ine the RMSE. Some representative results and the RMSEcurve are shown in Figure 14. We could compensate forthis misalignment by performing grid search over ∆ x andpick out the best recovery which has minimum value of totalvariation (cid:80) x (cid:107)∇ I ( x ) (cid:107) .For the recovery of light, this phenomenon is harmful.But such sensitivity to even subpixel misalignment can en-able the detection and magnification of motion of the objectthat is invisible to the eyes, like [19]. We leave this for fu-ture work.
7. Discussions and Conclusion
In this paper we show that it is possible to infer an im-age of the world around an object that is covered in ran-dom specular facets. This class of objects actually providerich information about the environmental map and is signif-icantly different from the smooth objects with either Lam-bertian or specular surfaces, which researchers in the fieldof shape-from-X have worked on.The main contributions of the paper are twofold. First,we have presented the phenomenon that specular randommicrofacets can encode a large amount of information aboutthe surrounding light. This property may seem mysteriousat the first sight but indeed is intuitive and simple once weunderstand it. We also analyze the factors that affect theoptical limits of these reflectors. Second, we proposed andanalyzed a physical system that can efficiently perform thecalibration and inference of the surrounding light map basedon these sparkling surfaces.Currently our approach only reconstructs a single imageof the scene facing the sparkling object. Such an image cor-responds to a slice of the lightfield around the object. Usingan identical setup, it should be possible to reconstruct otherslices of the lightfield. Thus, our system could be naturallyextended to work as a lightfield camera. In addition, thisnew reflector has the ability of reveal subtle motions of theoptical setup. We leave all these exciting directions for fu-ture exploration. eferences [1] E. H. Adelson and J. Y. Wang. Single lens stereo with aplenoptic camera.
IEEE transactions on pattern analysis andmachine intelligence , 14(2):99–106, 1992. 2[2] Y. Adto, Y. Vasilyev, O. Ben-Shahar, and T. Zickler. Towarda theory of shape from specular flow. In . 2[3] R. Basri and D.W.Jacobs. Lambertian reflectance and linearsubspaces.
IEEE Transaction on Pattern Analysis and Ma-chine Intelligence (TPAMI) , 25:218 – 233, 2003. 1, 2[4] A. Blake. Does the brain know the physics of specular re-flection?
Nature , 343(6254):165–168, 1990. 2[5] G. D. Canas, Y. Vasilyev, Y. Adato, T. Zickler, S. Gortler, andO. Ben-Shahar. A linear formulation of shape from specularflow. In . 2[6] R. Ferbus, A. Torralba, and W. T. Freeman. Random lensimaging.
MIT CSAIL Tecnical Report , (058), 2006. 1, 2[7] S. W. Hasinoff, A. Levin, P. R. Goode, and W. T. Freeman.Diffuse reflectance imaging with astronomical applications.In . 2[8] A. Levin, R. Fergus, F. Durand, and W. T. Freeman. Imageand depth from a conventional camera with a coded aperture.
ACM Transactions on Graphics (TOG) , 26(3):70, 2007. 2[9] M. Levoy. Light fields and computational imaging.
IEEEComputer , 39(8):46–55, 2006. 2[10] K. Nishino and S. K. Nayar. Eyes for relighting. In
SIG-GRAPH 2004 , pages 704 – 701. 2[11] R. Ramamoorthi and P. Hanrahan. A signal-processingframework for inverse rendering. In
SIGGRAPH 2001 , pages117 – 128. 1, 2[12] R. Ramamoorthi and P. Hanrahan. On the relationship be-tween radiance and irradiance: determining the illuminationfrom images of a convex lambertian object.
JOSA A , 18:2448– 2459, 2001. 1, 2[13] R. Ramamoorthi and P. Hanrahan. A signal-processingframework for reflection.
ACM Transaction on Graphics ,2:104 – 1042, 2004. 1, 2[14] S. M. Seitz, Y. Matsusita, and K. N. Kutulakos. A theory ofinverse light transport. In . 2[15] P. Sen, B. Chen, G. Garg, S. R. Marschner, M. Horowitz,M. Levoy, and H.P.A.Lensch. Dual photography. In
SIG-GRAPH 2005 , pages 745 – 755. 2[16] D. Takhar, J. Laska, M. Wakin, M. Duarte, D. Baron, S. Sar-votham, K. Kelly, and R. Baraniuk. A new compressiveimaging camera architecture using optical-domain compres-sion. In
Computational Imaging IV at SPIE Eletronic Imag-ing , 2006. 2[17] A. Torralba and W. T. Freeman. Accidental pinhole andpinspeck cameras: Revealing the scene outside the picture.In . 1, 2[18] Y. Vasilyev, T. Zickler, S. Gortler, and O. Ben-Shahar. Shapefrom specular flow: Is one flow enough? In . 2[19] H.-Y. Wu, M. Rubinstein, E. Shih, J. Guttag, F. Durand, andW. T. Freeman. Eulerian video magnification for revealingsubtle changes in the world.
ACM Transactions on Graphics(Proc. SIGGRAPH 2012) , 31(4), 2012. 8[20] Y. Zhang, C. Mu, H.-W. Kuo, and J. Wright. Toward guar-anteed illumination models for non-convex objects. In14thInternational Conference on Computer Vision (ICCV 2013)