Abe Davis
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Abe Davis.
international conference on computer graphics and interactive techniques | 2014
Abe Davis; Michael Rubinstein; Neal Wadhwa; Gautham J. Mysore; William T. Freeman
When sound hits an object, it causes small vibrations of the objects surface. We show how, using only high-speed video of the object, we can extract those minute vibrations and partially recover the sound that produced them, allowing us to turn everyday objects---a glass of water, a potted plant, a box of tissues, or a bag of chips---into visual microphones. We recover sounds from high-speed footage of a variety of objects with different properties, and use both real and simulated data to examine some of the factors that affect our ability to visually recover sound. We evaluate the quality of recovered sounds using intelligibility and SNR metrics and provide input and recovered audio samples for direct comparison. We also explore how to leverage the rolling shutter in regular consumer cameras to recover audio from standard frame-rate videos, and use the spatial resolution of our method to visualize how sound-related vibrations vary over an objects surface, which we can use to recover the vibration modes of an object.
Computer Graphics Forum | 2012
Abe Davis; Marc Levoy
We present a system for interactively acquiring and rendering light fields using a hand‐held commodity camera. The main challenge we address is assisting a user in achieving good coverage of the 4D domain despite the challenges of hand‐held acquisition. We define coverage by bounding reprojection error between viewpoints, which accounts for all 4 dimensions of the light field. We use this criterion together with a recent Simultaneous Localization and Mapping technique to compute a coverage map on the space of viewpoints. We provide users with real‐time feedback and direct them toward under‐sampled parts of the light field. Our system is lightweight and has allowed us to capture hundreds of light fields. We further present a new rendering algorithm that is tailored to the unstructured yet dense data we capture. Our method can achieve piecewise‐bicubic reconstruction using a triangulation of the captured viewpoints and subdivision rules applied to reconstruction weights.
ACM Transactions on Graphics | 2014
Lixin Shi; Haitham Hassanieh; Abe Davis; Dina Katabi
Sparsity in the Fourier domain is an important property that enables the dense reconstruction of signals, such as 4D light fields, from a small set of samples. The sparsity of natural spectra is often derived from continuous arguments, but reconstruction algorithms typically work in the discrete Fourier domain. These algorithms usually assume that sparsity derived from continuous principles will hold under discrete sampling. This article makes the critical observation that sparsity is much greater in the continuous Fourier spectrum than in the discrete spectrum. This difference is caused by a windowing effect. When we sample a signal over a finite window, we convolve its spectrum by an infinite sinc, which destroys much of the sparsity that was in the continuous domain. Based on this observation, we propose an approach to reconstruction that optimizes for sparsity in the continuous Fourier spectrum. We describe the theory behind our approach and discuss how it can be used to reduce sampling requirements and improve reconstruction quality. Finally, we demonstrate the power of our approach by showing how it can be applied to the task of recovering non-Lambertian light fields from a small number of 1D viewpoint trajectories.
computer vision and pattern recognition | 2015
Abe Davis; Katherine L. Bouman; Justin G. Chen; Michael Rubinstein; William T. Freeman
The estimation of material properties is important for scene understanding, with many applications in vision, robotics, and structural engineering. This paper connects fundamentals of vibration mechanics with computer vision techniques in order to infer material properties from small, often imperceptible motion in video. Objects tend to vibrate in a set of preferred modes. The shapes and frequencies of these modes depend on the structure and material properties of an object. Focusing on the case where geometry is known or fixed, we show how information about an objects modes of vibration can be extracted from video and used to make inferences about that objects material properties. We demonstrate our approach by estimating material properties for a variety of rods and fabrics by passively observing their motion in high-speed and regular framerate video.
Journal of Infrastructure Systems | 2017
Justin G. Chen; Abe Davis; Neal Wadhwa; William T. Freeman; Oral Buyukozturk
AbstractVisual testing, as one of the oldest methods for nondestructive testing (NDT), plays a large role in the inspection of civil infrastructure. As NDT has evolved, more quantitative techniques...
Communications of The ACM | 2016
Neal Wadhwa; Hao-Yu Wu; Abe Davis; Michael Rubinstein; Eugene Shih; Gautham J. Mysore; Justin G. Chen; Oral Buyukozturk; John V. Guttag; William T. Freeman
The world is filled with important, but visually subtle signals. A persons pulse, the breathing of an infant, the sag and sway of a bridge---these all create visual patterns, which are too difficult to see with the naked eye. We present Eulerian Video Magnification, a computational technique for visualizing subtle color and motion variations in ordinary videos by making the variations larger. It is a microscope for small changes that are hard or impossible for us to see by ourselves. In addition, these small changes can be quantitatively analyzed and used to recover sounds from vibrations in distant objects, characterize material properties, and remotely measure a persons pulse.
international conference on computer graphics and interactive techniques | 2015
Abe Davis; Justin G. Chen
We present algorithms for extracting an image-space representation of object structure from video and using it to synthesize physically plausible animations of objects responding to new, previously unseen forces. Our representation of structure is derived from an image-space analysis of modal object deformation: projections of an objects resonant modes are recovered from the temporal spectra of optical flow in a video, and used as a basis for the image-space simulation of object dynamics. We describe how to extract this basis from video, and show that it can be used to create physically-plausible animations of objects without any knowledge of scene geometry or material properties.
computer vision and pattern recognition | 2012
YiChang Shih; Abe Davis; Samuel W. Hasinoff; William T. Freeman
It is often desirable to detect whether a surface has been touched, even when the changes made to that surface are too subtle to see in a pair of before and after images. To address this challenge, we introduce a new imaging technique that combines computational photography and laser speckle imaging. Without requiring controlled laboratory conditions, our method is able to detect surface changes that would be indistinguishable in regular photographs. It is also mobile and does not need to be present at the time of contact with the surface, making it well suited for applications where the surface of interest cannot be constantly monitored. Our approach takes advantage of the fact that tiny surface deformations cause phase changes in reflected coherent light which alter the speckle pattern visible under laser illumination. We take before and after images of the surface under laser light and can detect subtle contact by correlating the speckle patterns in these images. A key challenge we address is that speckle imaging is very sensitive to the location of the camera, so removing and reintroducing the camera requires high-accuracy viewpoint alignment. To this end, we use a combination of computational rephotography and correlation analysis of the speckle pattern as a function of camera translation. Our technique provides a reliable way of detecting subtle surface contact at a level that was previously only possible under laboratory conditions. With our system, the detection of these subtle surface changes can now be brought into the wild.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017
Abe Davis; Katherine L. Bouman; Justin G. Chen; Michael Rubinstein; Oral Buyukozturk; William T. Freeman
The estimation of material properties is important for scene understanding, with many applications in vision, robotics, and structural engineering. This paper connects fundamentals of vibration mechanics with computer vision techniques in order to infer material properties from small, often imperceptible motions in video. Objects tend to vibrate in a set of preferred modes. The frequencies of these modes depend on the structure and material properties of an object. We show that by extracting these frequencies from video of a vibrating object, we can often make inferences about that objects material properties. We demonstrate our approach by estimating material properties for a variety of objects by observing their motion in high-speed and regular frame rate video.
ACM Transactions on Graphics | 2017
Mackenzie Leake; Abe Davis; Anh Truong; Maneesh Agrawala
We present a system for efficiently editing video of dialogue-driven scenes. The input to our system is a standard film script and multiple video takes, each capturing a different camera framing or performance of the complete scene. Our system then automatically selects the most appropriate clip from one of the input takes, for each line of dialogue, based on a user-specified set of film-editing idioms. Our system starts by segmenting the input script into lines of dialogue and then splitting each input take into a sequence of clips time-aligned with each line. Next, it labels the script and the clips with high-level structural information (e.g., emotional sentiment of dialogue, camera framing of clip, etc.). After this pre-process, our interface offers a set of basic idioms that users can combine in a variety of ways to build custom editing styles. Our system encodes each basic idiom as a Hidden Markov Model that relates editing decisions to the labels extracted in the pre-process. For short scenes (< 2 minutes, 8--16 takes, 6--27 lines of dialogue) applying the user-specified combination of idioms to the pre-processed inputs generates an edited sequence in 2--3 seconds. We show that this is significantly faster than the hours of user time skilled editors typically require to produce such edits and that the quick feedback lets users iteratively explore the space of edit designs.