Tianfan Xue
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tianfan Xue.
european conference on computer vision | 2016
Jiajun Wu; Tianfan Xue; Joseph J. Lim; Yuandong Tian; Joshua B. Tenenbaum; Antonio Torralba; William T. Freeman
Understanding 3D object structure from a single image is an important but difficult task in computer vision, mostly due to the lack of 3D object annotations in real images. Previous work tackles this problem by either solving an optimization task given 2D keypoint positions, or training on synthetic data with ground truth 3D information. In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data. This is made possible mainly by two technical innovations. First, we propose a Projection Layer, which projects estimated 3D structure to 2D space, so that 3D-INN can be trained to predict 3D structural parameters supervised by 2D annotations on real images. Second, heatmaps of keypoints serve as an intermediate representation connecting real and synthetic data, enabling 3D-INN to benefit from the variation and abundance of synthetic 3D objects, without suffering from the difference between the statistics of real and synthesized images due to imperfect rendering. The network achieves state-of-the-art performance on both 2D keypoint estimation and 3D structure recovery. We also show that the recovered 3D information can be used in other vision applications, such as 3D rendering and image retrieval.
international conference on multimedia and expo | 2012
Yanjie Li; Tianfan Xue; Lifeng Sun; Jianzhuang Liu
The fast development of time-of-flight (ToF) cameras in recent years enables capture of high frame-rate 3D depth maps of moving objects. However, the resolution of depth map captured by ToF is rather limited, and thus it cannot be directly used to build a high quality 3D model. In order to handle this problem, we propose a novel joint example-based depth map super-resolution method, which converts a low resolution depth map to a high resolution depth map, using a registered high resolution color image as a reference. Different from previous depth map SR methods without training stage, we learn a mapping function from a set of training samples and enhance the resolution of the depth map via sparse coding algorithm. We further use a reconstruction constraint to make object edges sharper. Experimental results show that our method outperforms state-of-the-art methods for depth map super-resolution.
international conference on computer graphics and interactive techniques | 2015
Tianfan Xue; Michael Rubinstein; Ce Liu; William T. Freeman
We present a unified computational approach for taking photos through reflecting or occluding elements such as windows and fences. Rather than capturing a single image, we instruct the user to take a short image sequence while slightly moving the camera. Differences that often exist in the relative position of the background and the obstructing elements from the camera allow us to separate them based on their motions, and to recover the desired background scene as if the visual obstructions were not there. We show results on controlled experiments and many real and practical scenarios, including shooting through reflections, fences, and raindrop-covered windows.
computer vision and pattern recognition | 2012
Tianfan Xue; Jianzhuang Liu; Xiaoou Tang
Recovering 3D geometry from a single 2D line drawing is an important and challenging problem in computer vision. It has wide applications in interactive 3D modeling from images, computer-aided design, and 3D object retrieval. Previous methods of 3D reconstruction from line drawings are mainly based on a set of heuristic rules. They are not robust to sketch errors and often fail for objects that do not satisfy the rules. In this paper, we propose a novel approach, called example-based 3D object reconstruction from line drawings, which is based on the observation that a natural or man-made complex 3D object normally consists of a set of basic 3D objects. Given a line drawing, a graphical model is built where each node denotes a basic object whose candidates are from a 3D model (example) database. The 3D reconstruction is solved using a maximum-a-posteriori (MAP) estimation such that the reconstructed result best fits the line drawing. Our experiments show that this approach achieves much better reconstruction accuracy and are more robust to imperfect line drawings than previous methods.
european conference on computer vision | 2014
Tianfan Xue; Michael Rubinstein; Neal Wadhwa; Anat Levin; William T. Freeman
We present principled algorithms for measuring the velocity and 3D location of refractive fluids, such as hot air or gas, from natural videos with textured backgrounds. Our main observation is that intensity variations related to movements of refractive fluid elements, as observed by one or more video cameras, are consistent over small space-time volumes. We call these intensity variations “refraction wiggles”, and use them as features for tracking and stereo fusion to recover the fluid motion and depth from video sequences. We give algorithms for 1) measuring the (2D, projected) motion of refractive fluids in monocular videos, and 2) recovering the 3D position of points on the fluid from stereo cameras. Unlike pixel intensities, wiggles can be extremely subtle and cannot be known with the same level of confidence for all pixels, depending on factors such as background texture and physical properties of the fluid. We thus carefully model uncertainty in our algorithms for robust estimation of fluid motion and depth. We show results on controlled sequences, synthetic simulations, and natural videos. Different from previous approaches for measuring refractive flow, our methods operate directly on videos captured with ordinary cameras, do not require auxiliary sensors, light sources or designed backgrounds, and can correctly detect the motion and location of refractive fluids even when they are invisible to the naked eye.
computer vision and pattern recognition | 2011
Tianfan Xue; Jianzhuang Liu; Xiaoou Tang
Recovering 3D geometry from a single view of an object is an important and challenging problem in computer vision. Previous methods mainly focus on one specific class of objects without large topological changes, such as cars, faces, or human bodies. In this paper, we propose a novel single view reconstruction algorithm for symmetric piece-wise planar objects that are not restricted to some object classes. Symmetry is ubiquitous in manmade and natural objects and provides rich information for 3D reconstruction. Given a single view of a symmetric piecewise planar object, we first find out all the symmetric line pairs. The geometric properties of symmetric objects are used to narrow down the searching space. Then, based on the symmetric lines, a depth map is recovered through a Markov random field. Experimental results show that our algorithm can efficiently recover the 3D shapes of different objects with significant topological variations.
IEEE Transactions on Image Processing | 2012
Tianfan Xue; Jianzhuang Liu; Xiaoou Tang
3-D technologies are considered as the next generation of multimedia applications. Currently, one of the challenges faced by 3-D applications is the shortage of 3-D resources. To solve this problem, many 3-D modeling methods are proposed to directly recover 3-D geometry from 2-D images. However, these methods on single view modeling either require intensive user interaction, or are restricted to a specific kind of object. In this paper, we propose a novel 3-D modeling approach to recover 3-D geometry from a single image of a symmetric object with minimal user interaction. Symmetry is one of the most common properties of natural or manmade objects. Given a single view of a symmetric object, the user marks some symmetric lines and depth discontinuity regions on the image. Our algorithm first finds a set of planes to approximately fit to the object, and then a rough 3-D point cloud is generated by an optimization procedure. The occluded part of the object is further recovered using symmetry information. Experimental results on various indoor and outdoor objects show that the proposed system can obtain 3-D models from single images with only a little user interaction.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2018
Shaul Oron; Tali Dekel; Tianfan Xue; William T. Freeman; Shai Avidan
We propose a novel method for template matching in unconstrained environments. Its essence is the Best-Buddies Similarity (BBS), a useful, robust, and parameter-free similarity measure between two sets of points. BBS is based on counting the number of Best-Buddies Pairs (BBPs)—pairs of points in source and target sets that are mutual nearest neighbours, i.e., each point is the nearest neighbour of the other. BBS has several key features that make it robust against complex geometric deformations and high levels of outliers, such as those arising from background clutter and occlusions. We study these properties, provide a statistical analysis that justifies them, and demonstrate the consistent success of BBS on a challenging real-world dataset while using different types of features.
acm multimedia | 2011
Yanjie Li; Lifeng Sun; Tianfan Xue
Recent development of depth sensors has facilitated the progress of 2D-plus-depth methods for 3D video representation, for which frame-rate up-conversion (FRUC) of depth video is a critical step. However, due to the computational cost of state-of-the-art FRUC methods, real time applications of 2D-plus-depth is still limited. In this paper, we present a method of speeding up the FRUC of the depth video by treating it as part of a video coding process, combined with a novel color-mapping algorithm is adopted to improve the quality of temporal upsampling. Experiments show that the proposed systems saves up to 99.5% of the frame interpolation time, while achieving virtually identical reconstructed depth video as state-of-the-art methods.
computer vision and pattern recognition | 2015
Tianfan Xue; Hossein Mobahi; Frederic Durand; William T. Freeman
When viewed through a small aperture, a moving image provides incomplete information about the local motion. Only the component of motion along the local image gradient is constrained. In an essential part of optical flow algorithms, information must be aggregated from nearby image locations in order to estimate all components of motion. This limitation of local evidence for estimating optical flow is called “the aperture problem”. We pose and solve a generalization of the aperture problem for moving refractive elements. We consider a common setup in air flow imaging or telescope observation: a camera is viewing a static background, and an unknown refractive elements undergoing unknown motion between them. Then we are addressing this fundamental question: what does the local image motion tell us about the motion of refractive elements? We show that the information gleaned through a local aperture for this case is very different than that for optical flow. In optical flow, the movement of 1D structure already constrains the motion in a certain direction. However, we cannot infer any information about the refractive motion from the movement of 1D structure in the observed sequence, and can only recover one component of the motion from 2D structure. Results on both simulated and real sequences are shown to illustrate our theory.