Yuandong Tian
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuandong Tian.
european conference on computer vision | 2012
Yuandong Tian; C. Lawrence Zitnick; Srinivasa G. Narasimhan
Human pose estimation requires a versatile yet well-constrained spatial model for grouping locally ambiguous parts together to produce a globally consistent hypothesis. Previous works either use local deformable models deviating from a certain template, or use a global mixture representation in the pose space. In this paper, we propose a new hierarchical spatial model that can capture an exponential number of poses with a compact mixture representation on each part. Using latent nodes, it can represent high-order spatial relationship among parts with exact inference. Different from recent hierarchical models that associate each latent node to a mixture of appearance templates (like HoG), we use the hierarchical structure as a pure spatial prior avoiding the large and often confounding appearance space. We verify the effectiveness of this model in three ways. First, samples representing human-like poses can be drawn from our model, showing its ability to capture high-order dependencies of parts. Second, our model achieves accurate reconstruction of unseen poses compared to a nearest neighbor pose representation. Finally, our model achieves state-of-art performance on three challenging datasets, and substantially outperforms recent hierarchical models.
international conference on computer vision | 2009
Yuandong Tian; Srinivasa G. Narasimhan
A video sequence of an underwater scene taken from above the water surface suffers from severe distortions due to water fluctuations. In this paper, we simultaneously estimate the shape of the water surface and recover the planar underwater scene without using any calibration patterns, image priors, multiple viewpoints or active illumination. The key idea is to build a compact spatial distortion model of the water surface using the wave equation. Based on this model, we present a novel tracking technique that is designed specifically for water surfaces and addresses two unique challenges—the absence of an object model or template and the presence of complex appearance changes in the scene due to water fluctuation. We show the effectiveness of our approach on both simulated and real scenes, with text and texture.
computer vision and pattern recognition | 2011
Yuandong Tian; Srinivasa G. Narasimhan
Distortions in images of documents, such as the pages of books, adversely affect the performance of optical character recognition (OCR) systems. Removing such distortions requires the 3D deformation of the document that is often measured using special and precisely calibrated hardware (stereo, laser range scanning or structured light). In this paper, we introduce a new approach that automatically reconstructs the 3D shape and rectifies a deformed text document from a single image. We first estimate the 2D distortion grid in an image by exploiting the line structure and stroke statistics in text documents. This approach does not rely on more noise-sensitive operations such as image binarization and character segmentation. The regularity in the text pattern is used to constrain the 2D distortion grid to be a perspective projection of a 3D parallelogram mesh. Based on this constraint, we present a new shape-from-texture method that computes the 3D deformation up to a scale factor using SVD. Unlike previous work, this formulation imposes no restrictions on the shape (e.g., a developable surface). The estimated shape is then used to remove both geometric distortions and photometric (shading) effects in the image. We demonstrate our techniques on documents containing a variety of languages, fonts and sizes.
computer vision and pattern recognition | 2010
Yuandong Tian; Srinivasa G. Narasimhan
Image alignment in the presence of non-rigid distortions is a challenging task. Typically, this involves estimating the parameters of a dense deformation field that warps a distorted image back to its undistorted template. Generative approaches based on parameter optimization such as Lucas-Kanade can get trapped within local minima. On the other hand, discriminative approaches like Nearest-Neighbor require a large number of training samples that grows exponentially with the desired accuracy. In this work, we develop a novel data-driven iterative algorithm that combines the best of both generative and discriminative approaches. For this, we introduce the notion of a “pull-back” operation that enables us to predict the parameters of the test image using training samples that are not in its neighborhood (not ∊-close) in parameter space. We prove that our algorithm converges to the global optimum using a significantly lower number of training samples that grows only logarithmically with the desired accuracy. We analyze the behavior of our algorithm extensively using synthetic data and demonstrate successful results on experiments with complex deformations due to water and clothing.
computer vision and pattern recognition | 2009
Mohit Gupta; Yuandong Tian; Srinivasa G. Narasimhan; Li Zhang
Most active scene recovery techniques assume that a scene point is illuminated only directly by the illumination source. Consequently, global illumination effects due to inter-reflections, sub-surface scattering and volumetric scattering introduce strong biases in the recovered scene shape. Our goal is to recover scene properties in the presence of global illumination. To this end, we study the interplay between global illumination and the depth cue of illumination defocus. By expressing both these effects as low pass filters, we derive an approximate invariant that can be used to separate them without explicitly modeling the light transport. This is directly useful in any scenario where limited depth-of-field devices (such as projectors) are used to illuminate scenes with global light transport and significant depth variations. We show two applications: (a) accurate depth recovery in the presence of global illumination, and (b) factoring out the effects of defocus for correct direct-global separation in large depth scenes. We demonstrate our approach using scenes with complex shapes, reflectances, textures and translucencies.
International Journal of Computer Vision | 2012
Yuandong Tian; Srinivasa G. Narasimhan
Image alignment in the presence of non-rigid distortions is a challenging task. Typically, this involves estimating the parameters of a dense deformation field that warps a distorted image back to its undistorted template. Generative approaches based on parameter optimization such as Lucas-Kanade can get trapped within local minima. On the other hand, discriminative approaches like nearest-neighbor require a large number of training samples that grows exponentially with respect to the dimension of the parameter space, and polynomially with the desired accuracy 1/ϵ. In this work, we develop a novel data-driven iterative algorithm that combines the best of both generative and discriminative approaches. For this, we introduce the notion of a “pull-back” operation that enables us to predict the parameters of the test image using training samples that are not in its neighborhood (not ϵ-close) in the parameter space. We prove that our algorithm converges to the global optimum using a significantly lower number of training samples that grows only logarithmically with the desired accuracy. We analyze the behavior of our algorithm extensively using synthetic data and demonstrate successful results on experiments with complex deformations due to water and clothing.
International Journal of Computer Vision | 2012
Mohit Gupta; Yuandong Tian; Srinivasa G. Narasimhan; Li Zhang
Projectors are increasingly being used as light-sources in computer vision applications. In several applications, they are modeled as point light sources, thus ignoring the effects of illumination defocus. In addition, most active vision techniques assume that a scene point is illuminated only directly by the light source, thus ignoring global light transport effects. Since both defocus and global illumination co-occur in virtually all scenes illuminated by projectors, ignoring them can result in strong, systematic biases in the recovered scene properties. To make computer vision techniques work for general real world scenes, it is thus important to account for both these effects.In this paper, we study the interplay between defocused illumination and global light transport. We show that both these seemingly disparate effects can be expressed as low pass filters on the incident illumination. Using this observation, we derive an invariant between the two effects, which can be used to separate the two. This is directly useful in scenarios where limited depth-of-field devices (such as projectors) are used to illuminate scenes with global light transport and significant depth variations. We show applications in two scenarios: (a) accurate depth recovery in the presence of global light transport, and (b) factoring out the effects of illumination defocus for correct direct-global component separation. We demonstrate our approach using scenes with complex shapes, reflectance properties, textures and translucencies.
computer vision and pattern recognition | 2011
Dong Huang; Yuandong Tian; Fernando De la Torre
Kernel methods have been popular over the last decade to solve many computer vision, statistics and machine learning problems. An important, both theoretically and practically, open problem in kernel methods is the pre-image problem. The pre-image problem consists of finding a vector in the input space whose mapping is known in the feature space induced by a kernel. To solve the pre-image problem, this paper proposes a framework that computes an isomorphism between local Gram matrices in the input and feature space. Unlike existing methods that rely on analytic properties of kernels, our framework derives closed-form solutions to the pre-image problem in the case of non-differentiable and application-specific kernels. Experiments on the pre-image problem for visualizing cluster centers computed by kernel k-means and denoising high-dimensional images show that our algorithm outperforms state-of-the-art methods.
computer vision and pattern recognition | 2012
Yuandong Tian; Srinivasa G. Narasimhan; Alan J. Vannevel
Turbulence near hot surfaces such as desert terrains and roads during the summer, causes shimmering, distortion and blurring in images. While recent works have focused on image restoration, this paper explores what information about the scene can be extracted from the distortion caused by turbulence. Based on the physical model of wave propagation, we first study the relationship between the scene depth and the amount of distortion caused by homogenous turbulence. We then extend this relationship to more practical scenarios such as finite extent and height-varying turbulence, and present simple algorithms to estimate depth ordering, depth discontinuity and relative depth, from a sequence of short exposure images. In the case of general non-homogenous turbulence, we show that a statistical property of turbulence can be used to improve long-range structure-from-motion (or stereo). We demonstrate the accuracy of our methods in both laboratory and outdoor settings and conclude that turbulence (when present) can be a strong and useful depth cue.
neural information processing systems | 2017
Yuandong Tian; Qucheng Gong; Wenling Shang; Yuxin Wu; C. Lawrence Zitnick