Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yilei Xu is active.

Publication


Featured researches published by Yilei Xu.


computer vision and pattern recognition | 2007

Pose and Illumination Invariant Face Recognition in Video

Yilei Xu; Amit K. Roy-Chowdhury; Keyur Patel

The use of video sequences for face recognition has been relatively less studied than image-based approaches. In this paper, we present a framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. Our method is based on a recently obtained theoretical result that can integrate the effects of motion, lighting and shape in generating an image using a perspective camera. This result can be used to estimate the pose and illumination conditions for each frame of the probe sequence. Then, using a 3D face model, we synthesize images corresponding to the pose and illumination conditions estimated in the probe sequences. Similarity between the synthesized images and the probe video is computed by integrating over the entire sequence. The method can handle situations where the pose and lighting conditions in the training and testing data are completely disjoint.


international conference on computer vision | 2005

Integrating the effects of motion, illumination and structure in video sequences

Yilei Xu; Amit K. Roy-Chowdhury

Most work in computer vision has concentrated on studying the individual effect of motion and illumination on a 3D object. In this paper, we present a theory for combining the effects of motion, illumination, 3D structure, albedo, and camera parameters in a sequence of images obtained by a perspective camera. We show that the set of all Lambertian reflectance functions of a moving object, illuminated by arbitrarily distant light sources, lies close to a bilinear subspace consisting of nine illumination variables and six motion variables. This result implies that, given an arbitrary video sequence, it is possible to recover the 3D structure, motion and illumination conditions simultaneously using the bilinear subspace formulation. The derivation is based on the intuitive notion that, given an illumination direction, the images of a moving surface cannot change suddenly over a short time period. We experimentally compare the images obtained using our theory with ground truth data and show that the difference is small and acceptable. We also provide experimental results on real data by synthesizing video sequences of a 3D face with various combinations of motion and illumination directions


international conference on image processing | 2007

Super-Resolved Facial Texture Under Changing Pose and Illumination

Jiangang Yu; Bir Bhanu; Yilei Xu; Amit K. Roy-Chowdhury

In this paper, we propose a method to incrementally super-resolve 3D facial texture by integrating information frame by frame from a video captured under changing poses and illuminations. First, we recover illumination, 3D motion and shape parameters from our tracking algorithm. This information is then used to super-resolve 3D texture using iterative back-projection (IBP) method. Finally, the super-resolved texture is fed back to the tracking part to improve the estimation of illumination and motion parameters. This closed-loop process continues to refine the texture as new frames come in. We also propose a local-region based scheme to handle non-rigidity of the human face. Experiments demonstrate that our framework not only incrementally super-resolves facial images, but recovers the detailed expression changes in high quality.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011

A Physics-Based Analysis of Image Appearance Models

Yilei Xu; Amit K. Roy-Chowdhury

Linear and multilinear models (PCA, 3DMM, AAM/ASM, and multilinear tensors) of object shape/appearance have been very popular in computer vision. In this paper, we analyze the applicability of these heuristic models from the fundamental physical laws of object motion and image formation. We prove that under suitable conditions, the image appearance space can be closely approximated to be multilinear, with the illumination and texture subspaces being trilinearly combined with the direct sum of the motion and deformation subspaces. This result provides a physics-based understanding of many of the successes and limitations of the linear and multilinear approaches existing in the computer vision literature, and also identifies some of the conditions under which they are valid. It provides an analytical representation of the image space in terms of different physical factors that affect the image formation process. Numerical analysis of the accuracy of the physics-based models is performed, and tracking results on real data are presented.


international conference on image processing | 2005

The joint illumination and motion space of video sequences

Yilei Xu; Amit K. Roy-Chowdhury

It has been proved that the set of all Lambertian reflectance functions obtained with arbitrarily distant light sources lies close to a 9D linear subspace. We extend this result from still images to video sequences. We show that the set of all Lambertian reflectance functions of a moving object at any position, illuminated by arbitrarily distant light sources, lies close to a bilinear subspace consisting of nine illumination variables and six motion variables. This result implies that, when the position and 3D model of an object at one instance of time is known, the reflectance images at future time instances can be estimated using the bilinear subspace. This is based on the fact that, given the illumination direction, the image of a moving surface cannot change suddenly over a short time period. We apply our theory to synthesize video sequences of a 3D face with various combinations of motion and illumination directions.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2008

Inverse Compositional Estimation of 3D Pose And Lighting in Dynamic Scenes

Yilei Xu; Amit K. Roy-Chowdhury

In this paper, we show how we can estimate, accurately and efficiently, the 3D motion of a rigid object and time-varying lighting in a dynamic scene. This is achieved in an inverse compositional tracking framework with a novel warping function that involves a 2D rarr 3D rarr 2D transformation. This also allows us to extend traditional two-frame inverse compositional tracking to a sequence of frames, leading to even higher computational savings. We prove the theoretical convergence of this method and show that it leads to significant reduction in computational burden. Experimental analysis on multiple video sequences shows impressive speedup over existing methods while retaining a high level of accuracy.


EURASIP Journal on Advances in Signal Processing | 2007

Integrating Illumination, Motion, and Shape Models for Robust Face Recognition in Video

Yilei Xu; Amit K. Roy-Chowdhury; Keyur Patel

The use of video sequences for face recognition has been relatively less studied compared to image-based approaches. In this paper, we present an analysis-by-synthesis framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. This requires tracking the video sequence, as well as recognition algorithms that are able to integrate information over the entire video; we address both these problems. Our method is based on a recently obtained theoretical result that can integrate the effects of motion, lighting, and shape in generating an image using a perspective camera. This result can be used to estimate the pose and structure of the face and the illumination conditions for each frame in a video sequence in the presence of multiple point and extended light sources. We propose a new inverse compositional estimation approach for this purpose. We then synthesize images using the face model estimated from the training data corresponding to the conditions in the probe sequences. Similarity between the synthesized and the probe images is computed using suitable distance measurements. The method can handle situations where the pose and lighting conditions in the training and testing data are completely disjoint. We show detailed performance analysis results and recognition scores on a large video dataset.


computer vision and pattern recognition | 2008

Learning a geometry integrated image appearance manifold from a small training set

Yilei Xu; Amit K. Roy-Chowdhury

While low-dimensional image representations have been very popular in computer vision, they suffer from two limitations: (i) they require collecting a large and varied training set to learn a low-dimensional set of basis functions, and (ii) they do not retain information about the 3D geometry of the object being imaged. In this paper, we show that it is possible to estimate low-dimensional manifolds that describe object appearance while retaining the geometrical information about the 3D structure of the object. By using a combination of analytically derived geometrical models and statistical learning methods, this can be achieved using a much smaller training set than most of the existing approaches. Specifically, we derive a quadrilinear manifold of object appearance that can represent the effects of illumination, pose, identity and deformation, and the basis functions of the tangent space to this manifold depend on the 3D surface normals of the objects. We show experimental results on constructing this manifold and how to efficiently track on it using an inverse compositional algorithm.


Archive | 2007

Pose and Illumination Invariant Face Recognition Using Video Sequences

Amit K. Roy-Chowdhury; Yilei Xu

Pose and illumination variations remain a persistent challenge in face recognition. In this paper, we present a framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. Our method is based on a recently obtained theoretical result that can integrate the effects of motion, lighting and shape in generating an image using a perspective camera. This result can be used to estimate the pose and structure of the face and the illumination conditions for each frame in a video sequence in the presence of multiple point and extended light sources. The pose and illumination estimates in the probe and gallery sequences can then be compared for recognition applications. If similar parameters exist in both the probe and gallery, the similarity between the set of images can be directly computed. If the lighting and pose parameters in the probe and gallery are different, we will synthesize the images using the face model estimated from the training data corresponding to the conditions in the probe sequences. The method can handle situations where the pose and lighting conditions in the training and testing data are very different. We will show results on a video-based face recognition dataset that we have collected.


computer vision and pattern recognition | 2008

A theoretical analysis of linear and multi-linear models of image appearance

Yilei Xu; Amit K. Roy-Chowdhury

Linear and multi-linear models of object shape/appearance (PCA, 3 DMM, AAM/ASM, multilinear tensors) have been very popular in computer vision. In this paper, we analyze the validity of these models from the fundamental physical laws of object motion and image formation. We rigorously prove that the image appearance space can be closely approximated to be locally multilinear, with the illumination subspace being bilinearly combined with the direct sum of the motion, deformation and texture subspaces. This result allows us to understand theoretically many of the successes and limitations of the linear and multi-linear approaches existing in the computer vision literature, and also identifies some of the conditions under which they are valid. It provides an analytical representation of the image space in terms of different physical factors that affect the image formation process. Experimental analysis of the accuracy of the theoretical models is performed as well as tracking on real data using the analytically derived basis functions of this space.

Collaboration


Dive into the Yilei Xu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Keyur Patel

University of California

View shared research outputs
Top Co-Authors

Avatar

Bir Bhanu

University of California

View shared research outputs
Top Co-Authors

Avatar

Jiangang Yu

University of California

View shared research outputs
Top Co-Authors

Avatar

Long Nyugen

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge