Yuxiao Hu
University of Illinois at Urbana–Champaign
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuxiao Hu.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005
Xiaofei He; Shuicheng Yan; Yuxiao Hu; Partha Niyogi; Hong-Jiang Zhang
We propose an appearance-based face recognition method called the Laplacianface approach. By using locality preserving projections (LPP), the face images are mapped into a face subspace for analysis. Different from principal component analysis (PCA) and linear discriminant analysis (LDA) which effectively see only the Euclidean structure of face space, LPP finds an embedding that preserves local information, and obtains a face subspace that best detects the essential face manifold structure. The Laplacianfaces are the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the face manifold. In this way, the unwanted variations resulting from changes in lighting, facial expression, and pose may be eliminated or reduced. Theoretical analysis shows that PCA, LDA, and LPP can be obtained from different graph models. We compare the proposed Laplacianface approach with Eigenface and Fisherface methods on three different face data sets. Experimental results suggest that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition.
computer vision and pattern recognition | 2007
Deng Cai; Xiaofei He; Yuxiao Hu; Jiawei Han; Thomas S. Huang
Subspace learning based face recognition methods have attracted considerable interests in recently years, including principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projection (LPP), neighborhood preserving embedding (NPE), marginal fisher analysis (MFA) and local discriminant embedding (LDE). These methods consider an n1timesn2 image as a vector in Rn 1 timesn 2 and the pixels of each image are considered as independent. While an image represented in the plane is intrinsically a matrix. The pixels spatially close to each other may be correlated. Even though we have n1xn2 pixels per image, this spatial correlation suggests the real number of freedom is far less. In this paper, we introduce a regularized subspace learning model using a Laplacian penalty to constrain the coefficients to be spatially smooth. All these existing subspace learning algorithms can fit into this model and produce a spatially smooth subspace which is better for image representation than their original version. Recognition, clustering and retrieval can be then performed in the image subspace. Experimental results on face recognition demonstrate the effectiveness of our method.
Pattern Recognition | 2005
Dalong Jiang; Yuxiao Hu; Shuicheng Yan; Lei Zhang; Hong-Jiang Zhang; Wen Gao
Face recognition with variant pose, illumination and expression (PIE) is a challenging problem. In this paper, we propose an analysis-by-synthesis framework for face recognition with variant PIE. First, an efficient two-dimensional (2D)-to-three-dimensional (3D) integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination. Then, realistic virtual faces with different PIE are synthesized based on the personalized 3D face to characterize the face subspace. Finally, face recognition is conducted based on these representative virtual faces. Compared with other related work, this framework has following advantages: (1) only one single frontal face is required for face recognition, which avoids the burdensome enrollment work; (2) the synthesized face samples provide the capability to conduct recognition under difficult conditions like complex PIE; and (3) compared with other 3D reconstruction approaches, our proposed 2D-to-3D integrated face reconstruction approach is fully automatic and more efficient. The extensive experimental results show that the synthesized virtual faces significantly improve the accuracy of face recognition with changing PIE.
international conference on computer vision | 2003
Xiaofei He; Shuicheng Yan; Yuxiao Hu; Hong-Jiang Zhang
We have demonstrated that the face recognition performance can be improved significantly in low dimensional linear subspaces. Conventionally, principal component analysis (PCA) and linear discriminant analysis (LDA) are considered effective in deriving such a face subspace. However, both of them effectively see only the Euclidean structure of face space. We propose a new approach to mapping face images into a subspace obtained by locality preserving projections (LPP) for face analysis. We call this Laplacianface approach. Different from PCA and LDA, LPP finds an embedding that preserves local information, and obtains a face space that best detects the essential manifold structure. In this way, the unwanted variations resulting from changes in lighting, facial expression, and pose may be eliminated or reduced. We compare the proposed Laplacianface approach with eigenface and fisherface methods on three test datasets. Experimental results show that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition.
ieee international conference on automatic face gesture recognition | 2004
Yuxiao Hu; Dalong Jiang; Shuicheng Yan; Lei Zhang; Hong-Jiang Zhang
An analysis-by-synthesis framework for face recognition with variant pose, illumination and expression (PIE) is proposed in this paper. First, an efficient 2D-to-3D integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination. Then, realistic virtual faces with different PIE are synthesized based on the personalized 3D face to characterize the face subspace. Finally, face recognition is conducted based on these representative virtual faces. Compared with other related works, this framework has the following advantages: 1) only one single frontal face is required for face recognition, which avoids the burdensome enrollment work; 2) the synthesized face samples provide the capability to conduct recognition under difficult conditions like complex PIE; and 3) the proposed 2D-to-3D integrated face reconstruction approach is fully automatic and more efficient. The extensive experimental results show that the synthesized virtual faces significantly improve the accuracy of face recognition with variant PIE.
european conference on computer vision | 2016
Yandong Guo; Lei Zhang; Yuxiao Hu; Xiaodong He; Jianfeng Gao
In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis. Associated with this task, we design and provide concrete measurement set, evaluation protocol, as well as training data. We also present in details our experiment setup and report promising baseline results. Our benchmark task could lead to one of the largest classification problems in computer vision. To the best of our knowledge, our training dataset, which contains 10M images in version 1, is the largest publicly available one in the world.
IEEE Transactions on Image Processing | 2011
Xu Zhao; Kai-Hsiang Lin; Yun Fu; Yuxiao Hu; Yuncai Liu; Thomas S. Huang
Detecting text and caption from videos is important and in great demand for video retrieval, annotation, indexing, and content analysis. In this paper, we present a corner based approach to detect text and caption from videos. This approach is inspired by the observation that there exist dense and orderly presences of corner points in characters, especially in text and caption. We use several discriminative features to describe the text regions formed by the corner points. The usage of these features is in a flexible manner, thus, can be adapted to different applications. Language independence is an important advantage of the proposed method. Moreover, based upon the text features, we further develop a novel algorithm to detect moving captions in videos. In the algorithm, the motion features, extracted by optical flow, are combined with text features to detect the moving caption patterns. The decision tree is adopted to learn the classification criteria. Experiments conducted on a large volume of real video shots demonstrate the efficiency and robustness of our proposed approaches and the real-world system. Our text and caption detection system was recently highlighted in a worldwide multimedia retrieval competition, Star Challenge, by achieving the superior performance with the top ranking.
acm multimedia | 2004
Lei Zhang; Yuxiao Hu; Mingjing Li; Wei-Ying Ma; Hong-Jiang Zhang
In this paper, we propose and investigate a new user scenario for face annotation, in which users are allowed to multi-select a group of otogras and assign names to these otogras. The system will then attempt to propagate names from otogra level to face level, i.e. to infer the correspondence between name and face. Given the face similarity measure which combines methodologies from face recognition and content-based image retrieval, we formulate name propagation as an optimization problem. We define the objective function as the sum of similarities between each pair of faces of the same individual in different otogras, and propose an iterative optimization algorithm to infer the optimal correspondence. To make the propagation result reliable, a reject scheme is adopted to reject those with low confidence scores. Furthermore, we investigate the combination and alternation of browsing mode for propagation and viewer mode for annotation, so that each mode can benefit from additional inputs from the other mode. The experimental evaluation has been conducted within a typical family album of over one thousand otogras and the results show that the proposed approach is effective and efficient in automated face annotation in family albums.
ieee international conference on automatic face & gesture recognition | 2008
Yuxiao Hu; Zhihong Zeng; Lijun Yin; Xiaozhou Wei; Xi Zhou; Thomas S. Huang
The ability to handle multi-view facial expressions is important for computers to understand affective behavior under less constrained environment. However, most of existing methods for facial expression recognition are based on the near-frontal view face data, which are likely to fail in the non-frontal facial expression analysis. In this paper, we conduct an investigation on analyzing multi-view facial expressions. Three local patch descriptors (HoG, LBP, and SIFT) are used to extract facial features, which are the inputs to a nearest-neighbor indexing method that identifies facial expressions. We also investigate the influence of feature dimension reductions (PCA, LDA, and LPP) and classifier fusion on the recognition performance. We test our approaches on multi-view data generated from BU-3DFE 3D facial expression database that includes 100 subjects with 6 emotions and 4 intensity levels. Our extensive person-independent experiments suggest that the SIFT descriptor outperforms HoG and LBP, and LPP outperforms PCA and LDA in this application. But the classifier fusion does not show a significant advantage over SIFT-only classifier.
ieee international conference on automatic face gesture recognition | 2004
Yuxiao Hu; Longbin Chen; Yi Zhou; Hong-Jiang Zhang
A robust pose estimation approach is proposed by combining facial appearance asymmetry and 3D geometry in a coarse-to-fine framework. The rough face pose is first estimated by analyzing the asymmetry of the distribution of the facial component detection confidences on an image, which actually implies an intrinsic relation between the face pose and the facial appearance. Then, this rough face pose, as well as error bandwidth, is utilized into a 3D-to-2D geometrical model matching to refine the pose estimation. The proposed approach is able to track a face with fast motion in front of cluttered background and recover its pose robustly and accurately in real- time. Experiment results are provided to demonstrate its efficiency and accuracy.