Songfan Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Songfan Yang is active.

Explore More

Publication

Featured researches published by Songfan Yang.

systems man and cybernetics | 2012

Understanding Discrete Facial Expressions in Video Using an Emotion Avatar Image

Songfan Yang; Bir Bhanu

Existing video-based facial expression recognition techniques analyze the geometry-based and appearance-based information in every frame as well as explore the temporal relation among frames. On the contrary, we present a new image-based representation and an associated reference image called the emotion avatar image (EAI), and the avatar reference, respectively. This representation leverages the out-of-plane head rotation. It is not only robust to outliers but also provides a method to aggregate dynamic information from expressions with various lengths. The approach to facial expression analysis consists of the following steps: 1) face detection; 2) face registration of video frames with the avatar reference to form the EAI representation; 3) computation of features from EAIs using both local binary patterns and local phase quantization; and 4) the classification of the feature as one of the emotion type by using a linear support vector machine classifier. Our system is tested on the Facial Expression Recognition and Analysis Challenge (FERA2011) data, i.e., the Geneva Multimodal Emotion Portrayal-Facial Expression Recognition and Analysis Challenge (GEMEP-FERA) data set. The experimental results demonstrate that the information captured in an EAI for a facial expression is a very strong cue for emotion inference. Moreover, our method suppresses the person-specific information for emotion and performs well on unseen data.

Face and Gesture 2011 | 2011

Facial expression recognition using emotion avatar image

Songfan Yang; Bir Bhanu

Existing facial expression recognition techniques analyze the spatial and temporal information for every single frame in a human emotion video. On the contrary, we create the Emotion Avatar Image (EAI) as a single good representation for each video or image sequence for emotion recognition. In this paper, we adopt the recently introduced SIFT flow algorithm to register every frame with respect to an Avatar reference face model. Then, an iterative algorithm is used not only to super-resolve the EAI representation for each video and the Avatar reference, but also to improve the recognition performance. Subsequently, we extract the features from EAIs using both Local Binary Pattern (LBP) and Local Phase Quantization (LPQ). Then the results from both texture descriptors are tested on the Facial Expression Recognition and Analysis Challenge (FERA2011) data, GEMEP-FERA dataset. To evaluate this simple yet powerful idea, we train our algorithm only using the given 155 videos of training data from GEMEP-FERA dataset. The result shows that our algorithm eliminates the person-specific information for emotion and performs well on unseen data.

advanced video and signal based surveillance | 2013

Reference-based person re-identification

Le An; Mehran Kafai; Songfan Yang; Bir Bhanu

Person re-identification refers to recognizing people across non-overlapping cameras at different times and locations. Due to the variations in pose, illumination condition, background, and occlusion, person re-identification is inherently difficult. In this paper, we propose a reference-based method for across camera person re-identification. In the training, we learn a subspace in which the correlations of the reference data from different cameras are maximized using Regularized Canonical Correlation Analysis (RCCA). For re-identification, the gallery data and the probe data are projected into the RCCA subspace and the reference descriptors (RDs) of the gallery and probe are constructed by measuring the similarity between them and the reference data. The identity of the probe is determined by comparing the RD of the probe and the RDs of the gallery. Experiments on benchmark dataset show that the proposed method outperforms the state-of-the-art approaches.

IEEE Transactions on Circuits and Systems for Video Technology | 2016

Person Reidentification With Reference Descriptor

Le An; Mehran Kafai; Songfan Yang; Bir Bhanu

Person identification across nonoverlapping cameras, also known as person reidentification, aims to match people at different times and locations. Reidentifying people is of great importance in crucial applications such as wide-area surveillance and visual tracking. Due to the appearance variations in pose, illumination, and occlusion in different camera views, person reidentification is inherently difficult. To address these challenges, a reference-based method is proposed for person reidentification across different cameras. Instead of directly matching people by their appearance, the matching is conducted in a reference space where the descriptor for a person is translated from the original color or texture descriptors to similarity measures between this person and the exemplars in the reference set. A subspace is first learned in which the correlations of the reference data from different cameras are maximized using regularized canonical correlation analysis (RCCA). For reidentification, the gallery data and the probe data are projected onto this RCCA subspace and the reference descriptors (RDs) of the gallery and probe are generated by computing the similarity between them and the reference data. The identity of a probe is determined by comparing the RD of the probe and the RDs of the gallery. A reranking step is added to further improve the results using a saliency-based matching scheme. Experiments on publicly available datasets show that the proposed method outperforms most of the state-of-the-art approaches.

IEEE Signal Processing Letters | 2015

Person Re-Identification by Robust Canonical Correlation Analysis

Le An; Songfan Yang; Bir Bhanu

Person re-identification is the task to match people in surveillance cameras at different time and location. Due to significant view and pose change across non-overlapping cameras, directly matching data from different views is a challenging issue to solve. In this letter, we propose a robust canonical correlation analysis (ROCCA) to match people from different views in a coherent subspace. Given a small training set as in most re-identification problems, direct application of canonical correlation analysis (CCA) may lead to poor performance due to the inaccuracy in estimating the data covariance matrices. The proposed ROCCA with shrinkage estimation and smoothing technique is simple to implement and can robustly estimate the data covariance matrices with limited training samples. Experimental results on two publicly available datasets show that the proposed ROCCA outperforms regularized CCA (RCCA), and achieves state-of-the-art matching results for person re-identification as compared to the most recent methods.

Neurocomputing | 2015

Efficient smile detection by Extreme Learning Machine

Le An; Songfan Yang; Bir Bhanu

Smile detection is a specialized task in facial expression analysis with applications such as photo selection, user experience analysis, and patient monitoring. As one of the most important and informative expressions, smile conveys the underlying emotion status such as joy, happiness, and satisfaction. In this paper, an efficient smile detection approach is proposed based on Extreme Learning Machine (ELM). The faces are first detected and a holistic flow-based face registration is applied which does not need any manual labeling or key point detection. Then ELM is used to train the classifier. The proposed smile detector is tested with different feature descriptors on publicly available databases including real-world face images. The comparisons against benchmark classifiers including Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) suggest that the proposed ELM based smile detector in general performs better and is very efficient. Compared to state-of-the-art smile detector, the proposed method achieves competitive results without preprocessing and manual registration.

international conference on distributed smart cameras | 2013

Improving person re-identification by soft biometrics based reranking

Le An; Xiaojing Chen; Mehran Kafai; Songfan Yang; Bir Bhanu

The problem of person re-identification is to recognize a target subject across non-overlapping distributed cameras at different times and locations. The applications of person re-identification include security, surveillance, multi-camera tracking, etc. In a real-world scenario, person re-identification is challenging due to the dramatic changes in a subjects appearance in terms of pose, illumination, background, and occlusion. Existing approaches either try to design robust features to identify a subject across different views or learn distance metrics to maximize the similarity between different views of the same person and minimize the similarity between different views of different persons. In this paper, we aim at improving the re-identification performance by reranking the returned results based on soft biometric attributes, such as gender, which can describe probe and gallery subjects at a higher level. During reranking, the soft biometric attributes are detected and attribute-based distance scores are calculated between pairs of images by using a regression model. These distance scores are used for reranking the initially returned matches. Experiments on a benchmark database with different baseline re-identification methods show that reranking improves the recognition accuracy by moving upwards the returned matches from gallery that share the same soft biometric attributes as the probe subject.

affective computing and intelligent interaction | 2011

A psychologically-inspired match-score fusion mode for video-based facial expression recognition

Albert C. Cruz; Bir Bhanu; Songfan Yang

Communication between humans is complex and is not limited to verbal signals; emotions are conveyed with gesture, pose and facial expression. Facial Emotion Recognition and Analysis (FERA), the techniques by which non-verbal communication is quantified, is an exemplar case where humans consistently outperform computer methods. While the field of FERA has seen many advances, no system has been proposed which scales well to very large data sets. The challenge for computer vision is how to automatically and nonheuristically downsample the data while maintaining the maximum representational power that does not sacrifice accuracy. In this paper, we propose a method inspired by human vision and attention theory [2]. Video is segmented into temporal partitions with a dynamic sampling rate based on the frequency of visual information. Regions are homogenized by a match-score fusion technique. The approach is shown to provide classification rates higher than the baseline on the AVEC 2011 video-subchallenge dataset [15].

Information Sciences | 2016

Sparse representation matching for person re-identification

Le An; Xiaojing Chen; Songfan Yang; Bir Bhanu

The need for recognizing people across distributed surveillance cameras leads to the growth of recent research interest in person re-identification. Person re-identification aims at matching people in non-overlapping cameras at different time and locations. It is a difficult pattern matching task due to significant appearance variations in pose, illumination, or occlusion in different camera views. To address this multi-view matching problem, we first learn a subspace using canonical correlation analysis (CCA) in which the goal is to maximize the correlation between data from different cameras but corresponding to the same people. Given a probe from one camera view, we represent it using a sparse representation from a jointly learned coupled dictionary in the CCA subspace. The ?1 induced sparse representation are regularized by an ?2 regularization term. The introduction of ?2 regularization allows learning a sparse representation while maintaining the stability of the sparse coefficients. To compute the matching scores between probe and gallery, their ?2 regularized sparse representations are matched using a modified cosine similarity measure. Experimental results with extensive comparisons on challenging datasets demonstrate that the proposed method outperforms the state-of-the-art methods and using ?2 regularized sparse representation (?1 + ?2) is more accurate compared to use a single ?1 or ?2 regularization term.

Neurocomputing | 2016

Person re-identification via hypergraph-based matching

Le An; Xiaojing Chen; Songfan Yang

Person re-identification which aims to match people across non-overlapping cameras has become an important research topic due to the increasing demand in many important applications such as video surveillance and security monitoring. Matching people in different cameras is a challenging task since the appearance of the same subject may change dramatically in different views due to variations in pose, lighting condition, etc. In order to reduce the feature discrepancy caused by view change, most of the existing methods focus either on robust feature extraction or view-invariant feature transformation. During matching, a subject to be identified, i.e., a probe, is compared with each subject in the gallery with known identities. The returned ranked list is generated based on the similarity scores. However, such a matching process only considers pairwise similarity between the probe and a gallery subject while higher order relationships between the probe and the gallery or even among the gallery subjects are ignored. To address this issue, we propose a hypergraph-based matching scheme in which both pairwise and higher order relationships for the probe and gallery subjects are discovered through hypergraph learning. In this way, improved similarity scores are obtained as compared to the conventional pairwise similarity measure. We conduct experiments on two widely used person re-identification datasets and the results demonstrate that matching through hypergraph learning leads to superior performance compared with state-of-the-arts. Furthermore, the proposed approach can be easily incorporated into any existing approach where similarities between probe and gallery are to be computed.

Explore More