Meina Kan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Meina Kan is active.

Explore More

Publication

Featured researches published by Meina Kan.

european conference on computer vision | 2014

Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment

Jie Zhang; Shiguang Shan; Meina Kan; Xilin Chen

Accurate face alignment is a vital prerequisite step for most face perception tasks such as face recognition, facial expression analysis and non-realistic face re-rendering. It can be formulated as the nonlinear inference of the facial landmarks from the detected face region. Deep network seems a good choice to model the nonlinearity, but it is nontrivial to apply it directly. In this paper, instead of a straightforward application of deep network, we propose a Coarse-to-Fine Auto-encoder Networks (CFAN) approach, which cascades a few successive Stacked Auto-encoder Networks (SANs). Specifically, the first SAN predicts the landmarks quickly but accurately enough as a preliminary, by taking as input a low-resolution version of the detected face holistically. The following SANs then progressively refine the landmark by taking as input the local features extracted around the current landmarks (output of the previous SAN) with higher and higher resolution. Extensive experiments conducted on three challenging datasets demonstrate that our CFAN outperforms the state-of-the-art methods and performs in real-time(40+fps excluding face detection on a desktop).

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Multi-View Discriminant Analysis

Meina Kan; Shiguang Shan; Haihong Zhang; Shihong Lao; Xilin Chen

In many computer vision systems, the same object can be observed at varying viewpoints or even by different sensors, which brings in the challenging demand for recognizing objects from distinct even heterogeneous views. In this work we propose a Multi-view Discriminant Analysis (MvDA) approach, which seeks for a single discriminant common space for multiple views in a non-pairwise manner by jointly learning multiple view-specific linear transforms. Specifically, our MvDA is formulated to jointly solve the multiple linear transforms by optimizing a generalized Rayleigh quotient, i.e., maximizing the between-class variations and minimizing the within-class variations from both intra-view and inter-view in the common space. By reformulating this problem as a ratio trace problem, the multiple linear transforms are achieved analytically and simultaneously through generalized eigenvalue decomposition. Furthermore, inspired by the observation that different views share similar data structures, a constraint is introduced to enforce the view-consistency of the multiple linear transforms. The proposed method is evaluated on three tasks: face recognition across pose, photo versus. sketch face recognition, and visual light image versus near infrared image face recognition on Multi-PIE, CUFSF and HFB databases respectively. Extensive experiments show that our MvDA achieves significant improvements compared with the best known results.

computer vision and pattern recognition | 2014

Stacked Progressive Auto-Encoders (SPAE) for Face Recognition Across Poses

Meina Kan; Shiguang Shan; Hong Chang; Xilin Chen

Identifying subjects with variations caused by poses is one of the most challenging tasks in face recognition, since the difference in appearances caused by poses may be even larger than the difference due to identity. Inspired by the observation that pose variations change non-linearly but smoothly, we propose to learn pose-robust features by modeling the complex non-linear transform from the non-frontal face images to frontal ones through a deep network in a progressive way, termed as stacked progressive auto-encoders (SPAE). Specifically, each shallow progressive auto-encoder of the stacked network is designed to map the face images at large poses to a virtual view at smaller ones, and meanwhile keep those images already at smaller poses unchanged. Then, stacking multiple these shallow auto-encoders can convert non-frontal face images to frontal ones progressively, which means the pose variations are narrowed down to zero step by step. As a result, the outputs of the topmost hidden layers of the stacked network contain very small pose variations, which can be used as the pose-robust features for face recognition. An additional attractiveness of the proposed method is that no pose estimation is needed for the test images. The proposed method is evaluated on two datasets with pose variations, i.e., MultiPIE and FERET datasets, and the experimental results demonstrate the superiority of our method to the existing works, especially to those 2D ones.

european conference on computer vision | 2012

Multi-view discriminant analysis

Meina Kan; Shiguang Shan; Haihong Zhang; Shihong Lao; Xilin Chen

The same object can be observed at different viewpoints or even by different sensors, thus generating multiple distinct even heterogeneous samples. Nowadays, more and more applications need to recognize object from distinct views. Some seminal works have been proposed for object recognition across two views and applied to multiple views in some inefficient pairwise manner. In this paper, we propose a Multi-view Discriminant Analysis (MvDA) method, which seeks for a discriminant common space by jointly learning multiple view-specific linear transforms for robust object recognition from multiple views, in a non-pairwise manner. Specifically, our MvDA is formulated to jointly solve the multiple linear transforms by optimizing a generalized Rayleigh quotient, i.e., maximizing the between-class variations and minimizing the within-class variations of the low-dimensional embeddings from both intra-view and inter-view in the common space. By reformulating this problem as a ratio trace problem, an analytical solution can be achieved by using the generalized eigenvalue decomposition. The proposed method is applied to three multi-view face recognition problems: face recognition across poses, photo-sketch face recognition, and Visual (VIS) image vs. Near Infrared (NIR) image face recognition. Evaluations are conducted respectively on Multi-PIE, CUFSF and HFB databases. Intensive experiments show that MvDA can achieve a more discriminant common space, with up to 13% improvement compared with the best known results.

Pattern Recognition | 2013

Adaptive discriminant learning for face recognition

Meina Kan; Shiguang Shan; Yu Su; Dong Xu; Xilin Chen

Face recognition from Single Sample per Person (SSPP) is extremely challenging because only one sample is available for each person. While many discriminant analysis methods, such as Fisherfaces and its numerous variants, have achieved great success in face recognition, these methods cannot work in this scenario, because more than one sample per person are needed to calculate the within-class scatter matrix. To address this problem, we propose Adaptive Discriminant Analysis (ADA) in which the within-class scatter matrix of each enrolled subject is inferred using his/her single sample, by leveraging a generic set with multiple samples per person. Our method is motivated from the assumption that subjects who look alike to each other generally share similar within-class variations. In ADA, a limited number of neighbors for each single sample are first determined from the generic set by using kNN regression or Lasso regression. Then, the within-class scatter matrix of this single sample is inferred as the weighted average of the within-class scatter matrices of these neighbors based on the arithmetic mean or Riemannian mean. Finally, the optimal ADA projection directions can be computed analytically by using the inferred within-class scatter matrices and the actual between-class scatter matrix. The proposed method is evaluated on three databases including FERET database, FRGC database and a large real-world passport-like face database. The extensive results demonstrate the effectiveness of our ADA when compared with the existing solutions to the SSPP problem.

british machine vision conference | 2011

Side-Information based Linear Discriminant Analysis for Face Recognition.

Meina Kan; Shiguang Shan; Dong Xu; Xilin Chen

In recent years, face recognition in the unconstrained environment has attracted increasing attentions, and a few methods have been evaluated on the Labeled Faces in the Wild (LFW) database. In the unconstrained conditions, sometimes we cannot obtain the full class label information of all the subjects. Instead we can only get the weak label information, such as the side-information, i.e., the image pairs from the same or different subjects. In this scenario, many multi-class methods (e.g., the wellknown Fisher Linear Discriminant Analysis (FLDA)), fail to work due to the lack of full class label information. To effectively utilize the side-information in such case, we propose Side-Information based Linear Discriminant Analysis (SILD), in which the within-class and between-class scatter matrices are directly calculated by using the side-information. Moreover, we theoretically prove that our SILD method is equivalent to FLDA when the full class label information is available. Experiments on LFW and FRGC databases support our theoretical analysis, and SILD using multiple features also achieve promising performance when compared with the stateof-the-art methods.

international conference on computer vision | 2015

AgeNet: Deeply Learned Regressor and Classifier for Robust Apparent Age Estimation

Xin Liu; Shaoxin Li; Meina Kan; Jie Zhang; Shuzhe Wu; Wenxian Liu; Hu Han; Shiguang Shan; Xilin Chen

Apparent age estimation from face image has attracted more and more attentions as it is favorable in some real-world applications. In this work, we propose an end-to-end learning approach for robust apparent age estimation, named by us AgeNet. Specifically, we address the apparent age estimation problem by fusing two kinds of models, i.e., real-value based regression models and Gaussian label distribution based classification models. For both kind of models, large-scale deep convolutional neural network is adopted to learn informative age representations. Another key feature of the proposed AgeNet is that, to avoid the problem of over-fitting on small apparent age training set, we exploit a general-to-specific transfer learning scheme. Technically, the AgeNet is first pre-trained on a large-scale web-collected face dataset with identity label, and then it is fine-tuned on a large-scale real age dataset with noisy age label. Finally, it is fine-tuned on a small training set with apparent age label. The experimental results on the ChaLearn 2015 Apparent Age Competition demonstrate that our AgeNet achieves the state-of-the-art performance in apparent age estimation.

ieee international conference on automatic face gesture recognition | 2015

Report on the FG 2015 Video Person Recognition Evaluation

J. Ross Beveridge; Hao Zhang; Bruce A. Draper; Patrick J. Flynn; Zhen-Hua Feng; Patrik Huber; Josef Kittler; Zhiwu Huang; Shaoxin Li; Yan Li; Meina Kan; Ruiping Wang; Shiguang Shan; Xilin Chen; Haoxiang Li; Gang Hua; Vitomir Struc; Janez Krizaj; Changxing Ding; Dacheng Tao; P. Jonathon Phillips

This report presents results from the Video Person Recognition Evaluation held in conjunction with the 11th IEEE International Conference on Automatic Face and Gesture Recognition. Two experiments required algorithms to recognize people in videos from the Point-and-Shoot Face Recognition Challenge Problem (PaSC). The first consisted of videos from a tripod mounted high quality video camera. The second contained videos acquired from 5 different handheld video cameras. There were 1401 videos in each experiment of 265 subjects. The subjects, the scenes, and the actions carried out by the people are the same in both experiments. Five groups from around the world participated in the evaluation. The video handheld experiment was included in the International Joint Conference on Biometrics (IJCB) 2014 Handheld Video Face and Person Recognition Competition. The top verification rate from this evaluation is double that of the top performer in the IJCB competition. Analysis shows that the factor most effecting algorithm performance is the combination of location and action: where the video was acquired and what the person was doing.

International Journal of Computer Vision | 2014

Domain Adaptation for Face Recognition: Targetize Source Domain Bridged by Common Subspace

Meina Kan; Junting Wu; Shiguang Shan; Xilin Chen

In many applications, a face recognition model learned on a source domain but applied to a novel target domain degenerates even significantly due to the mismatch between the two domains. Aiming at learning a better face recognition model for the target domain, this paper proposes a simple but effective domain adaptation approach that transfers the supervision knowledge from a labeled source domain to the unlabeled target domain. Our basic idea is to convert the source domain images to target domain (termed as targetize the source domain hereinafter), and at the same time keep its supervision information. For this purpose, each source domain image is simply represented as a linear combination of sparse target domain neighbors in the image space, with the combination coefficients however learnt in a common subspace. The principle behind this strategy is that, the common knowledge is only favorable for accurate cross-domain reconstruction, but for the classification in the target domain, the specific knowledge of the target domain is also essential and thus should be mostly preserved (through targetization in the image space in this work). To discover the common knowledge, specifically, a common subspace is learnt, in which the structures of both domains are preserved and meanwhile the disparity of source and target domains is reduced. The proposed method is extensively evaluated under three face recognition scenarios, i.e., domain adaptation across view angle, domain adaptation across ethnicity and domain adaptation across imaging condition. The experimental results illustrate the superiority of our method over those competitive ones.

IEEE Transactions on Image Processing | 2013

Learning Prototype Hyperplanes for Face Verification in the Wild

Meina Kan; Dong Xu; Shiguang Shan; Wen Li; Xilin Chen

In this paper, we propose a new scheme called Prototype Hyperplane Learning (PHL) for face verification in the wild using only weakly labeled training samples (i.e., we only know whether each pair of samples are from the same class or different classes without knowing the class label of each sample) by leveraging a large number of unlabeled samples in a generic data set. Our scheme represents each sample in the weakly labeled data set as a mid-level feature with each entry as the corresponding decision value from the classification hyperplane (referred to as the prototype hyperplane) of one Support Vector Machine (SVM) model, in which a sparse set of support vectors is selected from the unlabeled generic data set based on the learnt combination coefficients. To learn the optimal prototype hyperplanes for the extraction of mid-level features, we propose a Fishers Linear Discriminant-like (FLD-like) objective function by maximizing the discriminability on the weakly labeled data set with a constraint enforcing sparsity on the combination coefficients of each SVM model, which is solved by using an alternating optimization method. Then, we use the recent work called Side-Information based Linear Discriminant (SILD) analysis for dimensionality reduction and a cosine similarity measure for final face verification. Comprehensive experiments on two data sets, Labeled Faces in the Wild (LFW) and YouTube Faces, demonstrate the effectiveness of our scheme.

Explore More