Zhen-Hua Feng
University of Surrey
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhen-Hua Feng.
IEEE Signal Processing Letters | 2015
Zhen-Hua Feng; Patrik Huber; Josef Kittler; William J. Christmas; Xiaojun Wu
In this letter, we present a random cascaded-regression copse (R-CR-C) for robust facial landmark detection. Its key innovations include a new parallel cascade structure design, and an adaptive scheme for scale-invariant shape update and local feature extraction. Evaluation on two challenging benchmarks shows the superiority of the proposed algorithm to state-of-the-art methods.
ieee international conference on automatic face gesture recognition | 2015
J. Ross Beveridge; Hao Zhang; Bruce A. Draper; Patrick J. Flynn; Zhen-Hua Feng; Patrik Huber; Josef Kittler; Zhiwu Huang; Shaoxin Li; Yan Li; Meina Kan; Ruiping Wang; Shiguang Shan; Xilin Chen; Haoxiang Li; Gang Hua; Vitomir Struc; Janez Krizaj; Changxing Ding; Dacheng Tao; P. Jonathon Phillips
This report presents results from the Video Person Recognition Evaluation held in conjunction with the 11th IEEE International Conference on Automatic Face and Gesture Recognition. Two experiments required algorithms to recognize people in videos from the Point-and-Shoot Face Recognition Challenge Problem (PaSC). The first consisted of videos from a tripod mounted high quality video camera. The second contained videos acquired from 5 different handheld video cameras. There were 1401 videos in each experiment of 265 subjects. The subjects, the scenes, and the actions carried out by the people are the same in both experiments. Five groups from around the world participated in the evaluation. The video handheld experiment was included in the International Joint Conference on Biometrics (IJCB) 2014 Handheld Video Face and Person Recognition Competition. The top verification rate from this evaluation is double that of the top performer in the IJCB competition. Analysis shows that the factor most effecting algorithm performance is the combination of location and action: where the video was acquired and what the person was doing.
international conference on image processing | 2015
Patrik Huber; Zhen-Hua Feng; William J. Christmas; Josef Kittler; Matthias Rätsch
In this paper, we propose a novel fitting method that uses local image features to fit a 3D Morphable Face Model to 2D images. To overcome the obstacle of optimising a cost function that contains a non-differentiable feature extraction operator, we use a learning-based cascaded regression method that learns the gradient direction from data. The method allows to simultaneously solve for shape and pose parameters. Our method is thoroughly evaluated on Morphable Model generated data and first results on real data are presented. Compared to traditional fitting methods, which use simple raw features like pixel colour or edge maps, local features have been shown to be much more robust against variations in imaging conditions. Our approach is unique in that we are the first to use local features to fit a 3D Morphable Model. Because of the speed of our method, it is applicable for realtime applications. Our cascaded regression framework is available as an open source library at github.com/patrikhuber/superviseddescent.
IEEE Transactions on Image Processing | 2015
Zhen-Hua Feng; Guosheng Hu; Josef Kittler; William J. Christmas; Xiaojun Wu
A large amount of training data is usually crucial for successful supervised learning. However, the task of providing training samples is often time-consuming, involving a considerable amount of tedious manual work. In addition, the amount of training data available is often limited. As an alternative, in this paper, we discuss how best to augment the available data for the application of automatic facial landmark detection. We propose the use of a 3D morphable face model to generate synthesized faces for a regression-based detector training. Benefiting from the large synthetic training data, the learned detector is shown to exhibit a better capability to detect the landmarks of a face with pose variations. Furthermore, the synthesized training data set provides accurate and consistent landmarks automatically as compared to the landmarks annotated manually, especially for occluded facial parts. The synthetic data and real data are from different domains; hence the detector trained using only synthesized faces does not generalize well to real faces. To deal with this problem, we propose a cascaded collaborative regression algorithm, which generates a cascaded shape updater that has the ability to overcome the difficulties caused by pose variations, as well as achieving better accuracy when applied to real faces. The training is based on a mix of synthetic and real image data with the mixing controlled by a dynamic mixture weighting schedule. Initially, the training uses heavily the synthetic data, as this can model the gross variations between the various poses. As the training proceeds, progressively more of the natural images are incorporated, as these can model finer detail. To improve the performance of the proposed algorithm further, we designed a dynamic multi-scale local feature extraction method, which captures more informative local features for detector training. An extensive evaluation on both controlled and uncontrolled face data sets demonstrates the merit of the proposed algorithm.
computer vision and pattern recognition | 2017
Zhen-Hua Feng; Josef Kittler; William J. Christmas; Patrik Huber; Xiaojun Wu
We present a new Cascaded Shape Regression (CSR) architecture, namely Dynamic Attention-Controlled CSR (DAC-CSR), for robust facial landmark detection on unconstrained faces. Our DAC-CSR divides facial landmark detection into three cascaded sub-tasks: face bounding box refinement, general CSR and attention-controlled CSR. The first two stages refine initial face bounding boxes and output intermediate facial landmarks. Then, an online dynamic model selection method is used to choose appropriate domain-specific CSRs for further landmark refinement. The key innovation of our DAC-CSR is the fault-tolerant mechanism, using fuzzy set sample weighting, for attention-controlled domain-specific model training. Moreover, we advocate data augmentation with a simple but effective 2D profile face generator, and context-aware feature extraction for better facial feature representation. Experimental results obtained on challenging datasets demonstrate the merits of our DAC-CSR over the state-of-the-art methods.
articulated motion and deformable objects | 2016
Josef Kittler; Patrik Huber; Zhen-Hua Feng; Guosheng Hu; William J. Christmas
3D Morphable Face Models (3DMM) have been used in face recognition for some time now. They can be applied in their own right as a basis for 3D face recognition and analysis involving 3D face data. However their prevalent use over the last decade has been as a versatile tool in 2D face recognition to normalise pose, illumination and expression of 2D face images. A 3DMM has the generative capacity to augment the training and test databases for various 2D face processing related tasks. It can be used to expand the gallery set for pose-invariant face matching. For any 2D face image it can furnish complementary information, in terms of its 3D face shape and texture. It can also aid multiple frame fusion by providing the means of registering a set of 2D images. A key enabling technology for this versatility is 3D face model to 2D face image fitting. In this paper recent developments in 3D face modelling and model fitting will be overviewed, and their merits in the context of diverse applications illustrated on several examples, including pose and illumination invariant face recognition, and 3D face reconstruction from video.
IEEE Transactions on Systems, Man, and Cybernetics | 2017
Xiaoning Song; Zhen-Hua Feng; Guosheng Hu; Xiaojun Wu
This paper presents a half-face dictionary integration (HFDI) algorithm for representation-based classification. The proposed HFDI algorithm measures residuals between an input signal and the reconstructed one, using both the original and the synthesized dual-column (row) half-face training samples. More specifically, we first generate a set of virtual half-face samples for the purpose of training data augmentation. The aim is to obtain high-fidelity collaborative representation of a test sample. In this half-face integrated dictionary, each original training vector is replaced by an integrated dual-column (row) half-face matrix. Second, to reduce the redundancy between the original dictionary and the extended half-face dictionary, we propose an elimination strategy to gain the most robust training atoms. The last contribution of the proposed HFDI method is the use of a competitive fusion method weighting the reconstruction residuals from different dictionaries for robust face classification. Experimental results obtained from the Facial Recognition Technology, Aleix and Robert, Georgia Tech, ORL, and Carnegie Mellon University-pose, illumination and expression data sets demonstrate the effectiveness of the proposed method, especially in the case of the small sample size problem.
Pattern Recognition | 2018
Paul Koppen; Zhen-Hua Feng; Josef Kittler; Muhammad Awais; William J. Christmas; Xiaojun Wu; He-Feng Yin
A Gaussian Mixture 3DMM (GM-3DMM) which models the global population as a mixture of Gaussian subpopulations.A ESO-based model selection strategy for GM-3DMM fitting.A GM-3DMM-based face recognition framework by fusing multiple experts, which has achieved state-of-the-art result on the Multi-PIE face dataset.A new 3D face dataset, SURREY-JNU, comprising 942 3D face scans of people with mixed backgrounds. 3D Morphable Face Models (3DMM) have been used in pattern recognition for some time now. They have been applied as a basis for 3D face recognition, as well as in an assistive role for 2D face recognition to perform geometric and photometric normalisation of the input image, or in 2D face recognition system training. The statistical distribution underlying 3DMM is Gaussian. However, the single-Gaussian model seems at odds with reality when we consider different cohorts of data, e.g.Black and Chinese faces. Their means are clearly different. This paper introduces the Gaussian Mixture 3DMM (GM-3DMM) which models the global population as a mixture of Gaussian subpopulations, each with its own mean. The proposed GM-3DMM extends the traditional 3DMM naturally, by adopting a shared covariance structure to mitigate small sample estimation problems associated with data in high dimensional spaces. We construct a GM-3DMM, the training of which involves a multiple cohort dataset, SURREY-JNU, comprising 942 3D face scans of people with mixed backgrounds. Experiments in fitting the GM-3DMM to 2D face images to facilitate their geometric and photometric normalisation for pose and illumination invariant face recognition demonstrate the merits of the proposed mixture of Gaussians 3D face model.
Pattern Recognition | 2017
Guosheng Hu; Fei Yan; Josef Kittler; William J. Christmas; Chi-Ho Chan; Zhen-Hua Feng; Patrik Huber
We propose an efficient stepwise optimisation (ESO) strategy that optimises sequentially the pose, shape, light direction, light strength and skin texture parameters in separate steps leading to an accurate and efficient fitting.A perspective camera and Phong reflectance model are used to model the geometric projection and illumination respectively. Linear methods that are adapted to camera and illumination models are proposed.We propose a fully automatic face recognition system based on ESO. This system supports 3D-assisted global and local feature extraction. 3D face reconstruction of shape and skin texture from a single 2D image can be performed using a 3D Morphable Model (3DMM) in an analysis-by-synthesis approach. However, performing this reconstruction (fitting) efficiently and accurately in a general imaging scenario is a challenge. Such a scenario would involve a perspective camera to describe the geometric projection from 3D to 2D, and the Phong model to characterise illumination. Under these imaging assumptions the reconstruction problem is nonlinear and, consequently, computationally very demanding. In this work, we present an efficient stepwise 3DMM-to-2D image-fitting procedure, which sequentially optimises the pose, shape, light direction, light strength and skin texture parameters in separate steps. By linearising each step of the fitting process we derive closed-form solutions for the recovery of the respective parameters, leading to efficient fitting. The proposed optimisation process involves all the pixels of the input image, rather than randomly selected subsets, which enhances the accuracy of the fitting. It is referred to as Efficient Stepwise Optimisation (ESO).The proposed fitting strategy is evaluated using reconstruction error as a performance measure. In addition, we demonstrate its merits in the context of a 3D-assisted 2D face recognition system which detects landmarks automatically and extracts both holistic and local features using a 3DMM. This contrasts with most other methods which only report results that use manual face landmarking to initialise the fitting. Our method is tested on the public CMU-PIE and Multi-PIE face databases, as well as one internal database. The experimental results show that the face reconstruction using ESO is significantly faster, and its accuracy is at least as good as that achieved by the existing 3DMM fitting algorithms. A face recognition system integrating ESO to provide a pose and illumination invariant solution compares favourably with other state-of-the-art methods. In particular, it outperforms deep learning methods when tested on the Multi-PIE database.
Information Sciences | 2017
Changbin Shao; Xiaoning Song; Zhen-Hua Feng; Xiaojun Wu; Yuhui Zheng
Abstract In this study, we present a new sparse-representation-based face-classification algorithm that exploits dynamic dictionary optimization on an extended dictionary using synthesized faces. More specifically, given a dictionary consisting of face examples, we first augment the dictionary with a set of virtual faces generated by calculating the image difference of a pair of faces. This results in an extended dictionary with hybrid training samples, which enhances the capacity of the dictionary to represent new samples. Second, to reduce the redundancy of the extended dictionary and improve the classification accuracy, we use a dictionary-optimization method. We truncate the extended dictionary with a more compact structure by discarding the original samples with small contributions to represent a test sample. Finally, we perform sparse-representation-based face classification using the optimized dictionary. Experimental results obtained using the AR and FERRET face datasets demonstrate the superiority of the proposed method in terms of accuracy, especially for small-sample-size problems.
Collaboration
Dive into the Zhen-Hua Feng's collaboration.
French Institute for Research in Computer Science and Automation
View shared research outputs