Is this you? Create Your Porfile

Zenghai Chen

Hong Kong Polytechnic University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zenghai Chen is active.

Explore More

Publication

Featured researches published by Zenghai Chen.

international conference on multimodal interfaces | 2014

Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning

Junkai Chen; Zenghai Chen; Zheru Chi; Hong Fu

This paper presents our proposed approach for the second Emotion Recognition in The Wild Challenge. We propose a new feature descriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG_TOP) to represent facial expressions. We also explore the properties of visual features and audio features, and adopt Multiple Kernel Learning (MKL) to find an optimal feature fusion. An SVM with multiple kernels is trained for the facial expression classification. Experimental results demonstrate that our method achieves a promising performance. The overall classification accuracy on the validation set and test set are 40.21% and 45.21%, respectively.

Neurocomputing | 2013

Multi-instance multi-label image classification: A neural approach

Zenghai Chen; Zheru Chi; Hong Fu; Dagan Feng

In this paper, a multi-instance multi-label algorithm based on neural networks is proposed for image classification. The proposed algorithm, termed multi-instance multi-label neural network (MIMLNN), consists of two stages of MultiLayer Perceptrons (MLP). For multi-instance multi-label image classification, all the regional features are fed to the first-stage MLP, with one MLP copy processing one image region. After that, the MLP in the second stage incorporates the outputs of the first-stage MLPs to produce the final labels for the input image. The first-stage MLP is expected to model the relationship between regions and labels, while the second-stage MLP aims at capturing the label correlation for classification refinement. Error Back-Propagation (BP) approach is adopted to tune the parameters of MIMLNN. In view of that traditional gradient descent algorithm suffers from long-term dependency problem, a refined BP algorithm named Rprop is extended to effectively train MIMLNN. The experiments are conducted on a synthetic dataset and the Corel dataset. Experimental results demonstrate the superior performance of MIMLNN comparing with state-of-the-art algorithms for multi-instance multi-label image classification.

IEEE Transactions on Affective Computing | 2018

Facial Expression Recognition in Video with Multiple Feature Fusion

Junkai Chen; Zenghai Chen; Zheru Chi; Hong Fu

Video based facial expression recognition has been a long standing problem and attracted growing attention recently. The key to a successful facial expression recognition system is to exploit the potentials of audiovisual modalities and design robust features to effectively characterize the facial appearance and configuration changes caused by facial motions. We propose an effective framework to address this issue in this paper. In our study, both visual modalities (face images) and audio modalities (speech) are utilized. A new feature descriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG-TOP) is proposed to extract dynamic textures from video sequences to characterize facial appearance changes. And a new effective geometric feature derived from the warp transformation of facial landmarks is proposed to capture facial configuration changes. Moreover, the role of audio modalities on recognition is also explored in our study. We applied the multiple feature fusion to tackle the video-based facial expression recognition problems under lab-controlled environment and in the wild, respectively. Experiments conducted on the extended Cohn-Kanade (CK+) database and the Acted Facial Expression in Wild (AFEW) 4.0 database show that our approach is robust in dealing with video-based facial expression recognition problems under lab-controlled environment and in the wild compared with the other state-of-the-art methods.

systems man and cybernetics | 2012

An Adaptive Recognition Model for Image Annotation

Zenghai Chen; Hong Fu; Zheru Chi; David Dagan Feng

In this paper, an adaptive recognition model (ARM) is proposed for image annotation. The ARM consists of an adaptive classification network (CFN) and a nonlinear correlation network (CLN). The adaptive CFN aims to annotate an image with keywords, and the CLN is used to unveil the correlative information of keywords for annotation refinement. Image annotation is carried out by an ARM in two stages. In the first stage, the features extracted from regions of the input image are fed to a CFN to produce classification labels. In the second stage, the CLN uses keyword correlations learned from the training images to refine the classification result. The ARM works in a forward-propagating manner, resulting in high efficiency in image annotation. Furthermore, the computational time of an ARM is insensitive to the number of regions of the input image and the vocabulary size. In this paper, the effect of keyword correlation in image annotation is, comprehensively, investigated on a real image dataset and a synthetic image dataset. The exploitation of a controllable synthetic dataset helps to systematically study the function of keyword correlation and effectively analyze the performance of the ARM. Experimental results demonstrate the efficiency and effectiveness of the ARM.

international conference on control, automation, robotics and vision | 2010

A neural network model with adaptive structure for image annotation

Zenghai Chen; Hong Fu; Zheru Chi; David Dagan Feng

A neural network model with adaptive structure for image annotation is proposed in this paper. The adaptive structure enables the proposed model to utilize both global and regional visual features, as well as correlative information of annotated keywords for annotation. In order to achieve an approximate global optimum rather than a local optimum, both genetic algorithm and traditional back-propagation algorithm, are combined for model training. The neural network model is experimented on a synthetic image dataset with controllable parameters, which has not been used in previous image annotation experiments. Experimental results demonstrate the effectiveness of the proposed model.

asian conference on computer vision | 2014

Hierarchical Local Binary Pattern for Branch Retinal Vein Occlusion Recognition

Zenghai Chen; Hui Zhang; Zheru Chi; Hong Fu

Branch retinal vein occlusion (BRVO) is one of the most common retinal vascular diseases of the elderly that would dramatically impair one’s vision if it is not diagnosed and treated timely. Automatic recognition of BRVO could significantly reduce an ophthalmologist’s workload, make the diagnosis more efficient, and save the patients’ time and costs. In this paper, we propose for the first time, to the best of our knowledge, automatic recognition of BRVO using fundus images. In particular, we propose Hierarchical Local Binary Pattern (HLBP) to represent the visual content of an fundus image for classification. HLBP is comprised of Local Binary Pattern (LBP) in a hierarchical fashion with max-pooling. In order to evaluate the performance of HLBP, we establish a BRVO dataset for experiments. HLBP is compared with several state-of-the-art feature presentation methods on the BRVO dataset. Experimental results demonstrate the superior performance of our proposed method for BRVO recognition.

systems, man and cybernetics | 2015

Eye-Tracking Aided Digital System for Strabismus Diagnosis

Zenghai Chen; Hong Fu; Wai-Lun Lo; Zheru Chi

Strabismus is a common ophthalmic disease with a relatively high prevalence (4%). It is one of the most common vision disorders in preschool children. If it is not timely diagnosed and well treated, strabismus would cause amblyopia, and even permanent vision loss. The diagnosis is thus essential. However, most of the diagnosis methods, such as cover testing, are conducted manually by an ophthalmologist. The examination cost is relatively high and the results are subjective. In this paper, we propose an eye tracking method for strabismus diagnosis. This method allows us to develop an objective and automatic strabismus diagnosis system that could significantly increase the examination efficiency and reduce the cost. Experimental results demonstrate the effectiveness of the proposed eye tracking method for strabismus diagnosis.

systems, man and cybernetics | 2015

Learning to Detect Saliency with Deep Structure

Yu Hu; Zenghai Chen; Zheru Chi; Hong Fu

Deep learning has shown great successes in solving various problems of computer vision. To the best of our knowledge, however, little existing work applies deep learning to saliency modeling. In this paper, a new saliency model based on convolutional neural network is proposed. The proposed model is able to produce a saliency map directly from an images pixels. In the model, multi-level output values are adopted to simulate continuous values in a saliency map. Differing from most neural networks that use a relatively small number of output nodes, the output layer of our model has a large number of nodes. To make the training more efficient, an improved learning algorithm is adopted to train the model. Experimental results show that the proposed model succeeds in generating acceptable saliency maps after proper training.

international conference on image processing | 2015

Dynamic texture and geometry features for facial expression recognition in video

Junkai Chen; Zenghai Chen; Zheru Chi; Hong Fu

Facial expression recognition in video has attracted growing attention recently. In this paper, we propose to handle this problem with dynamic appearance and geometric features. We propose a new feature descriptor called HOG from Three Orthogonal Planes (HOG-TOP) to represent dynamic features. In addition, we introduce two types of geometry features to represent the facial rigid changes and non-rigid changes, respectively. Multiple Kernel Learning (MKL) is applied to find an optimal combination of two types of features. And finally a Support Vector Machine (SVM) with multiple kernels is trained for the facial expression classification. Extensive experiments conducted on the extended Cohn-Kanade dataset show that our method can achieve a competitive performance compared with the other state-of-the-art methods.

Journal of Healthcare Engineering | 2018

Strabismus Recognition Using Eye-Tracking Data and Convolutional Neural Networks

Zenghai Chen; Hong Fu; Wai-Lun Lo; Zheru Chi

Strabismus is one of the most common vision diseases that would cause amblyopia and even permanent vision loss. Timely diagnosis is crucial for well treating strabismus. In contrast to manual diagnosis, automatic recognition can significantly reduce labor cost and increase diagnosis efficiency. In this paper, we propose to recognize strabismus using eye-tracking data and convolutional neural networks. In particular, an eye tracker is first exploited to record a subjects eye movements. A gaze deviation (GaDe) image is then proposed to characterize the subjects eye-tracking data according to the accuracies of gaze points. The GaDe image is fed to a convolutional neural network (CNN) that has been trained on a large image database called ImageNet. The outputs of the full connection layers of the CNN are used as the GaDe images features for strabismus recognition. A dataset containing eye-tracking data of both strabismic subjects and normal subjects is established for experiments. Experimental results demonstrate that the natural image features can be well transferred to represent eye-tracking data, and strabismus can be effectively recognized by our proposed method.

Explore More