Is this you? Create Your Porfile

Zhanpeng Zhang

The Chinese University of Hong Kong

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhanpeng Zhang is active.

Explore More

Publication

Featured researches published by Zhanpeng Zhang.

european conference on computer vision | 2014

Facial Landmark Detection by Deep Multi-task Learning

Zhanpeng Zhang; Ping Luo; Chen Change Loy; Xiaoou Tang

Facial landmark detection has long been impeded by the problems of occlusion and pose variation. Instead of treating the detection task as a single and independent problem, we investigate the possibility of improving detection robustness through multi-task learning. Specifically, we wish to optimize facial landmark detection together with heterogeneous but subtly correlated tasks, e.g. head pose estimation and facial attribute inference. This is non-trivial since different tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, with task-wise early stopping to facilitate learning convergence. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art method based on cascaded deep model [21].

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Learning Deep Representation for Face Alignment with Auxiliary Attributes

Zhanpeng Zhang; Ping Luo; Chen Change Loy; Xiaoou Tang

In this study, we show that landmark detection or face alignment task is not a single and independent problem. Instead, its robustness can be greatly improved with auxiliary information. Specifically, we jointly optimize landmark detection together with the recognition of heterogeneous but subtly correlated facial attributes, such as gender, expression, and appearance attributes. This is non-trivial since different attribute inference tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, which not only learns the inter-task correlation but also employs dynamic task coefficients to facilitate the optimization convergence when learning multiple complex tasks. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing face alignment methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art methods based on cascaded deep model.

international conference on computer vision | 2015

Learning Social Relation Traits from Face Images

Zhanpeng Zhang; Ping Luo; Chen Change Loy; Xiaoou Tang

Social relation defines the association, e.g., warm, friendliness, and dominance, between two or more people. Motivated by psychological studies, we investigate if such fine grained and high-level relation traits can be characterised and quantified from face images in the wild. To address this challenging problem we propose a deep model that learns a rich face representation to capture gender, expression, head pose, and age-related attributes, and then performs pairwise-face reasoning for relation prediction. To learn from heterogeneous attribute sources, we formulate a new network architecture with a bridging layer to leverage the inherent correspondences among these datasets. It can also cope with missing target attribute labels. Extensive experiments show that our approach is effective for fine-grained social relation learning in images and videos.

european conference on computer vision | 2016

Joint Face Representation Adaptation and Clustering in Videos

Zhanpeng Zhang; Ping Luo; Chen Change Loy; Xiaoou Tang

Clustering faces in movies or videos is extremely challenging since characters’ appearance can vary drastically under different scenes. In addition, the various cinematic styles make it difficult to learn a universal face representation for all videos. Unlike previous methods that assume fixed handcrafted features for face clustering, in this work, we formulate a joint face representation adaptation and clustering approach in a deep learning framework. The proposed method allows face representation to gradually adapt from an external source domain to a target video domain. The adaptation of deep representation is achieved without any strong supervision but through iteratively discovered weak pairwise identity constraints derived from potentially noisy face clustering result. Experiments on three benchmark video datasets demonstrate that our approach generates character clusters with high purity compared to existing video face clustering methods, which are either based on deep face representation (without adaptation) or carefully engineered features.

International Journal of Computer Vision | 2018

From Facial Expression Recognition to Interpersonal Relation Prediction

Zhanpeng Zhang; Ping Luo; Chen Change Loy; Xiaoou Tang

Interpersonal relation defines the association, e.g., warm, friendliness, and dominance, between two or more people. We investigate if such fine-grained and high-level relation traits can be characterized and quantified from face images in the wild. We address this challenging problem by first studying a deep network architecture for robust recognition of facial expressions. Unlike existing models that typically learn from facial expression labels alone, we devise an effective multitask network that is capable of learning from rich auxiliary attributes such as gender, age, and head pose, beyond just facial expression data. While conventional supervised training requires datasets with complete labels (e.g., all samples must be labeled with gender, age, and expression), we show that this requirement can be relaxed via a novel attribute propagation method. The approach further allows us to leverage the inherent correspondences between heterogeneous attribute sources despite the disparate distributions of different datasets. With the network we demonstrate state-of-the-art results on existing facial expression recognition benchmarks. To predict inter-personal relation, we use the expression recognition network as branches for a Siamese model. Extensive experiments show that our model is capable of mining mutual context of faces for accurate fine-grained interpersonal prediction.

Pattern Recognition | 2015

Hierarchical facial landmark localization via cascaded random binary patterns

Zhanpeng Zhang; Wei Zhang; Huijun Ding; Jianzhuang Liu; Xiaoou Tang

The main challenge of facial landmark localization in real-world application is that the large changes of head pose and facial expressions cause substantial image appearance variations. To avoid high dimensional facial shape regression, we propose a hierarchical pose regression approach, estimating the head rotation, face components, and facial landmarks hierarchically. The regression process works in a unified cascaded fern framework with binary patterns. We present generalized gradient boosted ferns (GBFs) for the regression framework, which give better performance than ferns. The framework also achieves real time performance. We verify our method on the latest benchmark datasets and show that it achieves the state-of-the-art performance. HighlightsA regression framework is designed for multi-view facial landmark localization.The use of comparison based feature is highly efficient for landmark localization.Gradient-boosted decision tree is superior to random forest for localization task.Accuracy and speed are tested on the widely used open dataset: LFW, AFLW and 300-W.

acm multimedia | 2013

Facial landmark localization based on hierarchical pose regression with cascaded random ferns

Zhanpeng Zhang; Wei Zhang; Jianzhuang Liu; Xiaoou Tang

The main challenge of facial landmark localization in real-world application is that the large changes of head pose and facial expressions cause substantial image appearance variations. To avoid high dimensional regression in the 3D and 2D facial pose spaces simultaneously, we propose a hierarchical pose regression approach, estimating the head rotation, facial components and landmarks hierarchically. The regression process works in a unified cascaded fern framework. We present generalized gradient boosted ferns (GBFs) for the regression framework, which give better performance than traditional ferns. The framework also achieves real time performance. We verify our method on the latest benchmark datasets. The results show that it outperforms state-of-the-art methods in both accuracy and speed.

IEEE Transactions on Circuits and Systems for Video Technology | 2014

Multiview Facial Landmark Localization in RGB-D Images via Hierarchical Regression With Binary Patterns

Zhanpeng Zhang; Wei Zhang; Jianzhuang Liu; Xiaoou Tang

In this paper, we propose a real-time system of multiview facial landmark localization in RGB-D images. The facial landmark localization problem is formulated into a regression framework, which estimates both the head pose and the landmark positions. In this framework, we propose a coarse-to-fine approach to handle the high-dimensional regression output. At first, 3-D face position and rotation are estimated from the depth observation via a random regression forest. Afterward, the 3-D pose is refined by fusing the estimation from the RGB observation. Finally, the landmarks are located from the RGB observation with gradient boosted decision trees in a pose conditional model. The benefits of the proposed localization framework are twofold: the pose estimation and landmark localization are solved with hierarchical regression, which is different from previous approaches where the pose and landmark locations are iteratively optimized, which relies heavily on the initial pose estimation; due to the different characters of the RGB and depth cues, they are used for landmark localization at different stages and incorporated in a robust manner. In the experiments, we show that the proposed approach outperforms state-of-the-art algorithms on facial landmark localization with RGB-D input.

Archive | 2014