Is this you? Create Your Porfile

Jianshu Li

National University of Singapore

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jianshu Li is active.

Explore More

Publication

Featured researches published by Jianshu Li.

international conference on multimodal interfaces | 2016

Happiness level prediction with sequential inputs via multiple regressions

Jianshu Li; Sujoy Roy; Jiashi Feng; Terence Sim

This paper presents our solution submitted to the Emotion Recognition in the Wild (EmotiW 2016) group-level happiness intensity prediction sub-challenge. The objective of this sub-challenge is to predict the overall happiness level given an image of a group of people in a natural setting. We note that both the global setting and the faces of the individuals in the image influence the group-level happiness intensity of the image. Hence the challenge lies in building a solution that incorporates both these factors and also considers their right combination. Our proposed solution incorporates both these factors as a combination of global and local information. We use a convolutional neural network to extract discriminative face features, and a recurrent neural network to selectively memorize the important features to perform the group-level happiness prediction task. Experimental evaluations show promising performance improvements, resulting in Root Mean Square Error (RMSE) reduction of about 0.5 units on the test set compared to the baseline algorithm that uses only global information.

acm multimedia | 2016

Robust Face Recognition with Deep Multi-View Representation Learning

Jianshu Li; Jian Zhao; Fang Zhao; Hao Liu; Jing Li; Shengmei Shen; Jiashi Feng; Terence Sim

This paper describes our proposed method targeting at the MSR Image Recognition Challenge MS-Celeb-1M. The challenge is to recognize one million celebrities from their face images captured in the real world. The challenge provides a large scale dataset crawled from the Web, which contains a large number of celebrities with many images for each subject. Given a new testing image, the challenge requires an identify for the image and the corresponding confidence score. To complete the challenge, we propose a two-stage approach consisting of data cleaning and multi-view deep representation learning. The data cleaning can effectively reduce the noise level of training data and thus improves the performance of deep learning based face recognition models. The multi-view representation learning enables the learned face representations to be more specific and discriminative. Thus the difficulties of recognizing faces out of a huge number of subjects are substantially relieved. Our proposed method achieves a coverage of 46.1% at 95% precision on the random set and a coverage of 33.0% at 95% precision on the hard set of this challenge.

european conference on computer vision | 2018

Multi-fiber Networks for Video Recognition

Yunpeng Chen; Yannis Kalantidis; Jianshu Li; Shuicheng Yan; Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks. To this end, we present the novel Multi-Fiber architecture that slices a complex neural network into an ensemble of lightweight networks or fibers that run through the network. To facilitate information flow between fibers we further incorporate multiplexer modules and end up with an architecture that reduces the computational cost of 3D networks by an order of magnitude, while increasing recognition performance at the same time. Extensive experimental results show that our multi-fiber architecture significantly boosts the efficiency of existing convolution networks for both image and video recognition tasks, achieving state-of-the-art performance on UCF-101, HMDB-51 and Kinetics datasets. Our proposed model requires over 9\(\times \) and 13\(\times \) less computations than the I3D [1] and R(2+1)D [2] models, respectively, yet providing higher accuracy.

international joint conference on artificial intelligence | 2018

3D-Aided Deep Pose-Invariant Face Recognition

Jian Zhao; Lin Xiong; Yu Cheng; Yi Cheng; Jianshu Li; Li Zhou; Yan Xu; Jayashree Karlekar; Sugiri Pranata; Shengmei Shen; Junliang Xing; Shuicheng Yan; Jiashi Feng

Learning from synthetic faces, though perhaps appealing for high data efficiency, may not bring satisfactory performance due to the distribution discrepancy of the synthetic and real face images. To mitigate this gap, we propose a 3D-Aided Deep Pose-Invariant Face Recognition Model (3D-PIM), which automatically recovers realistic frontal faces from arbitrary poses through a 3D face model in a novel way. Specifically, 3D-PIM incorporates a simulator with the aid of a 3D Morphable Model (3D MM) to obtain shape and appearance prior for accelerating face normalization learning, requiring less training data. It further leverages a globallocal Generative Adversarial Network (GAN) with multiple critical improvements as a refiner to enhance the realism of both global structures and local details of the face simulator’s output using unlabelled real data only, while preserving the identity information. Qualitative and quantitative experiments on both controlled and in-the-wild benchmarks clearly demonstrate superiority of the proposed model over state-of-the-arts.

computer vision and pattern recognition | 2017

3D-Assisted Coarse-to-Fine Extreme-Pose Facial Landmark Detection

Shengtao Xiao; Jianshu Li; Yunpeng Chen; Zhecan Wang; Jiashi Feng; Shuicheng Yan; Ashraf A. Kassim

We propose a novel 3D-assisted coarse-to-fine extreme-pose facial landmark detection system in this work. For a given face image, our system first refines the face bounding box with landmark locations inferred from a 3D face model generated by a Recurrent 3D Regressor at coarse level. Another R3R is then employed to fit a 3D face model onto the 2D face image cropped with the refined bounding box at fine-scale. 2D landmark locations inferred from the fitted 3D face are further adjusted with the popular 2D regression method, i.e. LBF. The 3D-assisted coarse-to-fine strategy and the 2D adjustment process explicitly ensure both the robustness to extreme face poses and bounding box disturbance and the accuracy towards pixel-level landmark displacement. Extensive experiments on the Menpo Challenge test sets demonstrate the superior performance of our system.

IEEE Transactions on Circuits and Systems for Video Technology | 2017

Towards a Comprehensive Face Detector in the Wild

Jianshu Li; Luoqi Liu; Jianan Li; Jiashi Feng; Shuicheng Yan; Terence Sim

In this paper, we aim to build a comprehensive face detection system which provides a one-stop solution to various practical challenges for face detection in realistic scenarios, e.g., detecting faces from multiple-views, faces with occlusions, exaggerated expressions or blurred faces. Moreover, we introduce an automatic data harvest algorithm to effectively improve the generalization performance of the system even when collecting training faces containing various challenging patterns is difficult. In particular, we introduce three critical components to build the system, i.e., a recently widely used deep convolutional neural network (CNN), a novel blur-aware bi-channel network architecture, and a new self-learning mechanism capable of exploiting video contexts continuously. The aforementioned challenges except for detecting blurred faces can potentially be addressed by the CNN component owing its robustness to local deformation of target faces. The more challenging problem of detecting blurred faces is addressed by the bi-channel architecture component which processes blurred and clear faces adaptively. In addition, to address the difficulties in improving the generalization performance of the learning-based face detection system, we introduce a video-context-based self-learning mechanism into the system, which enables the system to continuously enhance its performance by harvesting faces with challenging training patterns automatically. To exploit video context, the detector is applied to massive unlabeled videos, and challenging faces are captured based on temporal inference. These recaptured faces, generally corresponding to one or multiple challenges mentioned above, are fed into the detection system to further improve its performance. Extensive experiments with the proposed detection system provide new state-of-the-art performance on FDDB data set, PASCAL face data set, AFW data set, and WIDER Face data set.

acm multimedia | 2018

Multi-Human Parsing Machines

Jianshu Li; Jian Zhao; Yunpeng Chen; Sujoy Roy; Shuicheng Yan; Jiashi Feng; Terence Sim

Human parsing is an important task in human-centric analysis. Despite the remarkable progress in single-human parsing, the more realistic case of multi-human parsing remains challenging in terms of the data and the model. Compared with the considerable number of available single-human parsing datasets, the datasets for multi-human parsing are very limited in number mainly due to the huge annotation effort required. Besides the data challenge to multi-human parsing, the persons in real-world scenarios are often entangled with each other due to close interaction and body occlusion, making it difficult to distinguish body parts from different person instances. In this paper we propose the Multi-Human Parsing Machines (MHPM) system, which contains an MHP Montage model and an MHP Solver, to address both challenges in multi-human parsing. Specifically, the MHP Montage model in MHPM generates realistic images with multiple persons together with the parsing labels. It intelligently composes single persons onto background scene images while maintaining the structural information between persons and the scene. The generated images can be used to train better multi-human parsing algorithms. On the other hand, the MHP Solver in MHPM solves the bottleneck of distinguishing multiple entangled persons with close interaction. It employs a Group-Individual Push and Pull (GIPP) loss function, which can effectively separate persons with close interaction. We experimentally show that the proposed MHPM can achieve state-of-the-art performance on the multi-human parsing benchmark and the person individualization benchmark, which distinguishes closely entangled person instances.

ieee international conference on automatic face gesture recognition | 2017

BoxFlow: Unsupervised Face Detector Adaptation from Images to Videos

Jianshu Li; Jiashi Feng; Luoqi Liu; Terence Sim

Face detectors are usually trained on static images but deployed in the wild such as surveillance videos. Due to the domain shift between images and videos, directly applying the image-based face detectors onto videos usually gives unsatisfactory performance. In this paper, we introduce the BoxFlow – a new unsupervised detector adaptation method that can effectively adapt a face detector pre-trained on static images to videos. BoxFlow unsupervisedly adapts face detectors through fully exploiting the motion contexts across video frames. In particular, BoxFlow introduces three novel components: (1) generalized heat map representation of face locations with augmented shape flexibility; (2) motion based temporal contextual regularization among adjacent frames for unsupervised face detection refinement; (3) a self-paced learning strategy that adapts face detectors from easy data samples to challenging ones progressively. With these key components, we develop a systematic unsupervised face detector adaptation framework to help face detectors adapt to various deployed environments. Extensive experiments on the IDA dataset clearly demonstrate the superiority of our proposed method. Without utilizing any annotation, the BoxFlow achieves about 10%-20% performance gain in terms of Average Precision than directly applying image-based face detectors.

neural information processing systems | 2017

Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis

Jian Zhao; Lin Xiong; Panasonic Karlekar Jayashree; Jianshu Li; Fang Zhao; Zhecan Wang; Panasonic Sugiri Pranata; Panasonic Shengmei Shen; Shuicheng Yan; Jiashi Feng

acm multimedia | 2015