gmei Shen
Panasonic
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by gmei Shen.
IEEE Transactions on Multimedia | 2018
Jianan Li; Xiaodan Liang; Shengmei Shen; Tingfa Xu; Jiashi Feng; Shuicheng Yan
In this paper, we consider the problem of pedestrian detection in natural scenes. Intuitively, instances of pedestrians with different spatial scales may exhibit dramatically different features. Thus, large variance in instance scales, which results in undesirable large intracategory variance in features, may severely hurt the performance of modern object instance detection methods. We argue that this issue can be substantially alleviated by the divide-and-conquer philosophy. Taking pedestrian detection as an example, we illustrate how we can leverage this philosophy to develop a Scale-Aware Fast R-CNN (SAF R-CNN) framework. The model introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges. Outputs from all of the subnetworks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales, via a gate function defined over the sizes of object proposals. Extensive evaluations on several challenging pedestrian detection datasets well demonstrate the effectiveness of the proposed SAF R-CNN. Particularly, our method achieves state-of-the-art performance on Caltech [P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An evaluation of the state of the art,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 4, pp. 743–761, Apr. 2012], and obtains competitive results on INRIA [N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2005, pp. 886–893], ETH [A. Ess, B. Leibe, and L. V. Gool, “Depth and appearance for mobile scene analysis,” in Proc. Int. Conf. Comput. Vis., 2007, pp. 1–8], and KITTI [A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2012, pp. 3354–3361].
acm multimedia | 2016
Jianshu Li; Jian Zhao; Fang Zhao; Hao Liu; Jing Li; Shengmei Shen; Jiashi Feng; Terence Sim
This paper describes our proposed method targeting at the MSR Image Recognition Challenge MS-Celeb-1M. The challenge is to recognize one million celebrities from their face images captured in the real world. The challenge provides a large scale dataset crawled from the Web, which contains a large number of celebrities with many images for each subject. Given a new testing image, the challenge requires an identify for the image and the corresponding confidence score. To complete the challenge, we propose a two-stage approach consisting of data cleaning and multi-view deep representation learning. The data cleaning can effectively reduce the noise level of training data and thus improves the performance of deep learning based face recognition models. The multi-view representation learning enables the learned face representations to be more specific and discriminative. Thus the difficulties of recognizing faces out of a huge number of subjects are substantially relieved. Our proposed method achieves a coverage of 46.1% at 95% precision on the random set and a coverage of 33.0% at 95% precision on the hard set of this challenge.
international joint conference on artificial intelligence | 2018
Jian Zhao; Lin Xiong; Yu Cheng; Yi Cheng; Jianshu Li; Li Zhou; Yan Xu; Jayashree Karlekar; Sugiri Pranata; Shengmei Shen; Junliang Xing; Shuicheng Yan; Jiashi Feng
Learning from synthetic faces, though perhaps appealing for high data efficiency, may not bring satisfactory performance due to the distribution discrepancy of the synthetic and real face images. To mitigate this gap, we propose a 3D-Aided Deep Pose-Invariant Face Recognition Model (3D-PIM), which automatically recovers realistic frontal faces from arbitrary poses through a 3D face model in a novel way. Specifically, 3D-PIM incorporates a simulator with the aid of a 3D Morphable Model (3D MM) to obtain shape and appearance prior for accelerating face normalization learning, requiring less training data. It further leverages a globallocal Generative Adversarial Network (GAN) with multiple critical improvements as a refiner to enhance the realism of both global structures and local details of the face simulator’s output using unlabelled real data only, while preserving the identity information. Qualitative and quantitative experiments on both controlled and in-the-wild benchmarks clearly demonstrate superiority of the proposed model over state-of-the-arts.
pacific rim conference on multimedia | 2017
Keke Liu; Yazhou Liu; Quansen Sun; Sugiri Pranata; Shengmei Shen
Driver head analysis is of paramount interest for the advanced driver assistance systems (ADAS). Recently proposed methods almost rely on training with labeled samples, especially deep learning. However, the labeling process is a subjective and tiresome manual task. Even trickier, our application scene is driver assistance systems, where the training dataset is more difficult to capture. In this paper, we present a rendering pipeline to synthesize virtual-world driver head pose and facial landmark dataset with annotation by computer 3D animation software, in which we consider driver’s gender, dress, hairstyle, hats and glasses. This large amounts of virtual-world labeled dataset and a small amount of real-world labeled dataset are trained together firstly by deeply supervised transfer metric learning method. We treat it as a cross-domain task, the labeled virtual data is a source domain and the unlabeled real-world data is a target domain. By exploiting the feature self-learning characteristic of deep networks, we find the common feature subspace between them, and transfer discriminative knowledge from the labeled source domain to the labeled target domain. Finally we employ a small number of real-world dataset to fine-tune the model iteratively. Our experiments show high accuracy on real-world driver head images.
arXiv: Computer Vision and Pattern Recognition | 2017
Lin Xiong; Jayashree Karlekar; Jian Zhao; Jiashi Feng; Sugiri Pranata; Shengmei Shen
international conference on computer vision | 2017
Yu Cheng; Jian Zhao; Zhecan Wang; Yan Xu; Karlekar Jayashree; Shengmei Shen; Jiashi Feng
computer vision and pattern recognition | 2018
Jian Zhao; Yu Cheng; Yan Xu; Lin Xiong; Jianshu Li; Fang Zhao; Karlekar Jayashree; Sugiri Pranata; Shengmei Shen; Junliang Xing; Shuicheng Yan; Jiashi Feng
computer vision and pattern recognition | 2018
Yan Xu; Xi Ouyang; Yu Cheng; Shining Yu; Lin Xiong; Choon-Ching Ng; Sugiri Pranata; Shengmei Shen; Junliang Xing
arXiv: Computer Vision and Pattern Recognition | 2018
Jian Zhao; Yu Cheng; Yi Cheng; Yang Yang; Haochong Lan; Fang Zhao; Lin Xiong; Yan Xu; Jianshu Li; Sugiri Pranata; Shengmei Shen; Junliang Xing; Hengzhu Liu; Shuicheng Yan; Jiashi Feng
arXiv: Computer Vision and Pattern Recognition | 2018
Jubin Johnson; Shunsuke Yasugi; Yoichi Sugino; Sugiri Pranata; Shengmei Shen