Shihong Lao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shihong Lao is active.

Explore More

Publication

Featured researches published by Shihong Lao.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2007

High-Performance Rotation Invariant Multiview Face Detection

Chang Huang; Haizhou Ai; Yuan Li; Shihong Lao

Rotation invariant multiview face detection (MVFD) aims to detect faces with arbitrary rotation-in-plane (RIP) and rotation-off-plane (ROP) angles in still images or video sequences. MVFD is crucial as the first step in automatic face processing for general applications since face images are seldom upright and frontal unless they are taken cooperatively. In this paper, we propose a series of innovative methods to construct a high-performance rotation invariant multiview face detector, including the width-first-search (WFS) tree detector structure, the vector boosting algorithm for learning vector-output strong classifiers, the domain-partition-based weak learning method, the sparse feature in granular space, and the heuristic search for sparse feature selection. As a result of that, our multiview face detector achieves low computational complexity, broad detection scope, and high detection accuracy on both standard testing sets and real-life images

ieee international conference on automatic face gesture recognition | 2004

Fast rotation invariant multi-view face detection based on real Adaboost

Bo Wu; Haizhou Ai; Chang Huang; Shihong Lao

In this paper, we propose a rotation invariant multi-view face detection method based on Real Adaboost algorithm. Human faces are divided into several categories according to the variant appearance from different viewpoints. For each view category, weak classifiers are configured as confidence-rated look-up-table (LUT) of Haar feature. Real Adaboost algorithm is used to boost these weak classifiers and construct a nesting-structured face detector. To make it rotation invariant, we divide the whole 360-degree range into 12 sub-ranges and construct their corresponding view based detectors separately. To improve performance, a pose estimation method is introduced and results in a processing speed of four frames per second on 320/spl times/240 sized image. Experiments on faces with 360-degree in-plane rotation and /spl mnplus/90-degree out-of-plane rotation are reported, of which the frontal face detector subsystem retrieves 94.5% of the faces with 57 false alarms on the CMU+MlT frontal face test set and the multi-view face detector subsystem retrieves 89.8% of the faces with 221 false alarms on the CMU profile face test set.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2008

Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Life Spans

Yuan Li; Haizhou Ai; Takayoshi Yamashita; Shihong Lao; Masato Kawade

Tracking object in low frame rate video or with abrupt motion poses two main difficulties which most conventional tracking methods can hardly handle: 1) poor motion continuity and increased search space; 2) fast appearance variation of target and more background clutter due to increased search space. In this paper, we address the problem from a view which integrates conventional tracking and detection, and present a temporal probabilistic combination of discriminative observers of different lifespans. Each observer is learned from different ranges of samples, with different subsets of features, to achieve varying level of discriminative power at varying cost. An efficient fusion and temporal inference is then done by a cascade particle filter which consists of multiple stages of importance sampling. Experiments show significantly improved accuracy of the proposed approach in comparison with existing tracking methods, under the condition of low frame rate data and abrupt motion of both target and camera.

international conference on computer vision | 2007

Eyeblink-based Anti-Spoofing in Face Recognition from a Generic Webcamera

Gang Pan; Lin Sun; Zhaohui Wu; Shihong Lao

We present a real-time liveness detection approach against photograph spoofing in face recognition, by recognizing spontaneous eyeblinks, which is a non-intrusive manner. The approach requires no extra hardware except for a generic webcamera. Eyeblink sequences often have a complex underlying structure. We formulate blink detection as inference in an undirected conditional graphical framework, and are able to learn a compact and efficient observation and transition potentials from data. For purpose of quick and accurate recognition of the blink behavior, eye closity, an easily-computed discriminative measure derived from the adaptive boosting algorithm, is developed, and then smoothly embedded into the conditional model. An extensive set of experiments are presented to show effectiveness of our approach and how it outperforms the cascaded Adaboost and HMM in task of eyeblink detection.

international conference on computer vision | 2005

Vector boosting for rotation invariant multi-view face detection

Chang Huang; Haizhou Ai; Yuan Li; Shihong Lao

In this paper, we propose a novel tree-structured multiview face detector (MVFD), which adopts the coarse-to-fine strategy to divide the entire face space into smaller and smaller subspaces. For this purpose, a newly extended boosting algorithm named vector boosting is developed to train the predictors for the branching nodes of the tree that have multicomponents outputs as vectors. Our MVFD covers a large range of the face space, say, +/-45/spl deg/ rotation in plane (RIP) and +/-90/spl deg/ rotation off plane (ROP), and achieves high accuracy and amazing speed (about 40 ms per frame on a 320 /spl times/ 240 video sequence) compared with previous published works. As a result, by simply rotating the detector 90/spl deg/, 180/spl deg/ and 270/spl deg/, a rotation invariant (360/spl deg/ RIP) MVFD is implemented that achieves real time performance (11 fps on a 320 /spl times/ 240 video sequence) with high accuracy.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Multi-View Discriminant Analysis

Meina Kan; Shiguang Shan; Haihong Zhang; Shihong Lao; Xilin Chen

In many computer vision systems, the same object can be observed at varying viewpoints or even by different sensors, which brings in the challenging demand for recognizing objects from distinct even heterogeneous views. In this work we propose a Multi-view Discriminant Analysis (MvDA) approach, which seeks for a single discriminant common space for multiple views in a non-pairwise manner by jointly learning multiple view-specific linear transforms. Specifically, our MvDA is formulated to jointly solve the multiple linear transforms by optimizing a generalized Rayleigh quotient, i.e., maximizing the between-class variations and minimizing the within-class variations from both intra-view and inter-view in the common space. By reformulating this problem as a ratio trace problem, the multiple linear transforms are achieved analytically and simultaneously through generalized eigenvalue decomposition. Furthermore, inspired by the observation that different views share similar data structures, a constraint is introduced to enforce the view-consistency of the multiple linear transforms. The proposed method is evaluated on three tasks: face recognition across pose, photo versus. sketch face recognition, and visual light image versus near infrared image face recognition on Multi-PIE, CUFSF and HFB databases respectively. Extensive experiments show that our MvDA achieves significant improvements compared with the best known results.

computer vision and pattern recognition | 2009

Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses

Junliang Xing; Haizhou Ai; Shihong Lao

This paper presents an online detection-based two-stage multi-object tracking method in dense visual surveillances scenarios with a single camera. In the local stage, a particle filter with observer selection that could deal with partial object occlusion is used to generate a set of reliable tracklets. In the global stage, the detection responses are collected from a temporal sliding window to deal with ambiguity caused by full object occlusion to generate a set of potential tracklets. The reliable tracklets generated in the local stage and the potential tracklets generated within the temporal sliding window are associated by Hungarian algorithm on a modified pairwise tracklets association cost matrix to get the global optimal association. This method is applied to the pedestrian class and evaluated on two challenging datasets. The experimental results prove the effectiveness of our method.

international conference on acoustics, speech, and signal processing | 2007

Person-Specific SIFT Features for Face Recognition

Jun Luo; Yong Ma; Erina Takikawa; Shihong Lao; Masato Kawade; Bao-Liang Lu

Scale invariant feature transform (SIFT) proposed by Lowe has been widely and successfully applied to object detection and recognition. However, the representation ability of SIFT features in face recognition has rarely been investigated systematically. In this paper, we proposed to use the person-specific SIFT features and a simple non-statistical matching strategy combined with local and global similarity on key-points clusters to solve face recognition problems. Large scale experiments on FERET and CAS-PEAL face databases using only one training sample per person have been carried out to compare it with other non person-specific features such as Gabor wavelet feature and local binary pattern feature. The experimental results demonstrate the robustness of SIFT features to expression, accessory and pose variations.

computer vision and pattern recognition | 2007

Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Lifespans

Yuan Li; Haizhou Ai; Takayoshi Yamashita; Shihong Lao; Masato Kawade

Tracking object in low frame rate video or with abrupt motion poses two main difficulties which conventional tracking methods can barely handle: 1) poor motion continuity and increased search space; 2) fast appearance variation of target and more background clutter due to increased search space. In this paper, we address the problem from a view which integrates conventional tracking and detection, and present a temporal probabilistic combination of discriminative observers of different lifespans. Each observer is learned from different ranges of samples, with different subsets of features, to achieve varying level of discriminative power at varying cost. An efficient fusion and temporal inference is then done by a cascade particle filter which consists of multiple stages of importance sampling. Experiments show significantly improved accuracy of the proposed approach in comparison with existing tracking methods, under the condition of low frame rate data and abrupt motion of both target and camera.

asian conference on computer vision | 2012

Histogram of oriented normal vectors for object recognition with a depth sensor

Shuai Tang; Xiaoyu Wang; Xutao Lv; Tony X. Han; James M. Keller; Zhihai He; Marjorie Skubic; Shihong Lao

We propose a feature, the Histogram of Oriented Normal Vectors (HONV), designed specifically to capture local geometric characteristics for object recognition with a depth sensor. Through our derivation, the normal vector orientation represented as an ordered pair of azimuthal angle and zenith angle can be easily computed from the gradients of the depth image. We form the HONV as a concatenation of local histograms of azimuthal angle and zenith angle. Since the HONV is inherently the local distribution of the tangent plane orientation of an object surface, we use it as a feature for object detection/classification tasks. The object detection experiments on the standard RGB-D dataset [1] and a self-collected Chair-D dataset show that the HONV significantly outperforms traditional features such as HOG on the depth image and HOG on the intensity image, with an improvement of 11.6% in average precision. For object classification, the HONV achieved 5.0% improvement over state-of-the-art approaches.

Explore More