Shaohua Kevin Zhou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shaohua Kevin Zhou is active.

Explore More

Publication

Featured researches published by Shaohua Kevin Zhou.

IEEE Transactions on Image Processing | 2004

Visual tracking and recognition using appearance-adaptive models in particle filters

Shaohua Kevin Zhou; Rama Chellappa; Baback Moghaddam

We present an approach that incorporates appearance-adaptive models in a particle filter to realize robust visual tracking and recognition algorithms. Tracking needs modeling interframe motion and appearance changes, whereas recognition needs modeling appearance changes between frames and gallery images. In conventional tracking algorithms, the appearance model is either fixed or rapidly changing, and the motion model is simply a random walk with fixed noise variance. Also, the number of particles is typically fixed. All these factors make the visual tracker unstable. To stabilize the tracker, we propose the following modifications: an observation model arising from an adaptive appearance model, an adaptive velocity motion model with adaptive noise variance, and an adaptive number of particles. The adaptive-velocity model is derived using a first-order linear predictor based on the appearance difference between the incoming observation and the previous particle configuration. Occlusion analysis is implemented using robust statistics. Experimental results on tracking visual objects in long outdoor and indoor video sequences demonstrate the effectiveness and robustness of our tracking algorithm. We then perform simultaneous tracking and recognition by embedding them in a particle filter. For recognition purposes, we model the appearance changes between frames and gallery images by constructing the intra- and extrapersonal spaces. Accurate recognition is achieved when confronted by pose and view variations.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

From sample similarity to ensemble similarity: probabilistic distance measures in reproducing kernel Hilbert space

Shaohua Kevin Zhou; Rama Chellappa

This paper addresses the problem of characterizing ensemble similarity from sample similarity in a principled manner. Using a reproducing kernel as a characterization of sample similarity, we suggest a probabilistic distance measure in the reproducing kernel Hilbert space (RKHS) as the ensemble similarity. Assuming normality in the RKHS, we derive analytic expressions for probabilistic distance measures that are commonly used in many applications, such as Chernoff distance (or the Bhattacharyya distance as its special case), Kullback-Leibler divergence, etc. Since the reproducing kernel implicitly embeds a nonlinear mapping, our approach presents a new way to study these distances whose feasibility and efficiency is demonstrated using experiments with synthetic and real examples. Further, we extend the ensemble similarity to the reproducing kernel for ensemble and study the ensemble similarity for more general data representations.

computer vision and pattern recognition | 2008

Hierarchical, learning-based automatic liver segmentation

Haibin Ling; Shaohua Kevin Zhou; Yefeng Zheng; Bogdan Georgescu; Michael Suehling; Dorin Comaniciu

In this paper we present a hierarchical, learning-based approach for automatic and accurate liver segmentation from 3D CT volumes. We target CT volumes that come from largely diverse sources (e.g., diseased in six different organs) and are generated by different scanning protocols (e.g., contrast and non-contrast, various resolution and position). Three key ingredients are combined to solve the segmentation problem. First, a hierarchical framework is used to efficiently and effectively monitor the accuracy propagation in a coarse-to-fine fashion. Second, two new learning techniques, marginal space learning and steerable features, are applied for robust boundary inference. This enables handling of highly heterogeneous texture pattern. Third, a novel shape space initialization is proposed to improve traditional methods that are limited to similarity transformation. The proposed approach is tested on a challenging dataset containing 174 volumes. Our approach not only produces excellent segmentation accuracy, but also runs about fifty times faster than state-of-the-art solutions [7, 9].

international conference on computer vision | 2005

Image based regression using boosting method

Shaohua Kevin Zhou; Bogdan Georgescu; Xiang Sean Zhou; Dorin Comaniciu

We present a general algorithm of image based regression that is applicable to many vision problems. The proposed regressor that targets a multiple-output setting is learned using boosting method. We formulate a multiple-output regression problem in such a way that overfitting is decreased and an analytic solution is admitted. Because we represent the image via a set of highly redundant Haar-like features that can be evaluated very quickly and select relevant features through boosting to absorb the knowledge of the training data, during testing we require no storage of the training data and evaluate the regression function almost in no time. We also propose an efficient training algorithm that breaks the computational bottleneck in the greedy feature selection process. We validate the efficiency of the proposed regressor using three challenging tasks of age estimation, tumor detection, and endocardial wall localization and achieve the best performance with a dramatic speed, e.g., more than 1000 times faster than conventional data-driven techniques such as support vector regressor in the experiment of endocardial wall localization.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2007

Appearance Characterization of Linear Lambertian Objects, Generalized Photometric Stereo, and Illumination-Invariant Face Recognition

Shaohua Kevin Zhou; Gaurav Aggarwal; Rama Chellappa; David W. Jacobs

Traditional photometric stereo algorithms employ a Lambertian reflectance model with a varying albedo field and involve the appearance of only one object. In this paper, we generalize photometric stereo algorithms to handle all appearances of all objects in a class, in particular the human face class, by making use of the linear Lambertian property. A linear Lambertian object is one which is linearly spanned by a set of basis objects and has a Lambertian surface. The linear property leads to a rank constraint and, consequently, a factorization of an observation matrix that consists of exemplar images of different objects (e.g., faces of different subjects) under different, unknown illuminations. Integrability and symmetry constraints are used to fully recover the subspace bases using a novel linearized algorithm that takes the varying albedo field into account. The effectiveness of the linear Lambertian property is further investigated by using it for the problem of illumination-invariant face recognition using just one image. Attached shadows are incorporated in the model by a careful treatment of the inherent nonlinearity in Lamberts law. This enables us to extend our algorithm to perform face recognition in the presence of multiple illumination sources. Experimental results using standard data sets are presented

european conference on computer vision | 2002

Probabilistic Human Recognition from Video

Shaohua Kevin Zhou; Rama Chellappa

This paper presents a method for incorporating temporal information in a video sequence for the task of human recognition. A time series state space model, parameterized by a tracking state vector and a recognizing identity variable, is proposed to simultaneously characterize the kinematics and identity. Two sequential importance sampling (SIS) methods, a brute-force version and an efficient version, are developed to provide numerical solutions to the model. The joint distribution of both state vector and identity variable is estimated at each time instant and then propagated to the next time instant. Marginalization over the state vector yields a robust estimate of the posterior distribution of the identity variable. Due to the propagation of identity and kinematics, a degeneracy in posterior probability of the identity variable is achieved to give improved recognition. This evolving behavior is characterized using changes in entropy. The effectiveness of this approach is illustrated using experimental results on low resolution face data and upper body data.

information processing in medical imaging | 2007

Shape regression machine

Shaohua Kevin Zhou; Dorin Comaniciu

We present a machine learning approach called shape regression machine (SRM) to segmenting in real time an anatomic structure that manifests a deformable shape in a medical image. Traditional shape segmentation methods rely on various assumptions. For instance, the deformable model assumes that edge defines the shape; the Mumford-Shah variational method assumes that the regions inside/outside the (closed) contour are homogenous in intensity; and the active appearance model assumes that shape/appearance variations are linear. In addition, they all need a good initialization. In contrast, SRM poses no such restrictions. It is a two-stage approach that leverages (a) the underlying medical context that defines the anatomic structure and (b) an annotated database that exemplifies the shape and appearance variations of the anatomy. In the first stage, it solves the initialization problem as object detection and derives a regression solution that needs just one scan in principle. In the second stage, it learns a nonlinear regressor that predicts the nonrigid shape from image appearance. We also propose a boosting regression approach that supports real time segmentation. We demonstrate the effectiveness of SRM using experiments on segmenting the left ventricle endocardium from an echocardiogram of an apical four chamber view.

computer vision and pattern recognition | 2009

Learning multi-modal densities on Discriminative Temporal Interaction Manifold for group activity recognition

Ruonan Li; Rama Chellappa; Shaohua Kevin Zhou

While video-based activity analysis and recognition has received much attention, existing body of work mostly deals with single object/person case. Coordinated multi-object activities, or group activities, present in a variety of applications such as surveillance, sports, and biological monitoring records, etc., are the main focus of this paper. Unlike earlier attempts which model the complex spatial temporal constraints among multiple objects with a parametric Bayesian network, we propose a Discriminative Temporal Interaction Manifold (DTIM) framework as a data-driven strategy to characterize the group motion pattern without employing specific domain knowledge. In particular, we establish probability densities on the DTIM, whose element, the discriminative temporal interaction matrix, compactly describes the coordination and interaction among multiple objects in a group activity. For each class of group activity we learn a multi-modal density function on the DTIM. A Maximum a Posteriori (MAP) classifier on the manifold is then designed for recognizing new activities. Experiments on football play recognition demonstrate the effectiveness of the approach.

computer vision and pattern recognition | 2007

Joint Real-time Object Detection and Pose Estimation Using Probabilistic Boosting Network

Jingdan Zhang; Shaohua Kevin Zhou; Leonard McMillan; Dorin Comaniciu

In this paper, we present a learning procedure called probabilistic boosting network (PBN) for joint real-time object detection and pose estimation. Grounded on the law of total probability, PBN integrates evidence from two building blocks, namely a multiclass boosting classifier for pose estimation and a boosted detection cascade for object detection. By inferring the pose parameter, we avoid the exhaustive scanning for the pose, which hampers real time requirement. In addition, we only need one integral image/volume with no need of image/volume rotation. We implement PBN using a graph-structured network that alternates the two tasks of foreground/background discrimination and pose estimation for rejecting negatives as quickly as possible. Compared with previous approaches, we gain accuracy in object localization and pose estimation while noticeably reducing the computation. We invoke PBN to detect the left ventricle from a 3D ultrasound volume, processing about 10 volumes per second, and the left atrium from 2D images in real time.

international conference on acoustics, speech, and signal processing | 2004

Robust two-camera tracking using homography

Zhanfeng Yue; Shaohua Kevin Zhou; Rama Chellappa

The paper introduces a two view tracking method which uses the homography relation between the two views to handle occlusions. An adaptive appearance-based model is incorporated in a particle filter to realize robust visual tracking. Occlusion is detected using robust statistics. When there is occlusion in one view, the homography from this view to other views is estimated from previous tracking results and used to infer the correct transformation for the occluded view. Experimental results show the robustness of the two view tracker.

Explore More