Guan Luo
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guan Luo.
international conference on computer vision | 2007
Xi Li; Weiming Hu; Zhongfei Zhang; Xiaoqin Zhang; Guan Luo
Most existing subspace analysis-based tracking algorithms utilize a flattened vector to represent a target, resulting in a high dimensional data learning problem. Recently, subspace analysis is incorporated into the multilinear framework which offline constructs a representation of image ensembles using high-order tensors. This reduces spatio-temporal redundancies substantially, whereas the computational and memory cost is high. In this paper, we present an effective online tensor subspace learning algorithm which models the appearance changes of a target by incrementally learning a low-order tensor eigenspace representation through adaptively updating the sample mean and eigenbasis. Tracking then is led by the state inference within the framework in which a particle filter is used for propagating sample distributions over the time. A novel likelihood function, based on the tensor reconstruction error norm, is developed to measure the similarity between the test image and the learned tensor subspace model during the tracking. Theoretic analysis and experimental evaluations against a state-of-the-art method demonstrate the promise and effectiveness of this algorithm.
asian conference on computer vision | 2009
Chunfeng Yuan; Weiming Hu; Xi Li; Stephen J. Maybank; Guan Luo
This paper presents a new action recognition approach based on local spatio-temporal features. The main contributions of our approach are twofold. First, a new local spatio-temporal feature is proposed to represent the cuboids detected in video sequences. Specifically, the descriptor utilizes the covariance matrix to capture the self-correlation information of the low-level features within each cuboid. Since covariance matrices do not lie on Euclidean space, the Log-Euclidean Riemannian metric is used for distance measure between covariance matrices. Second, the Earth Mover’s Distance (EMD) is used for matching any pair of video sequences. In contrast to the widely used Euclidean distance, EMD achieves more robust performances in matching histograms/distributions with different sizes. Experimental results on two datasets demonstrate the effectiveness of the proposed approach.
international conference on image processing | 2007
Qingdi Wei; Weiming Hu; Xiaoqin Zhang; Guan Luo
Action recognition is one of the most active research fields in computer vision. In this paper, we propose a novel method for classifying human actions in a series of image sequences containing certain actions. Human action in image sequences can be recognized by a time-varying contour of human body. We first extract shape context of each contour to form the feature space. Then the dominant sets approach is used for feature clustering and classification to obtain the labeled sequences. Finally, we use a smoothing algorithm upon the labeled sequences to recognize human actions. The proposed dominant sets-based approach has been tested in comparison to three classical methods: K-means, mean shift, and fuzzy-C-mean. Experimental results demonstrate that the dominant sets-based approach achieves the best recognition performance. Moreover, our method is robust to non-rigid deformations, significant scale changes, high action irregularities, and low quality video.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014
Guan Luo; Shuang Yang; Guodong Tian; Chunfeng Yuan; Weiming Hu; Stephen J. Maybank
In this paper, we address the problem of human action recognition through combining global temporal dynamics and local visual spatio-temporal appearance features. For this purpose, in the global temporal dimension, we propose to model the motion dynamics with robust linear dynamical systems (LDSs) and use the model parameters as motion descriptors. Since LDSs live in a non-Euclidean space and the descriptors are in non-vector form, we propose a shift invariant subspace angles based distance to measure the similarity between LDSs. In the local visual dimension, we construct curved spatio-temporal cuboids along the trajectories of densely sampled feature points and describe them using histograms of oriented gradients (HOG). The distance between motion sequences is computed with the Chi-Squared histogram distance in the bag-of-words framework. Finally we perform classification using the maximum margin distance learning method by combining the global dynamic distances and the local visual distances. We evaluate our approach for action recognition on five short clips data sets, namely Weizmann, KTH, UCF sports, Hollywood2 and UCF50, as well as three long continuous data sets, namely VIRAT, ADL and CRIM13. We show competitive results as compared with current state-of-the-art methods.
international conference on acoustics, speech, and signal processing | 2009
Haiqiang Zuo; Xi Li; Ou Wu; Weiming Hu; Guan Luo
Image spam is a new obfuscating method which spammers invented to more effectively bypass conventional text based spam filters. In this paper, a framework for filtering image spams by using the Fourier-Mellin invariant features is described. Fourier-Mellin features are robust for most kinds of image spam variations. A one-class classifier, the support vector data description (SVDD), is exploited to model the boundary of image spam class in the feature space without using information of legitimate emails. Experimental results demonstrate that our framework is effective for fighting image spam.
international world wide web conferences | 2009
Haiqiang Zuo; Weiming Hu; Ou Wu; Yunfei Chen; Guan Luo
Image spam is a new obfuscating method which spammers invented to more effectively bypass conventional text based spam filters. In this paper, we extract local invariant features of images and run a one-class SVM classifier which uses the pyramid match kernel as the kernel function to detect image spam. Experimental results demonstrate that our algorithm is effective for fighting image spam.
british machine vision conference | 2008
Xi Li; Weiming Hu; Zhongfei Zhang; Xiaoqin Zhang; Guan Luo
In this paper, we present a trajectory-based video retrieval framework using Dirichlet process mixture models. The main contribution of this framework is four-fold. (1) We apply a Dirichlet process mixture model (DPMM) to unsupervised trajectory learning. DPMM is a countably infinite mixture model with its components growing by itself. (2) We employ a time-sensitive Dirichlet process mixture model (tDPMM) to learn trajectories’ time-series characteristics. Furthermore, a novel likelihood estimation algorithm for tDPMM is proposed for the first time. (3) We develop a tDPMM-based probabilistic model matching scheme, which is empirically shown to be more error-tolerating and is able to deliver higher retrieval accuracy than the peer methods in the literature. (4) The framework has a nice scalability and adaptability in the sense that when new cluster data are presented, the framework automatically identifies the new cluster information without having to redo the training. Theoretic analysis and experimental evaluations against the state-of-the-art methods demonstrate the promise and effectiveness of the framework.
Pattern Recognition | 2013
Haoran Wang; Chunfeng Yuan; Guan Luo; Weiming Hu; Changyin Sun
In this paper, we propose a novel approach based on Linear Dynamic Systems (LDSs) for action recognition. Our main contributions are two-fold. First, we introduce LDSs to action recognition. LDSs describe the dynamic texture which exhibits certain stationarity properties in time. They are adopted to model the spatiotemporal patches which are extracted from the video sequence, because the spatiotemporal patch is more analogous to a linear time invariant system than the video sequence. Notably, LDSs do not live in the Euclidean space. So we adopt the kernel principal angle to measure the similarity between LDSs, and then the multiclass spectral clustering is used to generate the codebook for the bag of features representation. Second, we propose a supervised codebook pruning method to preserve the discriminative visual words and suppress the noise in each action class. The visual words which maximize the inter-class distance and minimize the intra-class distance are selected for classification. Our approach yields the state-of-the-art performance on three benchmark datasets. Especially, the experiments on the challenging UCF Sports and Feature Films datasets demonstrate the effectiveness of the proposed approach in realistic complex scenarios.
asian conference on computer vision | 2007
Xiaoqin Zhang; Weiming Hu; Guan Luo; Stephen J. Maybank
This paper proposes a general Kernel-Bayesian framework for object tracking. In this framework, the kernel based method--mean shift algorithm is embedded into the Bayesian framework seamlessly to provide a heuristic prior information to the state transition model, aiming at effectively alleviating the heavy computational load and avoiding sample degeneracy suffered by the conventional Bayesian trackers. Moreover, the tracked object is characterized by a spatial-constraint MOG (Mixture of Gaussians) based appearance model, which is shown more discriminative than the traditional MOG based appearance model. Meantime, a novel selective updating technique for the appearance model is developed to accommodate the changes in both appearance and illumination. Experimental results demonstrate that, compared with Bayesian and kernel based tracking frameworks, the proposed algorithm is more efficient and effective.
asian conference on computer vision | 2010
Wei Li; Bing Li; Xiaoqin Zhang; Weiming Hu; Hanzi Wang; Guan Luo
Tracking multi-object under occlusion is a challenging task. When occlusion happens, only the visible part of occluded object can provide reliable information for the matching. In conventional algorithms, the deducing of the occlusion relationship is needed to derive the visible part. However deducing the occlusion relationship is difficult. The interdetermined effect between the occlusion relationship and the tracking results will degenerate the tracking performance, and even lead to the tracking failure. In this paper, we propose a novel framework to track multi-object with occlusion handling according to sparse reconstruction. The matching with l1-regularized sparse reconstruction can automatically focus on the visible part of the occluded object, and thus exclude the need of deducing the occlusion relationship. The tracking is simplified into a joint Bayesian inference problem. We compare our algorithm with the state-of-the-art algorithms. The experimental results show the superiority of our algorithm over other competing algorithms.