Guijin Wang
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guijin Wang.
Pattern Recognition Letters | 2011
Quan Miao; Guijin Wang; Chenbo Shi; Xinggang Lin; Zhiwei Ruan
We present a new object tracking scheme by employing adaptive classifiers to match the corresponding keypoints between consecutive frames. The detection of interest points is a critical step in obtaining robust local descriptions. This paper proposes an efficient feature detector based on SURF, by incrementally predicting the search space, to enhance the repeatability of the tracked interest points. Instead of computing the SURF descriptor, we construct a classifier-based descriptor using on-line boosting. With on-line learning ability based on our sample weighting mechanism, the classifier maintains its discriminative power to establish robust feature description and reliable points matching for subsequent tracking. In addition, matching candidates are validated using improved RANSAC to ensure correct updates and accurate tracking. All of these ingredients contribute measurably to improving overall tracking performance. Experimental results demonstrate the robustness and accuracy of our proposed technique.
Pattern Recognition | 2016
Hongzhao Chen; Guijin Wang; Jing-Hao Xue; Li He
In this paper, we propose a novel two-level hierarchical framework for three-dimensional (3D) skeleton-based action recognition, in order to tackle the challenges of high intra-class variance, movement speed variability and high computational costs of action recognition. In the first level, a new part-based clustering module is proposed. In this module, we introduce a part-based five-dimensional (5D) feature vector to explore the most relevant joints of body parts in each action sequence, upon which action sequences are automatically clustered and the high intra-class variance is mitigated. In the second level, there are two modules, motion feature extraction and action graphs. In the module of motion feature extraction, we utilize the cluster-relevant joints only and present a new statistical principle to decide the time scale of motion features, to reduce computational costs and adapt to variable movement speed. In the action graphs module, we exploit these 3D skeleton-based motion features to build action graphs, and devise a new score function based on maximum-likelihood estimation for action graph-based recognition. Experiments on the Microsoft Research Action3D dataset and the University of Texas Kinect Action dataset demonstrate that our method is superior or at least comparable to other state-of-the-art methods, achieving 95.56% recognition rate on the former dataset and 95.96% on the latter one. HighlightsWe propose a hierarchical framework for 3D skeleton-based action recognition.We introduce a part-based feature vector to automatically cluster action sequences.We present a statistical principle to decide the time scale of motion features.Our method outperforms other state-of-the-art methods on the MSRAction3D dataset.
international conference on image processing | 2010
Quan Miao; Guijin Wang; Xinggang Lin; Yongming Wang; Chenbo Shi; Chao Liao
Object tracking is a major technique in image processing and computer vision. In this paper, we propose a new robust feature-based tracking scheme by employing adaptive classifiers to match the detected keypoints in consecutive frames. The novelty of this paper is that the design of online boosting is combined with the invariance of local features so that the classifier-based descriptions are formed in association with the scale and rotation information. Furthermore, we introduce a sample weighting mechanism in the on-line classifier updating, for the subsequent tracking. Experimental results demonstrate the robustness and accuracy of our proposed technique.
Neurocomputing | 2015
Guijin Wang; Fei Zheng; Chenbo Shi; Jing-Hao Xue; Chunxiao Liu; Li He
Abstract Face recognition in video surveillance is a challenging task, largely due to the difficulty in matching images across cameras of distinct viewpoints and illuminations. To overcome this difficulty, this paper proposes a novel method which embeds distance metric learning into set-based image matching. First we use sets of face images, rather than individual images, as the input for recognition, since in surveillance systems the former is a more natural way. We model each image set using a convex-hull space spanned by its member images and measure the dissimilarity of two sets using the distance between the closest points of their corresponding convex hull spaces. Then we propose a set-based distance metric learning scheme to learn a feature-space mapping to a discriminative subspace. Finally we project image sets into the learned subspace and achieve face recognition by comparing the projected sets. In this way, we can adapt the variation in viewpoints and illuminations across cameras in order to improve face recognition in video surveillance. Experiments on the public Honda/UCSD and ChokePoint databases demonstrate the superior performance of our method to the state-of-the-art approaches.
Neurocomputing | 2015
Li He; Guijin Wang; Qingmin Liao; Jing-Hao Xue
Depth-images-based human pose estimation is facing two challenges: how to extract features which are discriminative to variations in human poses and robust against noise, and how to reliably learn body joints based on their dependence structure. To tackle the first problem, we propose a novel 3D Local Shape Context feature extracted from human body silhouette to characterise the local structure of body joints. To tackle the second problem, we incorporate a graphical model into regression forests to exploit structural constrains. Experiments demonstrate that our method can efficiently learn local body structures and localise joints. Compared with the state-of-the-art methods, our method significantly improves the accuracy of pose estimation from depth images.
international conference on image processing | 2010
Chenbo Shi; Guijin Wang; Xinggang Lin; Yongming Wang; Chao Liao; Quan Miao
This paper introduces a topology based affine invariant descriptor for maximally stable extremal regions (MSERs). The popular SIFT descriptor computes the texture information on a grey-scale patch. Instead our descriptor use only the topology and geometric information among MSERs so that features can be rapidly matched regardless of the texture in the image patch. Based on the ellipses fitting for the detected MSERs, geometric affine invariants between ellipses pair are extracted as the descriptors. Finally topology based voting selector is designed to achieve the best correspondences. Experiment shows that our descriptor is not only computational faster than SIFT descriptor, but also has better performance on wide angle of view and nonlinear illumination change. In addition, our descriptor shows a good result on multi sensor images registration.
IEEE Signal Processing Letters | 2015
Zhiwei Ruan; Guijin Wang; Jing-Hao Xue; Xinggang Lin
For objects with large appearance variations, it has been proved that their detection performance can be effectively improved by clustering positive training instances into subcategories and learning multi-component models for the subcategories. However, it is not trivial to generate subcategories of high quality, due to the difficulty in measuring the similarity between positive instances. In this letter we propose a new weakly supervised clustering method to achieve better sub-categorization. Our method provides a more precise measurement of the similarity by aligning the positive instances through latent variables and filtering the aligned features. As a better alternative to the initialization step of the latent-SVM algorithm for the learning of the multi-component models, our method can lead to a superior performance gain for object detection. We demonstrate this on various real-world datasets.
Neurocomputing | 2016
Li He; Guijin Wang; Qingmin Liao; Jing-Hao Xue
Prior models of human pose play a key role in state-of-the-art techniques for monocular pose estimation. However, a simple Gaussian model cannot represent well the prior knowledge of the pose diversity on depth images. In this paper, we develop a latent variable-based prior model by introducing a latent variable into the general pictorial structure. Two key characteristics of our model (we call Latent Variable Pictorial Structure) are as follows: (1) it adaptively adopts prior pose models based on the estimated value of the latent variable; and (2) it enables the learning of a more accurate part classifier. Experimental results demonstrate that the proposed method outperforms other state-of-the-art methods in recognition rate on the public datasets.
2013 International Conference on Optical Instruments and Technology: Optoelectronic Imaging and Processing Technology | 2013
Hongzhao Chen; Guijin Wang; Li He
In this paper, we propose a real-time action recognition algorithm, based on 3D human skeleton positions provided by the depth camera. Our contributions are threefold. First, considering that skeleton positions in different actions at different time are similar, we adopt the Naive-Bayes-Nearest-Neighbor (NBNN) method for classification. Second, to avoid different but similar actions which would decrease recognition rate obviously, we present a hierarchical model and increase the recognition rate significantly. Third, for a real-time application, we apply the sliding window to buffer the input and the threshold presented by the ratio of the second nearest distance and the nearest distance to smooth the output. Our method also rejects undefined actions. Experimental results on the Microsoft Research Action3D dataset demonstrate that our algorithm outperforms other state-of-the-art methods both in recognition rate and computing speed. Our algorithm increases the recognition rate by about 10% at the speed of 30fps averagely (with resolution 640×480).
Neurocomputing | 2014
Zhiwei Ruan; Guijin Wang; Jing-Hao Xue; Xinggang Lin; Yong Jiang
Dog face detection is an important object detection task, widely applied in many fields such as auto-focus and image retrieval. In many applications, users only care about specific target species, which are unknown to a detection system until the users register some relevant information like a limited number of target samples. We call this scenario the detection of user-registered dog faces. Due to the great variation between different dog species, no single model can describe all the species well. Meanwhile, it is also impractical to learn individual models for every potential target species that the users may care about, given the large number of dog species. Furthermore, the registered samples are usually too few to train a robust detector directly. In this context, we propose a novel user-registered object detection framework. This framework can generate an adaptive detector, from only a limited number of user-registered target samples and a couple of off-line trained auxiliary models. In addition, we build an annotated dog face dataset, which contains 10,712 images of 32 species. Experimental results on the dataset demonstrate that the proposed framework can achieve superior detection performance to the state-of-the-art approaches.