Guangming Zhu
Xidian University
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Guangming Zhu.
Sensors | 2016
Guangming Zhu; Liang Zhang; Peiyi Shen; Juan Song
Continuous human action recognition (CHAR) is more practical in human-robot interactions. In this paper, an online CHAR algorithm is proposed based on skeletal data extracted from RGB-D images captured by Kinect sensors. Each human action is modeled by a sequence of key poses and atomic motions in a particular order. In order to extract key poses and atomic motions, feature sequences are divided into pose feature segments and motion feature segments, by use of the online segmentation method based on potential differences of features. Likelihood probabilities that each feature segment can be labeled as the extracted key poses or atomic motions, are computed in the online model matching process. An online classification method with variable-length maximal entropy Markov model (MEMM) is performed based on the likelihood probabilities, for recognizing continuous human actions. The variable-length MEMM method ensures the effectiveness and efficiency of the proposed CHAR method. Compared with the published CHAR methods, the proposed algorithm does not need to detect the start and end points of each human action in advance. The experimental results on public datasets show that the proposed algorithm is effective and highly-efficient for recognizing continuous human actions.
international conference on pattern recognition | 2016
Guangming Zhu; Liang Zhang; Lin Mei; Jie Shao; Juan Song; Peiyi Shen
Human gesture recognition is one of the central research fields of computer vision, and effective gesture recognition is still challenging up to now. In this paper, we present a pyramidal 3D convolutional network framework for large-scale isolated human gesture recognition. 3D convolutional networks are utilized to learn the spatiotemporal features from gesture video files. Pyramid input is proposed to preserve the multi-scale contextual information of gestures, and each pyramid segment is uniformly sampled with temporal jitter. Pyramid fusion layers are inserted into the 3D convolutional networks to fuse the features of pyramid input. This strategy makes the networks recognize human gestures from the entire video files, not just from segmented clips independently. We present the experiment results on the 2016 ChaLearn LAP Large-scale Isolated Gesture Recognition Challenge, in which we placed third.
Signal Processing-image Communication | 2016
Guangming Zhu; Liang Zhang; Peiyi Shen; Juan Song
Taking fully into consideration the fact that one human action can be intuitively considered as a sequence of key poses and atomic motions in a particular order, a human action recognition method using multi-layer codebooks of key poses and atomic motions is proposed in this paper. Inspired by the dynamics models of human joints, normalized relative orientations are computed as features for each limb of human body. In order to extract key poses and atomic motions precisely, feature sequences are segmented into pose feature segments and motion feature segments dynamically, based on the potential differences of feature sequences. Multi-layer codebooks of each human action are constructed with the key poses extracted from pose feature segments and the atomic motions extracted from motion feature segments associated with each two key poses. The multi-layer codebooks represent action patterns of each human action, which can be used to recognize human actions with the proposed pattern-matching method. Three classification methods are employed for action recognition based on the multi-layer codebooks. Two public action datasets, i.e., CAD-60 and MSRC-12 datasets, are used to demonstrate the advantages of the proposed method. The experimental results show that the proposed method can obtain a comparable or better performance compared with the state-of-the-art methods. Human actions are modeled by a sequence of key poses and atomic motions.Normalized relative orientations are computed as features for each limb.Feature sequences are segmented into pose and motion feature segments dynamically.Multi-layer codebooks which constructed with extracted key poses and atomic motions.A pattern-matching method is proposed and integrated with traditional classifiers.
international conference on image processing | 2016
Liang Zhang; Peiyi Shen; Shu'e Zhang; Juan Song; Guangming Zhu
Heavy noises and large amounts of holes exist in depth images captured by sensors, such as Kinect, which would severely hinder the application of depth information. In this paper, a novel depth enhancement algorithm with improved exemplar-based inpainting and joint trilateral guided filtering is proposed. The improved examplar-based inpainting method is applied to fill the holes in the depth images, in which the level set distance component is introduced in the priority evaluation function. Then a joint trilateral guided filter is adopted to denoise and smooth the inpainted results. Experimental results reveal that the proposed algorithm can achieve better enhancement results compared with the existing methods in terms of subjective and objective quality measurements.
robotics and biomimetics | 2015
Guangming Zhu; Liang Zhang; Peiyi Shen; Juan Song; Lukui Zhi; Kang Yi
Human action recognition is a fundamental skill for personal assistive robotics to observe and automatically react to humans daily activities. Generally, one human activity can be intuitively considered as a sequence of key poses and atomic motions. Thus, a human action recognition algorithm based on key poses and atomic motions is proposed in this paper. Firstly, the normalized relative orientations of human joints are computed as the skeletal features. Secondly, the skeletal feature sequences are segmented into static segments and dynamic segments based on the kinetic energy. Then, the codebook of key poses is constructed from the static segments using clustering algorithms, and the codebook of atomic motions is constructed from the associated dynamic segments with any two key poses. Lastly, the activity patterns are constructed and the Naïve Bayes Nearest Neighbor algorithm is utilized to classify human activities based on the training and testing activity pattern matching. The Cornell CAD-60 dataset is used to test the proposed algorithm. The experimental results show that the proposed algorithm can obtain a better performance than the state-of-the-art algorithms.
Sensors | 2015
Liang Zhang; Peiyi Shen; Guangming Zhu; Wei Wei; Houbing Song
Internet of Things (IoT) is driving innovation in an ever-growing set of application domains such as intelligent processing for autonomous robots. For an autonomous robot, one grand challenge is how to sense its surrounding environment effectively. The Simultaneous Localization and Mapping with RGB-D Kinect camera sensor on robot, called RGB-D SLAM, has been developed for this purpose but some technical challenges must be addressed. Firstly, the efficiency of the algorithm cannot satisfy real-time requirements; secondly, the accuracy of the algorithm is unacceptable. In order to address these challenges, this paper proposes a set of novel improvement methods as follows. Firstly, the ORiented Brief (ORB) method is used in feature detection and descriptor extraction. Secondly, a bidirectional Fast Library for Approximate Nearest Neighbors (FLANN) k-Nearest Neighbor (KNN) algorithm is applied to feature match. Then, the improved RANdom SAmple Consensus (RANSAC) estimation method is adopted in the motion transformation. In the meantime, high precision General Iterative Closest Points (GICP) is utilized to register a point cloud in the motion transformation optimization. To improve the accuracy of SLAM, the reduced dynamic covariance scaling (DCS) algorithm is formulated as a global optimization problem under the G2O framework. The effectiveness of the improved algorithm has been verified by testing on standard data and comparing with the ground truth obtained on Freiburg University’s datasets. The Dr Robot X80 equipped with a Kinect camera is also applied in a building corridor to verify the correctness of the improved RGB-D SLAM algorithm. With the above experiments, it can be seen that the proposed algorithm achieves higher processing speed and better accuracy.
IEEE Access | 2017
Guangming Zhu; Liang Zhang; Peiyi Shen; Juan Song
Gesture recognition aims to recognize meaningful movements of human bodies, and is of utmost importance in intelligent human–computer/robot interactions. In this paper, we present a multimodal gesture recognition method based on 3-D convolution and convolutional long-short-term-memory (LSTM) networks. The proposed method first learns short-term spatiotemporal features of gestures through the 3-D convolutional neural network, and then learns long-term spatiotemporal features by convolutional LSTM networks based on the extracted short-term spatiotemporal features. In addition, fine-tuning among multimodal data is evaluated, and we find that it can be considered as an optional skill to prevent overfitting when no pre-trained models exist. The proposed method is verified on the ChaLearn LAP large-scale isolated gesture data set (IsoGD) and the Sheffield Kinect gesture (SKIG) data set. The results show that our proposed method can obtain the state-of-the-art recognition accuracy (51.02% on the validation set of IsoGD and 98.89% on SKIG).
Neurocomputing | 2016
Yuanmei Tian; Juan Song; Xiangdong Zhang; Peiyi Shen; Liang Zhang; Weibin Gong; Wei Wei; Guangming Zhu
Vehicle license plate positioning is the first and most important step in license plate recognition systems. Existing license plate location algorithms are sensitive to light conditions and prone to be influenced by the background interference. To solve these problems, this paper presents an adaboost algorithm combined with color differential model. The proposed algorithm is composed of a coarse location step and a precise location step. In the coarse location step, a binary image is obtained to select the candidate plate regions using the color differential model. Then in the precise location step, the features obtained above together with other features are used by the adaboost algorithm to train the classifiers and precisely locate the license plates. The experimental results show that the proposed algorithm is more robust to the light conditions and background interference. In particular, during nighttime the precision rate can attain above 95.0%.
chinese conference on pattern recognition | 2016
Juan Song; Liang Zhang; Peiyi Shen; Xilu Peng; Guangming Zhu
In this paper, a fast enhancement method based on de-hazing is proposed for single low-light images. Instead of dark channel prior (DCP) used in the de-hazing related literature, the luminance map is used to estimate the global atmospheric light and the transmittance according to the observed similarity between the luminance map and DCP. Through this substitution, on the one hand the computation complexity is greatly reduced; on the other hand the block artifacts is also avoided brought by discontinuous transmittance estimated from DCP. Experimental results indicate that the proposed method has a significant improvement in both enhancement effects and processing speed compared with state-of-art enhancement algorithms.
robotics and biomimetics | 2016
Liang Zhang; Zhihao Cheng; Yixin Gan; Guangming Zhu; Peiyi Shen; Juan Song
In this paper, we propose a real-time imitation algorithm for humanoid robots to mimic humans eight-chain whole-body motions for the first time. Eight-chain human model is constructed from skeletal data captured by Kinect to represent human whole-body motions, and eight-chain robot model is constructed according to humanoid robot body structures for human motion imitation. First, motion mapping strategy is employed to generate postures of robot by fitting corresponding eight kinematic chains in above two models. The positions of robots end effectors can be obtained after motion mapping. Then, Inverse Kinematic (IK) algorithm is used to generate joint angles of each joint chain with joint angle limitation. At last, the center of mass is controlled over the support polygon by adjusting ankle joints and hip joints for balance control, and balance analysis is made for each frame to build a stable action. In the post-processing, self-collision avoidance and joint angle speed limitation are employed. In the experiments, the proposed algorithm is tested on humanoid robot NAO, and the results demonstrate that NAO can imitate humans eight-chain motions with a high accuracy in real-time.
