Is this you? Create Your Porfile

Du Q. Huynh

University of Western Australia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Du Q. Huynh is active.

Explore More

Publication

Featured researches published by Du Q. Huynh.

european conference on computer vision | 2014

HOPC: Histogram of Oriented Principal Components of 3D Pointclouds for Action Recognition

Hossein Rahmani; Arif Mahmood; Du Q. Huynh; Ajmal S. Mian

Existing techniques for 3D action recognition are sensitive to viewpoint variations because they extract features from depth images which change significantly with viewpoint. In contrast, we directly process the pointclouds and propose a new technique for action recognition which is more robust to noise, action speed and viewpoint variations. Our technique consists of a novel descriptor and keypoint detection algorithm. The proposed descriptor is extracted at a point by encoding the Histogram of Oriented Principal Components (HOPC) within an adaptive spatio-temporal support volume around that point. Based on this descriptor, we present a novel method to detect Spatio-Temporal Key-Points (STKPs) in 3D pointcloud sequences. Experimental results show that the proposed descriptor and STKP detector outperform state-of-the-art algorithms on three benchmark human activity datasets. We also introduce a new multiview public dataset and show the robustness of our proposed method to viewpoint variations.

workshop on applications of computer vision | 2014

Real time action recognition using histograms of depth gradients and random decision forests

Hossein Rahmani; Arif Mahmood; Du Q. Huynh; Ajmal S. Mian

We propose an algorithm which combines the discriminative information from depth images as well as from 3D joint positions to achieve high action recognition accuracy. To avoid the suppression of subtle discriminative information and also to handle local occlusions, we compute a vector of many independent local features. Each feature encodes spatiotemporal variations of depth and depth gradients at a specific space-time location in the action volume. Moreover, we encode the dominant skeleton movements by computing a local 3D joint position difference histogram. For each joint, we compute a 3D space-time motion volume which we use as an importance indicator and incorporate in the feature vector for improved action discrimination. To retain only the discriminant features, we train a random decision forest (RDF). The proposed algorithm is evaluated on three standard datasets and compared with nine state-of-the-art algorithms. Experimental results show that, on the average, the proposed algorithm outperform all other algorithms in accuracy and have a processing speed of over 112 frames/second.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Histogram of Oriented Principal Components for Cross-View Action Recognition

Hossein Rahmani; Arif Mahmood; Du Q. Huynh; Ajmal S. Mian

Existing techniques for 3D action recognition are sensitive to viewpoint variations because they extract features from depth images which are viewpoint dependent. In contrast, we directly process point clouds for cross-view action recognition from unknown and unseen views. We propose the histogram of oriented principal components (HOPC) descriptor that is robust to noise, viewpoint, scale and action speed variations. At a 3D point, HOPC is computed by projecting the three scaled eigenvectors of the pointcloud within its local spatio-temporal support volume onto the vertices of a regular dodecahedron. HOPC is also used for the detection of spatiotemporal keypoints (STK) in 3D pointcloud sequences so that view-invariant STK descriptors (or Local HOPC descriptors) at these key locations only are used for action recognition. We also propose a global descriptor computed from the normalized spatio-temporal distribution of STKs in 4-D, which we refer to as STK-D. We have evaluated the performance of our proposed descriptors against nine existing techniques on two cross-view and three single-view human action recognition datasets. The experimental results show that our techniques provide significant improvement over state-of-the-art methods.

IEEE Transactions on Image Processing | 2013

A Gaussian Process Guided Particle Filter for Tracking 3D Human Pose in Video

Suman Sedai; Mohammed Bennamoun; Du Q. Huynh

In this paper, we propose a hybrid method that combines Gaussian process learning, a particle filter, and annealing to track the 3D pose of a human subject in video sequences. Our approach, which we refer to as annealed Gaussian process guided particle filter, comprises two steps. In the training step, we use a supervised learning method to train a Gaussian process regressor that takes the silhouette descriptor as an input and produces multiple output poses modeled by a mixture of Gaussian distributions. In the tracking step, the output pose distributions from the Gaussian process regression are combined with the annealed particle filter to track the 3D pose in each frame of the video sequence. Our experiments show that the proposed method does not require initialization and does not lose tracking of the pose. We compare our approach with a standard annealed particle filter using the HumanEva-I dataset and with other state of the art approaches using the HumanEva-II dataset. The evaluation results show that our approach can successfully track the 3D human pose over long video sequences and give more accurate pose tracking results than the annealed particle filter.

international conference on pattern recognition | 2014

Action Classification with Locality-Constrained Linear Coding

Hossein Rahmani; Arif Mahmood; Du Q. Huynh; Ajmal S. Mian

We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatio-temporal subsequence of a video sequence. Our proposed method divides the input video into equally spaced overlapping spatio-temporal sub sequences, each of which is decomposed into blocks and then cells. We use the Histogram of Oriented Gradient (HOG3D) feature to encode the information in each cell. We justify the use of LLC for encoding the block descriptor by demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor is obtained via a logistic regression classifier with L2 regularization. We evaluate and compare our algorithm with ten state-of-the-art algorithms on five benchmark datasets. Experimental results show that, on average, our algorithm gives better accuracy than these ten algorithms.

Pattern Recognition Letters | 2016

Discriminative human action classification using locality-constrained linear coding

Hossein Rahmani; Du Q. Huynh; Arif Mahmood; Ajmal S. Mian

We propose using the locality-constrained linear coding for action classification.Our sequence descriptor includes cell, block, and subsequence descriptors.We use maximum pooling and a logistic regression classifier to encode each sequence.We demonstrate the effectiveness of our algorithm on both depth and RGB videos. We propose a Locality-constrained Linear Coding (LLC) based algorithm that captures discriminative information of human actions in spatio-temporal subsequences of videos. The input video is divided into equally spaced overlapping spatio-temporal subsequences. Each subsequence is further divided into blocks and then cells. The spatio-temporal information in each cell is represented by a Histogram of Oriented 3D Gradients (HOG3D). LLC is then used to encode each block. We show that LLC gives more stable and repetitive codes compared to the standard Sparse Coding. The final representation of a video sequence is obtained using logistic regression with 2 regularization and classification is performed by a linear SVM. The proposed algorithm is applicable to conventional and depth videos. Experimental comparison with ten state-of-the-art methods on three depth video and two conventional video databases shows that the proposed method consistently achieves the best performance.

digital image computing: techniques and applications | 2009

Context-Based Appearance Descriptor for 3D Human Pose Estimation from Monocular Images

Suman Sedai; Mohammed Bennamoun; Du Q. Huynh

In this paper we propose a novel appearance descriptor for 3D human pose estimation from monocular images using a learning-based technique. Our image-descriptor is based on the intermediate local appearance descriptors that we design to encapsulate local appearance context and to be resilient to noise. We encode the image by the histogram of such local appearance context descriptors computed in an image to obtain the final image-descriptor for pose estimation. We name the final image-descriptor the Histogram of Local Appearance Context (HLAC). We then use Relevance Vector Machine (RVM) regression to learn the direct mapping between the proposed HLAC image-descriptor space and the 3D pose space. Given a test image, we first compute the HLAC descriptor and then input it to the trained regressor to obtain the final output pose in real time. We compared our approach with other methods using a synchronized video and 3D motion dataset. We compared our proposed HLAC image-descriptor with the Histogram of Shape Context and Histogram of SIFT like descriptors. The evaluation results show that HLAC descriptor outperforms both of them in the context of 3D Human pose estimation.

digital image computing: techniques and applications | 2005

Trajectory Based Video Sequence Synchronization

Daniel Wedge; Peter Kovesi; Du Q. Huynh

Video sequence synchronization is often necessary for computer vision applications where multiple simultaneously recorded videos are processed. We present a coarse-to-fine approach to synchronizing two video sequences recorded at the same frame rate by stationary cameras with fixed internal parameters. At the coarse level, each sequence is broken into a set of sub-sequences, which are then matched. A voting scheme determines the range in which the sequences’ temporal offset lies. The fine synchronization step searches for the temporal offset by initially examining integer offsets, and then using the golden-section search to locate the offset to sub-frame accuracy. Our algorithm recovers the temporal offset of two sequences using the motion of a single moving object, is computationally efficient, and it does not require any stationary background points as reference points. We present results for synthetic data and real video sequences, with various degrees of temporal overlap.

Pattern Recognition | 2013

Discriminative fusion of shape and appearance features for human pose estimation

Suman Sedai; Mohammed Bennamoun; Du Q. Huynh

This paper presents a method for combining the shape and appearance feature types in a discriminative learning framework for human pose estimation. We first present a new appearance descriptor that is distinctive and resilient to noise for 3D human pose estimation. We then combine the proposed appearance descriptor with a shape descriptor computed from the silhouette of the human subject using discriminative learning. Our method, which we refer to as a localized decision level fusion technique, is based on clustering the output pose space into several partitions and learning a decision level fusion model for the shape and appearance descriptors in each region. The combined shape and appearance descriptor allows complementary information of the individual feature types to be exploited, leading to improved performance of the pose estimation system. We evaluate our proposed fusion method with feature level fusion and kernel level fusion methods using a synchronized video and 3D motion dataset. Our experimental results show that the proposed feature combination method gives more accurate pose estimation than the one obtained from each individual feature type. Among the three fusion methods, our localized decision level fusion method is demonstrated to perform the best for 3D pose estimation.

IEEE Transactions on Image Processing | 2016

Constrained Metric Learning by Permutation Inducing Isometries

Joel Bosveld; Arif Mahmood; Du Q. Huynh; Lyle Noakes

The choice of metric critically affects the performance of classification and clustering algorithms. Metric learning algorithms attempt to improve performance, by learning a more appropriate metric. Unfortunately, most of the current algorithms learn a distance function which is not invariant to rigid transformations of images. Therefore, the distances between two images and their rigidly transformed pair may differ, leading to inconsistent classification or clustering results. We propose to constrain the learned metric to be invariant to the geometry preserving transformations of images that induce permutations in the feature space. The constraint that these transformations are isometries of the metric ensures consistent results and improves accuracy. Our second contribution is a dimension reduction technique that is consistent with the isometry constraints. Our third contribution is the formulation of the isometry constrained logistic discriminant metric learning (IC-LDML) algorithm, by incorporating the isometry constraints within the objective function of the LDML algorithm. The proposed algorithm is compared with the existing techniques on the publicly available labeled faces in the wild, viewpoint-invariant pedestrian recognition, and Toy Cars data sets. The IC-LDML algorithm has outperformed existing techniques for the tasks of face recognition, person identification, and object classification by a significant margin.

Explore More