2020 25th International Conference on Pattern Recognition (ICPR) | 2021

DeepPear: Deep Pose Estimation and Action Recognition

 
 

Abstract


Human action recognition has been a popular issue recently because it can be applied in many applications such as intelligent surveillance systems, human-robot interaction, and autonomous vehicle control. Human action recognition using RGB video is a challenging task because the learning of actions is easily affected by the cluttered background. To cope with this problem, the proposed method estimates 3D human poses first which can help remove the cluttered background and focus on the human body. In addition to the human poses, the proposed method also utilizes appearance features nearby the predicted joints to make our action prediction context-aware. Instead of using 3D convolutional neural networks as many action recognition approaches did, the proposed method uses a two-stream architecture that aggregates the results from skeleton-based and appearance-based approaches to do action recognition. Experimental results show that the proposed method achieved state-of-the-art performance on NTU RGB+D which is a large-scale dataset for human action recognition.

Volume None
Pages 7119-7125
DOI 10.1109/ICPR48806.2021.9413011
Language English
Journal 2020 25th International Conference on Pattern Recognition (ICPR)

Full Text