2021 3rd International Conference on Pattern Recognition and Intelligent Systems | 2021

Simultaneous temporal and spatial deep attention for imaged skeleton-based action recognition

 
 
 

Abstract


The use of skeletons as a modality to represent and recognize human actions has gained interest thanks to the compactness of the data, their reliable representativeness in addition to their strong robustness. The deep learning based recognition approaches which are based on it often propose to improve the recognition pipeline by integrating the concept of attention in their modeling. The idea is to allow the model to focus on the relevant information of the action instead of attempting some kind of blind modeling. In this article, we propose an action recognition approach integrating simultaneously both spatial and temporal attentions. We first perform a transformation of the input sequence data into a color matrix, called imaged skeleton, comprising Cartesian and rotational information. Then, this new representation is given as input to an architecture composed of a main trunk, that allows features extraction and classification, and several attention branches. Different experimental evaluations on two popular benchmark databases, namely UT-Kinect [1] and SBU Kinect Interaction [2], are conducted to verify the interest of our proposed approach, where better performances are reported. Index: convolutional neural network, spatio-temporal, skeleton-based action recognition, deep attention.

Volume None
Pages None
DOI 10.1145/3480651.3480668
Language English
Journal 2021 3rd International Conference on Pattern Recognition and Intelligent Systems

Full Text