ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) | 2021

Symmetric Sub-graph Spatio-Temporal Graph Convolution and its application in Complex Activity Recognition

 
 

Abstract


Understanding complex hand actions, such as assembly tasks or kitchen activities, from hand skeleton data is an important yet challenging task. In this paper, we analyze hand skeleton-based complex activities by modeling dynamic hand skeletons through a spatiotemporal graph convolutional neural network (ST-GCN). This model jointly learns and extracts Spatio-temporal features for activity recognition. Our proposed technique, Symmetric Sub-graph spatio-temporal graph convolutional neural network (S2-ST-GCN), exploits the symmetric nature of hand graphs to decompose them into smaller sub-graphs, which allow us to build a separate temporal model for the relative motion of the fingers. This subgraph approach can be implemented efficiently by preprocessing input data using a Haar unit based orthogonal matrix. Then, in addition to spatial filters, separate temporal filters can be learned for each sub-graph. We evaluate the performance of the proposed method on the First-Person Hand Action dataset. While the proposed method shows comparable performance with the state of the art methods in train:test=1:1 setting, it achieves this with greater stability. Furthermore, we demonstrate significant performance improvement in comparison to state of the art methods in the cross-person setting, where the model did not come across a test subject s data while learning. S2-ST-GCN also shows superior performance than a finger-based decomposition of the hand graph where no preprocessing is applied.

Volume None
Pages 3215-3219
DOI 10.1109/ICASSP39728.2021.9413833
Language English
Journal ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Full Text