2021 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) | 2021
Dynamic gesture recognition based on CNN-LSTM-Attention
Abstract
Compared with traditional human-computer interaction techniques, gesture recognition is closer to human expression habits and have some advantages of being efficient and easy to master. Vision-based gesture recognition does not require additional equipment, and is very convenient and relatively low cost. To recognize dynamic gesture in complex background, we build a backbone network based on SSD with dilated convolution, which greatly improves the quality of the detected feature maps, and then we proposes a CNN-LSTM-Attention based dynamic gesture recognition network. The spatial features of dynamic gestures at each moment are first extracted from gesture sequences, then these features are transformed into dynamic gesture spatio-temporal features by a recurrent neural network with an attention mechanism, and finally fed into a fully connected neural network for gesture recognition. The dynamic gesture recognition network achieves 93.5% recognition rate on Sahand dataset, which exhibits its effectiveness.