2021 International Joint Conference on Neural Networks (IJCNN) | 2021

Multi-Modal Emotion Recognition Based On deep Learning Of EEG And Audio Signals

 
 
 
 
 

Abstract


Automatic recognition of human emotional states has attracted many researchers attention in Human-Computer Interactions and emotional brain-computer interface recently. However, the accuracy of emotion recognition is not satisfying. Considering the advantage of information supplement based on deep learning of multi-modal signals related to emotion, this study proposed a novel emotion recognition architecture to fuse emotional features from brain electroencephalography (EEG) signal and the corresponding audio signal in emotion recognition on DEAP dataset. We used convolutional neural network (CNN) to extract EEG features and bidirectional long short term memory (BiLSTM) neural networks to extract audio features. After that, we combine the multi-modal features into a deep learning architecture to recognize arousal and valence levels. Results showed an improved accuracy compared with previous studies that merely used the EEG signals in both arousal level and valence level, which suggests the effectiveness of our proposed multi-modal fused emotion recognition model. In future work, multi-modal data from nature interaction scenes will be collected and inputted into this architecture to further validate the effectiveness of the method.

Volume None
Pages 1-6
DOI 10.1109/IJCNN52387.2021.9533663
Language English
Journal 2021 International Joint Conference on Neural Networks (IJCNN)

Full Text