Journal of Physics: Conference Series | 2021

A Comprehensive Analysis of Multimodal Speech Emotion Recognition

 

Abstract


Emotion recognition is critical in dealing with everyday interpersonal human interactions. Understanding a person’s emotions through his speech can do wonders for shaping social interactions. Because of the rapid development of social media, single-modal emotion recognition is finding it difficult to meet the demands of the current emotional recognition system. A multimodal emotion recognition model from speech and text was proposed in this paper to optimize the performance of the emotion recognition system. This paper, explore the comprehensive analysis of speech emotion recognition using text and audio. The results show that enhancement of accuracy compared to either audio or text. Here, results were obtained using the deep learning model I.e. LSTM. The experiment analysis is done for RAVDESS and SAVEE datasets. This implementation is done by python programming.

Volume 1917
Pages None
DOI 10.1088/1742-6596/1917/1/012009
Language English
Journal Journal of Physics: Conference Series

Full Text