Journal of Physics: Conference Series | 2021

A Comprehensive Analysis of Multimodal Speech Emotion Recognition

Abstract

Emotion recognition is critical in dealing with everyday interpersonal human interactions. Understanding a person’s emotions through his speech can do wonders for shaping social interactions. Because of the rapid development of social media, single-modal emotion recognition is finding it difficult to meet the demands of the current emotional recognition system. A multimodal emotion recognition model from speech and text was proposed in this paper to optimize the performance of the emotion recognition system. This paper, explore the comprehensive analysis of speech emotion recognition using text and audio. The results show that enhancement of accuracy compared to either audio or text. Here, results were obtained using the deep learning model I.e. LSTM. The experiment analysis is done for RAVDESS and SAVEE datasets. This implementation is done by python programming.

Volume 1917

Journal of Physics: Conference Series | 2021

A Comprehensive Analysis of Multimodal Speech Emotion Recognition

Abstract

Volume 1917

Pages None

DOI 10.1088/1742-6596/1917/1/012009

Language English

Journal Journal of Physics: Conference Series

Full Text