2021 International Conference on Artificial Intelligence (ICAI) | 2021
Multimodal Emotion Recognition using Deep Convolution and Recurrent Network
Abstract
Automatic human emotion recognition is one of the most important and growing field of research in Human Computer Interaction (HCI) domain. It has huge impact on applications like Automatic Human Behaviour Analysis and multimedia retrieval system. Recently, Deep Neural Networks have gain a lot of success in terms of accuracy related to machine learning tasks. Automatic human emotion recognition have also been solved using the deep learning based Convolution Neural Network (CNN). Currently, most of deep learning based algorithms for human emotion recognition only focuses on the particular direction like Vision, Text and Audio. These algorithms are trained on specific modality (visual data, textual data, acoustic data) and performs well in a controlled environment, but failed to achieve the good results in most of the real-life cases. It is due to the unpredictable behaviour of the human nature. So to tackle this problem, a novel deep learning based multi modal architecture have been proposed in this paper. This algorithms utilizes visual, textual as well as audio features to enhance the accuracy on automatic human emotion recognition. Extensive experiments have been performed to improve the accuracy and transparency of our proposed work. Results have proved that we achieved good results as compared to the sate-of-the-art results.