IEEE Transactions on Affective Computing | 2021
Predicting Emotionally Salient Regions Using Qualitative Agreement of Deep Neural Network Regressors
Abstract
Automatic emotion recognition plays a crucial role in various fields such as healthcare, human-computer interaction (HCI) and security and defense. While most of previous studies have focused on the recognition of emotion in isolated utterances, a more natural approach is to continuously track emotions during human interaction, identifying regions that are highly emotional. This study proposes a framework to define emotionally salient regions (hotspots), which we then attempt to dynamically detect. Our proposed approach defines hotspots relying on the qualitative agreement (QA) method, which searches for trends across continuous-time evaluations provided by different raters for arousal and valence. We illustrate the benefits of the QA method over averaging absolute values of the traces without considering trends across evaluators. After defining hotspot regions, we propose a deep learning framework to automatically detect these emotional hotspots. The proposed method relies on an ensemble of bidirectional long short term memory (BLSTM) regressors, trained on individual emotional traces provided by the evaluators, which are combined to automatically detect emotional hotspots. An appealing fusion approach to combine these regressors is to rely again on the QA method, which detects emotional salient regions with F1-scores as high as 60.9 percent for arousal and 50.4 percent for valence on the RECOLA dataset.