Appl. Intell. | 2021

EmNet: a deep integrated convolutional neural network for facial emotion recognition in the wild

 
 
 

Abstract


In the past decade, facial emotion recognition (FER) research saw tremendous progress, which led to the development of novel convolutional neural network (CNN) architectures for automatic recognition of facial emotions in static images. These networks, though, have achieved good recognition accuracy, they incur high computational costs and memory utilization. These issues restrict their deployment in real-world applications, which demands the FER systems to run on resource-constrained embedded devices in real-time. Thus, to alleviate these issues and to develop a robust and efficient method for automatic recognition of facial emotions in the wild with real-time performance, this paper presents a novel deep integrated CNN model, named EmNet (Emotion Network). The EmNet model consists of two structurally similar DCNN models and their integrated variant, jointly-optimized using a joint-optimization technique. For a given facial image, the EmNet gives three predictions, which are fused using two fusion schemes, namely average fusion and weighted maximum fusion, to obtain the final decision. To test the efficiency of the proposed FER pipeline on a resource-constrained embedded platform, we optimized the EmNet model and the face detector using TensorRT SDK and deploy the complete FER pipeline on the Nvidia Xavier device. Our proposed EmNet model with 4.80M parameters and 19.3MB model size attains notable improvement over the current state-of-the-art in terms of accuracy with multi-fold improvement in computational efficiency.

Volume 51
Pages 5543-5570
DOI 10.1007/S10489-020-02125-0
Language English
Journal Appl. Intell.

Full Text