2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI) | 2019
Self-Learned Feature Reconstruction and Offset-Dilated Feature Fusion for Real-Time Semantic Segmentation
Abstract
Recent approaches for real-time semantic segmentation usually employ the encoder-decoder architecture as the backbone to generate a high-quality segmentation prediction. There has been a lot of research on designing efficient encoding methods. However, enhancing the performance of components in decoder is also crucial for pixel-level recognition. In this paper, we propose a self-learned feature reconstruction (SFR) method and an offset-dilated feature fusion (ODFF) module to improve the prediction reconstruction capability of the decoder. Concretely, SFR can effectively reconstruct the high-resolution feature maps by recombining feature space, in which the space transformation matrix implicitly contained in a convolution layer can selectively highlight features at each position by leveraging the knowledge of label space in a self-learned way. Moreover, ODFF module can effectively fuse multilevel features with multiscale contextual information by feeding the feature maps into designed parallel offset-dilated convolutions, which enhances the feature representation capability of the decoder. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of our proposed methods embedded in ESPNet.