ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) | 2021

Aggregation Architecture and all-to-one Network for Real-Time Semantic Segmentation

 
 
 

Abstract


Deep convolutional neural network has demonstrated its out-standing performance in the field of image semantic segmentation. However, the enormous computational complexity of existing high-precision networks limits the application of the model in real-time segmentation tasks. How to achieve a good trade-off between accuracy and speed becomes a challenge. Existing solutions can be roughly divided into three categories according to the network architecture: dilation, encoder-decoder, and multi-pathway, each of which has its advantages. In this paper, we make the following contributions: (i) First, unlike the previous three architectures, we propose a new aggregation architecture as the network back-bone. (ii) Second, a multi-level auxiliary loss design model is used for the training phase, which can improve the model segmentation effect. (iii) According to this aggregation structure, an all-to-one network (ATONet) for real-time semantic segmentation is proposed, which achieves a good trade-off between speed and accuracy by assembling the features of all blocks. (iv) Finally, the proposed network achieves the accuracy of 74.4% and 70.1% mIoU with the inference speed of 42.7 FPS and 93.5 FPS on the Cityscapes and CamVid datasets.

Volume None
Pages 2330-2334
DOI 10.1109/ICASSP39728.2021.9413454
Language English
Journal ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Full Text