2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM) | 2019

Scene Text Detection Via Cascade FPN and Channel Enhancement

 
 

Abstract


The detection and recognition of scene text in wild environment have attracted mass of attention because of the rapid development of Convolutional Neural Networks. However, due to the high diversity of scale and shape of scene text in real world, most of the existing state-of-art algorithms may have their shortages. On the one hand, object detection based methods require quadrangle bounding box which cannot detect arbitrary text. On the other hand, the segmentation based methods fail to pay attention to the feature and channel inter-dependency, which may fall short when facing different scale texts. The Progressive Scale Expansion Network (PSENet), which can precisely detect text instances with arbitrary shapes, could be a useful and simple baseline. Based on PSENet, we additionally add more aggregation in spatial and channel of the feature map, making it possible to fully utilize feature of different scales and channels to handle arbitrary-shaped and multi-scale text instances. To accomplish this goal, we introduce the Path Aggregation Module(PAM) and Feature Selective Module(FSM). In the PAM, we implement cascade FPN structure and shortcut procedure with global information. In the FSM, we borrow the idea from SENet. Extensive experiments on Total-Text, ICDAR 2015 and ICDAR 2017 MLT validate the effectiveness of our method. On ICDAR2017, our best F-measure (75.3%) outperforms PSENet by 3.1%.

Volume None
Pages 155-161
DOI 10.1109/BigMM.2019.00-30
Language English
Journal 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM)

Full Text