2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC) | 2019

Attention-Based Text Recognition in Image

 
 
 
 
 
 

Abstract


Scene text recognition has attracted lots of research interest in computer vision for decades due to its various application. However, it is still a challenging task because of texts appearance variations in term of perspective distortion, text line curvature, text styles as well as font size. Almost all existing state of the art methods adopt the attention-based encoder-decoder framework which uses RNN as main structure. Inspired by the outstanding performance of transformer, which also adoptsencoder-decoder framework but discards the RNN unit, in the field of natural language processing, we develop the recognition network based on transformer (RNBT). And we also modify the loss function to improve the problem that the encoder-decode framework gets bad recognition performance on images that has longer text length than images in training set. The whole network can be trained end-to-end by using only images and image-level annotations. Extensive experiments on various public datasets, including CUTE80, SVT-Perspective, IIIT5K, SVT and ICDAR datasets, show that the proposed method achieves excellent performance on both regular and irregular datasets.

Volume None
Pages 278-283
DOI 10.1109/DSC.2019.00049
Language English
Journal 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC)

Full Text