2020 25th International Conference on Pattern Recognition (ICPR) | 2021

Sample-aware Data Augmentor for Scene Text Recognition

 
 
 
 
 
 
 

Abstract


Deep neural networks (DNNs) have been widely used in scene text recognition, and achieved remarkable performance. Such DNN-based scene text recognizers usually require plenty of training data for training, but data collection and annotation is usually cost-expensive in practice. To alleviate this issue, data augmentation is often applied to train the scene text recognizers. However, existing data augmentation methods including affine transformation and elastic transformation methods suffer from the problems of under- and over-diversity, due to the complexity of text contents and shapes. In this paper, we propose a sample-aware data augmentor to transform samples adaptively based on the contents of samples. Specifically, our data augmentor consists of three parts: gated module, affine transformation module, and elastic transformation module. In our data augmentor, affine transformation module focuses on keeping the affinity of samples, while elastic transformation module aims to improve the diversity of samples. With the gated module, our data augmentor determines transformation type adaptively based on the properties of training samples and the recognizer capability during the training process. Besides, our framework introduces an adversarial learning strategy to optimize the augmentor and the recognizer jointly. Extensive experiments on scene text recognition benchmarks show that our sample-aware data augmentor significantly improves the performance of state-of-the-art scene text recognizer.

Volume None
Pages 3978-3985
DOI 10.1109/ICPR48806.2021.9412536
Language English
Journal 2020 25th International Conference on Pattern Recognition (ICPR)

Full Text