IEEE Transactions on Multimedia | 2021

LCSegNet: An Efficient Semantic Segmentation Network for Large-Scale Complex Chinese Character Recognition

Abstract

Complex scene character recognition is a challenging yet important task in machine learning, especially for languages with large character sets, such as Chinese, which is composed of hieroglyphics with large-scale categories and similar glyphs. Recently, state-of-the-art methods based on semantic segmentation have achieved great success in scene parsing and have been applied in scene text recognition. However, because of limitations in terms of memory and computation, they are only applied in the small category recognition tasks, such as tasks involving English alphabets and digits. In this paper, we propose an efficient semantic segmentation model based on label coding (LC), called LCSegNet, to recognize large-scale Chinese characters. First, to reduce the number of labels, we design a new label coding method based on the Wubi Chinese characters code, called Wubi-CRF. In this method, glyphs and structure information of Chinese characters are encoded into 140-bit labels. Second, we employ an efficient semantic segmentation model for pixel-wise prediction and utilize a conditional random field (CRF) module to learn the constraint rules of Wubi-like coding. Finally, experiments are conducted on three benchmarks: a large Chinese text dataset in the wild (CTW), ICDAR2019-ReCTS, and HIT-OR3C dataset. Results show that the proposed method achieves state-of-the-art performances in both complex scene and handwritten character recognition tasks.

Volume 23

IEEE Transactions on Multimedia | 2021

LCSegNet: An Efficient Semantic Segmentation Network for Large-Scale Complex Chinese Character Recognition

Abstract

Volume 23

Pages 3427-3440

DOI 10.1109/tmm.2020.3025696

Language English

Journal IEEE Transactions on Multimedia

Full Text