2021 IEEE Region 10 Symposium (TENSYMP) | 2021

Multilingual Text Inversion Detection using Shape Context

 
 

Abstract


It is common to have aberrations in manually scanned textual documents. Document inversion is among the most frequent but a harder anomaly to detect efficiently. Moreover, an algorithm that may detect text inversion in one language may not work for another language. Deep Learning can be a language-agnostic solution, but they are not the most efficient. In this paper, we present an inversion detection algorithm based on shape context, which is a mathematical descriptor that uses log-polar histograms to encode relative shape information. Furthermore, to localize text blocks inside images, an efficient text bounding box algorithm has been proposed. The end-to-end algorithmic pipeline can localize text and detect inversion in multi-lingual text documents. The experiments demonstrate the method to have around 17.5x speed improvement vis-a-vis a standard Deep Learning model, with near 100% accuracy on the test dataset.

Volume None
Pages 1-6
DOI 10.1109/TENSYMP52854.2021.9550858
Language English
Journal 2021 IEEE Region 10 Symposium (TENSYMP)

Full Text