2019 International Conference on Document Analysis and Recognition Workshops (ICDARW) | 2019

Semantic Text Recognition via Visual Question Answering

 
 
 
 

Abstract


Scene Text Recognition has been an important challenge in the context of visual understanding, however, when reasoning is involved, generally by the inclusion of VQA mechanisms, current models fail in recognizing the required textual data. This joint task has not received the required attention, due to the lack of databases targeting this specific task, as well as all the additional challenges imposed for the separate tasks of text recognition and VQA. In this paper, we present a complete methodology to address this important task, by designing strategies that process the different components involved in the system. We use the data provided by the recent challenge proposed by the ICDAR community in the Robust Reading Challenge on Scene Text Visual Question Answering in the experimentation phase, giving us a great knowledge and insights for future work.

Volume 5
Pages 97-102
DOI 10.1109/ICDARW.2019.40088
Language English
Journal 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW)

Full Text