2021 16th International Conference on Computer Science & Education (ICCSE) | 2021

A Connected Components Based Layout Analysis Approach for Educational Documents

 
 
 
 
 

Abstract


Layout analysis, which aims to detect and categorize areas of interest on document images, is an increasingly important part in document image processing. Existing researches have conducted layout analysis on various documents, but none has been proposed for documents yielded from teaching, i.e. exam papers and workbooks, which are worth studying. In this paper, we propose a novel layout analysis system to achieve two tasks for workbook pages and exam papers respectively. On one hand, we segment text and non-text areas of workbook pages. On the other hand, we extract regions of interest on exam papers. Our system is based on connected component (CC) analysis, specifically, it extracts geometric features and spatial information of CCs to recognize page elements. We carried out experiments on images collected from real-world scenarios, and promising results confirmed the applicability and effectiveness of our system.

Volume None
Pages 875-880
DOI 10.1109/ICCSE51940.2021.9569699
Language English
Journal 2021 16th International Conference on Computer Science & Education (ICCSE)

Full Text