Mathias Seuret | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mathias Seuret is active.

Explore More

Publication

Featured researches published by Mathias Seuret.

international conference on document analysis and recognition | 2015

Page segmentation of historical document images with convolutional autoencoders

Kai Chen; Mathias Seuret; Marcus Liwicki; Jean Hennebert; Rolf Ingold

In this paper, we present an unsupervised feature learning method for page segmentation of historical handwritten documents available as color images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as either periphery, background, text block, or decoration. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we apply convolutional autoencoders to learn features directly from pixel intensity values. Then, using these features to train an SVM, we achieve high quality segmentation without any assumption of specific topologies and shapes. Experiments on three public datasets demonstrate the effectiveness and superiority of the proposed approach.

document analysis systems | 2016

Page Segmentation for Historical Document Images Based on Superpixel Classification with Unsupervised Feature Learning

Kai Chen; Cheng-Lin Liu; Mathias Seuret; Marcus Liwicki; Jean Hennebert; Rolf Ingold

In this paper, we present an efficient page segmentation method for historical document images. Many existing methods either rely on hand-crafted features or perform rather slow as they treat the problem as a pixel-level assignment problem. In order to create a feasible method for real applications, we propose to use superpixels as basic units of segmentation, and features are learned directly from pixels. An image is first oversegmented into superpixels with the simple linear iterative clustering (SLIC) algorithm. Then, each superpixel is represented by the features of its central pixel. The features are learned from pixel intensity values with stacked convolutional autoencoders in an unsupervised manner. A support vector machine (SVM) classifier is used to classify superpixels into four classes: periphery, background, text block, and decoration. Finally, the segmentation results are refined by a connected component based smoothing procedure. Experiments on three public datasets demonstrate that compared to our previous method, the proposed method is much faster and achieves comparable segmentation results. Additionally, much fewer pixels are used for classifier training.

document recognition and retrieval | 2015

Ground Truth Model, Tool, and Dataset for Layout Analysis of Historical Documents

Kai Chen; Mathias Seuret; Hao Wei; Marcus Liwicki; Jean Hennebert; Rolf Ingold

In this paper, we propose a new dataset and a ground-truthing methodology for layout analysis of historical documents with complex layouts. The dataset is based on a generic model for ground-truth presentation of the complex layout structure of historical documents. For the purpose of extracting uniformly the document contents, our model defines five types of regions of interest: page, text block, text line, decoration, and comment. Unconstrained polygons are used to outline the regions. A performance metric is proposed in order to evaluate various page segmentation methods based on this model. We have analysed four state-of-the-art ground-truthing tools: TRUVIZ, GEDI, WebGT, and Aletheia. From this analysis, we conceptualized and developed Divadia, a new tool that overcomes some of the drawbacks of these tools, targeting the simplicity and the efficiency of the layout ground truthing process on historical document images. With Divadia, we have created a new public dataset. This dataset contains 120 pages from three historical document image collections of different styles and is made freely available to the scientific community for historical document layout analysis research.

international conference on frontiers in handwriting recognition | 2016

DIVA-HisDB: A Precisely Annotated Large Dataset of Challenging Medieval Manuscripts

Foteini Simistira; Mathias Seuret; Nicole Eichenberger; Angelika Garz; Marcus Liwicki; Rolf Ingold

This paper introduces a publicly available historical manuscript database DIVA-HisDB for the evaluation of several Document Image Analysis (DIA) tasks. The database consists of 150 annotated pages of three different medieval manuscripts with challenging layouts. Furthermore, we provide a layout analysis ground-truth which has been iterated on, reviewed, and refined by an expert in medieval studies. DIVA-HisDB and the ground truth can be used for training and evaluating DIA tasks, such as layout analysis, text line segmentation, binarization and writer identification. Layout analysis results of several representative baseline technologies are also presented in order to help researchers evaluate their methods and advance the frontiers of complex historical manuscripts analysis. An optimized state-of-the-art Convolutional Auto-Encoder (CAE) performs with around 95% accuracy, demonstrating that for this challenging layout there is much room for improvement. Finally, we show that existing text line segmentation methods fail due to interlinear and marginal text elements.

document analysis systems | 2016

Creating Ground Truth for Historical Manuscripts with Document Graphs and Scribbling Interaction

Angelika Garz; Mathias Seuret; Fotini Simistira; Andreas Fischer; Rolf Ingold

Ground truth is both - indispensable for training and evaluating document analysis methods, and yet very tedious to create manually. This especially holds true for complex historical manuscripts that exhibit challenging layouts with interfering and overlapping handwriting. In this paper, we propose a novel semi-automatic system to support layout annotations in such a scenario based on document graphs and a pen-based scribbling interaction. On the one hand, document graphs provide a sparse page representation that is already close to the desired ground truth and on the other hand, scribbling facilitates an efficient and convenient pen-based interaction with the graph. The performance of the system is demonstrated in the context of a newly introduced database of historical manuscripts with complex layouts.

Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing | 2015

Selecting Autoencoder Features for Layout Analysis of Historical Documents

Hao Wei; Mathias Seuret; Kai Chen; Andreas Fischer; Marcus Liwicki; Rolf Ingold

Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing supports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase the classification accuracy and to reduce the feature dimension, in this paper we propose a novel feature selection method. The method cascades adapted versions of two conventional methods. Compared to three conventional methods and our previous work, the proposed method achieves a higher classification accuracy in most cases, while maintaining low feature dimension. In addition, we find that a significant number of autoencoder features are redundant or irrelevant for the classification, and we give our explanations. To the best of our knowledge, this paper is one of the first investigations in the field of image processing on the detection of redundancy and irrelevance of autoencoder features using feature selection.

international conference on frontiers in handwriting recognition | 2016

N-Light-N: A Highly-Adaptable Java Library for Document Analysis with Convolutional Auto-Encoders and Related Architectures

Mathias Seuret; Rolf Ingold; Marcus Liwicki

This paper presents a novel, highly-adaptable Java framework N-light-N, for the work with deep neural networks, especially with CAEs. While the most popular deep learning libraries focus on fast processing and high performance, they only implement the main-stream network architectures and network units. In recent research in the document domain, however, we have shown that modified networks, units, and training processes significantly improve the performance in various tasks. To enable the document research community with such capabilities, in this paper we introduce a novel, publicly available Deep Learning framework which is easy to use, adapt, and extend. Furthermore, we present successful applications for three tasks, including two in the domain of handwritten historical documents, and show how the framework can be used for adaptation, optimization, and deeper analysis.

Digital Scholarship in the Humanities | 2017

The use of Gabor features for semi-automatically generated polyon-based ground truth of historical document images

Hao Wei; Mathias Seuret; Marcus Liwicki; Rolf Ingold

Historical documents usually have a complex layout, making them one of the most challenging types of documents for automatic image analysis. In the pipeline of automatic document image analysis (DIA), layout analysis is an important prerequisite for further steps including optical character recognition, script analysis, and image recognition. It aims at splitting a document image into regions of interest such as text lines, background, and decorations. To train a layout analysis system, an essential prerequisite is a set of pages with corresponding ground truth (GT), i.e. existing labels (e.g. text line and decoration) annotated by human experts. Although there exist many methods and tools in GT generation, most of them are not suitable on our specific data sets. In this article, we propose to use Gabor features to generate GT, and based on Gabor features, we developed a web-based interface called DIVADIAWI. DIVADIAWI applies automatic functions using Gabor features to generate GT of text lines. For other region types such as background and decorations, users can manually draw their GT with userfriendly operations. The evaluation shows that (1) DIVADIAWI has two advantages when bringing it into context with state-of-the-art tools, (2) the automatic functions of DIVADIAWI greatly accelerate the GT generation, and (3) DIVADIAWI obtains a high score in a system usability test. .................................................................................................................................................................................

international conference on document analysis and recognition | 2015

Gradient-domain degradations for improving historical documents images layout analysis

Mathias Seuret; Kai Chen; Nicole Eichenbergery; Marcus Liwicki; Rolf Ingold

We present a novel method for adding realistic degradations to historical document images in order to generate more training data. Degradation patches are extracted from other documents and applied to the target document in the gradient domain. Working in the gradient domain has not been done for this purpose in document images analysis so far. It has the advantage to prevent color inconsistencies and allows to efficiently avoid border effects. This paper contains the detailed description of our novel method, with a focus on the mathematical aspect of the transition to and from the gradient domain. Furthermore, we perform quantitative experiments where we investigate the effects of using synthetically generated training data on historical documents with different kind of degradations.

international conference on frontiers in handwriting recognition | 2014

Pixel Level Handwritten and Printed Content Discrimination in Scanned Documents

Mathias Seuret; Marcus Liwicki; Rolf Ingold

Classification of the content of a scanned document as either printed or handwritten is typically tackled as a segmentation problem of pages into text lines or words. However these methods are not applicable on documents where handwritten annotations overlay printed text. In this paper we propose to treat the task as a pixel classification task, i.e., To classify individual foreground pixels into either printed or handwritten pixels. Our method uses various features of diverse nature taking the surrounding window into account. The influence of the features and their parameters are investigated and optimized on a validation set. Each foreground pixel is then classified by a multilayer perceptron using feature vectors based on a pixel neighborhood. Finally, a post-processing step corrects typical misclassifications, i.e., It removes outliers based on several heuristics. We evaluated our method on printed documents with real handwritten annotations and reached an accuracy of 96.10% on the test set. This is significantly higher than a previously published methods based on local features.

Explore More