Is this you? Create Your Porfile

Angelika Garz

Vienna University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Angelika Garz is active.

Explore More

Publication

Featured researches published by Angelika Garz.

document analysis systems | 2012

Binarization-Free Text Line Segmentation for Historical Documents Based on Interest Point Clustering

Angelika Garz; Andreas Fischer; Robert Sablatnig; Horst Bunke

Segmenting page images into text lines is a crucial pre-processing step for automated reading of historical documents. Challenging issues in this open research field are given \eg by paper or parchment background noise, ink bleed-through, artifacts due to aging, stains, and touching text lines. In this paper, we present a novel binarization-free line segmentation method that is robust to noise and copes with overlapping and touching text lines. First, interest points representing parts of characters are extracted from gray-scale images. Next, word clusters are identified in high-density regions and touching components such as ascenders and descenders are separated using seam carving. Finally, text lines are generated by concatenating neighboring word clusters, where neighborhood is defined by the prevailing orientation of the words in the document. An experimental evaluation on the Latin manuscript images of the Saint Gall database shows promising results for real-world applications in terms of both accuracy and efficiency.

international conference on document analysis and recognition | 2011

Layout Analysis for Historical Manuscripts Using Sift Features

Angelika Garz; Robert Sablatnig; Markus Diem

We propose a layout analysis method for historical manuscripts that relies on the part-based identification of layout entities. A layout entity -- such as letters of the text, initials or headings -- is composed of a set of characteristic segments or structures, which is dissimilar for distinct classes in the manuscripts under consideration. This fact is exploited in order to segment a manuscript page into homogeneous regions. Historical documents traditionally involve challenges such as uneven writing support and varying shapes of characters, fluctuating text lines, changing scripts and writing styles, and variance in the layout itself. Hence, a part-based detection of layout entities is proposed using a multi-stage algorithm for the localization of the entities, based on interest points. Results show that the proposed method is able to locate initials, headings and text areas in ancient manuscripts containing stains, tears and partially faded-out ink sufficiently well.

international conference on frontiers in handwriting recognition | 2010

Detecting Text Areas and Decorative Elements in Ancient Manuscripts

Angelika Garz; Markus Diem; Robert Sablatnig

An approach for the detection of decorative elements – such as initials and headlines – and text regions, focused on ancient manuscripts, is presented. Due to their age, ancient manuscripts suffer from degradation and staining as well as ink is faded-out over the time. Identifying decorative elements and text regions allows indexing a manuscript and serves as input for Optical Character Recognition (OCR) as it localizes regions of interest within document pages. We propose a robust method inspired by state-of-the-art object recognition methodologies. Scale Invariant Feature Transform (SIFT) descriptors are chosen to detect the regions of interest, and the scale of the interest points is used for localization. The classification is based on the fact that local properties of the decorative elements are different to those of regular text. The results show that the method is able to locate regular text in ancient manuscripts. The detection rate of decorative elements is not as high as for regular text but already yields to promising results.

international conference on frontiers in handwriting recognition | 2016

DIVA-HisDB: A Precisely Annotated Large Dataset of Challenging Medieval Manuscripts

Foteini Simistira; Mathias Seuret; Nicole Eichenberger; Angelika Garz; Marcus Liwicki; Rolf Ingold

This paper introduces a publicly available historical manuscript database DIVA-HisDB for the evaluation of several Document Image Analysis (DIA) tasks. The database consists of 150 annotated pages of three different medieval manuscripts with challenging layouts. Furthermore, we provide a layout analysis ground-truth which has been iterated on, reviewed, and refined by an expert in medieval studies. DIVA-HisDB and the ground truth can be used for training and evaluating DIA tasks, such as layout analysis, text line segmentation, binarization and writer identification. Layout analysis results of several representative baseline technologies are also presented in order to help researchers evaluate their methods and advance the frontiers of complex historical manuscripts analysis. An optimized state-of-the-art Convolutional Auto-Encoder (CAE) performs with around 95% accuracy, demonstrating that for this challenging layout there is much room for improvement. Finally, we show that existing text line segmentation methods fail due to interlinear and marginal text elements.

document analysis systems | 2014

A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents

Andreas Fischer; Micheal Baechler; Angelika Garz; Marcus Liwicki; Rolf Ingold

Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.

virtual systems and multimedia | 2010

Multi-scale texture-based text recognition in ancient manuscripts

Angelika Garz; Robert Sablatnig

Text recognition in ancient documents poses specific challenges such as degradation and staining, fading out of ink, fluctuating text lines, superimposing of text-elements or varying layouts, amongst others. To cope with those challenges, a texture-based approach is proposed, which exploits the fact that different kinds of textures have distinct orientation distributions. The orientation information is extracted using the Auto-Correlation Function (ACF). The approach is applied to three different manuscripts, namely to Glagolitic manuscripts of the 11th century, a Latin and a composite Latin-German manuscript, both originating from the 14th century. The evaluation is based on manually labeled ground truth and shows the accuracy of the features chosen even when the method is applied to document pages that are different in writing style and line spacing to those in the training set.

international symposium on visual computing | 2010

Local descriptors for document layout analysis

Angelika Garz; Markus Diem; Robert Sablatnig

This paper presents a technique for layout analysis of historical document images based on local descriptors. The considered layout elements are regions of regular text and elements having a decorative meaning such as headlines and initials. The proposed technique exploits the differences in the local properties of the layout elements. For this purpose, an approach drawing its inspiration from state-of-the-art object recognition methodologies - namely Scale Invariant Feature Transform (Sift) descriptors - is proposed. The scale of the interest points is used for localization. The results show that the method is able to locate regular text in ancient manuscripts. The detection rate of decorative elements is not as high as for regular text but already yields to promising results.

Archive | 2011