Angelika Garz
Vienna University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Angelika Garz.
document analysis systems | 2012
Angelika Garz; Andreas Fischer; Robert Sablatnig; Horst Bunke
Segmenting page images into text lines is a crucial pre-processing step for automated reading of historical documents. Challenging issues in this open research field are given \eg by paper or parchment background noise, ink bleed-through, artifacts due to aging, stains, and touching text lines. In this paper, we present a novel binarization-free line segmentation method that is robust to noise and copes with overlapping and touching text lines. First, interest points representing parts of characters are extracted from gray-scale images. Next, word clusters are identified in high-density regions and touching components such as ascenders and descenders are separated using seam carving. Finally, text lines are generated by concatenating neighboring word clusters, where neighborhood is defined by the prevailing orientation of the words in the document. An experimental evaluation on the Latin manuscript images of the Saint Gall database shows promising results for real-world applications in terms of both accuracy and efficiency.
international conference on document analysis and recognition | 2011
Angelika Garz; Robert Sablatnig; Markus Diem
We propose a layout analysis method for historical manuscripts that relies on the part-based identification of layout entities. A layout entity -- such as letters of the text, initials or headings -- is composed of a set of characteristic segments or structures, which is dissimilar for distinct classes in the manuscripts under consideration. This fact is exploited in order to segment a manuscript page into homogeneous regions. Historical documents traditionally involve challenges such as uneven writing support and varying shapes of characters, fluctuating text lines, changing scripts and writing styles, and variance in the layout itself. Hence, a part-based detection of layout entities is proposed using a multi-stage algorithm for the localization of the entities, based on interest points. Results show that the proposed method is able to locate initials, headings and text areas in ancient manuscripts containing stains, tears and partially faded-out ink sufficiently well.
international conference on frontiers in handwriting recognition | 2010
Angelika Garz; Markus Diem; Robert Sablatnig
An approach for the detection of decorative elements – such as initials and headlines – and text regions, focused on ancient manuscripts, is presented. Due to their age, ancient manuscripts suffer from degradation and staining as well as ink is faded-out over the time. Identifying decorative elements and text regions allows indexing a manuscript and serves as input for Optical Character Recognition (OCR) as it localizes regions of interest within document pages. We propose a robust method inspired by state-of-the-art object recognition methodologies. Scale Invariant Feature Transform (SIFT) descriptors are chosen to detect the regions of interest, and the scale of the interest points is used for localization. The classification is based on the fact that local properties of the decorative elements are different to those of regular text. The results show that the method is able to locate regular text in ancient manuscripts. The detection rate of decorative elements is not as high as for regular text but already yields to promising results.
international conference on frontiers in handwriting recognition | 2016
Foteini Simistira; Mathias Seuret; Nicole Eichenberger; Angelika Garz; Marcus Liwicki; Rolf Ingold
This paper introduces a publicly available historical manuscript database DIVA-HisDB for the evaluation of several Document Image Analysis (DIA) tasks. The database consists of 150 annotated pages of three different medieval manuscripts with challenging layouts. Furthermore, we provide a layout analysis ground-truth which has been iterated on, reviewed, and refined by an expert in medieval studies. DIVA-HisDB and the ground truth can be used for training and evaluating DIA tasks, such as layout analysis, text line segmentation, binarization and writer identification. Layout analysis results of several representative baseline technologies are also presented in order to help researchers evaluate their methods and advance the frontiers of complex historical manuscripts analysis. An optimized state-of-the-art Convolutional Auto-Encoder (CAE) performs with around 95% accuracy, demonstrating that for this challenging layout there is much room for improvement. Finally, we show that existing text line segmentation methods fail due to interlinear and marginal text elements.
document analysis systems | 2014
Andreas Fischer; Micheal Baechler; Angelika Garz; Marcus Liwicki; Rolf Ingold
Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.
virtual systems and multimedia | 2010
Angelika Garz; Robert Sablatnig
Text recognition in ancient documents poses specific challenges such as degradation and staining, fading out of ink, fluctuating text lines, superimposing of text-elements or varying layouts, amongst others. To cope with those challenges, a texture-based approach is proposed, which exploits the fact that different kinds of textures have distinct orientation distributions. The orientation information is extracted using the Auto-Correlation Function (ACF). The approach is applied to three different manuscripts, namely to Glagolitic manuscripts of the 11th century, a Latin and a composite Latin-German manuscript, both originating from the 14th century. The evaluation is based on manually labeled ground truth and shows the accuracy of the features chosen even when the method is applied to document pages that are different in writing style and line spacing to those in the training set.
international symposium on visual computing | 2010
Angelika Garz; Markus Diem; Robert Sablatnig
This paper presents a technique for layout analysis of historical document images based on local descriptors. The considered layout elements are regions of regular text and elements having a decorative meaning such as headlines and initials. The proposed technique exploits the differences in the local properties of the layout elements. For this purpose, an approach drawing its inspiration from state-of-the-art object recognition methodologies - namely Scale Invariant Feature Transform (Sift) descriptors - is proposed. The scale of the interest points is used for localization. The results show that the method is able to locate regular text in ancient manuscripts. The detection rate of decorative elements is not as high as for regular text but already yields to promising results.
Archive | 2011
Jana Machajdik; Allan Hanbury; Angelika Garz; Robert Sablatnig
document recognition and retrieval | 2016
Angelika Garz; Marcel Würsch; Andreas Fischer; Rolf Ingold
european signal processing conference | 2011
Angelika Garz; Robert Sablatnig; Markus Diem