Stefan Klink | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefan Klink is active.

Explore More

Publication

Featured researches published by Stefan Klink.

International Journal on Document Analysis and Recognition | 2001

Rule-based document structure understanding with a fuzzy combination of layout and textual features

Stefan Klink; Thomas Kieninger

Abstract. Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document ‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules can be formulated based on features which might be observed within one specific layout object. However, rules can also express dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common objects (e.g., lists).

document analysis systems | 2002

Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts

Stefan Klink; Armin Hust; Markus Junker; Andreas Dengel

Query expansion methods have been studied for a long time - with debatable success in many instances. In this paper, a new approach is presented based on using term concepts learned by other queries. Two important issues with query expansion are addressed: the selection and the weighing of additional search terms. In contrast to other methods, the regarded query is expanded by adding those terms which are most similar to the concept of individual query terms, rather than selecting terms that are similar to the complete query or that are directly similar to the query terms. Experiments have shown that this kind of query expansion results in notable improvements of the retrieval effectiveness if measured the recall/precision in comparison to the standard vector space model and to the pseudo relevance feedback. This approach can be used to improve the retrieval of documents in Digital Libraries, in Document Management Systems, in the WWW etc.

european conference on machine learning | 2002

Collaborative Learning of Term-Based Concepts for Automatic Query Expansion

Stefan Klink; Armin Hust; Markus Junker; Andreas Dengel

Information Retrieval Systems have been studied in Computer Science for decades. The traditional ad-hoc task is to find all documents relevant for an ad-hoc given query but the accuracy of ad-hoc document retrieval systems has plateaued in recent years. At DFKI, we are working on so-called collaborative information retrieval (CIR) systems which unintrusively learn from their users search processes. In this paper, a new approach is presented called term-based concept learning (TCL) which learns conceptual description terms occurring in known queries. A new query is expanded term by term using the previously learned concepts. Experiments have shown that TCL and the combination with pseudo relevance feedback result in notable improvements in the retrieval effectiveness if measured the recall/precision in comparison to the standard vector space model and to the pseudo relevance feedback. This approach can be used to improve the retrieval of documents in Digital Libraries, in Document Management Systems, in the WWW etc.

international conference on document analysis and recognition | 1999

MergeLayouts-overcoming faulty segmentations by a comprehensive voting of commercial OCR devices

Stefan Klink; Thorsten Jäger

In this paper we present a comprehensive voting approach, taking entire layouts obtained from commercial OCR devices as input. Such a layout comprises segments of three kinds: lines, words, and characters. By combining all attributes of a segment (e.g. recognized text, font height etc.), we attain a better layout, representing the original page layout as good as possible. The voting process itself is hierarchically organized, starting with the line segments. For each level, a search tree is spawn and all fellow segments (segments front different layouts which denote the same image area) are established. A heuristic search method is utilized which is guided by a similarity measure defined on segments. Deviations in the segmentation, as well as segmentation errors of individual commercial OCR devices, are compensated by an equalization module.

document analysis systems | 2000

Document Structure Analysis Based on Layout and Textual Features

Stefan Klink; Andreas Dengel; Thomas Kieninger

Text Mining | 2003

Towards Collaborative Information Retrieval: Three Approaches.

Armin Hust; Stefan Klink; Markus Junker; Andreas Dengel

Archive | 2001

Query reformulation with collaborative concept-based expansion

Stefan Klink

Archive | 2002

Query Reformulation in Collaborative Information Retrieval

Armin Hust; Stefan Klink; Markus Junker; Andreas Dengel

Archive | 2002

TCL - An Approach for Learning Meanings of Queries in Information Retrieval Systems

Stefan Klink; Armin Hust; Markus Junker

Archive | 2002

Towards Collaborative Information Retrieval

Markus Junker; Armin Hust; Stefan Klink

Explore More

Collaboration

Dive into the Stefan Klink's collaboration.

Top Co-Authors

Andreas Dengel

German Research Centre for Artificial Intelligence

View shared research outputs

Top Co-Authors

Roland John Burns

Hewlett-Packard

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Stefan Klink is active.

Publication

Featured researches published by Stefan Klink.

Rule-based document structure understanding with a fuzzy combination of layout and textual features

Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts

Collaborative Learning of Term-Based Concepts for Automatic Query Expansion

MergeLayouts-overcoming faulty segmentations by a comprehensive voting of commercial OCR devices

Document Structure Analysis Based on Layout and Textual Features

Towards Collaborative Information Retrieval: Three Approaches.

Query reformulation with collaborative concept-based expansion

Query Reformulation in Collaborative Information Retrieval

TCL - An Approach for Learning Meanings of Queries in Information Retrieval Systems

Towards Collaborative Information Retrieval

Collaboration

Dive into the Stefan Klink's collaboration.