Andreas Myka | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andreas Myka is active.

Explore More

Publication

Featured researches published by Andreas Myka.

european conference on principles of data mining and knowledge discovery | 1998

A New Algorithm for Faster Mining of Generalized Association Rules

Jochen Hipp; Andreas Myka; Rüdiger Wirth; Ulrich Güntzer

Generalized association rules are a very important extension of boolean association rules, but with current approaches mining generalized rules is computationally very expensive. Especially when considering the rule generation as being part of an interactive KDD-process this becomes annoying. In this paper we discuss strengths and weaknesses of known approaches to generate frequent itemsets. Based on the insights we derive a new algorithm, called Prutax, to mine generalized frequent itemsets. The basic ideas of the algorithm and further optimisation are described. Experiments with both synthetic and real-life data show that Prutax is an order of magnitude faster than previous approaches.

ADL '95 Selected Papers from the Digital Libraries, Research and Technology Advances | 1995

Fuzzy Full-Text Searches in OCR Databases

Andreas Myka; Ulrich Güntzer

Though the quality of optical character recognition software is steadily improving, it is still far from being perfect. As a result, full-text databases that are lled by means of OCR software contain many errors. These errors have to be taken into consideration if such kind of databases are examined by means of full-text searches. In this chapter, we will illustrate some of the possible methods that { to a certain extent { cope with the uncertainty of the database entries. These methods add fuzziness to precisely formulated queries in order to increase their recall. In addition, the described methods are compared to the method of matching query terms exactly: the preliminary results of tests that show their eeects on recall and precision are given.

Selected Papers from the Digital Libraries Workshop on Digital Libraries: Current Issues | 1994

Automatic Hypertext Conversion of Paper Document Collections

Andreas Myka; Ulrich Güntzer

Digital libraries should include all the enhanced search functionality that can be provided by using state-of-the-art electronic tools. With respect to this main goal, the support of intuitive searches by means of employing hypertextual features is important. In order to include these features into the browsing functionality also for raster image representations of documents, the underlying implicit and explicit hypertext structure of library objects has to be modelled and detected. This internal conversion of real library objects into hypertext objects has to be done automatically as far as possible in order to make it feasible at all. Yet, this conversion has to be exible enough to cope with the whole range of library objects. In order to do so it has to use explicit information, such as words, phrases, paragraphs etc., as well as all the implicit information contained in fonts and layout. In this chapter we will therefore describe the automatic hypertext conversion of printed articles based on a description language for link types. The description language provides for a means of describing a whole set of links by means of indicating characteristics they have in common instead of specifying single links. The usage of a description language also hides problems of the optical character recognition at speciication time. Furthermore, we describe how the quality of the newly generated web can be improved and how this web can be represented to the users.

international conference on document analysis and recognition | 1997

Measuring the effects of OCR errors on similarity linking

Andreas Myka; Ulrich Güntzer

The vector-space model offers an easy and robust model for Information Retrieval. Thereby, the similarities between queries and documents as well as the similarities between documents themselves are of importance. Document similarities may be used in order to generate links between documents that lead users from one document to related ones. Studies have shown that the vector-space model is robust in the context of OCR-processing if manually constructed queries are used. However it is not clear whether this model, if used for hypertext construction, is robust with regard to data corruption as caused by OCR engines. In this paper, we describe the performance of automatic hypertext construction, based on the vector-space model, with regard to three different measures: the number of overtakings within the used rankings, the accumulated distance of a documents position within the rankings and a comparison based on recall-precision graphs.

database and expert systems applications | 1997

On automatic similarity linking in digital libraries

Andreas Myka; Ulrich Güntzer

Hypertext links are a powerful extension of standard information retrieval techniques based on query languages. However the generation of links is often impractical due to large manual and/or computational effort. We analyze the effects of two main approaches that aim at a restriction of the necessary efforts: the direct use of OCR-processed documents instead of manually post-processed, i.e. corrected documents; and the use of shorter excerpts of documents instead of complete documents. For our tests, similarity links were computed based on the vector-space model; the links that are generated based on unmodified OCR documents and excerpts of documents are then compared to those links that are generated based on complete documents without OCR errors.

international conference on document analysis and recognition | 1993

Using electronic facsimiles of documents for automatic reconstruction of underlying hypertext structures

Andreas Myka; Ulrich Güntzer

When looking for detailed pieces of information within the facsimiles of documents, a user has only a highly limited set of supportive tools at his disposal. As a solution to this problem, the use of automatically generated hypertext structures is proposed. These structures also include conventional mechanisms like table of contents, index, or full text search, but extend to the possibility of associative searches. The automatic generation of hypertext structures is based on two sources of information: the output of a commercial OCR system and a document type dependent specification file including the specifications for both structure elements and link types. Thus, additional information hidden in layout and typography is taken into account in addition to the plain ASCII representation of the document. The browsing problems that may arise from a hypertexts nonlinearity do not appear, because the user also has access to the document in its original linear fashion.<<ETX>>

international conference on systems | 1992

Monitoring user actions in the hypertext system “HyperMan”

Andreas Myka; Ulrich Güntzer; Frank Sarre

The hypertext system “HyperMan” provides for an automatic conversion of linear machine–readable documents into hypertexts by applying text partitioning and link generation methods. After completion of the generation process, the graphical user interface of the system enables users to browse through the converted documents very easily. To determine whether user actions allow conclusions to be drawn about a generated hypertext, a special component that records user actions has been integrated into the system. In this way sequences of actions can be identified that provide hints of relationships between two document passages or between two terms that occur in the text. Then, the relationships can be stored as links or thesaurus entries, respectively, in the systems data base and can be made available to subsequent users. In addition to acquiring relationships, the user observation component also provides for hints about the acceptance of system components. These hints can serve as a basis for further development of the system.

ACM Sigweb Newsletter | 1995

HyperFacs-building and using a digitized paper library

Andreas Myka; Ulrich Güntzer

Though the number of electronically available text documents is steadily increasing, paper is still the most common medium. With regard to older documents, there is a huge heritage of material that cannot be converted manually into an adequate electronic form. With regard to texts that are created today, paper is still the primary publishing medium because people - either as authors or as readers - are more accustomed to it.

database and expert systems applications | 1992

Hypertext for Software Engineering: Automatic Conversion of Source Code and its Documentation into an Integrated Hypertext

Frank Sarre; Andreas Myka; Ulrich Güntzer

Without the right tools it will be increasingly difficult for software engineers to manage the extensive program sources and related documentation material of large software systems. In this paper, we propose to convert program sources, inline documentation and additional technical papers of a released software version into hypertext and to store them in a common database in order to facilitate integrated management. In particular, the goal is to make explicit the numerous relationships between program and text passages, but also within the programs or within the inline and the separately kept documentation, in order that the links generated during this process can be used for the maintenance of the software system.

Objektbanken für Experten | 1992

Kooperative Zugangssysteme zu Objektbanken

Ulrich Güntzer; Rudolf Bayer; Frank Sarre; J. Werner; Andreas Myka

In diesem Artikel werden kooperative Zugangssysteme zu Objektbanken vorgestellt, die neben den eigentlichen Zugangsmethoden auch Werkzeuge zur Objekt-Modellierung und zur automatischen Generierung der Objektbanken und weiterer Hilfsdatenbanken zur Unterstutzung der Kooperation enthalten. Das Information-Retrieval-Management-System Tumis ermoglicht Design und Administration von Information-Retrieval-Systemen fur grose Objektmengen und enthalt neben einer sehr effizienten erweiterten Booleschen Retrieval-Schnittstelle u.a. auch Komponenten zur Vervollstandigung von Suchanfragen im mutmaslichen Sinne der Benutzer sowie zur Interpretation naturlichsprachlicher Anfragen. Das Hypertext-System Hyperman realisiert die Aufbereitung beliebiger ASCII- und LATEX-Texte zu Hypertexten. Eine Browsing-Komponente ermoglicht die inhaltliche Erschliesung und das Navigieren in den zugrundeliegenden Dokumenten mit Hilfe von Volltextsuche und der Verfolgung verschiedener Arten von Links, die teilweise von der Benutzerschaft gelernt und manipuliert werden konnen. Uber eine geeignete Kopplung beider Systeme ist es moglich, aus einer grosen Anzahl von Objekten mit Tumis einige wenige zu selektieren, um diese dann mit dem System Hyperman tiefergehend zu analysieren.

Explore More