Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael Poprat is active.

Publication


Featured researches published by Michael Poprat.


cross language evaluation forum | 2013

Entity Recognition in Parallel Multi-lingual Biomedical Corpora: The CLEF-ER Laboratory Overview

Dietrich Rebholz-Schuhmann; Simon Clematide; Fabio Rinaldi; Senay Kafkas; Erik M. van Mulligen; Chinh Bui; Johannes Hellrich; Ian Lewin; David Milward; Michael Poprat; Antonio Jimeno-Yepes; Udo Hahn; Jan A. Kors

The identification and normalisation of biomedical entities from the scientific literature has a long tradition and a number of challenges have contributed to the development of reliable solutions. Increasingly patient records are processed to align their content with other biomedical data resources, but this approach requires analysing documents in different languages across Europe [1,2]. The CLEF-ER challenge has been organized by the Mantra project partners to improve entity recognition ER in multilingual documents. Several corpora in different languages, i.e. Medline titles, EMEA documents and patent claims, have been prepared to enable ER in parallel documents. The participants have been ask to annotate entity mentions with concept unique identifiers CUIs in the documents of their preferred non-English language. The evaluation determines the number of correctly identified entity mentions against a silver standard Task A and the performance measures for the identification of CUIs in the non-English corpora. The participants could make use of the prepared terminological resources for entity normalisation and of the English silver standard corpora SSCs as input for concept candidates in the non-English documents. The participants used different approaches including translation techniques and word or phrase alignments apart from lexical lookup and other text mining techniques. The performances for task A and B was lower for the patent corpus in comparison to Medline titles and EMEA documents. In the patent documents, chemical entities were identified at higher performance, whereas the other two document types cover a higher portion of medical terms. The number of novel terms provided from all corpora is currently under investigation. Altogether, the CLEF-ER challenge demonstrates the performances of annotation solutions in different languages against an SSC.


SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing | 2008

Building a BioWordNet by using WordNet's data formats and WordNet's software infrastructure: a failure story

Michael Poprat; Elena Beisswanger; Udo Hahn

In this paper, we describe our efforts to build on WordNet resources, using WordNet lexical data, the data format that it comes with and WordNets software infrastructure in order to generate a biomedical extension of WordNet, the BioWordNet. We began our efforts on the assumption that the software resources were stable and reliable. In the course of our work, it turned out that this belief was far too optimistic. We discuss the stumbling blocks that we encountered, point out an error in the WordNet software with implications for research based on it, and conclude that building on the legacy of WordNet data structures and its associated software might preclude sustainable extensions that go beyond the domain of general English.


Medical Informatics and The Internet in Medicine | 2007

Biomedical information retrieval across languages

Philipp Daumke; Kornél Markü; Michael Poprat; Stefan Schulz; Rüdiger Klar

This work presents a new dictionary-based approach to biomedical cross-language information retrieval (CLIR) that addresses many of the general and domain-specific challenges in current CLIR research. Our method is based on a multilingual lexicon that was generated partly manually and partly automatically, and currently covers six European languages. It contains morphologically meaningful word fragments, termed subwords. Using subwords instead of entire words significantly reduces the number of lexical entries necessary to sufficiently cover a specific language and domain. Mediation between queries and documents is based on these subwords as well as on lists of word-n-grams that are generated from large monolingual corpora and constitute possible translation units. The translations are then sent to a standard Internet search engine. This process makes our approach an effective tool for searching the biomedical content of the World Wide Web in different languages. We evaluate this approach using the OHSUMED corpus, a large medical document collection, within a cross-language retrieval setting.


meeting of the association for computational linguistics | 2007

Quantitative Data on Referring Expressions in Biomedical Abstracts

Michael Poprat; Udo Hahn

We report on an empirical study that deals with the quantity of different kinds of referring expressions in biomedical abstracts.


medical informatics europe | 2006

A language classifier that automatically divides medical documents for experts and health care consumers.

Michael Poprat; Kornél G. Markó; Udo Hahn


language resources and evaluation | 2008

Semantic Annotations for Biology: a Corpus Development Initiative at the Jena University Language & Information Engineering (JULIE) Lab.

Udo Hahn; Elena Beisswanger; Ekaterina Buyko; Michael Poprat; Katrin Tomanek; Joachim Wermter


language resources and evaluation | 2006

Language Specific and Topic Focused Web Crawling.

Olena Medelyan; Stefan Schulz; Jan Paetzold; Michael Poprat; Kornél G. Markó


CLEF (Working Notes) | 2013

Multilingual semantic resources and parallel corpora in the biomedical domain: The CLEF-ER challenge

Dietrich Rebholz-Schuhmann; Simon Clematide; Fabio Rinaldi; Senay Kafkas; Erik M. van Mulligen; Chinh Bui; Johannes Hellrich; Ian Lewin; David Milward; Michael Poprat; Antonio Jimeno-Yepes; Udo Hahn; Jan A. Kors


Software Engineering, Testing, and Quality Assurance for Natural Language Processing | 2008

Building a BioWordNet Using WordNet Data Structures and WordNet's Software Infrastructure--A Failure Story

Michael Poprat; Elena Beisswanger; Udo Hahn


RIAO '04 Coupling approaches, coupling media and coupling languages for information retrieval | 2004

Crossing languages in text retrieval via an interlingua

Udo Hahn; Kornél G. Markó; Michael Poprat; Stefan Schulz; Joachim Wermter; Percy Nohama

Collaboration


Dive into the Michael Poprat's collaboration.

Top Co-Authors

Avatar

Udo Hahn

University of Freiburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stefan Schulz

Medical University of Graz

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Milward

St John's Innovation Centre

View shared research outputs
Top Co-Authors

Avatar

Ian Lewin

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge