Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kornél G. Markó is active.

Publication


Featured researches published by Kornél G. Markó.


international conference on computational linguistics | 2004

Cognate mapping: a heuristic strategy for the semi-supervised acquisition of a Spanish lexicon from a Portuguese seed lexicon

Stefan Schulz; Kornél G. Markó; Eduardo Sbrissia; Percy Nohama; Udo Hahn

We deal with the automated acquisition of a Spanish medical subword lexicon from an already existing Portuguese seed lexicon. Using two non-parallel monolingual corpora we determined Spanish lexeme candidates from Portuguese seed lexicon entries by heuristic cognate mapping. We validated the emergent lexical translation hypotheses by determining the similarity of fixed-window context vectors on the basis of Portuguese and Spanish text corpora.


international acm sigir conference on research and development in information retrieval | 2005

Bootstrapping dictionaries for cross-language information retrieval

Kornél G. Markó; Stefan Schulz; Olena Medelyan; Udo Hahn

The bottleneck for dictionary-based cross-language information retrieval is the lack of comprehensive dictionaries, in particular for many different languages. We here introduce a methodology by which multilingual dictionaries (for Spanish and Swedish) emerge automatically from simple seed lexicons. These seed lexicons are automatically generated, by cognate mapping, from (previously manually constructed) Portuguese and German as well as English sources. Lexical and semantic hypotheses are then validated and new ones iteratively generated by making use of co-occurrence patterns of hypothesized translation synonyms in parallel corpora. We evaluate these newly derived dictionaries on a large medical document collection within a cross-language retrieval setting.


Methods of Information in Medicine | 2010

Subword-based Semantic Retrieval of Clinical and Bibliographic Documents

Philipp Daumke; Stefan Schulz; Marcel Lucas Müller; W. Dzeyk; L. Prinzen; Edson José Pacheco; P. Secco Cancian; Percy Nohama; Kornél G. Markó

OBJECTIVES The increasing amount of electronically available documents in bibliographic databases and the clinical documentation requires user-friendly techniques for content retrieval. METHODS A domain-specific approach on semantic text indexing for document retrieval is presented. It is based on a subword thesaurus and maps the content of texts in different European languages to a common interlingual representation, which supports the search across multilingual document collections. RESULTS Three use cases are presented where the semantic retrieval method has been implemented: a bibliographic database, a department EHR system, and a consumer-oriented Web portal. CONCLUSIONS It could be shown that a semantic indexing and retrieval approach, the performance of which had already been empirically assessed in prior studies, proved useful in different prototypical and routine scenarios and was well accepted by several user groups.


discovery science | 2005

Cross-language mining for acronyms and their completions from the web

Udo Hahn; Philipp Daumke; Stefan Schulz; Kornél G. Markó

We propose a method that aligns biomedical acronyms and their long-form definitions across different languages. We use a freely available search and extraction tool by which abbreviations, together with their fully expanded forms, are massively mined from the Web. In a subsequent step, language-specific variants, synonyms, and translations of the extracted acronym definitions are normalized by referring to a language-independent, shared semantic interlingua.


cross language evaluation forum | 2006

MorphoSaurus in ImageCLEF 2006: the effect of subwords on biomedical IR

Philipp Daumke; Jan Paetzold; Kornél G. Markó

In the 2006 ImageCLEF Medical Image Retrieval task we evaluate the effects of deep morphological analysis for mono-and cross-lingual document retrieval in the biomedical domain. The morphological analysis is based on the MorphoSaurus system in which subwords are introduced as morphologically meaningful word units. Subwords are organized in language specific lexica that were partly manually and partly automatically generated and currently cover six European languages. They are linked together in a multilingual thesaurus. The use of subwords instead of full words significantly reduces the number of lexical entries that are needed to sufficiently cover a specific language and domain. A further benefit of the approach is its independence from the underlying retrieval system. We combined MorphoSaurus with the open-source search engine Lucene and achieved precision gains of up to 25% over the baseline for a monolingual setting and promising results in a multilingual scenario.


international acm sigir conference on research and development in information retrieval | 2005

A CLIR interface to a web search engine

Philipp Daumke; Stefan Schulz; Kornél G. Markó

Medical document retrieval presents a unique combination of challenges for the design and implementation of retrieval engines. We introduce a method to meet these challenges by implementing a multilingual retrieval interface for biomedical content in the World Wide Web. To this end we developed an automated method for interlingual query construction by which a standard Web search engine is enabled to process non-English queries from the biomedical domain in order to retrieve English documents.


Methods of Information in Medicine | 2005

MorphoSaurus Design and Evaluation of an Interlingua-based, Cross-language Document Retrieval Engine for the Medical Domain

Kornél G. Markó; Stefan Schulz; Udo Hahn


american medical informatics association annual symposium | 2003

Cross-language MeSH indexing using morpho-semantic normalization.

Kornél G. Markó; Philipp Daumke; Stefan Schulz; Udo Hahn


RIAO '04 Coupling approaches, coupling media and coupling languages for information retrieval | 2004

Interlingual indexing across different languages

Kornél G. Markó; Udo Hahn; Stefan Schulz; Philipp Daumke; Percy Nohama


BMC Medical Informatics and Decision Making | 2008

Formal representation of complex SNOMED CT expressions

Stefan Schulz; Kornél G. Markó; Boontawee Suntisrivaraporn

Collaboration


Dive into the Kornél G. Markó's collaboration.

Top Co-Authors

Avatar

Stefan Schulz

Medical University of Graz

View shared research outputs
Top Co-Authors

Avatar

Udo Hahn

University of Freiburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Percy Nohama

Pontifícia Universidade Católica do Paraná

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Edson José Pacheco

The Catholic University of America

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge