Marek Medveď
Masaryk University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marek Medveď.
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers | 2016
Marek Medveď; Vojtěch Kovář; Miloš Jakubíček
In this paper we present our approach to the Bilingual Document Alignment Task (WMT16), where the main goal was to reach the best recall on extracting aligned pages within the provided data. Our approach consists of tree main parts: data preprocessing, keyword extraction and text pairs scoring based on keyword matching. For text preprocessing we use the TreeTagger pipeline that contains the Unitok tool (Michelfeit et al., 2014) for tokenization and the TreeTagger morphological analyzer (Schmid, 1994). After keywords extraction from the texts according TF-IDF scoring our system searches for comparable English-French pairs. Using a statistical dictionary created from a large English-French parallel corpus, the system is able to find comaparable documents. At the end this procedure is combined with the baseline algorithm and best one-to-one pairing is selected. The result reaches 91.6% recall on provided training data. After a deep error analysis (see section 5) the recall reached 97.4%.
international conference on agents and artificial intelligence | 2018
Marek Medveď; Aleš Horák
The Automatic Question Answering, or AQA, system is a representative of open domain QA systems, where the answer selection process leans on syntactic and semantic similarities between the question and the answering text snippets. Such approach is specifically oriented to languages with fine grained syntactic and morphologic features that help to guide the correct QA match. In this paper, we present the latest results of the AQA system with new word embedding criteria implementation. All AQA processing steps (question processing, answer selection and answer extraction) are syntax-based with advanced scoring obtained by a combination of several similarity criteria (TF-IDF, tree distance, ...). Adding the word embedding parameters helped to resolve the QA match in cases, where the answer is expressed by semantically near equivalents. We describe the design and implementation of the whole QA process and provide a new evaluation of the AQA system with the word embedding criteria measured with an expanded version of Simple Question-Answering Database, or SQAD, with more than 3000 question-answer pairs extracted from the Czech Wikipedia.
language resources and evaluation | 2016
Vít Baisa; Jan Michelfeit; Marek Medveď; Miloš Jakubíček
Archive | 2015
Marek Medveď; Vít Baisa; Aleš Horák
RASLAN | 2014
Aleš Horák; Marek Medveď
RASLAN | 2016
Marek Medveď; Aleš Horák; Vojtěch Kovář
RASLAN | 2015
Marek Medveď; Aleš Horák
RASLAN | 2014
Jan Rygl; Marek Medveď
RASLAN | 2013
Miloš Jakubíček; Marek Medveď
Archive | 2013
Marek Medveď; Miloš Jakubíček; Vojtěch Kovář