Marie Kopřivová
Charles University in Prague
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marie Kopřivová.
Journal of Linguistics/Jazykovedný casopis | 2017
Zuzana Komrsková; Marie Kopřivová; David Lukeš; Petra Poukarová; Hana Goláňová
Abstract The paper introduces the ORTOFON corpus of spontaneous spoken Czech and the DIALEKT corpus of Czech dialects, their design principles and practical solutions adopted during data collection.
International Conference on Computational and Corpus-Based Phraseology | 2017
Milena Hnátková; Tomáš Jelínek; Marie Kopřivová; Vladimír Petkevič; Alexandr Rosen; Hana Skoumalová; Pavel Vondřička
We propose a multidimensional taxonomy of multiword expressions (MWEs) as a pattern applicable to entries in a representative lexicon of Czech MWEs. The taxonomy and the lexicon are useful for many reasons concerning lexicography, teaching Czech as a foreign language, and theoretical issues of MWEs as entities standing between lexicon and grammar, as well as for NLP tasks such as tagging and parsing, identification and search of MWEs, or word sense and semantic disambiguation. In addition to the description of various types of idiomaticity, the taxonomy and the lexicon are designed to account for flexibility in morphology and word order, syntactic and lexical variants and even creatively used fragments.
International Conference on Computational and Corpus-Based Phraseology | 2017
Marie Kopřivová
This paper represents an attempt to put together a list of the most commonly used (most typical) Czech idioms using corpus data with annotated collocations. Collocations are annotated in corpora of contemporary written Czech as well as in a corpus of spoken Czech containing transcripts of intimate conversations. Idioms are selected based on their frequency in different text types (newspapers and magazines, non-fiction, fiction, spoken language) and the resulting list is compiled based on a criterion of occurrence of the given idiom in at least two different text types. A short characteristic of the individual text types is given in terms of which types of idioms are typical for them (according to formal criteria). This study confirms a substantial divide between idiom use in written and spoken language. A smaller difference can be observed between fiction on the one hand and non-fiction and newspapers on the other. The main reason for this is the interactive nature of fiction texts, which leads to them containing idioms with verbal components. These are employed in a fashion similar to spoken languages, in interactions among the individual characters. By contrast, non-fiction and journalistic language tends to be more descriptive, with more nominal idioms.
text speech and dialogue | 2015
David Lukeš; Petra Klimešová; Zuzana Komrsková; Marie Kopřivová
The ORAL series corpora of spontaneous spoken Czech currently contain neither lemmatization nor part of speech tagging. The main reason for this is that readily available NLP tools, designed primarily with written texts in mind, underperform when applied directly to speech transcripts, due to various morpohological and syntactic specificities of informal spoken language and the ways these are captured in transcription. Recently, the highly optimized open-source MorphoDiTa toolchain for training and applying stochastic tagging models was released; MorphoDiTa makes it easy and fast to experiment with incremental changes in the training procedure. The article discusses modifications to the morphological dictionary and training data used by the models which are necessary in order to improve their performance on the ORAL series corpora, as well as challenges which remain to be solved.
language resources and evaluation | 2014
Marie Kopřivová; Hana Goláňová; Petra Klimešová; David Lukeš
Časopis pro moderní filologii (Journal for Modern Philology) | 2018
Tomáš Jelínek; Marie Kopřivová; Vladimír Petkevič; Hana Skoumalová
Corpus Pragmatics | 2017
Anna Čermáková; Zuzana Komrsková; Marie Kopřivová; Petra Poukarová
Časopis pro moderní filologii (Journal for Modern Philology) | 2015
Petra Klimešová; Zuzana Komrsková; Marie Kopřivová; David Lukeš
Archive | 2008
Martina Waclawičová; Marie Kopřivová; Michal Křen; Lucie Válková
Archive | 2006
František Čermák; Jaroslava Hlaváčová; Milena Hnátková; Tomáš Jelínek; Jan Kocek; Marie Kopřivová; Michal Křen; Renata Novotná; Vladimír Petkevič; Věra Schmiedtová; Hana Skoumalová; Johanka Spoustová; Michal Šulc; Zdeněk Velíšek