Zdenka Uresová
Charles University in Prague
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zdenka Uresová.
north american chapter of the association for computational linguistics | 2014
Stephan Oepen; Marco Kuhlmann; Yusuke Miyao; Daniel Zeman; Silvie Cinková; Dan Flickinger; Jan Hajic; Zdenka Uresová
Task 18 at SemEval 2015 defines Broad-Coverage Semantic Dependency Parsing (SDP) as the problem of recovering sentence-internal predicate–argument relationships for all content words, i.e. the sema ...
spoken language technology workshop | 2008
Jan Hajic; Silvie Cinková; Marie Mikulová; Petr Pajas; Jan Ptáček; Josef Toman; Zdenka Uresová
We present a description of a new resource (Prague Dependency Treebank of Spoken Language) being created for English and Czech to be used for the task of speech understanding, broad natural language analysis for dialog systems and other speech-related tasks, including speech editing. The resources we have created so far contain audio and a standard transcription of spontaneous speech, but as a novel layer, we add an edited (ldquoreconstructedrdquo) version of the spoken utterances. These edits go beyond the scope of current speech reconstruction efforts in that we allow, on top of the usual deletions of speech artifacts, fillers, etc. also for word modifications, insertions and word order changes. We have used both monologue and dialogue recordings in English and Czech to verify the feasibility of such transcription. We have also assessed the quality of the resulting annotation since the relative freedom of the editing raises an issue of what a ldquocorrectrdquo annotation is.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw#N# Text to Universal Dependencies | 2017
Daniel Zeman; Martin Popel; Milan Straka; Jan Hajic; Joakim Nivre; Filip Ginter; Juhani Luotolahti; Sampo Pyysalo; Slav Petrov; Martin Potthast; Francis M. Tyers; Elena Badmaeva; Memduh Gokirmak; Anna Nedoluzhko; Silvie Cinková; Jaroslava Hlaváčová; Václava Kettnerová; Zdenka Uresová; Jenna Kanerva; Stina Ojala; Anna Missilä; Christopher D. Manning; Sebastian Schuster; Siva Reddy; Dima Taji; Nizar Habash; Herman Leung; Marie-Catherine de Marneffe; Manuela Sanguinetti; Maria Simi
The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.
workshop on events definition detection coreference and representation | 2014
Ondřej Dušek; Jan Hajic; Zdenka Uresová
We present a supervised learning method for verbal valency frame detection and selection, i.e., a specific kind of word sense disambiguation for verbs based on subcategorization information, which amounts to detecting mentions of events in text. We use the rich dependency annotation present in the Prague Dependency Treebanks for Czech and English, taking advantage of several analysis tools (taggers, parsers) developed on these datasets previously. The frame selection is based on manually created lexicons accompanying these treebanks, namely on PDT-Vallex for Czech and EngVallex for English. The results show that verbal predicate detection is easier for Czech, but in the subsequent frame selection task, better results have been achieved for English.
international conference on computational linguistics | 2014
Jan Hajic; Ondrej Bojar; Zdenka Uresová
This paper describes in detail the differences between Czech and English annotation using the Abstract Meaning Representation scheme, which stresses the use of ontologies (and semantically-oriented verbal lexicons) and relations based on meaning or ontological content rather than semantics or syntax. The basic “slogan” of the AMR specification clearly states that AMR is not an interlingua, yet it is expected that many relations as well as structures constructed from these relations will be similar or even identical across languages. In our study, we have investigated 100 sentences in English and their translations into Czech, annotated manually by AMRs, with the goal to describe the differences and if possible, to classify them into two main categories: those which are merely convention differences and thus can be unified by changing such conventions in the AMR annotation guidelines, and those which are so deeply rooted in the language structure that the level of abstraction which is inherent in the current AMR scheme does not allow for such unification.
meeting of the association for computational linguistics | 2016
Zdenka Uresová; Eduard Bejček; Jan Hajic
This paper describes results of a study related to the PARSEME Shared Task on automatic detection of verbal Multi-Word Expressions (MWEs) which focuses on their identification in running texts in many languages. The Shared Task’s organizers have provided basic annotation guidelines where four basic types of verbal MWEs are defined including some specific subtypes. Czech is among the twenty languages selected for the task. We will contribute to the Shared Task dataset, a multilingual open resource, by converting data from the Prague Dependency Treebank (PDT) to the Shared Task format. The question to answer is to which extent this can be done automatically. In this paper, we concentrate on one of the relevant MWE categories, namely on the quasi-universal category called “Inherently Pronominal Verbs” (IPronV) and describe its annotation in the Prague Dependency Treebank. After comparing it to the Shared Task guidelines, we can conclude that the PDT and the associated valency lexicon, PDT-Vallex, contain sufficient information for the conversion, even if some specific instances will have to be checked. As a side effect, we have identified certain errors in PDT annotation which can now be automatically corrected.
Proceedings of the Workshop on Discontinuous Structures in Natural Language Processing | 2016
Zdenka Uresová; Eva Fučíková; Jan Hajic
We describe results of investigation of a specific type of discontinuous constructions, namely non-projective constructions concerning verbs and their arguments. This topic is especially important for languages with a relatively free word order, such as Czech, which is the language we have primarily worked with. For comparison, we have included some results for English. The corpora used for both languages are the Prague Czech-English Dependency Treebank and the Prague Dependency Treebank, which are both annotated at a dependency syntax level as well as a deep (semantic) level, including verbs and their valency (arguments). We are using traditionally defined non-projectivity on trees with full linear ordering, but the two levels of annotation are innovatively combined to determine if a particular (deep) verb -argument structure is non-projective. As a result, we have identified several types of discontinuities, which we classify either by the verb class or structurally in terms of the verb, its arguments and their dependents. In addition, we have quantitatively compared selected phenomena found in Czech translated texts (in the PCEDT) to the native Czech as found in the original Prague Dependency Treebank.
language resources and evaluation | 2014
Nianwen Xue; Ondrej Bojar; Jan Hajic; Martha Palmer; Zdenka Uresová; Xiuhong Zhang
language resources and evaluation | 2014
Zdenka Uresová; Jan Hajic; Pavel Pecina; Ondrej Dusek
north american chapter of the association for computational linguistics | 2013
Zdenka Uresová; Jan Hajic; Eva Fučíková; Jana Šindlerová