Alexandr Rosen
Charles University in Prague
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexandr Rosen.
language resources and evaluation | 2014
Alexandr Rosen; Jirka Hana; Barbora Štindlová; Anna Feldman
The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked tiers, designed to handle a wide range of error types present in the input. Each tier corrects different types of errors; links between the tiers allow capturing errors in word order and complex discontinuous expressions. Errors are not only corrected, but also classified. The annotation scheme is tested on a data set including approx. 175,000 words with fair inter-annotator agreement results. We also explore the possibility of applying automated linguistic annotation tools (taggers, spell checkers and grammar checkers) to the learner text to support or even substitute manual annotation.
language resources and evaluation | 2012
Jirka Hana; Alexandr Rosen; Barbora Štindlová; Petr J"ager
The need for data about the acquisition of Czech by non-native learners prompted the compilation of the first learner corpus of Czech. After introducing its basic design and parameters, including a multi-tier manual annotation scheme and error taxonomy, we focus on the more technical aspects: the transcription of hand-written source texts, process of annotation, and options for exploiting the result, together with tools used for these tasks and decisions behind the choices. To support or even substitute manual annotation we assign some error tags automatically and use automatic annotation tools (tagger, spell checker).
text speech and dialogue | 2012
Tomáš Jelínek; Barbora Štindlová; Alexandr Rosen; Jirka Hana
We present an approach to building a learner corpus of Czech, manually corrected and annotated with error tags using a complex grammar-based taxonomy of errors in spelling, morphology, morphosyntax, lexicon and style. This grammar-based annotation is supplemented by a formal classification of errors based on surface alternations. To supply additional information about non-standard or ill-formed expressions, we aim at a synergy of manual and automatic annotation, deriving information from the original input and from the manual annotation.
Archive | 1994
Eva Hajičová; Alexandr Rosen
The present contribution describes an enterprise in collecting lexical data for an English parser in the context of a bilingual research project. The primary source of grammatical information is a computer usable version of OALD (Hornby, 1974). The target lexicon’s structure of verbal valency frames, inspired by the theoretical framework of functional generative description, includes an underlying level. Its content can be derived under some human supervision from OALD’s verb pattern codes. Results confirm the usefulness of machine readable dictionaries for NLP applications.
International Conference on Computational and Corpus-Based Phraseology | 2017
Milena Hnátková; Tomáš Jelínek; Marie Kopřivová; Vladimír Petkevič; Alexandr Rosen; Hana Skoumalová; Pavel Vondřička
We propose a multidimensional taxonomy of multiword expressions (MWEs) as a pattern applicable to entries in a representative lexicon of Czech MWEs. The taxonomy and the lexicon are useful for many reasons concerning lexicography, teaching Czech as a foreign language, and theoretical issues of MWEs as entities standing between lexicon and grammar, as well as for NLP tasks such as tagging and parsing, identification and search of MWEs, or word sense and semantic disambiguation. In addition to the description of various types of idiomaticity, the taxonomy and the lexicon are designed to account for flexibility in morphology and word order, syntactic and lexical variants and even creatively used fragments.
international conference on computational linguistics | 1992
Alexandr Rosen; Eva Hajičová; Jan Hajic
The authors collect lexical data for a module of English syntactic analysis in the context of a bilingual research project. The computer usable version of OALD (Hornby, 1974) is used as the primary source. The main focus is on the structure and derivation of valency frames for verbal entries in the target lexicon, Illustration of the complex relation between OALDs verb subcategorization codes and the target complementation paradigms is provided, and an approach to the derivation procedure design suggested.
linguistic annotation workshop | 2010
Jirka Hana; Alexandr Rosen; Svatava Škodová; Barbora Štindlová
International Journal of Corpus Linguistics | 2012
František Čermák; Alexandr Rosen
Archive | 2005
Alexandr Rosen
language resources and evaluation | 2012
Alexandr Rosen; Martin Vavřı́n