Kadri Muischnek | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kadri Muischnek is active.

Explore More

Publication

Featured researches published by Kadri Muischnek.

language resources and evaluation | 2010

The variability of multi-word verbal expressions in Estonian

Kadri Muischnek; Heiki-Jaan Kaalep

This article focuses on the variability of one of the subtypes of multi-word expressions, namely those consisting of a verb and a particle or a verb and its complement(s). We build on evidence from Estonian, an agglutinative language with free word order, analysing the behaviour of verbal multi-word expressions (opaque and transparent idioms, support verb constructions and particle verbs). Using this data we analyse such phenomena as the order of the components of a multi-word expression, lexical substitution and morphosyntactic flexibility.

Literary and Linguistic Computing | 2013

Variation of verbal constructions in Estonian dialects

Kristel Uiboaed; Cornelius Hasselblatt; Liina Lindström; Kadri Muischnek; John Nerbonne

Traditional Estonian dialect classifications are based on the phonology, morphology, and lexis, and there are very few studies about syntax available. The present article is the first quantitative syntactic study of Estonian dialects. We concentrate on constructions consisting of finite and non-finite verbs, and we apply contemporary statistical methods to explore the syntactic variation. Our results show that even bare token frequencies can identify syntactic patterns quite well, and that analyses exploiting collostructional methods makes the variational patterns even clearer. We use correspondence analysis and clustering to detect geographic influence on variation. The results suggest a syntax-based classification of dialects differs from the traditional classifications based mainly on phonology and lexis. Our data reveal systematic differences between eastern and western dialects at the syntactic level, whereas analyses based on phonology and lexis distinguish mainly between northern and southern dialects. The western dialects make more use of analytic constructions consisting of a finite and a non-finite verb form.

Acta Linguistica Academica | 2017

Parsing and beyond: Tools and resources for Estonian

Kadri Muischnek; Kaili Müürisep; Tiina Puolakainen

This article gives an overview of the state of art of tools and resources for syntactic analysis of Estonian. A morphosyntactic disambiguator, surface-syntactic analyzer and dependency parser are all based on the Constraint Grammar formalism. As for language resources, a 400,000-word manually annotated dependency treebank has been created, its annotation scheme is compatible with the output of the Constraint Grammar dependency parser. Part of the treebank has been converted to the Universal Dependencies annotation scheme. Our tools have also been tested by large-scale corpus annotation.

text, speech and dialogue | 2018

Annotated Clause Boundaries’ Influence on Parsing Results

Dage Särg; Kadri Muischnek; Kaili Müürisep

The aim of the paper is to study the effect of pre-annotated clause boundaries on dependency parsing of Estonian new media texts. Our hypothesis is that correct identification of clause boundaries helps to improve parsing because as the text is split into smaller syntactically meaningful units, it should be easier for the parser to determine the syntactic structure of a given unit. To test the hypothesis, we performed two experiments on a 14,000-word corpus of Estonian web texts whose morphological analysis had been manually validated. In the first experiment, the corpus with gold standard morphological tags was parsed with MaltParser both with and without the manually annotated clause boundaries. In the second experiment, only the segmentation of the text was preserved and the morphological analysis was done automatically before parsing. The experiments confirmed our hypothesis about the influence of correct clause boundaries by a small margin: in both experiments, the improvement of LAS was 0.6%.

Proceedings of the 2010 conference on Human Language Technologies -- The Baltic Perspective: Proceedings of the Fourth International Conference Baltic HLT 2010 | 2010