Heike Zinsmeister
University of Konstanz
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Heike Zinsmeister.
language resources and evaluation | 2012
Stefanie Dipper; Heike Zinsmeister
In this paper, we present first results from annotating abstract (discourse-deictic) anaphora in German. Our annotation guidelines provide linguistic tests for identifying the antecedent, and for determining the semantic types of both the antecedent and the anaphor. The corpus consists of selected speaker turns from the Europarl corpus. To date, 100 texts have been annotated according to these guidelines. The annotations show that anaphoric personal and demonstrative pronouns differ with respect to the distance to their antecedents. A semantic analysis reveals that, contrary to suggestions put forward in the literature, referents of anaphors do not tend to be more abstract than the referents of their antecedents.
linguistic annotation workshop | 2009
Stefanie Dipper; Heike Zinsmeister
In this paper, we present preliminary work on corpus-based anaphora resolution of discourse deixis in German. Our annotation guidelines provide linguistic tests for locating the antecedent, and for determining the semantic types of both the antecedent and the anaphor. The corpus consists of selected speaker turns from the Europarl corpus.
sighum workshop on language technology for cultural heritage social sciences and humanities | 2016
Fabian Barteld; Ingrid Schröder; Heike Zinsmeister
This paper describes our contribution to two challenges in data-driven lemmatization. We approach lemmatization in the framework of a two-stage process, where first lemma candidates are generated and afterwards a ranker chooses the most probable lemma from these candidates. The first challenge is that languages with rich morphology like Modern German can feature morphological changes of different kinds, in particular word-internal modification. This makes the generation of the correct lemma a harder task than just removing suffixes (stemming). The second challenge that we address is spelling variation as it appears in non-standard texts. We experiment with different generators that are specifically tailored to deal with these two challenges. We show in an oracle setting that there is a possible increase in lemmatization accuracy of 14% with our methods to generate lemma candidates on Middle Low German, a group of historical dialects of German (1200‐1650 AD). Using a log-linear model to choose the correct lemma from the set, we obtain an actual increase of 5.56%.
discourse anaphora and anaphor resolution colloquium | 2011
Stefanie Dipper; Christine Rieger; Melanie Seiss; Heike Zinsmeister
Abstract anaphors refer to abstract referents such as facts or events. Automatic resolution of this kind of anaphora still poses a problem for language processing systems. The present paper presents a corpus-based comparative study on German and English abstract anaphors and their antecedents to gain further insights into the linguistic properties of different anaphor types and their distributions. To this end, parallel texts from the Europarl corpus have been annotated with functional and morpho-syntactic information. We outline the annotation process and show how we start out with a small set of well-defined markables in German. We successively expand this set in a cross-linguistic bootstrapping approach by collecting translation equivalents from English and using them to track down further forms of German anaphors, and, in the next turn, in English, etc.
linguistic annotation workshop | 2014
Fabian Barteld; Sarah Ihden; Ingrid Schröder; Heike Zinsmeister
When annotating non-standard languages, descriptively incomplete language phenomena (EAGLES, 1996) are often encountered. In this paper, we present examples of ambiguous forms taken from a historical corpus and offer a classification of such descriptively incomplete language phenomena and its rationale. We then discuss various approaches to the annotation of these phenomena, arguing that multiple annotations provide the most appropriate encoding strategy for the annotator. Finally, we show how multiple annotations can be encoded in existing standards such as PAULA and GrAF.
Archive | 2012
Heike Telljohann; Erhard W. Hinrichs; Sandra Kübler; Heike Zinsmeister; Kathrin Beck
Archive | 2003
Ellen Brandner; Heike Zinsmeister; Artemis Alexiadou; Miriam Butt; Tracy Holloway King; Eric Haeberli; Jóhannes Gísli Jónsson; Marcus Kracht; Diane Nelson; Halldor Armann Sigurðsson; Ralf Vogel; Ellen Woolford; Dieter Wunderlich
Archive | 2010
Lothar Lemnitzer; Heike Zinsmeister
Complexity | 2003
Heike Zinsmeister; Ulrich Heid
language resources and evaluation | 2010
Heike Zinsmeister; Stefanie Dipper