Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ines Rehbein is active.

Publication


Featured researches published by Ines Rehbein.


international workshop conference on parsing technologies | 2009

Scalable Discriminative Parsing for German

Yannick Versley; Ines Rehbein

Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology. For German and other non-English languages, linguistically motivated complex treebank transformations have been shown to improve performance within the framework of PCFG parsing, while generative lexicalized models do not seem to be as easily adaptable to these languages. In this paper, we show a practical way to use grammatical functions as first-class citizens in a discriminative model that allows to extend annotated treebank grammars with rich feature sets without having to suffer from sparse data problems. We demonstrate the flexibility of the approach by integrating unsupervised PP attachment and POS-based word clusters into the parser.


GSCL | 2013

Fine-Grained POS Tagging of German Tweets

Ines Rehbein

This paper presents the first work on POS tagging German Twitter data, showing that despite the noisy and often cryptic nature of the data a fine-grained analysis of POS tags on Twitter microtext is feasible. Our CRF-based tagger achieves an accuracy of around 89% when trained on LDA word clusters, features from an automatically created dictionary and additional out-of-domain training data.


linguistic annotation workshop | 2009

Assessing the benefits of partial automatic pre-labeling for frame-semantic annotation

Ines Rehbein; Josef Ruppenhofer; Caroline Sporleder

In this paper, we present the results of an experiment in which we assess the usefulness of partial semi-automatic annotation for frame labeling. While we found no conclusive evidence that it can speed up human annotation, automatic pre-annotation does increase its overall quality.


meeting of the association for computational linguistics | 2006

German particle verbs and pleonastic prepositions

Ines Rehbein; Josef van Genabith

This paper discusses the behaviour of German particle verbs formed by two-way prepositions in combination with pleonastic PPs including the verb particle as a preposition. These particle verbs have a characteristic feature: some of them license directional prepositional phrases in the accusative, some only allow for locative PPs in the dative, and some particle verbs can occur with PPs in the accusative and in the dative. Directional particle verbs together with directional PPs present an additional problem: the particle and the preposition in the PP seem to provide redundant information. The paper gives an overview of the semantic verb classes influencing this phenomenon, based on corpus data, and explains the underlying reasons for the behaviour of the particle verbs. We also show how the restrictions on particle verbs and pleonastic PPs can be expressed in a grammar theory like Lexical Functional Grammar (LFG).


language resources and evaluation | 2012

Is it worth the effort? Assessing the benefits of partial automatic pre-labeling for frame-semantic annotation

Ines Rehbein; Josef Ruppenhofer; Caroline Sporleder

Corpora with high-quality linguistic annotations are an essential component in many NLP applications and a valuable resource for linguistic research. For obtaining these annotations, a large amount of manual effort is needed, making the creation of these resources time-consuming and costly. One attempt to speed up the annotation process is to use supervised machine-learning systems to automatically assign (possibly erroneous) labels to the data and ask human annotators to correct them where necessary. However, it is not clear to what extent these automatic pre-annotations are successful in reducing human annotation effort, and what impact they have on the quality of the resulting resource. In this article, we present the results of an experiment in which we assess the usefulness of partial semi-automatic annotation for frame labeling. We investigate the impact of automatic pre-annotation of differing quality on annotation time, consistency and accuracy. While we found no conclusive evidence that it can speed up human annotation, we found that automatic pre-annotation does increase its overall quality.


linguistic annotation workshop | 2014

POS error detection in automatically annotated corpora

Ines Rehbein

Recent work on error detection has shown that the quality of manually annotated corpora can be substantially improved by applying consistency checks to the data and automatically identifying incorrectly labelled instances. These methods, however, can not be used for automatically annotated corpora where errors are systematic and cannot easily be identified by looking at the variance in the data. This paper targets the detection of POS errors in automatically annotated corpora, so-called silver standards, showing that by combining different measures sensitive to annotation quality we can identify a large part of the errors and obtain a substantial increase in accuracy.


meeting of the association for computational linguistics | 2017

Detecting annotation noise in automatically labelled data.

Ines Rehbein; Josef Ruppenhofer

We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.


linguistic annotation workshop | 2017

Catching the common cause: extraction and annotation of causal relations and their participants

Ines Rehbein; Josef Ruppenhofer

In this paper, we present a simple, yet effective method for the automatic identification and extraction of causal relations from text, based on a large English-German parallel corpus. The goal of this effort is to create a lexical resource for German causal relations. The resource will consist of a lexicon that describes constructions that trigger causality as well as the participants of the causal event, and will be augmented by a corpus with annotated instances for each entry, that can be used as training data to develop a system for automatic classification of causal relations. Focusing on verbs, our method harvested a set of 100 different lexical triggers of causality, including support verb constructions. At the moment, our corpus includes over 1,000 annotated instances. The lexicon and the annotated data will be made available to the research community.


Proceedings of the Second Workshop on Extra-Propositional Aspects of Meaning in Computational Semantics (ExProM 2015) | 2015

Filled Pauses in User-generated Content are Words with Extra-propositional Meaning

Ines Rehbein

In this paper, we present a corpus study investigating the use of the fillers ¨ (uh) and ¨ ahm (uhm) in informal spoken German youth language and in written text from social media. Our study shows that filled pauses occur in both corpora as markers of hesitations, corrections, repetitions and unfinished sentences, and that the form as well as the type of the fillers are distributed similarly in both registers. We present an analysis of fillers in written microblogs, illustrating that ¨ ah and ¨ ahm are used intentionally and can add a subtext to the message that is understandable to both author and reader. We thus argue that filled pauses in user-generated content from social media are words with extrapropositional meaning.


Zeitschrift Fur Germanistische Linguistik | 2010

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Ines Rehbein

Abstract In 1959, Lucien Tesnière wrote his main work Éléments de syntaxe structurale. While the impact on theoretical linguistics was not very strong at first, 50 years later there exist a variety of linguistic theories based on Tesnières work. In computational linguistics, as in theoretical linguistics, dependency grammar was not very influential at first. The last 10–15 years, however, have brought a noticeable change and dependency grammar has found its way into computational linguistics. Syntactically annotated corpora based on dependency representations are available for a variety of languages, as well as statistical parsers which give a syntactic analysis of running text describing the underlying dependency relations between word tokens in the text. This article gives an overview of relevant areas of computational linguistics which have been influenced by dependency grammar. It discusses the pros and cons of different types of syntactic representation used in natural language processing and their suitability as representations of meaning. Finally, an attempt is made to give an outlook on the future impact of dependency grammar on computational linguistics.

Collaboration


Dive into the Ines Rehbein's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hagen Hirschmann

Humboldt University of Berlin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anke Lüdeling

Humboldt University of Berlin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge