Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sabine Schulte im Walde is active.

Publication


Featured researches published by Sabine Schulte im Walde.


Computational Linguistics | 2006

Experiments on the Automatic Induction of German Semantic Verb Classes

Sabine Schulte im Walde

This article presents clustering experiments on German verbs: A statistical grammar model for German serves as the source for a distributional verb description at the lexical syntax-semantics interface, and the unsupervised clustering algorithm k-means uses the empirical verb properties to perform an automatic induction of verb classes. Various evaluation measures are applied to compare the clustering results to gold standard German semantic verb classes under different criteria. The primary goals of the experiments are (1) to empirically utilize and investigate the well-established relationship between verb meaning and verb behavior within a cluster analysis and (2) to investigate the required technical parameters of a cluster analysis with respect to this specific linguistic task. The clustering methodology is developed on a small-scale verb set and then applied to a larger-scale verb set including 883 German verbs.


meeting of the association for computational linguistics | 2002

Inducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information

Sabine Schulte im Walde; Chris Brew

The paper describes the application of k-Means, a standard clustering technique, to the task of inducing semantic classes for German verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 57 verbs into 14 classes. The automatic clustering was evaluated against independently motivated, hand-constructed semantic verb classes. A series of post-hoc cluster analyses explored the influence of specific frames and frame groups on the coherence of the verb classes, and supported the tight connection between the syntactic behaviour of the verbs and their lexical meaning components.


empirical methods in natural language processing | 2002

Spectral clustering for German verbs

Chris Brew; Sabine Schulte im Walde

We describe and evaluate the application of a spectral clustering technique (Ng et al., 2002) to the unsupervised clustering of German verbs. Our previous work has shown that standard clustering techniques succeed in inducing Levin-style semantic classes from verb subcategorisation information. But clustering in the very high dimensional spaces that we use is fraught with technical and conceptual difficulties. Spectral clustering performs a dimensionality reduction on the verb frame patterns, and provides a robustness and efficiency that standard clustering methods do not display in direct use. The clustering results are evaluated according to the alignment (Christianini et al., 2002) between the Gram matrix defined by the cluster output and the corresponding matrix defined by a gold standard.


international conference on computational linguistics | 2000

Robust German noun chunking with a probabilistic context-free grammar

Helmut Schmid; Sabine Schulte im Walde

We present a noun chunker for German which is based on a head-lexicalised probabilistic context-free grammar. A manually developed grammar was semi-automatically extended with robustness rules in order to allow parsing of unrestricted text. The model parameters were learned from unlabelled training data by a probabilistic context-free parser. For extracting noun chunks, the parser generates all possible noun chunk analyses, scores them with a novel algorithm which maximizes the best chunk sequence criterion, and chooses the most probable chunk sequence. An evaluation of the chunker on 2,140 hand-annotated noun chunks yielded 92% recall and 93% precision.


Journal of Psycholinguistic Research | 2001

Verb Frame Frequency as a Predictor of Verb Bias

Mirella Lapata; Frank Keller; Sabine Schulte im Walde

There is considerable evidence showing that the human sentence processor is guided by lexical preferences in resolving syntactic ambiguities. Several types of preferences have been identified, including morphological, syntactic, and semantic ones. However, the literature fails to provide a uniform account of what lexical preferences are and how they should be measured. The present paper provides evidence for the view that lexical preferences are records of prior linguistic experience. We show that a type of lexial syntactic preference, viz., verb biases as measured by norming experiments, can be approximated by verb frame frequencies extracted from a large, balanced corpus using computational learning techniques.


conference of the european chapter of the association for computational linguistics | 2014

Chasing Hypernyms in Vector Spaces with Entropy

Enrico Santus; Alessandro Lenci; Qin Lu; Sabine Schulte im Walde

In this paper, we introduce SLQS , a new entropy-based measure for the unsupervised identification of hypernymy and its directionality in Distributional Semantic Models (DSMs). SLQS is assessed through two tasks: (i.) identifying the hypernym in hyponym-hypernym pairs, and (ii.) discriminating hypernymy among various semantic relations. In both tasks, SLQS outperforms other state-of-the-art measures.


empirical methods in natural language processing | 2005

Identifying Semantic Relations and Functional Properties of Human Verb Associations

Sabine Schulte im Walde; Alissa Melinger

This paper uses human verb associations as the basis for an investigation of verb properties, focusing on semantic verb relations and prominent nominal features. First, the lexical semantic taxonymy GermaNet is checked on the types of classic semantic relations in our data; verb-verb pairs not covered by GermaNet can help to detect missing links in the taxonomy, and provide a useful basis for defining non-classical relations. Second, a statistical grammar is used for determining the conceptual roles of the noun responses. We present prominent syntax-semantic roles and evidence for the usefulness of co-occurrence information in distributional verb descriptions.


joint conference on lexical and computational semantics | 2014

Contrasting Syntagmatic and Paradigmatic Relations: Insights from Distributional Semantic Models

Gabriella Lapesa; Stefan Evert; Sabine Schulte im Walde

This paper presents a large-scale evaluation of bag-of-words distributional models on two datasets from priming experiments involving syntagmatic and paradigmatic relations. We interpret the variation in performance achieved by different settings of the model parameters as an indication of which aspects of distributional patterns characterize these types of relations. Contrary to what has been argued in the literature (Rapp, 2002; Sahlgren, 2006) ‐ that bag-of-words models based on secondorder statistics mainly capture paradigmatic relations and that syntagmatic relations need to be gathered from first-order models ‐ we show that second-order models perform well on both paradigmatic and syntagmatic relations if their parameters are properly tuned. In particular, our results show that size of the context window and dimensionality reduction play a key role in differentiating DSM performance on paradigmatic vs. syntagmatic relations.


conference of the european chapter of the association for computational linguistics | 2003

Experiments on the choice of features for learning verb classes

Sabine Schulte im Walde

The choice of verb features is crucial for the learning of verb classes. This paper presents clustering experiments on 168 German verbs, which explore the relevance of features on three levels of verb description, purely syntactic frame types, prepositional phrase information and selectional preferences. In contrast to previous approaches concentrating on the sparse data problem, we present evidence for a linguistically defined limit on the usefulness of features which is driven by the idiosyncratic properties of the verbs and the specific attributes of the desired verb classification.


Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014) | 2014

Distinguishing Degrees of Compositionality in Compound Splitting for Statistical Machine Translation

Marion Weller; Fabienne Cap; Stefan Müller; Sabine Schulte im Walde; Alexander M. Fraser

The paper presents an approach to morphological compound splitting that takes the degree of compositionality into account. We apply our approach to German noun compounds and particle verbs within a German‐English SMT system, and study the effect of only splitting compositional compounds as opposed to an aggressive splitting. A qualitative study explores the translational behaviour of non-compositional compounds.

Collaboration


Dive into the Sabine Schulte im Walde's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stefan Bott

Pompeu Fabra University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jason Utt

University of Stuttgart

View shared research outputs
Researchain Logo
Decentralizing Knowledge