Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Martin Riedl is active.

Publication


Featured researches published by Martin Riedl.


Journal of Language Modelling | 2013

Text: now in 2D! A framework for lexical expansion with contextual similarity

Chris Biemann; Martin Riedl

A new metaphor of two-dimensional text for data-driven semantic modeling of natural language is proposed, which provides an entirely new angle on the representation of text: not only syntagmatic relations are annotated in the text, but also paradigmatic relations are made explicit by generating lexical expansions. We operationalize distributional similarity in a general framework for large corpora, and describe a new method to generate similar terms in context. Our evaluation shows that distributional similarity is able to produce highquality lexical resources in an unsupervised and knowledge-free way, and that our highly scalable similarity measure yields better scores in a WordNet-based evaluation than previous measures for very large corpora. Evaluating on a lexical substitution task, we find that our contextualization method improves over a non-contextualized baseline across all parts of speech, and we show how the metaphor can be applied successfully to part-of-speech tagging. A number of ways to extend and improve the contextualization method within our framework are discussed. As opposed to comparable approaches, our framework defines a model of lexical expansions in context that can generate the expansions as opposed to ranking a given list, and thus does not require existing lexical-semantic resources.


international conference on pervasive computing | 2009

Rule-based activity recognition framework: Challenges, technique and learning

Holger Storf; Martin Becker; Martin Riedl

Among the central challenges of Ambient Assisted Living systems are the autonomous and reliable recognition of the assisted persons current situation and the proactive offering and rendering of adequate assistance services. In the context of emergency support, such situations may be acute emergency situations or long-term deviations from typical behavior that will result in emergency situations in the future. To optimize the treatment of the former and the prevention of the latter, reliable recognition of characteristic activities of daily living is necessary. In this paper, we present our multi-agent-based activity recognition framework as well as experiences made with it. Besides a detailed discussion of our hybrid recognition approach, we also elaborate on the tailoring of the underlying reasoning models to the individual environments and users in an initial learning phase. Finally, we present experiences made with the recognition framework in our Ambient Assisted Living Laboratory.


meeting of the association for computational linguistics | 2014

That's sick dude!: Automatic identification of word sense change across different timescales

Sunny Mitra; Ritwik Mitra; Martin Riedl; Chris Biemann; Animesh Mukherjee; Pawan Goyal

In this paper, we propose an unsupervised method to identify noun sense changes based on rigorous analysis of time-varying text data available in the form of millions of digitized books. We construct distributional thesauri based networks from data at different time points and cluster each of them separately to obtain word-centric sense clusters corresponding to the different time points. Subsequently, we compare these sense clusters of two different time points to find if (i) there is birth of a new sense or (ii) if an older sense has got split into more than one sense or (iii) if a newer sense has been formed from the joining of older senses or (iv) if a particular sense has died. We conduct a thorough evaluation of the proposed methodology both manually as well as through comparison with WordNet. Manual evaluation indicates that the algorithm could correctly identify 60.4% birth cases from a set of 48 randomly picked samples and 57% split/join cases from a set of 21 randomly picked samples. Remarkably, in 44% cases the birth of a novel sense is attested by WordNet, while in 46% cases and 43% cases split and join are respectively confirmed by WordNet. Our approach can be applied for lexicography, as well as for applications like word sense disambiguation or semantic search.


empirical methods in natural language processing | 2015

A Single Word is not Enough: Ranking Multiword Expressions Using Distributional Semantics

Martin Riedl; Chris Biemann

We present a new unsupervised mechanism, which ranks word n-grams according to their multiwordness. It heavily relies on a new uniqueness measure that computes, based on a distributional thesaurus, how often an n-gram could be replaced in context by a single-worded term. In addition with a downweighting mechanism for incomplete terms this forms a new measure called DRUID. Results show large improvements on two small test sets over competitive baselines. We demonstrate the scalability of the method to large corpora, and the independence of the measure of shallow syntactic filtering.


meeting of the association for computational linguistics | 2015

JoBimViz: A Web-based Visualization for Graph-based Distributional Semantic Models

Eugen Ruppert; Manuel Kaufmann; Martin Riedl; Chris Biemann

This paper introduces a web-based visualization framework for graph-based distributional semantic models. The visualization supports a wide range of data structures, including term similarities, similarities of contexts, support of multiword expressions, sense clusters for terms and sense labels. In contrast to other browsers of semantic resources, our visualization accepts input sentences, which are subsequently processed with languageindependent or language-dependent ways to compute term-context representations. Our web demonstrator currently contains models for multiple languages, based on different preprocessing such as dependency parsing and n-gram context representations. These models can be accessed from a database, the web interface and via a RESTful API. The latter facilitates the quick integration of such models in research prototypes.


Natural Language Engineering | 2015

An automatic approach to identify word sense changes in text media across timescales

Sunny Mitra; Ritwik Mitra; Suman Kalyan Maity; Martin Riedl; Chris Biemann; Pawan Goyal; Animesh Mukherjee

In this paper, we propose an unsupervised and automated method to identify noun sense changes based on rigorous analysis of time-varying text data available in the form of millions of digitized books and millions of tweets posted per day. We construct distributional-thesauribased networks from data at different time points and cluster each of them separately to obtain word-centric sense clusters corresponding to the different time points. Subsequently, we propose a split/join based approach to compare the sense clusters at two different time points to find if there is ‘birth’ of a new sense. The approach also helps us to find if an older sense was ‘split’ into more than one sense or a newer sense has been formed from the ‘join’ of older senses or a particular sense has undergone ‘death’. We use this completely unsupervised approach (a) within the Google books data to identify word sense differences within a media, and (b) across Google books and Twitter data to identify differences in word sense distribution across different media. We conduct a thorough evaluation of the proposed methodology both manually as well as through comparison with WordNet.


north american chapter of the association for computational linguistics | 2016

Unsupervised Compound Splitting With Distributional Semantics Rivals Supervised Methods

Martin Riedl; Chris Biemann

In this paper we present a word decompounding method that is based on distributional semantics. Our method does not require any linguistic knowledge and is initialized using a large monolingual corpus. The core idea of our approach is that parts of compounds (like “candle” and “stick”) are semantically similar to the entire compound, which helps to exclude spurious splits (like “candles” and “tick”). We report results for German and Dutch: For German, our unsupervised method comes on par with the performance of a rule-based and a supervised method and significantly outperforms two unsupervised baselines. For Dutch, our method performs only slightly below a rule-based optimized compound splitter.


meeting of the association for computational linguistics | 2016

Impact of MWE Resources on Multiword Recognition

Martin Riedl; Chris Biemann

In this paper, we demonstrate the impact of Multiword Expression (MWE) resources in the task of MWE recognition in text. We present results based on the Wiki50 corpus for MWE resources, generated using unsupervised methods from raw text and resources that are extracted using manual text markup and lexical resources. We show that resources acquired from manual annotation yield the best MWE tagging performance. However, a more finegrained analysis that differentiates MWEs according to their part of speech (POS) reveals that automatically acquired MWE lists outperform the resources generated from human knowledge for three out of four classes.


recent advances in natural language processing | 2017

Multilingual and Cross-Lingual Complex Word Identification.

Seid Muhie Yimam; Sanja Štajner; Martin Riedl; Chris Biemann

Complex Word Identification (CWI) is an important task in lexical simplification and text accessibility. Due to the lack of CWI datasets, previous works largely depend on Simple English Wikipedia and edit histories for obtaining ‘gold standard’ annotations, which are of doubtable quality, and limited only to English. We collect complex words/phrases (CP) for English, German and Spanish, annotated by both native and non-native speakers, and propose language independent features that can be used to train multilingual and cross-lingual CWI models. We show that the performance of cross-lingual CWI systems (using a model trained on one language and applying it on the other languages) is comparable to the performance of monolingual CWI systems.


language resources and evaluation | 2018

A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework

Flavio Massimiliano Cecchini; Martin Riedl; Elisabetta Fersini; Chris Biemann

Abstract This article presents a comparison of different Word Sense Induction (wsi) clustering algorithms on two novel pseudoword data sets of semantic-similarity and co-occurrence-based word graphs, with a special focus on the detection of homonymic polysemy. We follow the original definition of a pseudoword as the combination of two monosemous terms and their contexts to simulate a polysemous word. The evaluation is performed comparing the algorithm’s output on a pseudoword’s ego word graph (i.e., a graph that represents the pseudoword’s context in the corpus) with the known subdivision given by the components corresponding to the monosemous source words forming the pseudoword. The main contribution of this article is to present a self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). To our knowledge, we are the first to conduct and discuss a large-scale systematic pseudoword evaluation targeting the induction of coarse-grained homonymous word senses across a large number of graph clustering algorithms.

Collaboration


Dive into the Martin Riedl's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Animesh Mukherjee

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Ritwik Mitra

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Sunny Mitra

Indian Institute of Technology Kharagpur

View shared research outputs
Top Co-Authors

Avatar

Eugen Ruppert

Technische Universität Darmstadt

View shared research outputs
Top Co-Authors

Avatar

Matthew Hatem

Technische Universität Darmstadt

View shared research outputs
Top Co-Authors

Avatar

Sanja Štajner

University of Wolverhampton

View shared research outputs
Researchain Logo
Decentralizing Knowledge