Martin Jansche | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Jansche is active.

Explore More

Publication

Featured researches published by Martin Jansche.

empirical methods in natural language processing | 2002

Information Extraction from Voicemail Transcripts

Martin Jansche; Steven P. Abney

Voicemail is not like email. Even such basic information as the name of the caller/sender or a phone number for returning calls is not represented explicitly and must be obtained from message transcripts or other sources. We discuss techniques for doing this and the challenges these tasks present.

meeting of the association for computational linguistics | 2003

Parametric Models of Linguistic Count Data

Martin Jansche

It is well known that occurrence counts of words in documents are often modeled poorly by standard distributions like the binomial or Poisson. Observed counts vary more than simple models predict, prompting the use of overdispersed models like Gamma-Poisson or Beta-binomial mixtures as robust alternatives. Another deficiency of standard models is due to the fact that most words never occur in a given document, resulting in large amounts of zero counts. We propose using zero-inflated models for dealing with this, and evaluate competing models on a Naive Bayes text classification task. Simple zero-inflated models can account for practically relevant variation, and can be easier to work with than overdispersed models.

international conference on computational linguistics | 2002

Named entity extraction with conditional Markov models and classifiers

Martin Jansche

Our approach to multilingual named entity (NE) recognition in the context of the CoNLL Shared Task consists of the following ingredients:Feature engineering A human expert (though not necessarily a language expert) determines relevant features to be used to determine whether or not a word is part of a named entity.

north american chapter of the association for computational linguistics | 2001

Re-engineering letter-to-sound rules

Martin Jansche

Using finite-state automata for the text analysis component in a text-to-speech system is problematic in several respects: the rewrite rules from which the automata are compiled are difficult to write and maintain, and the resulting automata can become very large and therefore inefficient. Converting the knowledge represented explicitly in rewrite rules into a more efficient format is difficult. We take an indirect route, learning an efficient decision tree representation from data and tapping information contained in existing rewrite rules, which increases performance compared to learning exclusively from a pronunciation lexicon.

Archive | 2002

Generating Spatial Descriptions from a Cognitive Point of View

Robert Porzel; Martin Jansche; Ralf Meyer-Klabunde

Speakers’ descriptions of one and the same spatial scenario differ greatly in respect to the linearisation of the objects and the strategies and points of view employed. Modelling the conceptual processes that are responsible for selecting and processing this information for a spatial description constitutes a complex task, since a multitude of factors come into play. Empirical findings and principles from various fields must be included in modelling this rather complex problem-solving task. We present several empirical studies that show how specific features of the spatial representation, the addressee, and the communicative task influence linearisation processes. Based on these studies, a natural language generation system, ParOLE, is introduced as an example of a cognitively motivated model of the conceptual processes underlying the production of spatial descriptions.

conference of the international speech communication association | 2011