Martin Jansche
Ohio State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Jansche.
empirical methods in natural language processing | 2002
Martin Jansche; Steven P. Abney
Voicemail is not like email. Even such basic information as the name of the caller/sender or a phone number for returning calls is not represented explicitly and must be obtained from message transcripts or other sources. We discuss techniques for doing this and the challenges these tasks present.
meeting of the association for computational linguistics | 2003
Martin Jansche
It is well known that occurrence counts of words in documents are often modeled poorly by standard distributions like the binomial or Poisson. Observed counts vary more than simple models predict, prompting the use of overdispersed models like Gamma-Poisson or Beta-binomial mixtures as robust alternatives. Another deficiency of standard models is due to the fact that most words never occur in a given document, resulting in large amounts of zero counts. We propose using zero-inflated models for dealing with this, and evaluate competing models on a Naive Bayes text classification task. Simple zero-inflated models can account for practically relevant variation, and can be easier to work with than overdispersed models.
international conference on computational linguistics | 2002
Martin Jansche
Our approach to multilingual named entity (NE) recognition in the context of the CoNLL Shared Task consists of the following ingredients:Feature engineering A human expert (though not necessarily a language expert) determines relevant features to be used to determine whether or not a word is part of a named entity.
north american chapter of the association for computational linguistics | 2001
Martin Jansche
Using finite-state automata for the text analysis component in a text-to-speech system is problematic in several respects: the rewrite rules from which the automata are compiled are difficult to write and maintain, and the resulting automata can become very large and therefore inefficient. Converting the knowledge represented explicitly in rewrite rules into a more efficient format is difficult. We take an indirect route, learning an efficient decision tree representation from data and tapping information contained in existing rewrite rules, which increases performance compared to learning exclusively from a pronunciation lexicon.
Archive | 2002
Robert Porzel; Martin Jansche; Ralf Meyer-Klabunde
Speakers’ descriptions of one and the same spatial scenario differ greatly in respect to the linearisation of the objects and the strategies and points of view employed. Modelling the conceptual processes that are responsible for selecting and processing this information for a spatial description constitutes a complex task, since a multitude of factors come into play. Empirical findings and principles from various fields must be included in modelling this rather complex problem-solving task. We present several empirical studies that show how specific features of the spatial representation, the addressee, and the communicative task influence linearisation processes. Based on these studies, a natural language generation system, ParOLE, is introduced as an example of a cognitively motivated model of the conceptual processes underlying the production of spatial descriptions.
conference of the international speech communication association | 2011
Yun-Hsuan Sung; Martin Jansche; Pedro J. Moreno
conference of the international speech communication association | 2011
Samantha Ainsley; Linne Ha; Martin Jansche; Ara Kim; Masayuki Nanzawa
Archive | 2002
Gerald Baumgartner; Martin Jansche; Konstantin Läufer
workshop spoken language technologies for under resourced languages | 2018
Isin Demirsahin; Martin Jansche; Alexander Gutkin
workshop spoken language technologies for under resourced languages | 2018
Alexander Gutkin; Tatiana Merkulova; Martin Jansche