Stefan Harbeck
University of Erlangen-Nuremberg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefan Harbeck.
Bioinformatics | 1999
Uwe Ohler; Stefan Harbeck; Heinrich Niemann; Elmar Nöth; Martin G. Reese
MOTIVATION We describe a new content-based approach for the detection of promoter regions of eukaryotic protein encoding genes. Our system is based on three interpolated Markov chains (IMCs) of different order which are trained on coding, non-coding and promoter sequences. It was recently shown that the interpolation of Markov chains leads to stable parameters and improves on the results in microbial gene finding (Salzberg et al., Nucleic Acids Res., 26, 544-548, 1998). Here, we present new methods for an automated estimation of optimal interpolation parameters and show how the IMCs can be applied to detect promoters in contiguous DNA sequences. Our interpolation approach can also be employed to obtain a reliable scoring function for human coding DNA regions, and the trained models can easily be incorporated in the general framework for gene recognition systems. RESULTS A 5-fold cross-validation evaluation of our IMC approach on a representative sequence set yielded a mean correlation coefficient of 0.84 (promoter versus coding sequences) and 0.53 (promoter versus non-coding sequences). Applied to the task of eukaryotic promoter region identification in genomic DNA sequences, our classifier identifies 50% of the promoter regions in the sequences used in the most recent review and comparison by Fickett and Hatzigeorgiou ( Genome Res., 7, 861-878, 1997), while having a false-positive rate of 1/849 bp.
pacific symposium on biocomputing | 1999
Uwe Ohler; Georg Stemmer; Stefan Harbeck; Heinrich Niemann
We present a new statistical approach for eukaryotic polymerase II promoter recognition. We apply stochastic segment models in which each state represents a functional part of the promoter. The segments are trained in an unsupervised way. We compare segment models with three and five states with our previous system which modeled the promoters as a whole, i.e. as a single state. Results on the classification of a representative collection of human and D. melanogaster promoter and non-promoter sequences show great improvements. The practical importance is demonstrated on the mining of large contiguous sequences.
international conference on acoustics speech and signal processing | 1999
Volker Warnke; Stefan Harbeck; Elmar Nöth; Heinrich Niemann; Michael Levit
In this paper we present a new approach for estimating the interpolation parameters of language models (LM) which are used as classifiers. With the classical maximum likelihood (ML) estimation theoretically one needs to have a huge amount of data and the fundamental density assumption has to be correct. Usually one of these conditions is violated, so different optimization techniques like maximum mutual information (MMI) and minimum classification error (MCE) can be used instead, where the interpolation parameters are not optimized on their own but in consideration of all models together. In this paper we present how MCE and MMI techniques can be applied to two different kind of interpolation strategies: the linear interpolation, which is the standard interpolation method and the rational interpolation. We compare ML, MCE and MMI on the German part of the Verbmobil corpus, where we get a reduction of 3% of classification error when discriminating between 18 dialog act classes.
Archive | 1997
Volker Warnke; Stefan Harbeck; Heinrich Niemann; Elmar Nöth
In this paper we present a new approach for topic spotting based on subword units and feature vectors instead of words. In our first approach, we only use vector quantized feature vectors and polygram language models for topic representation. In the second approach, we use phonemes instead of the vector quantized feature vectors and model the topics again using polygram language models. We trained and tested the two methods on two different corpora. The first is a part of a media corpus which contains data from TV shows for three different topics. The second is the VERBMOBIL-corpus where we used 18 dialog acts as topics. Each corpus was splitted into disjunctive test and training sets. We achieved recognition rates up to 82% for the three topics of the media corpus and up to 64% using 18 dialog acts of the VERBMOBIL-corpus as topics.
text speech and dialogue | 1999
Elmar Nöth; Florian Gallwitz; Maria Aretoulaki; Jürgen Haas; Stefan Harbeck; Richard Huber; Heinrich Niemann
In this paper we present extensions to the spoken dialogue system EVAR which are crucial issues for the next generation dialogue systems. EVAR was developed at the University of Erlangen. In 1994, it became accessible over telephone line and could answer inquiries in the German language about German InterCity train connections. It has since been continuously improved and extended, including some unique features, such as the processing of out-of-vocabulary words and a flexible dialogue strategy that adapts to the quality of the recognition of the user input.
Archive | 1998
Elmar Nth; Florian Gallwitz; Heinrich Niemann; Jürgen Haas; Maria Aretoulaki; Manuela Boros; Richard Huber; Stefan Harbeck
conference of the international speech communication association | 1997
Ernst Günter Schukat-Talamazzini; Florian Gallwitz; Stefan Harbeck; Volker Warnke
conference of the international speech communication association | 1995
Stefan Harbeck; Andreas Kießling; Ralf Kompe; Heinrich Niemann; Elmar Nöth
conference of the international speech communication association | 1998
Maria Aretoulaki; Stefan Harbeck; Florian Gallwitz; Elmar Nöth; Heinrich Niemann; Jozef Ivanecký; Ivo Ipšić; Nikola Pavesic; Václav Matoušek
conference of the international speech communication association | 1999
Stefan Harbeck; Uwe Ohler