Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ralf Kompe is active.

Publication


Featured researches published by Ralf Kompe.


Speech Communication | 2004

Generating non-native pronunciation variants for lexicon adaptation

Silke Goronzy; Stefan Rapp; Ralf Kompe

Abstract Handling non-native speech in automatic speech recognition (ASR) systems is an area of increasing interest. The majority of systems are tailored to native speech only and as a consequence performance for non-native speakers often is not satisfactory. One way to approach the problem is to adapt the acoustic models to the new speaker. Another important means to improve performance for non-native speakers is to consider non-native pronunciations in the dictionary. The difficulty here lies in the generation of the non-native variants, especially if various accents are to be considered. Traditional approaches to model pronunciation variation either require phonetic expertise or extensive speech databases. They are too costly, especially if a flexible modelling of several accents is desired. We propose to exclusively use native speech databases to derive non-native pronunciation variants. We use an English phoneme recogniser to generate English pronunciations for German words and use these to train decision trees that are able to predict the respective English-accented variant from the German canonical transcription. Furthermore we combine this approach with online, incremental weighted MLLR speaker adaptation. Using the enhanced dictionary and the speaker adaptation alone improved the word error rate of the baseline system by 5.2% and 16.8%, respectively. When both methods were combined, we achieved an improvement of 18.2%.


conference of the international speech communication association | 1995

Prosodic scoring of word hypotheses graphs

Ralf Kompe; Andreas Kießling; Heinrich Niemann; Elmar Nöth; Ernst Günter Schukat-Talamazzini; A. Zottmann; Anton Batliner

Prosodic boundary detection is important to disam biguate parsing especially in spontaneous speech where elliptic sentences occur frequently Word graphs are an e cient interface between word recognition and parser Prosodic classi cation of word chains has been published earlier The adjustments necessary for applying these classi cation techniques to word graphs are discussed in this paper When classifying a word hypothesis a set of context words has to be determined appropriately A method has been developed to use stochastic language models for prosodic classi cation This as well has been adopted for the use on word graphs We also improved the set of acoustic prosodic features with which the recog nition errors were reduced by about on the read speech we were working on previously now achieving error rate for boundary classes and for accent classes Moving to spontaneous speech the recognition er ror increases signi cantly e g for a class boundary task We show that even on word graphs the combina tion of language models which model a larger context with acoustic prosodic classi ers reduces the recognition error by up to


international conference on computational linguistics | 1996

Integrating syntactic and prosodic information for the efficient detection of empty categories

Anton Batliner; Anke Feldhaus; Stefan Geißler; Andreas Kießling; Tibor Kiss; Ralf Kompe; Elmar Nöth

We describe a number of experiments that demonstrate the usefulness of prosodic information for a processing module which parses spoken utterances with a feature-based grammar employing empty categories. We show that by requiring certain prosodic properties from those positions in the input, where the presence of an empty category has to be hypothesized, a derivation can be accomplished more efficiently. The approach has been implemented in the machine translation project VERBMOBIL and results in a significant reduction of the work-load for the parser.


conference of the international speech communication association | 1993

Prosody takes over : a prosodically guided dialog system

Ralf Kompe; Andreas Kießling; Thomas Kuhn; Marion Mast; Heinrich Niemann; Elmar Nöth; K. Ott; Anton Batliner

In this paper first experiments with naive persons using the speech understanding and dialog system EVAR are discussed. The domain of EVAR is train table inquiry. We observed that in real human-human dialogs when the officer transmits the information the customer very often interrupts. Many of these interruptions are just repetitions of the time of day given by the officer. The functional role of these interruptions is determined by prosodic cues only. An important result of the experiments with EVAR is that it is hard to follow the system giving the train connection via speech synthesis. In this case it is even more important than in human-human dialogs that the user has the opportunity to interact during the answer phase. Therefore we extended the dialog module to allow the user to repeat the time of day and we added a prosody module guiding the continuation of the dialog.


conference of the international speech communication association | 1994

Improving parsing by incorporating "prosodic clause boundaries" into a grammar

Gabriele Bakenecker; U. Block; Anton Batliner; Ralf Kompe; Elmar Nöth; P. Regel-Brietzmann

In written language, punctuation is used to separate main and subordinate clause. In spoken language, ambiguities arise due to missing punctuation, but clause boundaries are often marked prosodically and can be used instead. We detect PCBs (Prosodically marked Clause Boundaries) by using prosodic features (duration, intonation, energy, and pause information) with a neural network, achieving a recognition rate of 82%. PCBs are integrated into our grammar using a special syntactic category ‘break’ that can be used in the phrase-structure rules of the grammar in a similar way as punctuation is used in grammars for written language. Whereas punctuation in most cases is obligatory, PCBs are sometimes optional. Moreover, they can in principle occur everywhere in the sentence due e.g. to hesitations or misrecognition. To cope with these problems we tested two different approaches: A slightly modified parser for word chains containing PCBs and a word graph parser that takes the probabilities of PCBs into account. Tests were conducted on a subset of infinitive subordinate clauses from a large speech database containing sentences from the domain of train table inquiries. The average number of syntactic derivations could be reduced by about 70 % even when working on recognized word graphs.


conference of the international speech communication association | 1994

AUTOMATIC LABELING OF PHRASE ACCENTS IN GERMAN

Andreas Kiessling; Ralf Kompe; Anton Batliner; Heinrich Niemann; Elmar Nöth

In this paper a method for the automatic labeling of phrase accents is described, based on a large text corpus that has been generated automatically and read by 100 speakers. Perception experiments on a subset of 500 utterances show a high agreement between the automatically generated accent labels and the judgment scores obtained. We computed different prosodic feature vectors from the speech signal for each syllable and trained different Gaussian distribution classifiers and artificial neural networks using the automatically generated accent labels. Recognition rates of up to 83% could be achieved for the distinction of accentuated vs. unaccentuated syllables. Similar results could be obtained for the comparison of the listeners judgments with the automatic classification.


Archive | 1994

Detection of phrase boundaries and accents

Andreas Kießling; Ralf Kompe; Heinrich Niemann; Elmar Nöth; Anton Batliner

On a large speech database read by untrained speakers experiments for the recognition of phrase boundaries and phrase accents were performed. We used durational features as well as features derived from pitch and energy contours and pause information. Different sets of features were compared. For distinguishing three different boundary classes a recognition rate of 75.7% and for distinguishing accentuated from unaccentuated syllables a recognition rate of 88.7% could be achieved.


Archive | 2013

Classification of boundaries and accents in spontaneous speech

Andreas Kießling; Ralf Kompe; Anton Batliner; Heinrich Niemann; Elmar Nöth

Das diesem Bericht zugrundeliegende Forschungsvorhaben wurde mit Mitteln des Bundesministers f ur Bildung, Wissenschaft, Forschung und Technologie unter dem F orderkennzeichen 01 IV 102 H/0 und 01 IV 102 F/4 gef ordert. Die Verantwortung f ur den Inhalt dieser Arbeit liegt bei den Autoren.


Mustererkennung 1997, 19. DAGM-Symposium | 1997

Prosodische Information: Begriffsbestimmung und Nutzen für das Sprachverstehen

Elmar Nöth; Anton Batliner; Andreas Kießling; Ralf Kompe; Heinrich Niemann

Prosodische Information spielt in der Mensch—Mensch—Kommunikation eine grose Rolle, in der automatischen Sprachverarbeitung (ASV) wurde diese Informationsquelle bisher jedoch nicht benutzt. Erst seitdem sich die automatische Sprachverarbeitung der Spontansprache und weniger restringierten Aufgabenstellungen zugewandt hat, ist der Einsatz der Prosodie wirklich wesentlich geworden. Wir beschreiben im einzelnen die Grunde dafur und zeigen an der Integration der Prosodie in das automatische Ubersetzungssystem Verbmobil, das dieser Einsatz auch erfolgreich ist. Verbmobil ist weltweit das erste ASV—Gesamtsystem, welches prosodische Information wahrend der linguistischen Analyse einsetzt. Die zur Zeit wirkungsvollste prosodische Information wird von den Wahrscheinlichkeiten fur Satzgrenzen geliefert. Diese werden zu 94% richtig erkannt. Wahrend des syntaktischen Parsens von Worthypothesengraphen fuhrt die Benutzung der Satzgrenzen—Information zu einer Beschleunigung der syntaktischen Analyse um 92% und zu einer Reduktion der syntaktischen Lesarten um 96%.


conference of the international speech communication association | 1994

Going back to the source : inverse filtering of the speech signal with ANNs

Joachim Denzler; Ralf Kompe; Andreas Kießling; Heinrich Niemann; Elmar Nöth

In this paper we present a new method transforming speech signals to voice source signals (VSS) using artificial neural networks (ANN). We will point out that the ANN mapping of speech signals into source signals is quite accurate, and most of the irregularities in the speech signal will lead to an irregularity in the source signal, produced by the ANN (ANN-VSS). We will show that the mapping of the ANN is robust with respect to untrained speakers, different recording conditions and facilities, and different vocabularies. We will also present preliminary results which show that from the ANN source signal pitch periods can be determined accurately.

Collaboration


Dive into the Ralf Kompe's collaboration.

Top Co-Authors

Avatar

Raquel Tato

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Elmar Nöth

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Stefan Rapp

University of Stuttgart

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andreas Kießling

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Anton Batliner

Ludwig Maximilian University of Munich

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andreas Kiessling

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elmar Nth

University of Erlangen-Nuremberg

View shared research outputs
Researchain Logo
Decentralizing Knowledge