Ernst Günter Schukat-Talamazzini
University of Erlangen-Nuremberg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Hotspot
Dive into the research topics where Ernst Günter Schukat-Talamazzini is active.
Publication
Featured researches published by Ernst Günter Schukat-Talamazzini.
Pattern Recognition | 1995
Horst Bunke; M. Roth; Ernst Günter Schukat-Talamazzini
Abstract A method for the off-line recognition of cursive handwriting based on hidden Markov models (HMMs) is described. The features used in the HMMs are based on the arcs of skeleton graphs of the words to be recognized. An algorithm is applied to the skeleton graph of a word that extracts the edges in a particular order. Given the sequence of edges extracted from the skeleton graph, each edge is transformed into a 10-dimensional feature vector. The features represent information about the location of an edge relative to the four reference lines, its curvature and the degree of the nodes incident to the considered edge. The linear model was adopted as basic HMM topology. Each letter of the alphabet is represented by a linear HMM. Given a dictionary of fixed size, an HMM for each dictionary word is built by sequential concatenation of the HMMs representing the individual letters of the word. Training of the HMMs is done by means of the Baum-Welch algorithm, while the Viterbi algorithm is used for recognition. An average correct recognition rate of over 98% on the word level has been achieved in experiments with cooperative writers using two dictionaries of I50 words each.
international joint conference on artificial intelligence | 1996
Marion Mast; Heinrich Niemann; Elmar Nöth; Ernst Günter Schukat-Talamazzini
This paper presents automatic methods for the classification of dialog acts. In the verbmobil application (speech-to-speech translation of face-to-face dialogs) maximally 50 % of the utterances are analyzed in depth and for the rest, shallow processing takes place. The dialog component keeps track of the dialog with this shallow processing. For the classification of utterances without in depth processing two methods are presented: Semantic Classification Trees and Polygrams. For both methods the classification algorithm is trained automatically from a corpus of labeled data. The novel idea with respect to SCTs is the use of dialog state dependent CTs and with respect to Polygrams it is the use of competing language models for the classification of dialog acts.
conference of the international speech communication association | 1995
Ralf Kompe; Andreas Kießling; Heinrich Niemann; Elmar Nöth; Ernst Günter Schukat-Talamazzini; A. Zottmann; Anton Batliner
Prosodic boundary detection is important to disam biguate parsing especially in spontaneous speech where elliptic sentences occur frequently Word graphs are an e cient interface between word recognition and parser Prosodic classi cation of word chains has been published earlier The adjustments necessary for applying these classi cation techniques to word graphs are discussed in this paper When classifying a word hypothesis a set of context words has to be determined appropriately A method has been developed to use stochastic language models for prosodic classi cation This as well has been adopted for the use on word graphs We also improved the set of acoustic prosodic features with which the recog nition errors were reduced by about on the read speech we were working on previously now achieving error rate for boundary classes and for accent classes Moving to spontaneous speech the recognition er ror increases signi cantly e g for a class boundary task We show that even on word graphs the combina tion of language models which model a larger context with acoustic prosodic classi ers reduces the recognition error by up to
international conference on acoustics, speech, and signal processing | 1995
Ernst Günter Schukat-Talamazzini; Joachim Hornegger; Heinrich Niemann
Linear discriminant or Karhunen-Loeve transforms are established techniques for mapping features into a lower dimensional subspace. This paper introduces a uniform statistical framework, where the computation of the optimal feature reduction is formalized as a maximum-likelihood estimation problem. The experimental evaluation of this suggested extension of linear selection methods shows a slight improvement of the recognition accuracy.
Mustererkennung 1991, 13. DAGM-Symposium | 1991
Ernst Günter Schukat-Talamazzini; Heinrich Niemann
Das Isadora-System ist ein HMM-basiertes System zur Analyse von Sprachsignalen. Phonetische, morphologische und grammatische Spracheinheiten werden durch die Knoten eines hierarchischen Konstituentennetzes reprasentiert. Gewohnliche Links-Rechts-Markovmodelle dienen der akustischen Modellierung minimaler Netzknoten, wahrend die Modelle komplexerer Knoten durch geeignete Verknupfungen (Hintereinander- und Parallelschaltung, Rackkopplung) kleinerer HMMs konstruiert werden.
international conference on acoustics, speech, and signal processing | 1992
Ernst Günter Schukat-Talamazzini; H. Niemann; W. Eckert; Thomas Kuhn; S. Rieck
The authors address the choice of suitable subword units for the hidden Markov model (HMM)-based front-end of a speaker-independent large vocabulary continuous speech dialog system (EVAR). In contrast to the well-known approach of using context-dependent phone-like units (for instance generalized triphones) the authors developed inventories of larger-sized subword units, so-called context-freezing units (CFU). CFU models can be considered as an approximation to the extremely desirable situation of having whole word HMMs under the limiting conditions of the training speech data at hand. Recognition experiments indicate an advantage of the context-freezing units over triphone/biphone/phone combinations in terms of the achieved word accuracy, at least in the case of German speech. Using triphones with contexts generalized by means of broad phonetic classes, the authors achieved results comparable to the CFU ones.<<ETX>>
Speech recognition and understanding | 1992
H. Niemann; Gerhard Sagerer; U. Ehrlic; Ernst Günter Schukat-Talamazzini; Franz Kummert
This contribution describes an approach to integrate a speech understanding and dialog system into a homogeneous architecture based on semantic networks. The definition of the network as well as its use in speech understanding is described briefly. A scoring function for word hypotheses meeting the requirements of a graph search algorithm is presented. The main steps of the linguistic analysis, i.e. syntax, semantics, and pragmatics, are described and their realization in the semantic network is shown. The processing steps alternating between data- and model-driven phases are outlined using an example sentence which demonstrates a tight interaction between word recognition and linguistic processing.
international conference on acoustics, speech, and signal processing | 1993
Ernst Günter Schukat-Talamazzini; M. Bielecki; Heinrich Niemann; Thomas Kuhn; S. Rieck
Three algorithms to speed up full covariance multivariate Gaussian vector quantizers are presented. The speed-up is achieved by avoiding the distance calculation for a considerable number of codebook classes at each input frame. In two cases, this pruning is guided by thresholds U/sub kappa lambda / which are computed for each class pair in a preprocessing stage. The computation of the U/sub kappa lambda /s from the codebook parameters by the gradient projection method leads to an admissible search strategy. Two of the proposed search procedures trace accuracy for speed. Both of them allow more than fivefold speed-up vector quantization at very low frame error rates, and without any degradation of word accuracy.<<ETX>>
international conference on pattern recognition | 1992
S. Rieck; Ernst Günter Schukat-Talamazzini; H. Niemann
Presents a new approach to speaker adaptation based on semi-continuous hidden Markov models (SCHMM). The authors introduce a modification of the semi-continuous codebook updating which allows rapid speaker adaptation. The approach is based on the idea that phonetic information already incorporated in a trained model should be used to update the codebook. Thus the different acoustic representation of a new speaker is learned while the connection between codebook entries and model states remains the same. Several experiments were carried out with a small speech sample. It is possible to demonstrate that the new codebook updating performs better than conventional SCHMM codebook updating and that using a speech sample comprising about 40 seconds of adaptation speech is enough to achieve 50 percent of the difference in performance between full speaker-dependent training and no adaptation at all.<<ETX>>
Mustererkennung 1995, 17. DAGM-Symposium | 1995
Keren Yu; Bernard Achermann; C. Nyffenegger; Xiaoyi Jiang; Horst Bunke; Ernst Günter Schukat-Talamazzini
In diesem Beitrag stellen wir je eine Methode fur die Erkennung von Frontal- und Profilansichten menschlicher Gesichter dar. Die Profilerkennung beruht auf einem Formvergleich, wahrend fur die Frontalansichten Hidden Markov-Modelle (HMMs) verwendet werden. Durch eine Kombination beider Methoden kann die Erkennungsrate deutlich verbessert werden.