Ernst Günter Schukat-Talamazzini

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ernst Günter Schukat-Talamazzini is active.

Explore More

Publication

Featured researches published by Ernst Günter Schukat-Talamazzini.

Pattern Recognition | 1995

Off-line cursive handwriting recognition using hidden markov models

Horst Bunke; M. Roth; Ernst Günter Schukat-Talamazzini

Abstract A method for the off-line recognition of cursive handwriting based on hidden Markov models (HMMs) is described. The features used in the HMMs are based on the arcs of skeleton graphs of the words to be recognized. An algorithm is applied to the skeleton graph of a word that extracts the edges in a particular order. Given the sequence of edges extracted from the skeleton graph, each edge is transformed into a 10-dimensional feature vector. The features represent information about the location of an edge relative to the four reference lines, its curvature and the degree of the nodes incident to the considered edge. The linear model was adopted as basic HMM topology. Each letter of the alphabet is represented by a linear HMM. Given a dictionary of fixed size, an HMM for each dictionary word is built by sequential concatenation of the HMMs representing the individual letters of the word. Training of the HMMs is done by means of the Baum-Welch algorithm, while the Viterbi algorithm is used for recognition. An average correct recognition rate of over 98% on the word level has been achieved in experiments with cooperative writers using two dictionaries of I50 words each.

international joint conference on artificial intelligence | 1996

Automatic classification of dialog acts with semantic classification trees and polygrams

Marion Mast; Heinrich Niemann; Elmar Nöth; Ernst Günter Schukat-Talamazzini

This paper presents automatic methods for the classification of dialog acts. In the verbmobil application (speech-to-speech translation of face-to-face dialogs) maximally 50 % of the utterances are analyzed in depth and for the rest, shallow processing takes place. The dialog component keeps track of the dialog with this shallow processing. For the classification of utterances without in depth processing two methods are presented: Semantic Classification Trees and Polygrams. For both methods the classification algorithm is trained automatically from a corpus of labeled data. The novel idea with respect to SCTs is the use of dialog state dependent CTs and with respect to Polygrams it is the use of competing language models for the classification of dialog acts.

conference of the international speech communication association | 1995

Prosodic scoring of word hypotheses graphs

Ralf Kompe; Andreas Kießling; Heinrich Niemann; Elmar Nöth; Ernst Günter Schukat-Talamazzini; A. Zottmann; Anton Batliner

Prosodic boundary detection is important to disam biguate parsing especially in spontaneous speech where elliptic sentences occur frequently Word graphs are an e cient interface between word recognition and parser Prosodic classi cation of word chains has been published earlier The adjustments necessary for applying these classi cation techniques to word graphs are discussed in this paper When classifying a word hypothesis a set of context words has to be determined appropriately A method has been developed to use stochastic language models for prosodic classi cation This as well has been adopted for the use on word graphs We also improved the set of acoustic prosodic features with which the recog nition errors were reduced by about on the read speech we were working on previously now achieving error rate for boundary classes and for accent classes Moving to spontaneous speech the recognition er ror increases signi cantly e g for a class boundary task We show that even on word graphs the combina tion of language models which model a larger context with acoustic prosodic classi ers reduces the recognition error by up to

international conference on acoustics, speech, and signal processing | 1995

Optimal linear feature transformations for semi-continuous hidden Markov models

Ernst Günter Schukat-Talamazzini; Joachim Hornegger; Heinrich Niemann

Linear discriminant or Karhunen-Loeve transforms are established techniques for mapping features into a lower dimensional subspace. This paper introduces a uniform statistical framework, where the computation of the optimal feature reduction is formalized as a maximum-likelihood estimation problem. The experimental evaluation of this suggested extension of linear selection methods shows a slight improvement of the recognition accuracy.

Mustererkennung 1991, 13. DAGM-Symposium | 1991

Das ISADORA-System - ein akustisch-phonetisches Netzwerk zur automatischen Spracherkennung

Ernst Günter Schukat-Talamazzini; Heinrich Niemann

Das Isadora-System ist ein HMM-basiertes System zur Analyse von Sprachsignalen. Phonetische, morphologische und grammatische Spracheinheiten werden durch die Knoten eines hierarchischen Konstituentennetzes reprasentiert. Gewohnliche Links-Rechts-Markovmodelle dienen der akustischen Modellierung minimaler Netzknoten, wahrend die Modelle komplexerer Knoten durch geeignete Verknupfungen (Hintereinander- und Parallelschaltung, Rackkopplung) kleinerer HMMs konstruiert werden.

international conference on acoustics, speech, and signal processing | 1992

Acoustic modelling of subword units in the Isadora speech recognizer

Ernst Günter Schukat-Talamazzini; H. Niemann; W. Eckert; Thomas Kuhn; S. Rieck

The authors address the choice of suitable subword units for the hidden Markov model (HMM)-based front-end of a speaker-independent large vocabulary continuous speech dialog system (EVAR). In contrast to the well-known approach of using context-dependent phone-like units (for instance generalized triphones) the authors developed inventories of larger-sized subword units, so-called context-freezing units (CFU). CFU models can be considered as an approximation to the extremely desirable situation of having whole word HMMs under the limiting conditions of the training speech data at hand. Recognition experiments indicate an advantage of the context-freezing units over triphone/biphone/phone combinations in terms of the achieved word accuracy, at least in the case of German speech. Using triphones with contexts generalized by means of broad phonetic classes, the authors achieved results comparable to the CFU ones.<<ETX>>

Speech recognition and understanding | 1992

The Interaction of Word Recognition and Linguistic Processing in Speech Understanding

H. Niemann; Gerhard Sagerer; U. Ehrlic; Ernst Günter Schukat-Talamazzini; Franz Kummert

This contribution describes an approach to integrate a speech understanding and dialog system into a homogeneous architecture based on semantic networks. The definition of the network as well as its use in speech understanding is described briefly. A scoring function for word hypotheses meeting the requirements of a graph search algorithm is presented. The main steps of the linguistic analysis, i.e. syntax, semantics, and pragmatics, are described and their realization in the semantic network is shown. The processing steps alternating between data- and model-driven phases are outlined using an example sentence which demonstrates a tight interaction between word recognition and linguistic processing.

international conference on acoustics, speech, and signal processing | 1993

A non-metrical space search algorithm for fast Gaussian vector quantization

Ernst Günter Schukat-Talamazzini; M. Bielecki; Heinrich Niemann; Thomas Kuhn; S. Rieck

Three algorithms to speed up full covariance multivariate Gaussian vector quantizers are presented. The speed-up is achieved by avoiding the distance calculation for a considerable number of codebook classes at each input frame. In two cases, this pruning is guided by thresholds U/sub kappa lambda / which are computed for each class pair in a preprocessing stage. The computation of the U/sub kappa lambda /s from the codebook parameters by the gradient projection method leads to an admissible search strategy. Two of the proposed search procedures trace accuracy for speed. Both of them allow more than fivefold speed-up vector quantization at very low frame error rates, and without any degradation of word accuracy.<<ETX>>

international conference on pattern recognition | 1992

Speaker adaptation using semi-continuous hidden Markov models

S. Rieck; Ernst Günter Schukat-Talamazzini; H. Niemann

Presents a new approach to speaker adaptation based on semi-continuous hidden Markov models (SCHMM). The authors introduce a modification of the semi-continuous codebook updating which allows rapid speaker adaptation. The approach is based on the idea that phonetic information already incorporated in a trained model should be used to update the codebook. Thus the different acoustic representation of a new speaker is learned while the connection between codebook entries and model states remains the same. Several experiments were carried out with a small speech sample. It is possible to demonstrate that the new codebook updating performs better than conventional SCHMM codebook updating and that using a speech sample comprising about 40 seconds of adaptation speech is enough to achieve 50 percent of the difference in performance between full speaker-dependent training and no adaptation at all.<<ETX>>

Mustererkennung 1995, 17. DAGM-Symposium | 1995

Kombination von Frontal- und Profilanalyse menschlicher Gesichter

Keren Yu; Bernard Achermann; C. Nyffenegger; Xiaoyi Jiang; Horst Bunke; Ernst Günter Schukat-Talamazzini

In diesem Beitrag stellen wir je eine Methode fur die Erkennung von Frontal- und Profilansichten menschlicher Gesichter dar. Die Profilerkennung beruht auf einem Formvergleich, wahrend fur die Frontalansichten Hidden Markov-Modelle (HMMs) verwendet werden. Durch eine Kombination beider Methoden kann die Erkennungsrate deutlich verbessert werden.

Explore More