Marcel Katz
Otto-von-Guericke University Magdeburg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marcel Katz.
international conference on pattern recognition | 2006
Sven E. Krüger; Martin Schafföner; Marcel Katz; Edin Andelic; Andreas Wendemuth
Speech recognition is usually based on hidden Markov models (HMMs), which represent the temporal dynamics of speech very efficiently, and Gaussian mixture models, which do non-optimally the classification of speech into single speech units (phonemes). In this paper we use parallel mixtures of support vector machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system. SVMs are very appealing due to their association with statistical learning theory and have already shown good results in pattern recognition and in continuous speech recognition. They suffer however from the effort for training which scales at least quadratic with respect to the number of training vectors. The SVM mixtures need only nearly linear training time making it easier to deal with the large amount of speech data. In our hybrid system we use the SVM mixtures as acoustic models in a HMM-based decoder. We train and test the hybrid system on the DARPA resource management (RM1) corpus, showing better performance than HMM-based decoder using Gaussian mixtures
Neural Computation | 2006
Edin Andelic; Martin Schafföner; Marcel Katz; Sven E. Krüger; Andreas Wendemuth
Sparse nonlinear classification and regression models in reproducing kernel Hilbert spaces (RKHSs) are considered. The use of Mercer kernels and the square loss function gives rise to an overdetermined linear least-squares problem in the corresponding RKHS. When we apply a greedy forward selection scheme, the least-squares problem may be solved by an order-recursive update of the pseudoinverse in each iteration step. The computational time is linear with respect to the number of the selected training samples.
international conference on pattern recognition | 2002
Marcel Katz; Hans-Günter Meier; Hans J. G. A. Dolfing; Dietrich Klakow
Focuses on the problem of a robust estimation of different transformation matrices based on linear discriminant analysis (LDA) as it is used in automatic speech recognition systems. We investigate the effect of class distributions with artificial features and compare the resulting Fisher criterion. The paper shows that it is not very helpful to use only the Fisher criterion for an assessment of class separability. Furthermore we address the problem of dealing with too many additional dimensions in the estimation. Special experiments performed on subsets of the Wall Street Journal database (WSJ) indicate that a minimum of about 2000 feature vectors per class is needed for robust estimations with monophones. Finally we make a prediction to future experiments on the LDA matrix estimation with more classes.
international conference on pattern recognition | 2006
Edin Andelic; Martin Schafföner; Marcel Katz; Sven E. Krüger
In this paper, we propose a novel order-recursive training algorithm for kernel-based discriminants which is computationally efficient. We integrate this method in a hybrid HMM-based speech recognition system by translating the outputs of the kernel-based classifier into class-conditional probabilities and using them instead of Gaussian mixtures as production probabilities of a HMM-based decoder for speech recognition. The performance of the described hybrid structure is demonstrated on the DARPA resource management (RMI) corpus
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop | 2006
Marcel Katz; Martin Schafföner; Edin Andelic; Sven E. Krüger; Andreas Wendemuth
Logistic regression is a well known classification method in the field of statistical learning. Recently, a kernelized version of logistic regression has become very popular, because it allows non-linear probabilistic classification and shows promising results on several benchmark problems. In this paper we show that kernel logistic regression (KLR) and especially its sparse extensions (SKLR) are useful alternatives to standard Gaussian mixture models (GMMs) and support vector machines (SVMs) in Speaker recognition. While the classification results of KLR and SKLR are similar to the results of SVMs, we show that SKLR produces highly sparse models. Unlike SVMs the kernel logistic regression also provides an estimate of the conditional probability of class membership. In speaker identification experiments the SKLR methods outperform the SVM and the GMM baseline system on the POLY-COST database
international conference on intelligent computing | 2006
Marcel Katz; Sven E. Krüger; Martin Schafföner; Edin Andelic; Andreas Wendemuth
In this paper we investigate two discriminative classification approaches for frame-based speaker identification and verification, namely Support Vector Machine (SVM) and Sparse Kernel Logistic Regression (SKLR). SVMs have already shown good results in regression and classification in several fields of pattern recognition as well as in continuous speech recognition. While the non-probabilistic output of the SVM has to be translated into conditional probabilities, the SKLR produces the probabilities directly. In speaker identification and verification experiments both discriminative classification methods outperform the standard Gaussian Mixture Model (GMM) system on the POLYCOST database.
international conference on acoustics, speech, and signal processing | 2006
Martin Schafföner; Sven E. Krüger; Edin Andelic; Marcel Katz; Andreas Wendemuth
Contemporary automatic speech recognition uses hidden-Markov-models (HMMs) to model the temporal structure of speech where one HMM is used for each phonetic unit. The states of the HMMs are associated with state-conditional probability density functions (PDFs) which are typically realized using mixtures of Gaussian PDFs (GMMs). Training of GMMs is error-prone especially if training data size is limited. This paper evaluates two new methods of modeling state-conditional PDFs using probabilistically interpreted support vector machines and kernel Fisher discriminants. Extensive experiments on the RMI (P. Price et al., 1988) corpus yield substantially improved recognition rates compared to traditional GMMs. Due to their generalization ability, our new methods reduce the word error rate by up to 13% using the complete training set and up to 33% when the training set size is reduced
Archive | 2005
M. Deutscher; Marcel Katz; Sven E. Krüger; M. Bajbouj
It is hard to describe in an analytical way all dynamical relationships of the ground and the robot. Unexpected environmental changes complicate a safe walking. Therefore it is obvious to develop a method what allows the evaluation of acoustic emissions depending on acting forces in the past and consecutive steps during walking. Practical investigations of steps on different ground deliver information which were recorded online and interpreted by an operator. The analytical evaluation of the relationship between robot body and the environmental reactions and a subjective interpretation delivers bad or good steps of these anticipations. That knowledge can be “leaved” for further explorations. The fusion of our data and evaluation of these relationships were done by our SVM (Support Vector Machine) a fast statistical classifier. The recognizable relationships between the empirical information and that of the SVM can help to generate adequate movements.
conference of the international speech communication association | 2005
Sven E. Krüger; Martin Schafföner; Marcel Katz; Edin Andelic; Andreas Wendemuth
Archive | 2004
Andreas Wendemuth; Edin Andelic; Sebastian Barth; Stefan Dobler; Marcel Katz; Sven E. Krüger; Michael Maiwald; Mathias Mamsch; Martin Schafföner