Daniel Willett | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel Willett is active.

Explore More

Publication

Featured researches published by Daniel Willett.

international conference on acoustics, speech, and signal processing | 2000

Frame-discriminative and confidence-driven adaptation for LVCSR

Frank Wallhoff; Daniel Willett; Gerhard Rigoll

Maximum likelihood linear regression (MLLR) has become the most popular approach for adapting speaker-independent hidden Markov models to a specific speakers characteristics. However, it is well known, that discriminative training objectives outperform maximum likelihood training approaches, especially in cases where training data is very limited, as it always is the case in adaptation tasks. Therefore, this paper explores the application of a frame-based discriminative training objective for adaptation. It presents evaluations for supervised as well as for unsupervised adaption on the 1993 WSJ adaptation tests of native and non-native speakers. Relative improvements in word error rate of up to 25% could be measured compared to the MLLR adapted recognition systems. Along with unsupervised adaptation, the paper also presents the improvements achieved by the application of confidence measures. They provided an average relative improvement of 10% compared to ordinary unsupervised MLLR.

international conference on acoustics, speech, and signal processing | 2013

Investigation on cross- and multilingual MLP features under matched and mismatched acoustical conditions

Zoltán Tüske; Joel Pinto; Daniel Willett; Ralf Schlüter

In this paper, Multi Layer Perceptron (MLP) based multilingual bottleneck features are investigated for acoustic modeling in three languages - German, French, and US English. We use a modified training algorithm to handle the multilingual training scenario without having to explicitly map the phonemes to a common phoneme set. Furthermore, the cross-lingual portability of bottleneck features between the three languages are also investigated. Single pass recognition experiments on large vocabulary SMS dictation task indicate that (1) multilingual bottleneck features yield significantly lower word error rates compared to standard MFCC features (2) multilingual bottleneck features are superior to monolingual bottleneck features trained for the target language with limited training data, and (3) multilingual bottleneck features are beneficial in training acoustic models in a low resource language where only mismatched training data is available-by exploiting the more matched training data from other languages.

international conference on document analysis and recognition | 1999

Performance evaluation of a new hybrid modeling technique for handwriting recognition using identical on-line and off-line data

Anja Brakensiek; Andreas Kosmala; Daniel Willett; Wenwei Wang; Gerhard Rigoll

The paper deals with the performance evaluation of a novel hybrid approach to large vocabulary cursive handwriting recognition and contains various innovations. 1) It presents the investigation of a new hybrid approach to handwriting recognition, consisting of hidden Markov models (HMMs) and neural networks trained with a special information theory based training criterion. This approach has only been recently introduced successfully to online handwriting recognition and is now investigated for the first time for offline recognition. 2) The hybrid approach is extensively compared to traditional HMM modeling techniques and the superior performance of the new hybrid approach is demonstrated. 3) The data for the comparison has been obtained from a database containing online handwritten data which has been converted to offline data. Therefore, a multiple evaluation has been carried out, incorporating the comparison of different modeling techniques and the additional comparison of each technique for online and offline recognition, using a unique database. The results confirm that online recognition leads to better recognition results due to the dynamic information of the data, but also show that it is possible to obtain recognition rates for offline recognition that are close to the results obtained for online recognition. Furthermore, it can be shown that for both online and offline recognition, the new hybrid approach clearly outperforms the competing traditional HMM techniques. It is also shown that the new hybrid approach yields superior results for the offline recognition of machine printed multifont characters.

international conference on acoustics, speech, and signal processing | 2002

Recent advances in efficient decoding combining on-line transducer composition and smoothed language model incorporation

Daniel Willett; Shigeru Katagiri

This paper presents and evaluates our recent efforts on efficient decoding for Large Vocabulary Continuous Speech Recognition in the framework of Weighted Finite State Transducers. We evaluate on-the-fly transducer composition for reduced memory consumption combined with weight smearing for a more time-synchronous language model incorporation. It turns out that in the on-line composition mode weight smoothing within the static part of the network is even more beneficial on run-time to accuracy ratio than in the fully precompiled case. Evaluations are carried out on a state-of-the-art recognition system of 10k words, cross-word triphone acoustic models and trigram language model. In this scenario, the Viterbi-search is carried out fully time-synchronously in only a single pass. The combination of on-the-fly network composition with only the unigram part of the language model smoothly compiled into the network achieves a remarkably good run-time to accuracy ratio with only moderate memory requirements.

international conference on pattern recognition | 1998

A new hybrid approach to large vocabulary cursive handwriting recognition

Gerhard Rigoll; Andreas Kosmala; Daniel Willett

Presents a hybrid modeling technique that is used for the first time in hidden Markov model-based handwriting recognition. This new approach combines the advantages of discrete and continuous Markov models and it is shown that this is especially suitable for modeling the features typically used in handwriting recognition. The performance of this hybrid technique is demonstrated by an extensive comparison with traditional modeling techniques for a difficult large vocabulary handwriting recognition task.

IEEE Transactions on Speech and Audio Processing | 2001

A continuous density interpretation of discrete HMM systems and MMI-neural networks

Christoph Neukirchen; Jörg Rottland; Daniel Willett; Gerhard Rigoll

The subject of this paper is the integration of the traditional vector quantizer (VQ) and discrete hidden Markov models (HMM) combination in the mixture emission density framework commonly used in automatic speech recognition (ASR). It is shown that the probability density of a system that consists of a VQ and a discrete classifier can be interpreted as a special case of a semi-continuous mixture model. Thus, the VQ parameters and the classifier can be trained jointly. In this framework, a gradient based VQ training method for single and multiple feature stream systems is derived. This leads to an approach that is directly related to the paradigm of maximum mutual information (MMI) neural networks, that has been successfully applied as VQ in ASR earlier. In continuous speech recognition experiments that were carried out for the Resource Management and Wall Street Journal databases the presented systems achieve recognition accuracies that compete well with comparable Gaussian mixture HMMs. Thus, we demonstrate that the performance degradations, often reported for discrete HMM systems, are not mainly caused by the vector quantization process in itself, but that they are due to the traditional separation of the VQ and the HMM during parameter estimation. These degradations can be avoided by training of the entire system as described here, while keeping the attractive computational speed of discrete HMMs.

international conference on acoustics, speech, and signal processing | 2000

DUcoder-the Duisburg University LVCSR stackdecoder

Daniel Willett; Christoph Neukirchen; Gerhard Rigoll

With this paper, we present the DUcoder, the LVCSR decoder developed at Duisburg University. The decoder performs the Viterbi search for the most probable word sequence in recognition systems that make use of HMMs and backoff N-gram language models. In principle, the decoding strategy is similar to the one of the so-called stackdecoders. During the development of the decoder, emphasis has been laid upon innovations for rapidly speeding up decoding by carefully performing approximations. Besides a brief presentation of the decoders overall design, this paper points out the crucial issues with respect to speed and recognition performance. Evaluations are carried out on a German LVCSR system with a vocabulary of 100000 words, word-internal triphones and a trigram language model. Close-to-real-time performance is achieved with 12% additional error while a decoder configuration which runs in around 40 times real-time causes no search error on the evaluations set.

international conference on acoustics speech and signal processing | 1998

A NN/HMM hybrid for continuous speech recognition with a discriminant nonlinear feature extraction

Gerhard Rigoll; Daniel Willett

This paper deals with a hybrid NN/HMM architecture for continuous speech recognition. We present a novel approach to set up a neural linear or nonlinear feature transformation that is used as a preprocessor on top of the HMM systems RBF-network to produce discriminative feature vectors that are well suited for being modeled by mixtures of Gaussian distributions. In order to omit the computational cost of discriminative training of a context-dependent system, we propose to train a discriminant neural feature transformation on a system of low complexity and reuse this transformation in the context-dependent system to output improved feature vectors. The resulting hybrid system is an extension of a state-of-the-art continuous HMM system, and in fact, it is the first hybrid system that really is capable of outperforming these standard systems with respect to the recognition accuracy, without the need for discriminative training of the entire system. In experiments carried out on the Resource Management 1000-word continuous speech recognition task we achieved a relative error reduction of about 10% with a recognition system that, even before, was among the best ever observed on this task.

Mustererkennung 2000, 22. DAGM-Symposium | 2000

Unlimited Vocabulary Script Recognition Using Character N-Grams

Anja Brakensiek; Daniel Willett; Gerhard Rigoll

In this paper a robust Script recognition system is described, which makes use of a language model, that consists of backoff character n-grams. The system is based on Hidden Markov Models (HMMs) using discrete and hybrid modeling techniques, where the latter depends on a vector quantizer trained according to the MMI-criterion (information theory-based neural network). The presented recognition results refer to the SEDAL-database of degraded English documents such as photocopy or fax using no dictionary and a writer-dependent handwritten database of cursive German Script samples. Our resulting system for character recognition yields significantly better recognition results for an unlimited vocabulary using language models.

international conference on acoustics speech and signal processing | 1999

Refining tree-based state clustering by means of formal concept analysis, balanced decision trees and automatically generated model-sets

Daniel Willett; Christoph Neukirchen; Jörg Rottland; Gerhard Rigoll

Decision tree-based state clustering has emerged in as the most popular approach for clustering the states of context dependent hidden Markov model based speech recognizers. The application of sets of phones, mainly phonetically motivated, that limit the possible clusters, results in a reasonably good modeling of unseen phones while it still enables to model specific phones very precisely whenever this is necessary and enough training data is available. Formal concept analysis, a young mathematical discipline, provides means for the treatment of sets and sets of sets that are well suited for further improving tree-based state clustering. The possible refinements are outlined and evaluated in this paper. The major merit is the proposal of procedures for the adaptation of the number of sets used for clustering to the amount of available training data, and of a method that generates suitable sets automatically without the incorporation of additional knowledge.

Explore More