Fabian Triefenbach
Ghent University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fabian Triefenbach.
IEEE Transactions on Audio, Speech, and Language Processing | 2013
Fabian Triefenbach; Azarakhsh Jalalvand; Kris Demuynck; Jean-Pierre Martens
Accurate acoustic modeling is an essential requirement of a state-of-the-art continuous speech recognizer. The Acoustic Model (AM) describes the relation between the observed speech signal and the non-observable sequence of phonetic units uttered by the speaker. Nowadays, most recognizers use Hidden Markov Models (HMMs) in combination with Gaussian Mixture Models (GMMs) to model the acoustics, but neural-based architectures are on the rise again. In this work, the recently introduced Reservoir Computing (RC) paradigm is used for acoustic modeling. A reservoir is a fixed - and thus non-trained - Recurrent Neural Network (RNN) that is combined with a trained linear model. This approach combines the ability of an RNN to model the recent past of the input sequence with a simple and reliable training procedure. It is shown here that simple reservoir-based AMs achieve reasonable phone recognition and that deep hierarchical and bi-directional reservoir architectures lead to a very competitive Phone Error Rate (PER) of 23.1% on the well-known TIMIT task.
2011 First International Conference on Informatics and Computational Intelligence | 2011
Fabian Triefenbach; Jean-Pierre Martens
It has been shown for some time that a Recurrent Neural Network (RNN) can perform an accurate acoustic-phonetic decoding of a continuous speech stream. However, the error back-propagation through time (EBPTT) training of such a network is often critical (bad local optimum) and very time consuming. These problems hamper the deployment of sufficiently large networks that would be able to outperform state-of-the-art Hidden Markov Models. To overcome this drawback of RNNs, we recently proposed to employ a large pool of recurrently connected non-linear nodes (a so-called reservoir) with fixed weights, and to map the reservoir outputs to meaningful phonemic classes by means of a layer of linear output nodes (called the readout nodes) whose weights form the solution of a set of linear equations. In this paper, we collect experimental evidence that the performance of a reservoir-based system can be enhanced by working with non-linear readout nodes. Although this calls for an iterative training, it boils down to a non-linear regression which seems to be less critical and time consuming than EBPTT.
Computer Speech & Language | 2015
Azarakhsh Jalalvand; Fabian Triefenbach; Kris Demuynck; Jean-Pierre Martens
HighlightsStudy of robustness of Reservoir Computing (RC) based continuous digit recognizers.Discovery of new relations between RC control parameters, input and output dynamics.Use of these relations to find heuristics to reduce the reservoir development time.Creation of an RC-based recognizer that is more noise robust than the AFE-GMM-HMM. It is acknowledged that Hidden Markov Models (HMMs) with Gaussian Mixture Models (GMMs) as the observation density functions achieve excellent digit recognition performance at high signal to noise ratios (SNRs). Moreover, many years of research have led to good techniques to reduce the impact of noise, distortion and mismatch between training and test conditions on the recognition accuracy. Nevertheless, we still await systems that are truly robust against these confounding factors. The present paper extends recent work on acoustic modeling based on Reservoir Computing (RC), a concept that has its roots in Machine Learning. By introducing a novel analysis of reservoirs as non-linear dynamical systems, new insights are gained and translated into a new reservoir design recipe that is extremely simple and highly comprehensible in terms of the dynamics of the acoustic features and the modeled acoustic units. By tuning the reservoir to these dynamics, one can create RC-based systems that not only compete well with conventional systems in clean conditions, but also degrade more gracefully in noisy conditions. Control experiments show that noise-robustness follows from the random fixation of the reservoir neurons whereas, tuning the reservoir dynamics increases the accuracy without compromising the noise-robustness.
IEEE Signal Processing Letters | 2014
Fabian Triefenbach; Kris Demuynck; Jean-Pierre Martens
Thanks to research in neural network based acoustic modeling, progress in Large Vocabulary Continuous Speech Recognition (LVCSR) seems to have gained momentum recently. In search for further progress, the present letter investigates Reservoir Computing (RC) as an alternative new paradigm for acoustic modeling. RC unifies the appealing dynamical modeling capacity of a Recurrent Neural Network (RNN) with the simplicity and robustness of linear regression as a model for training the weights of that network. In previous work, an RC-HMM hybrid yielding very good phone recognition accuracy on TIMIT could be designed, but no proof was offered yet that this success would also transfer to LVCSR. This letter describes the development of an RC-HMM hybrid that provides good recognition on the Wall Street Journal benchmark. For the WSJ0 5k word task, word error rates of 6.2% (bigram language model) and 3.9% (trigram) are obtained on the Nov-92 evaluation set. Given that RC-based acoustic modeling is a fairly new approach, these results open up promising perspectives.
spoken language technology workshop | 2012
Fabian Triefenbach; Kris Demuynck; Jean-Pierre Martens
In earlier work we have shown that good phoneme recognition is possible with a so-called reservoir, a special type of recurrent neural network. In this paper, different architectures based on Reservoir Computing (RC) for large vocabulary continuous speech recognition are investigated. Besides experiments with HMM hybrids, it is shown that a RC-HMM tandem can achieve the same recognition accuracy as a classical HMM, which is a promising result for such a fairly new paradigm. It is also demonstrated that a state-level combination of the scores of the tandem and the baseline HMM leads to a significant improvement over the baseline. A word error rate reduction of the order of 20% relative is possible.
neural information processing systems | 2010
Fabian Triefenbach; Azarakhsh Jalalvand; Benjamin Schrauwen; Jean-Pierre Martens
conference of the international speech communication association | 2011
Azarakhsh Jalalvand; Fabian Triefenbach; David Verstraeten; Jean-Pierre Martens
ieee automatic speech recognition and understanding workshop | 2013
Kris Demuynck; Fabian Triefenbach
conference of the international speech communication association | 2012
Azarakhsh Jalalvand; Fabian Triefenbach; Jean-Pierre Martens
conference of the international speech communication association | 2013
Fabian Triefenbach; Azarakhsh Jalalvand; Kris Demuynck; Jean-Pierre Martens