Herbert Reininger
Goethe University Frankfurt
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Herbert Reininger.
international conference on acoustics, speech, and signal processing | 1985
Herbert Reininger; D. Wolf
In vector quantization schemes usually speech and speaker dependent codebooks are applied in order to achieve good speech quality at medium bit rates. This paper deals with another approach: The speech waveforms are transformed into signals which ideally do no longer contain speech and speaker specific features. Thus these signals can be encoded by an universal vector quantizer. This concept is realized by a system called RELP-VQ. The performance of this RELP-VQ scheme was evaluated by SNR-measurements as well as by informal listening tests including female and male English and German speakers.
IET Biometrics | 2015
Stefan Billeb; Christian Rathgeb; Herbert Reininger; Klaus Kasper; Christoph Busch
(Voice-) biometric data is considered as personally identifiable information, that is, the increasing demand on (mobile) speaker recognition systems calls for applications which prevent from privacy threats, such as identity-theft or tracking without consent. Technologies of biometric template protection, in particular biometric cryptosystems, fulfil standardised properties of irreversibility and unlinkability which represent appropriate countermeasures to such vulnerabilities of conventional biometric recognition systems. Thereby, public confidence in and social acceptance of biometric applications is strengthened. In this work the authors propose a binarisation technique, which is used to extract scalable high-entropy binary voice reference data (templates) from speaker models, based on Gaussian mixture models and universal background models. Binary feature vectors are then protected within a template protection scheme in particular, fuzzy commitment scheme, in which error correction list-decoding is employed to overcome high intra-class variance of voice samples. In experiments, which are evaluated out on a text-independent speaker corpus of 339 individuals, it is demonstrated that the fully ISO/IEC IS 24745 compliant system achieves privacy protection at a negligible loss of biometric performance, confirming the soundness of the presented approach.
Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204) | 1998
Harald Wüst; Klaus Kasper; Herbert Reininger
In order to establish solutions based on neural networks for low cost products for the mass market, low power and low complex single chip neuro-processors for implementing large neural networks are needed. We introduce a highly optimized hardware design of a low complex and cascadable neuro-processor for realizing feedforward and in particular recurrent neural networks. One main feature of the proposed design is a special mixed floating point and fixed point arithmetic which in contrast to high precision floating point units reduces the necessary word lengths and the overall memory requirements. Moreover a special activity memory structure is used to enable the efficient calculation of recurrent networks omitting communication and data transfer problems. Finally, the application of the proposed design to a speech recognition task and its realization on a FPGA is presented.
international conference on acoustics, speech, and signal processing | 1997
Klaus Kasper; Herbert Reininger; Dietrich Wolf
We present a robust speaker independent speech recognition system consisting of a feature extraction based on a model of the auditory periphery, and a locally recurrent neural network for scoring of the derived feature vectors. A number of recognition experiments were carried out to investigate the robustness of this combination against different types of noise in the test data. The proposed method is compared with cepstral, RASTA, and JAH-RASTA processing for feature extraction and hidden Markov models for scoring. The presented results show that the information in features from the auditory model can be best exploited by locally recurrent neural networks. The robustness achieved by this combination is comparable to that of JAH-RASTA in combination with HMM but without any requirement for an explicit adaptation to the noise in speech pauses.
international conference on acoustics, speech, and signal processing | 1991
U. Kipper; Herbert Reininger; Dietrich Wolf
The authors present a novel method for improving the speech quality of CELP (code excited linear prediction) schemes at low bit rates using an excitation sequence with a locally adapted number of pulses. The design of this adaptive CELP scheme for a data rate of 4.8 kb/s is described. The improvement of the performance as compared to that of a conventional 4.8 kb/s CELP scheme in terms of signal-to-noise ratio is about 3 dB. Due to the significantly reduced roughness, in particular in voiced segments, the processed speech sounds more clear and natural independently of the speaker.<<ETX>>
international conference on acoustics speech and signal processing | 1996
Klaus Kasper; Herbert Reininger; Harald Wüst
Recurrent neural networks (RNN) provide a solution for low cost speech recognition systems (SRS) in mass products or in products with energetic constraints if their inherent parallelism could be exploited in a hardware realization. Actually, the computational complexity of SRS based on fully recurrent neural networks (FRNN), e.g. the large number of connections, prevents a hardware realization. We introduce locally recurrent neural networks (LRNN) in order to keep the properties of RNN on the one hand and to reduce the connectivity density of the network on the other hand. By simulation experiments it is shown that the recognition capability of LRNN is equivalent to that of FRNN and superior to other proposed network architectures. Furthermore, it is shown that with an appropriate representation of the network parameters and a retraining of the network 5 Bit quantization of the weights and activities is possible without significant loss in recognition performance.
international conference on acoustics, speech, and signal processing | 1995
Klaus Kasper; Herbert Reininger; Dietrich Wolf; Harald Wüst
For a variety of telephone applications it is sufficient to realize a speech recognition system (SRS) with a system vocabulary consisting of a few command words, digits, and connected digits. However, in the development of a SRS for application in telephone environment it has to be considered that the speech is bandpass limited and a high recognition performance has to be guaranteed under speaker independent and even adverse conditions. Furthermore, it is important that the SRS is efficiently implementable. Fully recurrent neural networks (FRNN) provide a new approach for realizing a robust SRS with a single network. FRNN are able to perform the process of feature scoring discriminatively and independently of the length of the feature sequence. In SRS based on Hidden Markov Models (HMM), different methods have to be applied for scoring the feature vectors and for compensating the variations in phone durations. Here we report about investigations to realize a monolithic SRS based on FRNN for telephone speech. Besides isolated word recognition, the capability of FRNN-SRS to deal with connected digit recognition is presented. Furthermore, it is shown how FRNN could be immunized against several types of additive noise.
ieee workshop on neural networks for signal processing | 1994
Klaus Kasper; Herbert Reininger; Dietrich Wolf; Harald Wüst
Reports on investigations concerning the application of fully recurrent neural networks (FRNN) for speaker independent speech recognition. In a phoneme based recognition system separate FRNN are used for feature scoring as well as for compensating variations in time durations of speech segments. A recognizer with a FRNN for feature scoring achieves the same recognition rate as a recognition system where the context information is provided. The performance of the FRNN used for time alignment is comparable to that of a viterbi based alignment with durational constraints. Additionally, a monolithic speech recognizer is realized by FRNN which directly classifies feature sequences. The performance of this FRNN is comparable to that of speech recognition systems which are based on discrete hidden Markov models and use a sophisticated durational modeling. Furthermore, simulation experiments revealed that FRNN are able to extract relevant information for speech recognition from noise contaminated speech and thus achieve a robust recognition performance.<<ETX>>
Odyssey 2016 | 2016
Marco Paulini; Christian Rathgeb; Andreas Nautsch; Hermine Reichau; Herbert Reininger; Christoph Busch
Technologies of biometric template protection grant a significant improvement in data privacy and increase the likelihood that the general public will effectively consent in the biometric system usage. Focusing on speaker recognition this area of research is still in its infancy. Previously proposed voice biometric template protection schemes fail in guaranteeing required properties of irreversibility and unlinkability without significantly degrading the recognition accuracy. A crucial step for accurate and secure template protection schemes is the feature type transformation which might be required to binarize extracted feature vectors. In this paper we introduce a binarization technique for voice biometric features called multi-bit allocation. The proposed scheme, which builds upon a GMM-UBM-based speaker recogniton system, is designed to extract discriminative compact binary feature vectors to be applied in a voice biometric template protection scheme. In a preliminary experimental study we show that the resulting binary representation causes only a marginal decrease in biometric performance compared to the baseline system, confirming the soundness and aplicability of the proposed scheme.
ieee workshop on neural networks for signal processing | 1995
Klaus Kasper; Herbert Reininger; Dietrich Wolf; Harald Wüst
Speech recognition systems (SRS) designed for applications in low cost products, like telephones or in systems like autonomous vehicles, are faced with the demand for solutions with low complexity. A small vocabulary consisting of a few command words and the digits is sufficient for most of the applications but has to be recognized robustly. Here we report about investigations concerning the application of recurrent neural networks (RNN) for speaker independent recognition of speech signals with telephone bandwidth. An RNN-SRS with low complexity is developed which recognizes isolated words as well as connected digits in adverse conditions. We introduce locally recurrent neural networks (LRNN). LRNN are layered networks which have recurrent connections only between the neurons of a hidden layer and their n-nearest neighbours. The neurons of the input and the output layer have unidirectional and sparse connections to the hidden layer. In comparison to RNN the density of the connections is drastically reduced and long distance wiring could be avoided in VLSI realization.