Erhard Rank
Vienna University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Erhard Rank.
Signal Processing | 2003
Erhard Rank
We examine Bayesian learning of a regularization factor and the noise level of radial basis function (RBF) networks in the framework of nonlinear time-series prediction and system modeling. A Bayesian trained RBF network is applied in an autonomous recursive prediction model (oscillator model) for regenerating time-series generated by the Lorenz system and speech signals. The oscillator model is able to capture the invariant measures of the Lorenz system for high enough SNR, and to reproduce the voiced part of speech signals.
international conference on acoustics, speech, and signal processing | 1997
Erhard Rank; Gernot Kubin
Starting from the waveguide model for plucked strings, a new digital signal processing model for the slapping technique on electric bass guitars is derived. The model includes amplitude limitations for the string at the frets and/or the fingerboard. These highly nonlinear elements are realized by conditional reflections which depend on the local string displacement. A model of the string dynamics for the two slapbass techniques-knocking the string with the thumb knuckle and plucking very strong with the index or middle finger-has been implemented both as MATLAB and C simulations and synthesizes sounds close to the natural instrument.
international work conference on artificial and natural neural networks | 2001
Erhard Rank; Gernot Kubin
In this paper we present a speech analysis/synthesis coder based on a combination of linear prediction with nonlinear modeling of the residual using a regularized radial basis function (RBF) network. The model has been applied to synthesis of sustained vowel signals and has been found to preserve the dynamics and spectra of the original speech signal. While several nonlinear speech models reportedly suffer from high-frequency losses in the synthesized speech due to system inherent low-pass behavior, our approach achieves good speech signal reproduction even in the higher frequency ranges. The decomposition of the speech signal by linear prediction analysis supports processing during synthesis such as pitch modifications while the nonlinear modeling provides the means for adequate reproduction of the fine-grained dynamic characteristics of speech.
Speech Communication | 2006
Erhard Rank; Gernot Kubin
Abstract The autonomous oscillator model for speech synthesis is augmented by a non-linear predictor to re-generate the modulated noise-like signal component of speech signals. The resulting ‘oscillator-plus-noise’ model in combination with vocal tract modeling by linear prediction is able to re-generate the spectral content of stationary wide-band vowel signals with high fidelity. For adequate modeling of mixed-excitation speech signals (such as voiced fricatives), the model is extended by a second linear prediction path for the independent spectral shaping of the noise-like component. With one and the same model, not only sustained voiced and mixed-excitation phonemes, but also stationary unvoiced sounds can be re-generated faithfully.
Lecture Notes in Computer Science | 2005
Gernot Kubin; Claudia Lainscsek; Erhard Rank
More than ten years ago the first successful application of a nonlinear oscillator model to high-quality speech signal processing was reported (Kubin and Kleijn, 1994). Since then, numerous developments have been initiated to turn nonlinear oscillators into a standard tool for speech technology. The present contribution will review and compare several of these attempts with a special emphasis on adaptive model identification from data and the approaches to the associated machine learning problems. This includes Bayesian methods for the regularization of the parameter estimation problem (including the pruning of irrelevant parameters) and Ansatz library (Lainscsek et al., 2001) based methods (structure selection of the model). We conclude with the observation that these advanced identification methods need to be combined with a thorough background from speech science to succeed in practical modeling tasks.
international conference on acoustics, speech, and signal processing | 2006
Erhard Rank; Tuan Van Pham; Gernot Kubin
In this paper we address the application of a denoising algorithm based on wavelet packet decomposition and quantile noise estimation to noise suppression for automatic speech recognition. The denoising algorithm is adapted to suit the different requirements in machine recognition, as compared to human perception, and is tested in combination with state-of-the-art speech recognition systems. The results show, that, if the proposed algorithm is integrated with the recognition system - including the training process - a performance comparable to recent high-quality noise suppression methods is achieved
international conference on advanced technologies for communications | 2010
Tuan V. Pham; Michael Stark; Erhard Rank
In this paper, we analyze the performance of wavelet-based voice activity detection (VAD) algorithms with respect to the detection of target speech. In addition, the state-of-the-art VAD standardized for the G. 729 B, the ETSI AFE ES 202 050 are evaluated extensively. Experimental results on a self-built cocktail party corpus including different target-interference speech activity conditions are provided. Results show that: (i) the wavelet-based VAD algorithms are superior to other VADmethods in terms of classification measures; (ii) the robustness of the wavelet feature still holds in a completely mismatched environment.
conference of the international speech communication association | 2003
Michael Pucher; Friedrich Neubarth; Erhard Rank; Georg Niklfeld; Qi Guan
conference of the international speech communication association | 1999
Erhard Rank
Archive | 2007
Erhard Rank; Gernot Kubin