Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Sündermann is active.

Publication


Featured researches published by David Sündermann.


international conference on acoustics, speech, and signal processing | 2006

Text-Independent Voice Conversion Based on Unit Selection

David Sündermann; Harald Höge; Antonio Bonafonte; Hermann Ney; Alan W. Black; Shrikanth Narayanan

So far, most of the voice conversion training procedures are text-dependent, i.e., they are based on parallel training utterances of source and large speaker. Since several applications (e.g. speech-to-speech translation or dubbing) require text-independent training, over the last two years, training techniques that use non-parallel data were proposed In this paper, we present a new approach that applies unit selection to find corresponding time frames in source and target speech. By means of a subjective experiment it is shown that this technique achieves the same performance as the conventional text-dependent training


ieee automatic speech recognition and understanding workshop | 2003

VTLN-based cross-language voice conversion

David Sündermann; Hermann Ney; Harald Höge

In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As cross-language voice conversion aims at the transformation of a source speakers voice into that of a target speaker using a different language, we want to investigate whether VTLN is an appropriate method to adapt the voice characteristics. After applying several conventional VTLN warping functions, we extend the conventional piece-wise linear function to several segments, allowing a more detailed warping of the source spectrum. Experiments on cross-language voice conversion are performed on three corpora of two languages and both speaker genders.


international conference on acoustics, speech, and signal processing | 2005

A study on residual prediction techniques for voice conversion

David Sündermann; Antonio Bonafonte; Hermann Ney

Several well-studied voice conversion techniques use line spectral frequencies as features to represent the spectral envelopes of the processed speech frames. In order to return to the time domain, these features are converted to linear predictive coefficients that serve as coefficients of a filter applied to an unknown residual signal. We compare several residual prediction approaches that have already been proposed in the literature dealing with voice conversion. We also present a novel technique that outperforms the others in terms of voice conversion performance and sound quality.


international symposium on signal processing and information technology | 2003

VTLN-based voice conversion

David Sündermann; Hermann Ney

In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As voice conversion aims at the transformation of a source speakers voice into that of a target speaker, we want to investigate whether VTLN is an appropriate method to adapt the voice characteristics. After applying several conventional VTLN warping functions, we extend the piecewise linear function to several segments, allowing a more detailed warping of the source spectrum. Experiments on voice conversion are performed on three corpora of two languages and both speaker genders.


international symposium on signal processing and information technology | 2004

Time domain vocal tract length normalization

David Sündermann; Antonio Bonafonte; Hermann Ney; Harald Höge

Recently, the speaker normalization technique VTLN (vocal tract length normalization), known from speech recognition, was applied to voice conversion. So far, VTLN has been performed in frequency domain. However, to accelerate the conversion process, it is helpful to apply VTLN directly to the time frames of a speech signal. In this paper, we propose a technique which directly manipulates the time signal. By means of subjective tests, it is shown that the performance of voice conversion techniques based on frequency domain and time domain VTLN are equivalent in terms of speech quality, while the latter requires about 20 times less processing time.


ieee automatic speech recognition and understanding workshop | 2005

Residual prediction based on unit selection

David Sündermann; Harald Höge; Antonio Bonafonte; Hermann Ney; Alan W. Black

Recently, we presented a study on residual prediction techniques that can be applied to voice conversion based on linear transformation or hidden Markov model-based speech synthesis. Our voice conversion experiments showed that none of the six compared techniques was capable of successfully converting the voice while achieving a fair speech quality. In this paper, we suggest a novel residual prediction technique based on unit selection that outperforms the others in terms of speech quality (mean opinion score = 3) while keeping the conversion performance


international conference natural language processing | 2003

Synther - a new m-gram POS tagger

David Sündermann; Hermann Ney

The part-of-speech (POS) tagger synther based on m-gram statistics is described. After explaining its basic architecture, three smoothing approaches and the strategy for handling unknown words is exposed. Subsequently, synthers performance is evaluated in comparison with four state-of-the-art POS taggers. All of them are trained and tested on three corpora of different languages and domains. In the course of this evaluation, synther resulted in the lowest error rates or at least below average error rates. Finally, it is shown that the linear interpolation smoothing strategy with coverage-dependent weights features better properties than the two other approaches.


conference of the international speech communication association | 2004

A first step towards text-independent voice conversion.

Hermann Ney; David Sündermann; Antonio Bonafonte; Harald Höge


Procesamiento Del Lenguaje Natural | 2004

Voice Conversion Using Exclusively Unaligned Training Data.

David Sündermann; Antonio Bonafonte; Harald Höge; Hermann Ney


international symposium on signal processing and information technology | 2005

Residual prediction

David Sündermann; Harald Höge; Antonio Bonafonte; H. Duxans

Collaboration


Dive into the David Sündermann's collaboration.

Top Co-Authors

Avatar

Hermann Ney

RWTH Aachen University

View shared research outputs
Top Co-Authors

Avatar

Antonio Bonafonte

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Alan W. Black

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Guntram Strecha

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar

Shrikanth Narayanan

University of Southern California

View shared research outputs
Researchain Logo
Decentralizing Knowledge