Wen-hsiang Tu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wen-hsiang Tu is active.

Explore More

Publication

Featured researches published by Wen-hsiang Tu.

Signal Processing | 2012

Fast communication: Improved modulation spectrum enhancement methods for robust speech recognition

Jeih-weih Hung; Wen-hsiang Tu; Chien-chou Lai

In this paper, we present two novel algorithms to improve the noise robustness of features in speech recognition: modulation spectrum replacement (MSR) and modulation spectrum filtering (MSF). The magnitude spectra of feature streams are updated by referring to the information collected in the clean training set, and the resulting new feature streams are more noise-robust to achieve higher recognition accuracy. In experiments conducted on the Aurora-2 noisy digit database, we show that the proposed MSR achieves an average relative error reduction rate of nearly 57% compared to baseline processing, and MSF is specifically effective in enhancing the features preprocessed by conventional feature normalization methods to achieve even better recognition accuracy in noise-corrupted situations.

IEEE Signal Processing Letters | 2009

Incorporating Codebook and Utterance Information in Cepstral Statistics Normalization Techniques for Robust Speech Recognition in Additive Noise Environments

Jeih-weih Hung; Wen-hsiang Tu

Cepstral statistics normalization techniques have been shown to be very successful at improving the noise robustness of speech features. This letter proposes a hybrid-based scheme to achieve a more accurate estimate of the statistical information of features in these techniques. By properly integrating codebook and utterance knowledge, the resulting hybrid-based approach significantly outperforms conventional utterance-based, segment-based and codebook-based approaches in additive noise environments. Furthermore, the high-performance CS-HEQ can be implemented with a short delay and can thus be applied in real-time online systems.

ieee automatic speech recognition and understanding workshop | 2009

Sub-band modulation spectrum compensation for robust speech recognition

Wen-hsiang Tu; Sheng-Yuan Huang; Jeih-weih Hung

This paper proposes a novel scheme in performing feature statistics normalization techniques for robust speech recognition. In the proposed approach, the processed temporal-domain feature sequence is first converted into the modulation spectral domain. The magnitude part of the modulation spectrum is decomposed into non-uniform sub-band segments, and then each sub-band segment is individually processed by the well-known normalization methods, like mean normalization (MN), mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the feature stream with all the modified sub-band magnitude spectral segments and the original phase spectrum using the inverse DFT. With this process, the components that correspond to more important modulation spectral bands in the feature sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band spectral MVN and HEQ provide relative error rate reductions of 18.66% and 23.58% over the conventional temporal MVN and HEQ, respectively.

EURASIP Journal on Advances in Signal Processing | 2012

Enhancing the magnitude spectrum of speech features for robust speech recognition

Jeih-weih Hung; Hao-Teng Fan; Wen-hsiang Tu

AbstractIn this article, we present an effective compensation scheme to improve noise robustness for the spectra of speech signals. In this compensation scheme, called magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is performed on the frame sequence of the utterance. The magnitude spectra of non-speech frames are then reduced while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves an error reduction rate of nearly 42% relative to baseline processing. This method outperforms well-known spectral-domain speech enhancement techniques, including spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, such as mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improvements in recognition accuracy under noise-corrupted environments.

international conference on acoustics, speech, and signal processing | 2010

Magnitude spectrum enhancement for robust speech recognition

Wen-hsiang Tu; Jeih-weih Hung

In this paper, an effective compensation scheme for the spectra of speech signals is proposed in order to improve their noise robustness. In this compensation scheme, named magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is first processed for the frame sequence of the utterance, and then the magnitude spectra of non-speech frames are set to be small while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves a relative error reduction rate of nearly 50% from the baseline processing, which outperforms the well-known spectral-domain speech enhancement techniques, spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, like mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improved recognition accuracy under noise-corrupted environments.

international conference on computational linguistics | 2011