Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chung-Chien Hsu is active.

Publication


Featured researches published by Chung-Chien Hsu.


international conference on acoustics, speech, and signal processing | 2013

Voice activity detection based on frequency modulation of harmonics

Chung-Chien Hsu; Tse-En Lin; Jian-Hueng Chen; Tai-Shih Chi

In this paper, we propose a voice activity detection (VAD) algorithm based on spectro-temporal modulation structures of input sounds. A multi-resolution spectro-temporal analysis framework is used to inspect prominent speech structures. By comparing with an adaptive threshold, the proposed VAD distinguishes speech from non-speech based on the energy of the frequency modulation of harmonics. Compared with three standard VADs, ITU-T G.729B, ETSI AMR1 and AMR2, our proposed VAD significantly outperforms them in non-stationary noises in terms of the receiver operating characteristic (ROC) curves and the recognition rates from a practical distributed speech recognition (DSR) system.


international conference on acoustics, speech, and signal processing | 2012

Spectro-temporal subband Wiener filter for speech enhancement

Chung-Chien Hsu; Tse-En Lin; Jian-Hueng Chen; Tai-Shih Chi

In this paper, we propose a signal-channel speech enhancement algorithm by applying the conventional Wiener filter in the spectro-temporal modulation domain. The multi-resolution spectro-temporal analysis and synthesis framework for Fourier spectrograms [12] is extended to the analysis-modification-synthesis (AMS) framework for speech enhancement. Compared with conventional speech enhancement algorithms, a Wiener filter and an extended minimum mean-square error (MMSE) algorithm, our proposed method outperforms them by a large/small margin in white/babble noise conditions from both objective and subjective evaluations.


international conference on acoustics, speech, and signal processing | 2016

Discriminative deep recurrent neural networks for monaural speech separation

Guan-Xiang Wang; Chung-Chien Hsu; Jen-Tzung Chien

Deep neural network is now a new trend towards solving different problems in speech processing. In this paper, we propose a discriminative deep recurrent neural network (DRNN) model for monaural speech separation. Our idea is to construct DRNN as a regression model to discover the deep structure and regularity for signal reconstruction from a mixture of two source spectra. To reinforce the discrimination capability between two separated spectra, we estimate DRNN separation parameters by minimizing an integrated objective function which consists of two measurements. One is the within-source reconstruction errors due to the individual source spectra while the other conveys the discrimination information which preserves the mutual difference between two source spectra during the supervised training procedure. This discrimination information acts as a kind of regularization so as to maintain between-source separation in monaural source separation. In the experiments, we demonstrate the effectiveness of the proposed method for speech separation compared with the other methods.


international conference on acoustics, speech, and signal processing | 2015

Modulation Wiener filter for improving speech intelligibility

Chung-Chien Hsu; Kah-Meng Cheong; Jen-Tzung Chien; Tai-Shih Chi

This paper presents a single-channel high-dimensional Wiener filter in the spectro-temporal modulation domain. Unlike other conventional noise reduction techniques, the proposed algorithm not only reduces noise but also enhances the “textures” of the speech signal. A non-iterative decision-directed noise estimation method is adopted to estimate the modulation SNR for the modulation-domain Wiener filter. The efficacy of the proposed algorithm in enhancing speech intelligibility is assessed using the short-time objective intelligibility (STOI) measure. Statistical analysis results demonstrate that our proposed algorithm can improve STOI scores in speech-shape noise (SSN) and white noise conditions, but not in babble noise condition, while the conventional Wiener filter fails to improve STOI scores in all three noise conditions.


conference of the international speech communication association | 2016

Discriminative Layered Nonnegative Matrix Factorization for Speech Separation.

Chung-Chien Hsu; Tai-Shih Chi; Jen-Tzung Chien

This paper proposes a discriminative layered nonnegative matrix factorization (DL-NMF) for monaural speech separation. The standard NMF conducts the parts-based representation using a single-layer of bases which was recently upgraded to the layered NMF (L-NMF) where a tree of bases was estimated for multi-level or multi-aspect decomposition of a complex mixed signal. In this study, we develop the DL-NMF by extending the generative bases in L-NMF to the discriminative bases which are estimated according to a discriminative criterion. The discriminative criterion is conducted by optimizing the recovery of the mixed spectra from the separated spectra and minimizing the reconstruction errors between separated spectra and original source spectra. The experiments on single-channel speech separation show the superiority of DL-NMF to NMF and L-NMF in terms of the SDR, SIR and SAR measures.


international conference on acoustics, speech, and signal processing | 2015

A hearing model to estimate mandarin speech intelligibility for the hearing impaired patients

Pei-Chun Tsai; Shih-Ting Lin; Wen-Chung Lee; Chung-Chien Hsu; Tai-Shih Chi; Chia-Fone Lee

A hearing model, which is parameterized by hearing thresholds, degrees of loudness recruitment and reductions of frequency resolution of a hearing-impaired (HI) patient, is proposed in this paper. The model is developed in the filter-bank framework and is flexible for fitting hearing-loss conditions of HI patients. Psychoacoustic experiments were conducted under clean and noisy conditions to validate the models capability in predicting Mandarin speech intelligibility for HI patients. Statistical analysis on the hearing-test results suggests that the proposed model can predict Mandarin speech intelligibility for HI patients to a certain degree.


international conference on acoustics, speech, and signal processing | 2011

FFT-based spectro-temporal analysis and synthesis of sounds

Chung-Chien Hsu; Ting-Han Lin; Tai-Shih Chi

The concept of the two-dimensional spectro-temporal modulation filtering of the auditory model [1] is implemented for the FFT spectrogram. It analyzes the spectrogram in terms of the temporal dynamics and the spectral structures of the sound. The overlap and add (OLA) method, which is more convenient and reliable than the iterative-projection method proposed in [1], is used to invert the FFT spectrogram back to sounds. The Non-Negative Sparse Coding (NNSC) method is adopted to demonstrate the benefit of our analysis-synthesis procedures in a noise suppression application. Even without fine-tuning parameters, our proposed analysis-synthesis procedures offer benefits in de-noising especially under low SNR conditions.


international symposium on chinese spoken language processing | 2014

A non-uniformly distributed three-microphone array for speech enhancement in directional and diffuse noise field

Chung-Chien Hsu; Kah-Meng Cheong; Tai-Shih Chi

This paper proposes a non-uniformly distributed threemicrophone array speech enhancement system to suppress directional interferences and diffuse noise simultaneously. Each pair of microphones is designed to tackle one kind of noise. Unlike other hybrid systems, which combine different noise suppression techniques derived in different domains, the proposed system integrates two noise suppression techniques derived in the unified power spectral density domain. In terms of objective LLR and PESQ measures, the proposed hybrid system is demonstrated more effective in eliminating directional and diffuse noise under various SNR conditions than each individual suppression technique.


international symposium/conference on music information retrieval | 2014

BAYESIAN SINGING-VOICE SEPARATION

Po-Kai Yang; Chung-Chien Hsu; Jen-Tzung Chien


conference of the international speech communication association | 2015

Layered Nonnegative Matrix Factorization for Speech Separation

Chung-Chien Hsu; Jen-Tzung Chien; Tai-Shih Chi

Collaboration


Dive into the Chung-Chien Hsu's collaboration.

Top Co-Authors

Avatar

Tai-Shih Chi

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Jen-Tzung Chien

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Kah-Meng Cheong

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tse-En Lin

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Po-Kai Yang

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Guan-Xiang Wang

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Pei-Chun Tsai

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Shih-Ting Lin

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Ting-Han Lin

National Chiao Tung University

View shared research outputs
Researchain Logo
Decentralizing Knowledge