Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ji-Won Cho is active.

Publication


Featured researches published by Ji-Won Cho.


Signal Processing | 2016

Independent vector analysis followed by HMM-based feature enhancement for robust speech recognition

Ji-Won Cho; Hyung-Min Park

This paper presents a feature-enhancement method that uses the outputs of independent vector analysis (IVA) for robust speech recognition. Although frequency-domain(FD) independent component analysis (ICA) can be successfully used in preprocessing of speech recognition because of its capability of blind source separation (BSS), the performance of the conventional ICA-based approaches is significantly degraded in underdetermined cases. Assuming the target speaker is located relatively close to microphones, the blind spatial subtraction array (BSSA) (Takahashi et al. 10) tries to enhance target speech features by subtracting noise spectra estimated by FD ICA, even in the underdetermined cases. Unfortunately, the ICA may not be proficient at target speech estimation and then may cause inaccurate noise spectrum estimation. To improve robustness of speech recognition with the inaccurate noise spectra, we introduce Bayesian inference to estimate clean speech features. For a further improvement, FD ICA and delay-and-sum beamforming in the BSSA are replaced with IVA and its target speech output because IVA can improve separation performance without the permutation problem. Experimental results show that the proposed method can further reduce the relative word error rates by 60.11% and 20.07% on average compared to the BSSA for the AURORA2 and DARPA Resource Management databases, respectively. HighlightsA feature-enhancement method is proposed for robust speech recognition.The method uses the outputs of independent vector analysis to enhance speech features.The method enhances speech features based on Bayesian inference with an HMM prior.The method circumvents the underdetermined blind source separation problem.The method showed better performance than the comparing methods including the BSSA.


IEEE Signal Processing Letters | 2013

An Efficient HMM-Based Feature Enhancement Method With Filter Estimation for Reverberant Speech Recognition

Ji-Won Cho; Hyung-Min Park

This letter presents an efficient feature enhancement method for reverberant speech recognition that derives a minimum mean square error estimate of clean logarithmic mel-frequency power spectral coefficients (LMPSCs) based on a hidden-Markov-model(HMM) prior. Although an observation model of the reverberant LMPSCs can be simply formulated by coarse modeling of the room impulse response (RIR) , the presented method estimates not only the clean LMPSCs but also the RIR to reflect detailed reverberation. The experimental results indicate that the described method can further reduce relative word error rate (WER) by 18.09% on average compared to a method based on RIR coarse modeling.


IEEE Signal Processing Letters | 2009

Imposition of Sparse Priors in Adaptive Time Delay Estimation for Speaker Localization in Reverberant Environments

Ji-Won Cho; Hyung-Min Park

In this letter, we describe a method to estimate the time delay for speaker localization in reverberant environments. Based on an adaptive eigenvalue decomposition (AED) algorithm, the method takes the reverberation fully into account by estimating channel impulse responses from a speaker to sensors directly, but it may suffer from whitening effects for temporally correlated natural sounds. Imposing sparse priors on the responses can reduce the temporal whitening and provide a more accurate and robust time delay of speaker location. Experiments demonstrate that the proposed method can efficiently estimate the time delay for speaker localization in reverberant environments.


IEEE Signal Processing Letters | 2016

A Subband-Based Stationary-Component Suppression Method Using Harmonics and Power Ratio for Reverberant Speech Recognition

Byung Joon Cho; Haeyong Kwon; Ji-Won Cho; Chanwoo Kim; Richard M. Stern; Hyung-Min Park

This letter describes a preprocessing method called subband-based stationary-component suppression method using harmonics and power ratio (SHARP) processing for reverberant speech recognition. SHARP processing extends a previous algorithm called Suppression of Slowly varying components and the Falling edge (SSF), which suppresses the steady-state portions of subband spectral envelopes. The SSF algorithm tends to over-subtract these envelopes in highly reverberant environments when there are high levels of power in previous analysis frames. The proposed SHARP method prevents excessive suppression both by boosting the floor value using the harmonics in voiced speech segments and by inhibiting the subtraction for unvoiced speech by detecting frames in which power is concentrated in high-frequency channels. These modifications enable the SHARP algorithm to improve recognition accuracy by further reducing the mismatch between power contours of clean and reverberated speech. Experimental results indicate that the SHARP method provides better recognition accuracy in highly reverberant environments compared to the SSF algorithm. It is also shown that the performance of the SHARP method can be further improved by combining it with feature-space maximum likelihood linear regression (fMLLR).


IEEE Signal Processing Letters | 2016

DNN-Based Feature Enhancement Using DOA-Constrained ICA for Robust Speech Recognition

Ho Yong Lee; Ji-Won Cho; Minook Kim; Hyung-Min Park

The performance of automatic speech recognition (ASR) system is often degraded in adverse real-world environments. In recent times, deep learning has successfully emerged as a breakthrough for acoustic modeling in ASR; accordingly, deep neural network (DNN)-based speech feature enhancement (FE) approaches have attracted much attention owing to their powerful modeling capabilities. However, DNN-based approaches are unable to achieve remarkable performance improvements for speech with severe distortion in the test environments different from training environments. In this letter, we propose a DNN-based FE method where the DNN inputs include preenhanced spectral features computed from multichannel input signals to reconstruct noise-robust features. The preenhanced spectral features are obtained by direction-of-arrival (DOA)-constrained independent component analysis (DCICA) followed by Bayesian FE using a hidden-Markov-model prior, to exploit the capabilities of efficient online target speech extraction and efficient FE with prior information for robust ASR. In addition, noise spectral features computed from DCICA are included for further improvement. Therefore, the DNN is trained to reconstruct a clean spectral feature vector, from a sequence of corrupted input feature vectors in addition to the corresponding preenhanced and noise feature vectors. Experimental results demonstrate that the proposed method significantly improves recognition performance, even in mismatched noise conditions.


Computer Speech & Language | 2017

Bayesian feature enhancement using independent vector analysis and reverberation parameter re-estimation for noisy reverberant speech recognition

Ji-Won Cho; Jong-Hyeon Park; Joon-Hyuk Chang; Hyung-Min Park

Abstract Because speech recorded by distant microphones in real-world environments is contaminated by both additive noise and reverberation, the automatic speech recognition (ASR) performance is seriously degraded due to the mismatch between the training and testing environments. In the previous studies, some of the authors proposed a Bayesian feature enhancement (BFE) method with re-estimation of reverberation filter parameters for reverberant speech recognition and a BFE method employing independent vector analysis (IVA) to deal with speech corrupted by additive noise. Although both of them accomplish significant improvements in either reverberation- or noise-robust ASR, most of the real-world environments involve both additive noise and reverberation. For robust ASR in the noisy reverberant environments, in this paper, we present a hidden-Markov-model (HMM)-based BFE method using IVA and reverberation parameter re-estimation (RPR) to remove additive and reverberant distortion components in speech acquired by multi-microphones effectively by introducing Bayesian inference in the observation model of input speech features. Experimental results show that the presented method can further reduce the word error rates (WERs) compared with the BFE methods based on conventional noise and/or reverberation models and combinations of the BFE methods for reverberation- or noise-robust ASR.


human-agent interaction | 2015

A Method for Speech Dereverberation Based on an Image Deblurring Algorithm Using the Prior of Speech Magnitude Gradient Distribution in the Time-Frequency Domain

Wonyong Jo; Ji-Won Cho; Changsoo Je; Hyung-Min Park

We propose a speech dereverberation method in the time-frequency domain, based on an image deblurring algorithm. A reverberant speech magnitude can be modeled as a convolution of a clean speech with a reverberation filter in time-frequency domain. Then, dereverberation problem can be regarded as that of image deblurring. Therefore, the proposed method estimates the clean speech magnitude in the time-frequency domain by using the fast image deconvolution method with priors on sparsity of the clean speech magnitude gradient and exponentially decaying property of reverberation filters along the time axis. Then, scaling the reverberation speech magnitude by a mask obtained from the estimated clean one performs dereverberation. Experimental results show that the described method can enhance speech.


Archive | 2017

ICASSP2017 Poster (Paper #4319)

Hyung-Min Park; Ho-Yong Lee; Ji-Won Cho; Minook Kim; Park; Hyung-Min


IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2016

A Speech Enhancement Algorithm Based on Blind Signal Cancelation in Diffuse Noise Environments

Jaesik Hwang; Jaepil Seo; Ji-Won Cho; Hyung-Min Park


IEEE Signal Processing Letters | 2016

残響音声認識のための高調波と電力比を用いたサブバンドに基づいた固定成分抑制法【Powered by NICT】

Byung Joon Cho; Haeyong Kwon; Ji-Won Cho; Chanwoo Kim; M Stern Richard; Hyung-Min Park

Collaboration


Dive into the Ji-Won Cho's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chanwoo Kim

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge