Jae-Hun Choi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jae-Hun Choi is active.

Explore More

Publication

Featured researches published by Jae-Hun Choi.

Speech Communication | 2012

On using acoustic environment classification for statistical model-based speech enhancement

Jae-Hun Choi; Joon-Hyuk Chang

In this paper, we present a statistical model-based speech enhancement technique using acoustic environment classification supported by a Gaussian mixture model (GMM). In the data training stage, the principal parameters of the statistical model-based speech enhancement algorithm such as the weighting parameter in the decision-directed (DD) method, the long-term smoothing parameter of the noise estimation, and the control parameter of the minimum gain value are uniquely set as optimal operating points according to the given noise information to ensure the best performance for each noise. These optimal operating points, which are specific to the different background noises, are estimated based on the composite measures, which are the objective quality measures representing the highest correlation with the actual speech quality processed by noise suppression algorithms. In the on-line environment-aware speech enhancement step, the noise classification is performed on a frame-by-frame basis using the maximum likelihood (ML)-based Gaussian mixture model (GMM). The speech absence probability (SAP) is used to detect the speech absence periods and to update the likelihood of the GMM. According to the classified noise information for each frame, we assign the optimal values to the aforementioned three parameters for speech enhancement. We evaluated the performances of the proposed methods using objective speech quality measures and subjective listening tests under various noise environments. Our experimental results showed that the proposed method yields better performances than does a conventional algorithm with fixed parameters.

IEEE Transactions on Audio, Speech, and Language Processing | 2014

Dual-microphone voice activity detection technique based on two-step power level difference ratio

Jae-Hun Choi; Joon-Hyuk Chang

In this paper, we propose a novel dual-microphone voice activity detection (VAD) technique based on the two-step power level difference (PLD) ratio. This technique basically exploits the PLD between the primary microphone and the secondary microphone in a mobile device when the distance between the microphones and the sound source is relatively short. Based on the PLD, we propose the use of the PLD ratio (PLDR) instead of the original PLD to take advantage of the relative difference between the PLD of speech and the PLD of noise. Indeed, the PLDR is obtained by estimating the ratio of the PLD between the input signals and the PLD between the two channel noises during periods without speech. The proposed technique offers a two-step algorithm using the PLDRs including long-term PLDR (LT-PLDR), which characterizes long-term evolution and short-term PLDR (ST-PLDR), which characterizes short-time variation during the first step. LT-PLDR-based and ST-PLDR-based VAD decision are performed using the maximum a posteriori (MAP) probability derived from the model-trust algorithm and combined at the second step to reach a superior VAD decision for both long-term and short-term situations. Extensive experimental results show that the proposed dual-microphone VAD technique outperforms the conventional two-channel VAD method as well as most standardized VAD algorithms.

The Journal of the Acoustical Society of Korea | 2011

Improved Minimum Statistics Based on Environment-Awareness for Noise Power Estimation

Young-Ho Son; Jae-Hun Choi; Joon-Hyuk Chang

In this paper, we propose the improved noise power estimation in speech enhancement under various noise environments. The previous MS algorithm tracking the minimum value of finite search window uses the optimal power spectrum of signal for smoothing and adopts minimum probability. From the investigation of the previous MS-based methods it can be seen that a fixed size of the minimum search window is assumed regardless of the various environment. To achieve the different search window size, we use the noise classification algorithm based on the Gaussian mixture model (GMM). Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 perceptual evaluation of speech quality (PESQ) under various noise environments. Based on this, we show that the proposed algorithm yields better result compared to the conventional MS method.

international conference on acoustics, speech, and signal processing | 2012

Adaptive noise power estimation using spectral difference for robust speech enhancement

Jae-Hun Choi; Sang-Kyun Kim; Joon-Hyuk Chang

In this paper, we propose a spectral difference approach for noise power estimation in speech enhancement. The noise power estimate is given by recursively averaging past spectral power values using a smoothing parameter based on the current observation. The smoothing parameter in time and frequency is adjusted by the spectral difference between consecutive frames that can efficiently characterize noise variation. Specifically, we propose an effective technique based on a sigmoid-type function in order to adaptively determine the smoothing parameter based on the spectral difference. Compared to a conventional method, the proposed noise estimate is computationally efficient and able to effectively follow noise changes under various noise conditions.

ieee international conference on network infrastructure and digital content | 2012

On using spectral gradient in conditional MAP criterion for robust voice activity detection

Jae-Hun Choi; Joon-Hyuk Chang

In this paper, we propose a novel approach to improve a statistical model-based voice activity detection (VAD) method based on a modified conditional maximum a posteriori (MAP) criterion incorporating the spectral gradient scheme. The proposed conditional MAP incorporates not only the voice activity decision in the previous frame as in Ref. [1] but also the spectral gradient of the observed spectra between the current frame and the past frames to efficiently exploit the inter-frame correlation of voice activity. As a result, the proposed VAD leads to six separate thresholds to be adaptively determined in the likelihood ratio test (LRT) depending on both the previous VAD result and the estimated spectral gradient parameter. Experimental results demonstrate that the proposed approach yields better results compared to those of the previous conditional MAP-based method.

Journal of the Institute of Electronics Engineers of Korea | 2010