Is this you? Create Your Porfile

Seung Ho Choi

Seoul National University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Seung Ho Choi is active.

Explore More

Publication

Featured researches published by Seung Ho Choi.

advances in multimedia | 2013

Superwideband bandwidth extension using normalized MDCT coefficients for scalable speech and audio coding

Young Han Lee; Seung Ho Choi

A bandwidth extension (BWE) algorithm from wideband to superwideband (SWB) is proposed for a scalable speech/audio codec that uses modified discrete cosine transform (MDCT) coefficients as spectral parameters. The superwideband is first split into several subbands that are represented as gain parameters and normalized MDCT coefficients in the proposed BWE algorithm. We then estimate normalized MDCT coefficients of the wideband to be fetched for the superwideband and quantize the fetch indices. After that, we quantize gain parameters by using relative ratios between adjacent subbands. The proposed BWE algorithm is embedded into a standard superwideband codec, the SW Bextension of G.729.1 Annex E, and its bitrate and quality are compared with those of the BWE algorithm already employed in the standard superwideband codec. It is shown from the comparison that the proposed BWE algorithm relatively reduces the bitrate by around 19% with better quality, compared to the BWE algorithm in the SWB extension of G.729.1 Annex E.

IEEE Transactions on Consumer Electronics | 2011

Sound source elevation using spectral notch filtering and directional band boosting in stereo loudspeaker reproduction

Chan Jun Chun; Hong Kook Kim; Seung Ho Choi; Sei-Jin Jang; Seok-Pil Lee

In this paper, a virtual elevation method of a sound source is proposed for a stereo loudspeaker reproduction. To achieve this goal, the proposed method first applies a head-related transfer function (HRTF) to the sound source to vertically localize it. Then, spectral notch filtering followed by directional band boosting is used for the further elevation of the sound source. In particular, a filter having three notches is designed by analyzing the spectral characteristics estimated from the measured HRTF database and by investigating the effect of spectral boosting on the perception of sound elevation. To evaluate the elevation performance of the proposed method, subjective listening tests are subsequently conducted using several sound sources, including male and female speech, as well as guitar, violin, and pop music. It is shown from the tests that the sound sources processed by the proposed method are most likely to be perceived as if they are played from stereo loudspeakers located at around 20°, even though the stereo loudspeakers are actually located on a horizontal plane, i.e.,0o.1.

international conference on future generation communication and networking | 2010

A Packet Loss Concealment Algorithm Robust to Burst Packet Loss Using Multiple Codebooks and Comfort Noise for CELP-Type Speech Coders

Nam In Park; Hong Kook Kim; Min A Jung; Seong Ro Lee; Seung Ho Choi

In this paper, a packet loss concealment (PLC) algorithm for CELP-type speech coders is proposed to improve the quality of decoded speech under burst packet loss. A conventional PLC algorithm is usually based on speech correlation to reconstruct decoded speech of lost frames by using the information on the parameters obtained from the previous frames that are assumed to be correctly received. However, this approach is apt to fail to reconstruct voice onset signals since the parameters such as pitch, LPC coefficient, and adaptive/fixed codebooks of the previous frames are almost related to silence frames. Thus, in order to reconstruct speech signals in the voice onset intervals, we propose a multiple codebook based approach which includes a traditional adaptive codebook and a new random codebook composed of comfort noise. The proposed PLC algorithm is designed as a PLC algorithm for G.729 and its performance is then compared with that of the PLC algorithm employed in G.729 by means of perceptual evaluation of speech quality (PESQ), a waveform comparison, and an A-B preference test under different random and burst packet loss conditions. It is shown from the experiments that the proposed PLC algorithm provides significantly better speech quality than the PLC of G.729, especially under burst packet loss and voice onset conditions.

ubiquitous computing | 2011

Detection of Howling Frequency Using Temporal Variations in Power Spectrum

Jae-Won Lee; Seung Ho Choi

Indoor audio feedback is a common problem in many audio amplification systems. The audio feedback can be out of control in some indoor conditions, which results in howling. Also, the howling frequency is subject to variation by the changes of indoor environments. Most conventional methods attempt to eliminate the howling, but they do not provide a way to predict the occurrence of howling. This paper presents a novel method to predict the howling frequency using temporal variations in power spectrum.

international conference on future generation communication and networking | 2011

Improvements in Howling Margin Using Phase Dispersion

Jae-Won Lee; Seung Ho Choi

The limitation in the gain of audio system is mainly due to the howling generated by an acoustic feedback circuit. Furthermore, the howling occurs differently depending on the acoustic environments. In this paper, we propose a phase dispersion method to improve the howling margin in the audio amplifier systems. In order to eliminate unexpected potential howling frequencies, the proposed method controls the phase around the howling frequency using all-pass filter with phase dispersion. From the experiments, it is shown that the howling margin is improved by around 3 dB.

international conference on computer science, environment, ecoinformatics, and education | 2011

A Cepstral PDF Normalization Method for Noise Robust Speech Recognition

Yong Ho Suk; Seung Ho Choi

In this paper, we propose a novel cepstrum normalization method based on the scoring procedure of order statistics for speech recognition in additive noise environments. The conventional methods normalize the mean and/or variance of the cepstrum, which results in an incomplete normalization of the probability density function (PDF). The proposed method fully normalizes the PDF of the cepstrum, providing an identical PDF between clean and noisy cepstrum. For the target PDF, the generalized Gaussian distribution is selected to consider various densities. In recognition phase, a table lookup method is devised in order to save computational costs. From the speaker-independent isolated-word recognition experiments, we show that the proposed method gives improved performance compared with that of the conventional methods, especially in heavy noise environments.

International Conference on Informatics Engineering and Information Science | 2011

A Study on Speech Coders for Automatic Speech Recognition in Adverse Communication Environments

Seung Ho Choi

In this research work, we present the effects of several standard speech coders on automatic speech recognition in adverse communication environments such as tandem, frame erasure, and noisy conditions. The adverse conditions were chosen to simulate the operations of mobile communication environments. The comparative results can provide a guideline for selecting a speech coder when a speech recognition service is needed in digital communication networks.

international conference on ubiquitous and future networks | 2016

Joint hierarchical modulation and network coding for asymmetric data rate transmission over multiple-access relay channel

Dongho You; Ye Hoon Lee; Seung Ho Choi; Dong Ho Kim; Sunghyun Cho

In this paper, we consider a time-division multiple-access relay channel (MARC), in which two source nodes (SNs) transmit data with different data rate to a destination node (DN) with the help of a relay node (RN). To provide asymmetric data rates, we consider joint hierarchical 16QAM (HM 16QAM) and network coding. Simulations results show that the joint HM 16QAM and network coding has similar throughput to the conventional network coding with zero padding when the RN is closer to the DN, whereas the joint 16QAM and network coding outperforms the network coding with zero padding when the RN is closer to the SNs.

Phonetics and Speech Sciences | 2014

Syllable-Type-Based Phoneme Weighting Techniques for Listening Intelligibility in Noisy Environments

Young Ho Lee; Jong Han Joo; Seung Ho Choi

ABSTRACTIntelligibility of speech transmitted to listeners can significantly be degraded in noisy environments such as in auditorium and in train station due to ambient noises. Noise-masked speech signal is hard to be recognized by listeners. Among the conventional methods to improve speech intelligibility, consonant-vowel intensity ratio (CVR) approach reinforces the powers of overall consonants. However, excessively reinforced consonant is not helpful in recognition. Furthermore, only some of consonants are improved by the CVR approach. In this paper, we propose the corrective weighting (CW) approach that reinforces the powers of consonants according to syllable-type such as consonant-vowel-consonant (CVC), consonant-vowel (CV) and vowel-consonant (VC) in Korean differently, considering the level of listeners recognition. The proposed CW approach was evaluated by the subjective test, Comparison Category Rating (CCR) test of ITU-T P.800, showed better performance, that is, 0.18 and 0.24 higher than the unprocessed CVR approach, respectively. Keywords: noisy environment, speech intelligibility, syllable types, consonant-vowel intensity ratio (CVR), corrective weighting (CW)

International Journal of Distributed Sensor Networks | 2013

Ultrasonic Sensor-Based Personalized Multichannel Audio Rendering for Multiview Broadcasting Services

Yong Guk Kim; Sang-Taeck Moon; Seung Ho Choi; Hong Kook Kim

An ultrasonic sensor-based personalized multichannel audio rendering method is proposed for multiview broadcasting services. Multiview broadcasting, a representative next-generation broadcasting technique, renders video image sequences captured by several stereoscopic cameras from different viewpoints. To achieve realistic multiview broadcasting, multichannel audio that is synchronized with a users viewpoint should be rendered in real time. For this reason, both a real-time person-tracking technique for estimating the users position and a multichannel audio rendering technique for virtual sound localization are necessary in order to provide realistic audio. Therefore, the proposed method is composed of two parts: a person-tracking method using ultrasonic sensors and a multichannel audio rendering method using MPEG Surround parameters. In order to evaluate the perceptual quality and localization performance of the proposed method, a MUSHRA listening test is conducted, and the directivity patterns are investigated. It is shown from these experiments that the proposed method provides better perceptual quality and localization performance than a conventional multichannel audio rendering method that also uses MPEG Surround parameters.

Explore More