Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chul Min Lee is active.

Publication


Featured researches published by Chul Min Lee.


IEEE Transactions on Speech and Audio Processing | 2005

Toward detecting emotions in spoken dialogs

Chul Min Lee; Shrikanth Narayanan

The importance of automatically recognizing emotions from human speech has grown with the increasing role of spoken language interfaces in human-computer interaction applications. This paper explores the detection of domain-specific emotions using language and discourse information in conjunction with acoustic correlates of emotion in speech signals. The specific focus is on a case study of detecting negative and non-negative emotions using spoken language data obtained from a call center application. Most previous studies in emotion recognition have used only the acoustic information contained in speech. In this paper, a combination of three sources of information-acoustic, lexical, and discourse-is used for emotion recognition. To capture emotion information at the language level, an information-theoretic notion of emotional salience is introduced. Optimization of the acoustic correlates of emotion with respect to classification error was accomplished by investigating different feature sets obtained from feature selection, followed by principal component analysis. Experimental results on our call center data show that the best results are obtained when acoustic and language information are combined. Results show that combining all the information, rather than using only acoustic information, improves emotion classification by 40.7% for males and 36.4% for females (linear discriminant classifier used for acoustic information).


international conference on multimodal interfaces | 2004

Analysis of emotion recognition using facial expressions, speech and multimodal information

Carlos Busso; Zhigang Deng; Serdar Yildirim; Murtaza Bulut; Chul Min Lee; Abe Kazemzadeh; Sungbok Lee; Ulrich Neumann; Shrikanth Narayanan

The interaction between human beings and computers will be more natural if computers are able to perceive and respond to human non-verbal communication such as emotions. Although several approaches have been proposed to recognize human emotions based on facial expressions or speech, relatively limited work has been done to fuse these two, and other, modalities to improve the accuracy and robustness of the emotion recognition system. This paper analyzes the strengths and the limitations of systems based only on facial expressions or acoustic information. It also discusses two approaches used to fuse these two modalities: decision level and feature level integration. Using a database recorded from an actress, four emotions were classified: sadness, anger, happiness, and neutral state. By the use of markers on her face, detailed facial motions were captured with motion capture, in conjunction with simultaneous speech recordings. The results reveal that the system based on facial expression gave better performance than the system based on just acoustic information for the emotions considered. Results also show the complementarily of the two modalities and that when these two modalities are fused, the performance and the robustness of the emotion recognition system improve measurably.


international conference on multimedia and expo | 2002

Classifying emotions in human-machine spoken dialogs

Chul Min Lee; Shrikanth Narayanan; Roberto Pieraccini

This paper reports on the comparison between various acoustic feature sets and classification algorithms for classifying spoken utterances based on the emotional state of the speaker. The data set used for the analysis comes from a corpus of human-machine dialogs obtained from a commercial application. Emotion recognition is posed as a pattern recognition problem. We used three different techniques - linear discriminant classifier (LDC), k-nearest neighborhood (k-NN) classifier, and support vector machine classifier (SVC) -for classifying utterances into 2 emotion classes: negative and non-negative. In this study, two feature sets were used; the base feature set obtained from the utterance-level statistics of the pitch and energy of the speech, and the feature set analyzed by principal component analysis (PCA). PCA showed a performance comparable to the base feature sets. Overall, the LDC achieved the best performance with error rates of 27.54% on female data and 25.46% on males with the base feature set. The SVC, however, showed a better performance in the problem of data sparsity.


American Journal of Obstetrics and Gynecology | 1994

An abnormal umbilical artery waveform: a strong and independent predictor of adverse perinatal outcome in patients with preeclampsia.

Bo Hyun Yoon; Chul Min Lee; Syng Wook Kim

OBJECTIVE An abnormal umbilical artery Doppler waveform is a risk factor for adverse perinatal outcome. However, it has not been established whether this is related to the earlier gestational age at delivery of fetuses with abnormal Doppler findings or whether Doppler findings are an independent predictor of perinatal outcome. Our purpose was to determine whether an abnormal Doppler umbilical artery waveform is associated with adverse perinatal outcome even after the gestational age at delivery is controlled for as a confounding variable in patients with preeclampsia. STUDY DESIGN Umbilical artery velocimetry studies were performed within 7 days of delivery in 72 consecutive patients admitted to our unit with preeclampsia. Adverse perinatal outcome was defined as fetal distress requiring cesarean delivery. Apgar score < 7 at 5 minutes, significant neonatal morbidity, or perinatal death. Significant neonatal morbidity was defined as neonatal sepsis, intraventricular hemorrhage (grade > or = 2), respiratory distress syndrome, pneumonia, bronchopulmonary dysplasia, acute renal failure, or necrotizing enterocolitis. Stepwise multiple logistic regression and receiver-operator characteristic curve analysis were used. RESULTS Patients with abnormal umbilical artery velocimetry had a significantly higher rate of complications, including cesarean section for fetal distress, preterm delivery, low Apgar scores, significant neonatal morbidity, and perinatal death, than did patients with a normal waveform. Receiver-operator characteristic curve and stepwise logistic regression analysis indicated that an abnormal umbilical artery waveform was a significant independent predictor for the development of adverse perinatal outcome (odds ratio 14.2, p < 0.005) after other confounding variables were adjusted. CONCLUSION An abnormal Doppler umbilical artery waveform is a strong and independent predictor of adverse perinatal outcome in patients with preeclampsia.


Journal of the Acoustical Society of America | 2004

Effects of emotion on different phoneme classes

Chul Min Lee; Serdar Yildirim; Murtaza Bulut; Carlos Busso; Abe Kazemzadeh; Sungbok Lee; Shrikanth Narayanan

This study investigates the effects of emotion on different phoneme classes using short‐term spectral features. In the research on emotion in speech, most studies have focused on prosodic features of speech. In this study, based on the hypothesis that different emotions have varying effects on the properties of the different speech sounds, we investigate the usefulness of phoneme‐class level acoustic modeling for automatic emotion classification. Hidden Markov models (HMM) based on short‐term spectral features for five broad phonetic classes are used for this purpose using data obtained from recordings of two actresses. Each speaker produces 211 sentences with four different emotions (neutral, sad, angry, happy). Using the speech material we trained and compared the performances of two sets of HMM classifiers: a generic set of ‘‘emotional speech’’ HMMs (one for each emotion) and a set of broad phonetic‐class based HMMs (vowel, glide, nasal, stop, fricative) for each emotion type considered. Comparison of ...


IEEE Signal Processing Letters | 2014

Stereophonic Acoustic Echo Suppression Incorporating Spectro-Temporal Correlations

Chul Min Lee; Jong Won Shin; Nam Soo Kim

In this letter, we propose an enhanced stereophonic acoustic echo suppression (SAES) algorithm incorporating spectral and temporal correlations in the short-time Fourier transform (STFT) domain. Unlike traditional stereophonic acoustic echo cancellation, SAES estimates the echo spectra in the STFT domain and uses a Wiener filter to suppress echo without performing any explicit double-talk detection. The proposed approach takes account of interdependencies among components in adjacent time frames and frequency bins, which enables more accurate estimation of the echo signals. Experimental results show that the proposed method yields improved performance compared to that of conventional SAES.


international conference on acoustics, speech, and signal processing | 2014

PARAMETRIC MULTICHANNEL NOISE REDUCTION ALGORITHM UTILIZING TEMPORAL CORRELATIONS IN REVERBERANT ENVIRONMENT

Yu Gwang Jin; Jong Won Shin; Chul Min Lee; Soo Hyun Bae; Nam Soo Kim

In this paper, we propose a parametric multichannel noise reduction algorithm utilizing temporal correlations in a noisy and reverberant environment. Under the reverberant condition, the received acoustic signal becomes highly correlated in the time domain and it makes successful noise reduction quite difficult. The proposed parametric noise reduction method takes account of interdependencies between components observed from different frames. Extended speech and noise power spectral density (PSD) matrices are estimated containing additional temporal information, and the parametric multichannel noise reduction filter based on these PSD matrices is applied to the input microphone array signal. According to the experimental results, the proposed algorithm has been found to show better performances compared with the conventional multiplicative filtering technique which considers the current input signals only.


international conference on acoustics, speech, and signal processing | 2011

A data-driven residual gain approach for two-stage speech enhancement

Yu Gwang Jin; Chul Min Lee; Kiho Cho; Nam Soo Kim

In this paper, we propose a novel speech enhancement algorithm based on data-driven residual gain estimation. The system consists of two stages. A noisy input signal is processed at the first stage by a conventional speech enhancement module from which both the enhanced signal and several signal-to-noise ratio (SNR)-related parameters are obtained. At the second stage, the residual gain, which is estimated by a data-driven method, is applied to the enhanced signal to further adjust it. According to the experimental results, the proposed algorithm has been found to show better performances compared with the conventional speech enhancement technique based on soft decision as well as the data-driven approach using the SNR grid look-up table.


international conference on acoustics, speech, and signal processing | 2014

CROSSBAND FILTERING FOR STEREOPHONIC ACOUSTIC ECHO SUPPRESSION

Chul Min Lee; Jong Won Shin; Yu Gwang Jin; Jeoung Hun Kim; Nam Soo Kim

In this paper, we propose a novel stereophonic acoustic echo suppression (SAES) technique based on crossband filtering in the short-time Fourier transform (STFT) domain. The proposed algorithm considers spectral correlations among components in adjacent frequency bins, and estimates the extended power spectral density (PSD) matrices and cross PSD vectors from the signal statistics for more precise echo estimation. In the STFT domain, the echo spectra are estimated by performing the technique without any distinguishable double-talk detector. According to the experimental results, the proposed algorithm has been found to show better performances compared with the conventional SAES method.


Journal of the Acoustical Society of America | 2004

Study of acoustic correlates associate with emotional speech

Serdar Yildirim; Sungbok Lee; Chul Min Lee; Murtaza Bulut; Carlos Busso; Ebrahim Kazemzadeh; Shrikanth Narayanan

This study investigates the acoustic characteristics of four different emotions expressed in speech. The aim is to obtain detailed acoustic knowledge on how a speech signal is modulated by changes from neutral to a certain emotional state. Such knowledge is necessary for automatic emotion recognition and classification and emotional speech synthesis. Speech data obtained from two semi‐professional actresses are analyzed and compared. Each subject produces 211 sentences with four different emotions; neutral, sad, angry, happy. We analyze changes in temporal and acoustic parameters such as magnitude and variability of segmental duration, fundamental frequency and the first three formant frequencies as a function of emotion. Acoustic differences among the emotions are also explored with mutual information computation, multidimensional scaling and acoustic likelihood comparison with normal speech. Results indicate that speech associated with anger and happiness is characterized by longer duration, shorter int...

Collaboration


Dive into the Chul Min Lee's collaboration.

Top Co-Authors

Avatar

Shrikanth Narayanan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sungbok Lee

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Carlos Busso

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Nam Soo Kim

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Abe Kazemzadeh

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Jong Won Shin

Gwangju Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yu Gwang Jin

Seoul National University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge