Cheolwoo Jo
Changwon National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cheolwoo Jo.
international conference of the ieee engineering in medicine and biology society | 2007
Jianglin Wang; Cheolwoo Jo
Diagnosis of pathological voice is one of the most important issues in biomedical applications of speech technology. This study focuses on the classification of pathological voice using the HMM (hidden Markov model), the GMM (Gaussian mixture model) and a SVM (support vector machine), and then compares the results to work done previously using an ANN (artificial neural network). Speech data were collected from those without and those with vocal disorders. Normal and pathological speech data were mixed in out experiment. Six characteristic parameters (jitter, shimmer, NHR, SPI, APQ and RAP) were chosen. Then the pattern recognition methods (HMM, GMM and SVM) were used to distinguish the mixed data into categories of normal and pathological speech. We found that the GMM-based method can give us superior classification rates compared to the other classification methods.
international conference of the ieee engineering in medicine and biology society | 2005
Cheolwoo Jo; Tao Li; Jianglin Wang
The aim of this paper is to analyze and discriminate the pathological voice by separating signal into periodic and aperiodic parts. Separation was performed recursively from the residual signal of voice signal. Based on initial estimation of aperiodic part of spectrum, aperiodic part is decided from the extrapolation method. Periodic part is decided by subtracting aperiodic part from the original spectrum. A parameter HNR is derived based on the separation. Parameter value statistics are compared with those of Jitter and Shimmer for normal, benign and malignant cases
international conference on signal processing | 2004
Tao Li; Cheolwoo Jo
In this paper we tried to discriminate severely noisy pathological voice from normal ones based on two different parameters, spectral slope and ratio of energies in harmonic and noise components (HNR). The spectral slope was obtained by using a curve filling method and the HNR was computed from cepstrum. Speech data from normal peoples and patients were collected, then diagnosed and classified into three different categories (normal, relatively less noisy and severely noisy data). The mean and the standard deviation of the two different parameters were computed and compared to characterize and discriminate the severely noisy pathological voice from others.
international conference on spoken language processing | 1996
Cheolwoo Jo; Ho-Gyun Bang; William A. Ainsworth
The paper proposes an improved method of glottal closure instant detection using linear prediction and standard pitch concept. The main improvements are on its speed of computation and error reduction on position finding for the cases that were not possible or caused many errors using previous methods. The method can resolve the problems occurring in current methods to some extent. The false location detection rate is reduced to its inherent interpolation capability. Also the amount of computation is reduced. Another benefit of the method is that it does not need additional post processing to find peaks or smoothing of the pitch tracks. The authors also compared results among three different kinds of linear prediction based pitch detectors.
text speech and dialogue | 1999
Cheolwoo Jo; Attila Ferencz; Dae-Hyun Kim
This paper contains a brief presentation about the analysis, re-synthesis and evaluation of emotional Korean speech. Four different emotional styles were taken into consideration that were induced from the actors. Prosodic features of these sentences were extracted and analyzed. Based on these statistical results, one neutral uttered sentence was re-synthesized into these four different emotional styles. The generated Artificial emotional speech was evaluated by judges based on a MOS test.
international congress on image and signal processing | 2013
Cheolwoo Jo; Osho Gupta
This paper presents a basic research results to visualize the micro-movement of formant tracks for the purpose of using it to analyze speech signal on the research of speech analysis and synthesis etc. Current analysis method on formant is mainly focusing on analysis of static statistical characteristics. But dynamic information about changing vowels is hard to acquire using conventional statistical results such as mean, variance etc. In this experiment we propose a new visualization method to display the dynamic behavior of vowel changes according to the micro movement of formant on vowel space.
international conference on computer research and development | 2011
Cheolwoo Jo; Jaehee Kim
In this paper, details of the designing and implementing voice source simulator using simulink and matlab are discussed. This simulator is an implementation by model-based design concept. Voice source can be analyzed and manipulated through various factors by choosing options from GUI input and selecting the pre-defined blocks or user created ones. This kind of simulation tool can make ease the procedure of analyzing speech signal for various purposes such as voice quality analysis, pathological voice analysis, speech coding etc. Also basic analysis functions are supported to compare the original signal and the manipulated ones.
intelligent systems design and applications | 2006
Cheolwoo Jo; Jianglin Wang
We explored the voice source and vocal tract characteristics of emotional speech to estimate the voice quality. Emotional speech data used was collected from the actors. Speech materials consist of 10 sentences from 3 male and 3 female speakers in 6 emotional states. 6 emotions are sadness, anger, happiness, fear, boredom and neutral, /a/ sound was segmented from the data used for the analysis. In terms of voice source we measured jitter, shimmer, NHR, pitch and pitch range. To investigate vocal tract changes, normalized vocal tract area ratios were used. Area functions were computed from the linear predictive coefficients. Vocal tract was also divided into three sections to observe the changes according to different emotions. Each computed vocal tract area part was normalized by dividing the same part of corresponding neutral speech. Jitter value was the biggest in the neutral emotion. Shimmer was similar for all the emotions except for the fear. And the fear showed the biggest value in NHR. Pitch value was elevated for all emotions except the boredom. Pitch range was the biggest in the anger. In terms of vocal tract change, there was not a remarkable difference at lip section, but the fear and sadness showed great changes at the vocal fold section
Information Acquisition, 2005 IEEE International Conference on | 2006
Tao Li; Cheolwoo Jo
Voice quality is considered to play an important role for the transmission of emotions in human speech communications. In this paper, we explored the acoustical characteristics of voice quality in the emotional speech signals based on numerical parameters, such as Jitter, RAP, Shimmer, APQ, NHR and SPI. In addition, the role of pitch, pitch range and normalized speech duration of the emotional speech was focused. Korean emotional speech database was collected from a professional actor. Nine sentences having different contents were respectively uttered with six different kinds of emotions: neutral, happiness, anger, sadness, fear and boredom. Jitter, RAP, Shimmer, APQ, NHR and SPI were computed respectively after extracting the voiced segment with the vowel /a/ from each emotional sentence. Pitch, pitch range and normalized speech duration of each emotional speech signal were also measured or computed. The statistical analysis based on the changes of these nine sets of different parameters was performed to characterize voice quality of the human emotional speeches.
conference of the international speech communication association | 2004
Cheolwoo Jo; Soo-Geon Wang; Byunggon Yang; Hyung-Soon Kim; Tao Li