Byeongchang Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Byeongchang Kim is active.

Explore More

Publication

Featured researches published by Byeongchang Kim.

ACM Transactions on Asian Language Information Processing | 2002

Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information

Byeongchang Kim; Gary Geunbae Lee; Jong-Hyeok Lee

Both dictionary-based and rule-based methods on grapheme-to-phoneme conversion have their own advantages and limitations. For example, a large sized phonetic dictionary and complex morphophonemic rules are required for the dictionary-based method and the LTS (letter to sound) rule-based method itself cannot model the complete morphophonemic constraints.This paper describes a grapheme-to-phoneme conversion method for Korean using a dictionary-based and rule-based hybrid method with a phonetic pattern dictionary and CCV (consonant consonant vowel) LTS (letter to sound) rules. The phonetic pattern dictionary, standing for the dictionary-based method, contains entries in the form of a morpheme pattern and its phonetic pattern. The patterns represent candidate phonological changes in left and right boundaries of morphemes. Obviously, the CCV LTS rules stand for the rule-based method. The rules are in charge of grapheme-to-phoneme conversion within morphemes.The conversion method consists of mainly two steps including morpheme to phoneme conversion and morphophonemic connectivity check, and two preprocessing steps including phrase break prediction and morpheme normalization. Phrase break prediction presumes phrase breaks using the stochastic method on part-of-speech (POS) information. Morpheme normalization is to replace non-Korean symbols with their corresponding standard Korean graphemes. In the morpheme-phoneticizing module, each morpheme in the phrase is converted into phonetic patterns by looking it up in the phonetic pattern dictionary. Graphemes within a morpheme are grouped into CCV units and converted into phonemes by the CCV LTS rules. The morphophonemic connectivity table supports grammaticality checking of the two adjacent phonetic morphemes.In experiments with a non-Korean symbol free corpus of 4,973 sentences, we achieved a 99.98% grapheme-to-phoneme conversion performance rate and a 99.0% sentence conversion performance rate. With a broadcast news corpus of 621 sentences, 99.7% of the graphemes and 86.6% of the sentences are correctly converted. The full Korean TTS (Text-to-Speech) system is now being implemented using this conversion method.

ACM Transactions on Asian Language Information Processing | 2002

Automatic corpus-based tone and break-index prediction using K-ToBI representation

Jin-Seok Lee; Byeongchang Kim; Gary Geunbae Lee

In this article we present a prosody generation architecture based on K-ToBI (Korean Tone and Break Index) representation. ToBI is a multitier representation system based on linguistic knowledge that transcribes events in an utterance. The TTS (Text-To-Speech) system, which adopts ToBI as an intermediate representation, is known to exhibit higher flexibility, modularity, and domain/task portability compared to the direct prosody generation TTS systems. However, for practical-level performance, the cost of corpus preparation is very expensive because the ToBI labeled corpus is constructed manually by many prosody experts, and normally requires large amounts of data for statistical prosody modeling. Unlike previous ToBI-based systems, this article proposes a new method, which transcribes the K-ToBI labels in Korean speech completely automatically. We develop automatic corpus-based K-ToBI labeling tools and prediction methods based on several lexico-syntactic linguistic features for decision-tree induction. We demonstrate the performance of F0 generation from automatically predicted K-ToBI labels, and confirm that the performance is reasonably comparable to state-of-the-art direct prosody generation methods and previous ToBI-based methods.

international conference on computational linguistics | 2000

Decision-tree based error correction for statistical phrase break prediction in Korean

Byeongchang Kim; Gary Geunbae Lee

In this paper, we present a new phrase break prediction architecture that integrates probabilistic approach with decision-tree based error correction. The probabilistic method alone usually suffers from performance degradation due to inherent data sparseness problems and it only covers a limited range of contextual information. Moreover, the module can not utilize the selective morpheme tag and relative distance to the other phrase breaks. The decision-tree based error correction was tightly integrated to overcome these limitations.The initially phrase break tagged morpheme sequence is corrected with the error correcting decision tree which was induced by C4.5 from the correctly tagged corpus with the output of the probabilistic predictor. The decision tree-based post error correction provided improved results even with the phrase break predictor that has poor initial performance. Moreover, the system can be flexibly tuned to new corpus without massive retraining.

international joint conference on natural language processing | 2004

High speed unknown word prediction using support vector machine for chinese text-to-speech systems

Juhong Ha; Yu Zheng; Byeongchang Kim; Gary Geunbae Lee; Yoon-Suk Seong

One of the most significant problems in POS (Part-of-Speech) tagging of Chinese texts is an identification of words in a sentence, since there is no blank to delimit the words. Because it is impossible to pre-register all the words in a dictionary, the problem of unknown words inevitably occurs during this process. Therefore, the unknown word problem has remarkable effects on the accuracy of the sound in Chinese TTS (Text-to-Speech) system. In this paper, we present a SVM (support vector machine) based method that predicts the unknown words for the result of word segmentation and tagging. For high speed processing to be used in a TTS, we pre-detect the candidate boundary of the unknown words before starting actual prediction. Therefore we perform a two-phase unknown word prediction in the steps of detection and prediction. Results of the experiments are very promising by showing high precision and high recall with also high speed.

meeting of the association for computational linguistics | 1998

Unlimited Vocabulary Grapheme to Phoneme Conversion for Korean TTS

Byeongchang Kim; WonIl Lee; Gary Geunbae Lee; Jong-Hyeok Lee

This paper describes a grapheme-to-phoneme conversion method using phoneme connectivity and CCV conversion rules. The method consists of mainly four modules including morpheme normalization, phrase-break detection, morpheme to phoneme conversion and phoneme connectivity check.The morpheme normalization is to replace non-Korean symbols into standard Korean graphemes. The phrase-break detector assigns phrase breaks using part-of-speech (POS) information. In the morpheme-to-phoneme conversion module, each morpheme in the phrase is converted into phonetic patterns by looking up the morpheme phonetic pattern dictionary which contains candidate phonological changes in boundaries of the morphemes. Graphemes within a morpheme are grouped into CCV patterns and converted into phonemes by the CCV conversion rules. The phoneme connectivity table supports grammaticality checking of the adjacent two phonetic morphemes.In the experiments with a corpus of 4,973 sentences, we achieved 99.9% of the grapheme-to-phoneme conversion performance. The full Korean TTS system is now being implemented using this conversion method.

ACM Transactions on Asian Language Information Processing | 2012

Stacking Model-Based Korean Prosodic Phrasing Using Speaker Variability Reduction and Linguistic Feature Engineering

Jinsik Lee; Sungjin Lee; Jonghoon Lee; Byeongchang Kim; Gary Geunbae Lee

This article presents a prosodic phrasing model for a general purpose Korean speech synthesis system. To reflect the factors affecting prosodic phrasing in the model, linguistically motivated machine-learning features were investigated. These features were effectively incorporated using a stacking model. The phrasing performance was also improved through feature engineering. The corpus used in the experiment is a 4,392-sentence corpus (55,015 words with an average of 13 words per sentence). Because the corpus contains speaker-dependent variability and such variability is not appropriately reflected in a general purpose speech synthesis system, a method to reduce such variability is proposed. In addition, the entire set of data used in the experiment is provided to the public for future use in comparative research.

international conference on computational science and its applications | 2006

C-TOBI-Based pitch accent prediction using maximum-entropy model

Byeongchang Kim; Gary Geunbae Lee

We model Chinese pitch accent prediction as a classification problem with six C-ToBI pitch accent types, and apply conditional Maximum Entropy (ME) classification to this problem. We acquire multiple levels of linguistic knowledge from natural language processing to make well-integrated features for ME framework. Five kinds of features were used to represent various linguistic constraints including phonetic features, POS tag features, phrase break features, position features, and length features.

Computer Speech & Language | 2017

Automatic sentence stress feedback for non-native English learners

Gary Geunbae Lee; Ho-Young Lee; Jieun Song; Byeongchang Kim; Sechun Kang; Jinsik Lee; Hyosung Hwang

The proposed sentence stress feedback system consists of stress prediction, detection and feedback provision models.The accuracy of the stress prediction and detection models was 96.6% and 84.1% respectively.The stress feedback provision model provides non-native learners with sentence stress errors.Any sentence can be used for practice in the proposed system.Students trained with this system improved their accentedness and rhythm significantly more than those in the control group. This paper proposes a sentence stress feedback system in which sentence stress prediction, detection, and feedback provision models are combined. This system provides non-native learners with feedback on sentence stress errors so that they can improve their English rhythm and fluency in a self-study setting. The sentence stress feedback system was devised to predict and detect the sentence stress of any practice sentence. The accuracy of the prediction and detection models was 96.6% and 84.1%, respectively. The stress feedback provision model offers positive or negative stress feedback for each spoken word by comparing the probability of the predicted stress pattern with that of the detected stress pattern. In an experiment that evaluated the educational effect of the proposed system incorporated in our CALL system, significant improvements in accentedness and rhythm were seen with the students who trained with our system but not with those in the control group.

Archive | 2016

ASR Error Management Using RNN Based Syllable Prediction for Spoken Dialog Applications

Byeongchang Kim; Junhwi Choi; Gary Geunbae Lee

We proposed automatic speech recognition (ASR) error management method using recurrent neural network (RNN) based syllable prediction for spoken dialog applications. ASR errors are detected and corrected by syllable prediction. For accurate prediction of a next syllable, we used a current syllable, previous syllable context, and phonetic information of next syllable which is given by ASR error. The proposed method can correct ASR errors only with a text corpus which is used for training of the target application, and it means that the method is independent to the ASR engine. The method is general and can be applied to any speech based application such as spoken dialog systems.

Archive | 2015

An Automatic Feedback System Based on Confidence Deviations of Prediction and Detection Models for English Phrase Break

Byeongchang Kim; Gary Geunbae Lee

This paper presents a method to construct a feedback provision model for English phrase breaks by utilizing confidence deviations of prediction and detection models. The proposed method consists of prediction, detection and feedback provision models. The prediction and detection models adopted conditional random fields classifiers performed on the Boston University radio news corpus, and achieved accuracies of 90.15% and 91.62%, respectively. The feedback provision model determines three types of feedbacks for each disjunction using the differences between the prediction and detection confidences. In a validation experiment for the feedback provision, the proposed method demonstrated a Pearson’s correlation coefficient of 0.74 between the feedback provision model’s scores and human fluency assessments.

Explore More