Jinsik Lee
Pohang University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jinsik Lee.
Computer Speech & Language | 2009
Jinsik Lee; Gary Geunbae Lee
This paper presents a data-driven Korean grapheme-to-phoneme conversion method including alignment, rule extraction, and rule pruning procedures. Novel rule extraction and pruning techniques are introduced to effectively handle the exceptional pronunciation of speech databases. The performances with the full rules and the reduced rules are 99.22% and 99.11% phoneme level accuracy, respectively. Compared with state-of-the-art previous works, the experimental results show that our method is very promising.
ACM Transactions on Asian Language Information Processing | 2012
Jinsik Lee; Sungjin Lee; Jonghoon Lee; Byeongchang Kim; Gary Geunbae Lee
This article presents a prosodic phrasing model for a general purpose Korean speech synthesis system. To reflect the factors affecting prosodic phrasing in the model, linguistically motivated machine-learning features were investigated. These features were effectively incorporated using a stacking model. The phrasing performance was also improved through feature engineering. The corpus used in the experiment is a 4,392-sentence corpus (55,015 words with an average of 13 words per sentence). Because the corpus contains speaker-dependent variability and such variability is not appropriately reflected in a general purpose speech synthesis system, a method to reduce such variability is proposed. In addition, the entire set of data used in the experiment is provided to the public for future use in comparative research.
Computer Speech & Language | 2017
Gary Geunbae Lee; Ho-Young Lee; Jieun Song; Byeongchang Kim; Sechun Kang; Jinsik Lee; Hyosung Hwang
The proposed sentence stress feedback system consists of stress prediction, detection and feedback provision models.The accuracy of the stress prediction and detection models was 96.6% and 84.1% respectively.The stress feedback provision model provides non-native learners with sentence stress errors.Any sentence can be used for practice in the proposed system.Students trained with this system improved their accentedness and rhythm significantly more than those in the control group. This paper proposes a sentence stress feedback system in which sentence stress prediction, detection, and feedback provision models are combined. This system provides non-native learners with feedback on sentence stress errors so that they can improve their English rhythm and fluency in a self-study setting. The sentence stress feedback system was devised to predict and detect the sentence stress of any practice sentence. The accuracy of the prediction and detection models was 96.6% and 84.1%, respectively. The stress feedback provision model offers positive or negative stress feedback for each spoken word by comparing the probability of the predicted stress pattern with that of the detected stress pattern. In an experiment that evaluated the educational effect of the proposed system incorporated in our CALL system, significant improvements in accentedness and rhythm were seen with the students who trained with our system but not with those in the control group.
International Conference on Advanced Computer Science and Information Technology | 2011
Byeongchang Kim; Jinsik Lee; Gary Geunbae Lee
This paper proposes a grapheme to phoneme conversion method for Korean Text-to-Speech system where the phonological phrasing information is incorporated. To verify the validity of the proposing method, a hybrid approach based grapheme to phoneme conversion system, which combines hand-written morphophonemic rules and maximum entropy models, is implemented. The experimental results show that the prediction accuracy of tensification on the eojeol (a space-delimited orthographic word) boundaries improve from 93.20% to 95.45%, which leads to better overall grapheme to phoneme conversion performance.
ubiquitous computing | 2010
Byeongchang Kim; Jinsik Lee; Gary Geunbae Lee
One of the enduring problems in developing high-quality TTS (text-to-speech) system is pitch contour generation. Considering language specific knowledge, an adjusted Fujisaki model for Korean TTS system is introduced along with refined machine learning features. The results of quantitative and qualitative evaluations show the validity of our system: the accuracy of the phrase command prediction is 0.8928; the correlations of the predicted amplitudes of a phrase command and an accent command are 0.6644 and 0.6002, respectively; our method achieved the level of “fair” naturalness (3.6) in a MOS scale for generated F0 curves.
Chemical Communications | 2011
Sanghwa Jeong; Nayoun Won; Jinsik Lee; Jiwon Bang; Jeongsoo Yoo; Sang Geol Kim; Jeong Ah Chang; Joonghyun Kim; Sungjee Kim
Journal of Physical Chemistry C | 2010
Sanghwa Jeong; Jinsik Lee; Jutaek Nam; Kyuhyun Im; Jaehyun Hur; Jong-Jin Park; Jong Min Kim; Bonghwan Chon; Taiha Joo; Sungjee Kim
conference of the international speech communication association | 2006
Seungwon Kim; Jinsik Lee; Byeongchang Kim; Gary Geunbae Lee
Archive | 2009
Sungjee Kim; Jinsik Lee; SongJoo Oh
conference of the international speech communication association | 2006
Jinsik Lee; Seungwon Kim; Gary Geunbae Lee