Gumwon Hong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gumwon Hong is active.

Explore More

Publication

Featured researches published by Gumwon Hong.

meeting of the association for computational linguistics | 2009

Bridging Morpho-Syntactic Gap between Source and Target Sentences for English-Korean Statistical Machine Translation

Gumwon Hong; Seung Wook Lee; Hae Chang Rim

Often, Statistical Machine Translation (SMT) between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessary function words, or by reordering source sentences. However, the removal of function words can cause a serious loss in information. In this paper, we present a possible method of bridging the morpho-syntactic gap for English-Korean SMT. In particular, the proposed method tries to transform a source sentence by inserting pseudo words, and by reordering the sentence in such a way that both sentences have a similar length and word order. The proposed method achieves 2.4 increase in BLEU score over baseline phrase-based system.

Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009) | 2009

A Hybrid Approach to English-Korean Name Transliteration

Gumwon Hong; Min-Jeong Kim; Do-Gil Lee; Hae-Chang Rim

This paper presents a hybrid approach to English-Korean name transliteration. The base system is built on MOSES with enabled factored translation features. We expand the base system by combining with various transliteration methods including a Web-based n-best re-ranking, a dictionary-based method, and a rule-based method. Our standard run and best non-standard run achieve 45.1 and 78.5, respectively, in top-1 accuracy. Experimental results show that expanding training data size significantly contributes to the performance. Also we discover that the Web-based re-ranking method can be successfully applied to the English-Korean transliteration.

north american chapter of the association for computational linguistics | 2009

A Multi-Phase Approach to Biomedical Event Extraction

Hyoung-Gyu Lee; Han-Cheol Cho; Min-Jeong Kim; Joo Young Lee; Gumwon Hong; Hae-Chang Rim

In this paper, we propose a system for biomedical event extraction using multi-phase approach. It consists of event trigger detector, event type classifier, and relation recognizer and event compositor. The system firstly identifies triggers in a given sentence. Then, it classifies the triggers into one of nine predefined classes. Lastly, the system examines each trigger whether it has a relation with participant candidates, and composites events with the extracted relations. The official score of the proposed system recorded 61.65 precision, 9.40 recall and 16.31 f-score in approximate span matching. However, we found that the threshold tuning for the third phase had negative effect. Without the threshold tuning, the system showed 55.32 precision, 16.18 recall and 25.04 f-score.

pacific rim international conference on artificial intelligence | 2010

Identifying idiomatic expressions using phrase alignments in bilingual parallel corpus

Hyoung Gyu Lee; Min-Jeong Kim; Gumwon Hong; Sang Bum Kim; Young Sook Hwang; Hae Chang Rim

Previous efforts to identify idiomatic expressions using a bilingual parallel corpus have focused on the method of using word alignments to catch the sense of individual words. In this paper, we propose a method of using phrase alignments rather than word alignments in a parallel corpus to recognize the sense of phrases as well as words. Our proposed scoring functions are based on the difference of translation tendency between a phrase and individual words. They can help us identify idiomatic expressions with a entropy variation and a translation difference between a phrase and individualwords. Experimental results show that our proposed method is more effective than previous approaches for the identification of idiomatic expressions. In addition, we proved that linguistic constraints can be integrated into our method to improve the performance.

international conference on ubiquitous information management and communication | 2011

Predicate-argument reordering based on learning to rank for English-Korean machine translation

Joo Young Lee; Gumwon Hong; Hae Chang Rim; Young In Song; Young Sook Hwang

In this paper, we propose a method of learning predicate-argument structure reordering, and present its effect on machine translation. The method takes two steps; first, it extracts generalized predicate-argument structure reordering rules using a source sentence parse tree from a parallel corpus. Second, it trains a model based on learning to rank framework to select the most relevant reordering rule based on source language context features. The learned model is used to restructure a source sentence in order to have similar word order with a target sentence. In our experiments on English-to-Korean machine translation, the proposed method achieves significant improvements in BLEU score, from 19.68 to 21.84.

international conference on advanced language processing and web information technology | 2007

Korean Spacing by Improving Viterbi Segmentation

Gumwon Hong; Hae Chang Rim

This paper presents a Korean spacing approach which employs an improved Viterbi segmentation model. Traditional Viterbi segmentation using the word unigram language model is simple and fast, but has two problems: data sparseness and improper preference of fewer segments. To overcome these limitations, the segmentation model is extended by employing a split probability based on character bigram. Contextual information is selectively used for further resolution of spacing ambiguities without much increase of the complexity. Experimental results show that the extended model performs better than the traditional segmentation model. Futhermore, compared to the state of the art system, our approach achieves better efficiency in terms of processing time without losing significant accuracy.

international conference on computational linguistics | 2010