Yi-Gyu Hwang
Electronics and Telecommunications Research Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yi-Gyu Hwang.
asia information retrieval symposium | 2006
Changki Lee; Yi-Gyu Hwang; Hyo-Jung Oh; Soojong Lim; Jeong Heo; Chung-Hee Lee; Hyeon-Jin Kim; Ji-Hyun Wang; Myung-Gil Jang
In many QA systems, fine-grained named entities are extracted by coarse-grained named entity recognizer and fine-grained named entity dictionary. In this paper, we describe a fine-grained Named Entity Recognition using Conditional Random Fields (CRFs) for question answering. We used CRFs to detect boundary of named entities and Maximum Entropy (ME) to classify named entity classes. Using the proposed approach, we could achieve an 83.2% precision, a 74.5% recall, and a 78.6% Fl for 147 fined-grained named entity types. Moreover, we reduced the training time to 27% without loss of performance compared to a baseline model. In the question answering, The QA system with passage retrieval and AIU archived about 26% improvement over QA with passage retrieval. The result demonstrated that our approach is effective for QA.
Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages | 2003
Euisok Chung; Yi-Gyu Hwang; Myung-Gil Jang
Named entity recognition is important in sophisticated information service system such as Question Answering and Text Mining since most of the answer type and text mining unit depend on the named entity type. Therefore we focus on named entity recognition model in Korean. Korean named entity recognition is difficult since each word of named entity has not specific features such as the capitalizing feature of English. It has high dependence on the large amounts of hand-labeled data and the named entity dictionary, even though these are tedious and expensive to create. In this paper, we devise HMM based named entity recognizer to consider various context models. Furthermore, we consider weakly supervised learning technique, CoTraining, to combine labeled data and unlabeled data.
robot and human interactive communication | 2007
Hyo-Jung Oh; Chung-Hee Lee; Yi-Gyu Hwang; Myung-Gil Jang; Jeon Gue Park; Yun Kun Lee
This paper presents a case study of edutainment robot, which is an intelligent robot for educational use with a voice-QA model applied. The emphatic functions of our robot are: analyzing spoken question from a student, finding an appropriate answer in Korean encyclopedia, and then serving the answer with speech synthesis. We develop the ESTk, which is an Automatic Speech Recognition (ASR) system based on Finite State Network (FSN) for processing Korean spoken questions. For answer extraction, we utilize machine learning techniques and pattern extraction method. With our live-update interaction method, our robot can be extended with new knowledge in real-time. By conducting a quiz game, we show a possibility of our robot as an edutainment robot.
The Kips Transactions:partb | 2003
Yi-Gyu Hwang; Bo-Hyun Yun
Named entity recognition is the process indispensable to question answering and information extraction systems. This paper presents an HMM based named entity (m) recognition method using the construction principles of compound words. In Korean, many named entities can be decomposed into more than one word. Moreover, there are contextual relationships among nouns in an NE, and among an NE and its surrounding words. In this paper, we classify words into a word as an NE in itself, a word in an NE, and/or a word adjacent to an n, and train an HMM based on NE-related word types and parts of speech. Proposed named entity recognition (NER) system uses trigram model of HMM for considering variable length of NEs. However, the trigram model of HMM has a serious data sparseness problem. In order to solve the problem, we use multi-level back-offs. Experimental results show that our NER system can achieve an F-measure of 87.6% in the economic articles.
international conference on computational linguistics | 2003
Bo-Hyun Yun; Tae-Hyun Kim; Yi-Gyu Hwang; Pal-Jin Lee; Seung-Shik Kang
Information extraction is to extract information about the main events in the text. This paper presents an event sentence extraction method in Korean newspapers for information extraction. Event sentences contain meaningful information such as the agent, the time and the place of an event. To extract these sentences, we acquire various features such as verbs, nouns, noun phrases, 3Ws, and their weights. And then, the system computes weights of sentences and extracts event sentences by our extraction algorithm. The experimental result shows the average precision of 86.1%.
Lecture Notes in Computer Science | 2004
Hyun-Kyu Kang; Yi-Gyu Hwang; Pum-Mo Ryu
This paper presents an effective concept-based document classification system, which can efficiently classify Korean documents through the thesaurus tool. The thesaurus tool is the information extractor that acquires the meanings of document terms from the thesaurus. It supports effective document classification with the acquired meanings. The system uses the concept-probability vector to represent the meanings of the terms. Because the category of the document depends on the meanings than the terms, even though the size of the vector is small, the system can classify the document without degradation of the performance. The system uses the small concept-probability vector so that it can save the time and space for document classification. The experimental results suggest that the presented system with the thesaurus tool can effectively classify the documents.
knowledge and systems engineering | 2009
Hyo-Jung Oh; Chung-Hee Lee; Yi-Gyu Hwang; Jeong Hur; Myung-Gil Jang
For question answering, the multi-source approach is justifiable especially when different sources provide different types of knowledge. In this paper, a variety of question and answer types are revealed. The key point this paper addresses under the framework of extensible QA is efficient and consonant usage of a number of distinct QA techniques for improving the answer confidence. To prove the extensibility of the proposed model, we started with four QA modules first and added two specialty modules to the system one by one. As a result, both effectiveness and efficiency are improved as we employ more specialized QA modules.
text speech and dialogue | 2007
Hyo-Jung Oh; Chung-Hee Lee; Yi-Gyu Hwang; Myung-Gil Jang
In this paper, we introduce a practical spoken dialogue interface for intelligent TV based on goal-oriented dialogue modeling. It uses a frame structure for representing the user intention and determining the next action. To analyze discourse context, we employ several statistical learning techniques and device an incremental dialogue strategy learning method from training corpus. By empirical experiments, we demonstrated the efficiency of the proposed system. In case of the subjective evaluation, we obtained 73% user satisfaction ratio, while the objective evaluation result was over 90% in case of a restricted situation for commercialization.
information reuse and integration | 2006
Hyo-Jung Oh; Yi-Gyu Hwang; Chung-Hee Lee; Changki Lee; Ji-Hyun Wang; Hyeon-Jin Kim; Myung-Gil Jang
In this paper, we attempt integration approach of various QA engines based on heterogeneous answer extraction methods. Our QA system is based on Web encyclopedia in Korean. We investigate characteristics of the encyclopedia and incorporate them in our answer acquisition methods. We defined four different types of QA engines depending on answer-class: knowledge-base QA, record QA, descriptive QA, and general QA. We describe how our proposed model was applied to the QA system with reviewing a set of experimental results
Lecture Notes in Computer Science | 2006
Changki Lee; Yi-Gyu Hwang; Hyo-Jung Oh; Soojong Lim; Jeong Heo; Chung-Hee Lee; Hyeon-Jin Kim; Ji-Hyun Wang; Myung-Gil Jang