Jin-Ji Li
Pohang University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jin-Ji Li.
international joint conference on natural language processing | 2009
Jungi Kim; Jin-Ji Li; Jong-Hyeok Lee
This paper describes an approach to utilizing term weights for sentiment analysis tasks and shows how various term weighting schemes improve the performance of sentiment analysis systems. Previously, sentiment analysis was mostly studied under data-driven and lexicon-based frameworks. Such work generally exploits textual features for fact-based analysis tasks or lexical indicators from a sentiment lexicon. We propose to model term weighting into a sentiment analysis system utilizing collection statistics, contextual and topic-related characteristics as well as opinion-related properties. Experiments carried out on various datasets show that our approach effectively improves previous methods.
workshop on statistical machine translation | 2009
Jin-Ji Li; Jungi Kim; Dongil Kim; Jong-Hyeok Lee
Chinese and Korean belong to different language families in terms of word-order and morphological typology. Chinese is an SVO and morphologically poor language while Korean is an SOV and morphologically rich one. In Chinese-to-Korean SMT systems, systematic differences between the verbal systems of the two languages make the generation of Korean verbal phrases difficult. To resolve the difficulties, we address two issues in this paper. The first issue is that the verb position is different from the viewpoint of word-order typology. The second is the difficulty of complex morphology generation of Korean verbs from the viewpoint of morphological typology. We propose a Chinese syntactic reordering that is better at generating Korean verbal phrases in Chinese-to-Korean SMT. Specifically, we consider reordering rules targeting Chinese verb phrases (VPs), preposition phrases (PPs), and modality-bearing words that are closely related to Korean verbal phrases. We verify our system with two corpora of different domains. Our proposed approach significantly improves the performance of our system over a baseline phrased-based SMT system. The relative improvements in the two corpora are +9.32% and +5.43%, respectively.
International Journal of Computer Processing of Languages | 2005
Jin-Ji Li; Ji-Eun Roh; Dongil Kim; Jong-Hyeok Lee
To generate a proper Korean predicate, a natural modal expression is the most important factor for a machine translation (MT) system. Tense, aspect, mood, negation, and voice are the major constituents related to modal expression. The linguistic encoding of a modal expression is quite different between Chinese and Korean in terms of linguistic typology and genealogy. In this paper, a new applicable categorization of Korean modality system viz. tense, aspect, mood, negation, and voice, will be proposed through a contrastive analysis of Chinese and Korean from the viewpoint of a practical MT system. In order to precisely determine the modal expression, effective feature selection frameworks for Chinese are presented with a variety of machine learning methods. As a result, our proposed approach achieved an accuracy of 83.10%.
International Journal of Computer Processing of Languages | 2003
Dongil Kim; Cui Zheng; Jin-Ji Li; Jong-Hyeok Lee
We propose a new prototype Chinese-to-Korean machine translation system called TOTAL-CK, a prototype transfer based MT system designed for the large-scale practical domain. TOTAL-CK consists of the components of analysis, transfer, and generation. In this paper, we mainly discuss the transfer issues resulting from stylistic structural differences between Chinese and Korean. The dependency grammar formalism is employed for Chinese parsed trees and their equivalent Korean parsed trees. We deal with structural transfer ambiguities that arise when a given source language syntactic pattern is potentially translated by a number of different target language syntactic patterns. Also, structural transfer ambiguities are the major complex transfer problem appearing during dependency transferring, in an SVO pattern that most frequently occurs in Chinese sentences. There are three types of structural transfer ambiguity: transitive transfer, subject-governing transfer and verb-governing transfer. They are discovered by analyzing the ambiguity and the related formalism. Finally, we present a thesaurus-based disambiguation approach for dependency structural transfer, which is adopted in TOTAL-CK. Our approach achieves 94.6% in precision for SVO sentences and 93.5% for overall sentences.
international conference on the computer processing of oriental languages | 2006
Xue-Mei Bai; Jin-Ji Li; Dongil Kim; Jong-Hyeok Lee
In general, there are two types of noun phrases (NP): Base Noun Phrase (BNP), and Maximal-Length Noun Phrase (MNP). MNP identification can largely reduce the complexity of full parsing, help analyze the general structure of complex sentences, and provide important clues for detecting main predicates in Chinese sentences. In this paper, we propose a 2-phase hybrid approach for MNP identification which adopts salient features such as expanded chunks and classified punctuations to improve performance. Experimental result shows a high quality performance of 89.66% in F1-measure.
International Journal of Computer Processing of Languages | 2004
Dongil Kim; Jin-Ji Li; Jong-Hyeok Lee
Recent MT paradigms, such as example-based MT, pattern-based MT, and hybrid MT contribute to overcoming the shortcoming of rule-based MT by improving the quality of translations. However, a problem with erroneous selection of examples or patterns also exists in these types of MTs. In this paper, we discover a matching abnormality, which we term an Over-Specification Problem that occurs when an input sentence corresponding to the pattern of general sense is actually matched to the patterns of exceptional sense, and thus is incorrectly translated. We formally define the problem and propose a solution based on the similarity of dependency trees and lexical association ratio to detect whether the input tree is dragged into the exceptional range of example trees or not. Finally, promising experimental results are provided. We take a hybrid MT as a testing MT and Chinese-to-Korean as a sample language pair.
meeting of the association for computational linguistics | 2010
Jungi Kim; Jin-Ji Li; Jong-Hyeok Lee
language resources and evaluation | 2008
Jin-Ji Li; Dongil Kim; Jong-Hyeok Lee
NTCIR | 2011
Hwidong Na; Jin-Ji Li; Se-Jong Kim; Jong-Hyeok Lee
Journal of KIISE:Software and Applications | 2010
Han-Kyong Kim; Hwidong Na; Jin-Ji Li; Jong-Hyeok Lee