Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jungi Kim is active.

Publication


Featured researches published by Jungi Kim.


Information Processing and Management | 2007

Cluster-based patent retrieval

In-Su Kang; Seung-Hoon Na; Jungi Kim; Jong-Hyeok Lee

Through the recent NTCIR workshops, patent retrieval casts many challenging issues to information retrieval community. Unlike newspaper articles, patent documents are very long and well structured. These characteristics raise the necessity to reassess existing retrieval techniques that have been mainly developed for structure-less and short documents such as newspapers. This study investigates cluster-based retrieval in the context of invalidity search task of patent retrieval. Cluster-based retrieval assumes that clusters would provide additional evidence to match users information need. Thus far, cluster-based retrieval approaches have relied on automatically-created clusters. Fortunately, all patents have manually-assigned cluster information, international patent classification codes. International patent classification is a standard taxonomy for classifying patents, and has currently about 69,000 nodes which are organized into a five-level hierarchical system. Thus, patent documents could provide the best test bed to develop and evaluate cluster-based retrieval techniques. Experiments using the NTCIR-4 patent collection showed that the cluster-based language model could be helpful to improving the cluster-less baseline language model.


international joint conference on natural language processing | 2009

Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis

Jungi Kim; Jin-Ji Li; Jong-Hyeok Lee

This paper describes an approach to utilizing term weights for sentiment analysis tasks and shows how various term weighting schemes improve the performance of sentiment analysis systems. Previously, sentiment analysis was mostly studied under data-driven and lexicon-based frameworks. Such work generally exploits textual features for fact-based analysis tasks or lexical indicators from a sentiment lexicon. We propose to model term weighting into a sentiment analysis system utilizing collection statistics, contextual and topic-related characteristics as well as opinion-related properties. Experiments carried out on various datasets show that our approach effectively improves previous methods.


workshop on statistical machine translation | 2009

Chinese Syntactic Reordering for Adequate Generation of Korean Verbal Phrases in Chinese-to-Korean SMT

Jin-Ji Li; Jungi Kim; Dongil Kim; Jong-Hyeok Lee

Chinese and Korean belong to different language families in terms of word-order and morphological typology. Chinese is an SVO and morphologically poor language while Korean is an SOV and morphologically rich one. In Chinese-to-Korean SMT systems, systematic differences between the verbal systems of the two languages make the generation of Korean verbal phrases difficult. To resolve the difficulties, we address two issues in this paper. The first issue is that the verb position is different from the viewpoint of word-order typology. The second is the difficulty of complex morphology generation of Korean verbs from the viewpoint of morphological typology. We propose a Chinese syntactic reordering that is better at generating Korean verbal phrases in Chinese-to-Korean SMT. Specifically, we consider reordering rules targeting Chinese verb phrases (VPs), preposition phrases (PPs), and modality-bearing words that are closely related to Korean verbal phrases. We verify our system with two corpora of different domains. Our proposed approach significantly improves the performance of our system over a baseline phrased-based SMT system. The relative improvements in the two corpora are +9.32% and +5.43%, respectively.


international conference on the computer processing of oriental languages | 2009

Found in Translation: Conveying Subjectivity of a Lexicon of One Language into Another Using a Bilingual Dictionary and a Link Analysis Algorithm

Jungi Kim; Hun-Young Jung; Sang-Hyeob Nam; Yeha Lee; Jong-Hyeok Lee

This paper proposes a method that automatically creates a subjectivity lexicon in a new language using a subjectivity lexicon in a resource---rich language with only a bilingual dictionary. We resolve some of the difficulties in selecting appropriate senses when translating lexicon, and present a framework that sequentially applies an iterative link analysis algorithm to enhance the quality of lexicons of both the source and target languages. The experimental results have empirically shown to improve the subjectivity lexicon in the source language as well as create a good quality lexicon in a new language.


International Journal of Computer Processing of Languages | 2009

Conveying Subjectivity of a Lexicon of One Language into Another Using a Bilingual Dictionary and a Link Analysis Algorithm

Jungi Kim; Hun-Young Jung; Yeha Lee; Jong-Hyeok Lee

This paper proposes a method that automatically creates a sentiment lexicon in a new language using a sentiment lexicon in a resource–rich language with only a bilingual dictionary. We resolve some of the difficulties in selecting appropriate senses when translating lexicon, and present a framework that sequentially applies an iterative link analysis algorithm to enhance the quality of lexicons of both the source and target languages. The experimental results have empirically shown to improve the sentiment lexicon in the source language as well as create a good quality lexicon in the new language.


international conference on the computer processing of oriental languages | 2006

Cluster-based patent retrieval using international patent classification system

Jungi Kim; In-Su Kang; Jong-Hyeok Lee

A patent collection provides a great test-bed for cluster-based information retrieval. International Patent Classification (IPC) system provides a hierarchical taxonomy with 5 levels of specificity. We regard IPC codes of patent applications as cluster information, manually assigned by patent officers according to their subjects. Such manual cluster provides advantages over auto-matically built clusters using document term similarities. There are previous researches that successfully apply cluster-based retrieval models using language modeling. We develop cluster-based language models that employ advantages of having manually clustered documents.


international conference on the computer processing of oriental languages | 2009

Extracting Domain-Dependent Semantic Orientations of Latent Variables for Sentiment Classification

Yeha Lee; Jungi Kim; Jong-Hyeok Lee

Sentiment analysis of weblogs is a challenging problem. Most previous work utilized semantic orientations of words or phrases to classify sentiments of weblogs. The problem with this approach is that semantic orientations of words or phrases are investigated without considering the domain of weblogs. Weblogs contain the authors various opinions about multifaceted topics. Therefore, we have to treat a semantic orientation domain-dependently. In this paper, we present an unsupervised learning model based on aspect model to classify sentiments of weblogs. Our model utilizes domain-dependent semantic orientations of latent variables instead of words or phrases, and uses them to classify sentiments of weblogs. Experiments on several domains confirm that our model assigns domain-dependent semantic orientations to latent variables correctly, and classifies sentiments of weblogs effectively.


international acm sigir conference on research and development in information retrieval | 2008

Exploiting proximity feature in bigram language model for information retrieval

Seung-Hoon Na; Jungi Kim; In-Su Kang; Jong-Hyeok Lee

Language modeling approaches have been effectively dealing with the dependency among query terms based on N-gram such as bigram or trigram models. However, bigram language models suffer from adjacency-sparseness problem which means that dependent terms are not always adjacent in documents, but can be far from each other, sometimes with distance of a few sentences in a document. To resolve the adjacency-sparseness problem, this paper proposes a new type of bigram language model by explicitly incorporating the proximity feature between two adjacent terms in a query. Experimental results on three test collections show that the proposed bigram language model significantly improves previous bigram model as well as Taos approach, the state-of-art method for proximity-based method.


international conference on the computer processing of oriental languages | 2009

Partially Supervised Phrase-Level Sentiment Classification

Sang-Hyob Nam; Seung-Hoon Na; Jungi Kim; Yeha Lee; Jong-Hyeok Lee

This paper presents a new partially supervised approach to phrase-level sentiment analysis that first automatically constructs a polarity-tagged corpus and then learns sequential sentiment tag from the corpus. This approach uses only sentiment sentences which are readily available on the Internet and does not use a polarity-tagged corpus which is hard to construct manually. With this approach, the system is able to automatically classify phrase-level sentiment. The result shows that a system can learn sentiment expressions without a polarity-tagged corpus.


text retrieval conference | 2008

KLE at TREC 2008 Blog Track: Blog Post and Feed Retrieval

Yeha Lee; Seung-Hoon Na; Jungi Kim; Sang-Hyob Nam; Hun-Young Jung; Jong-Hyeok Lee

Collaboration


Dive into the Jungi Kim's collaboration.

Top Co-Authors

Avatar

Jong-Hyeok Lee

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yeha Lee

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Seung-Hoon Na

Electronics and Telecommunications Research Institute

View shared research outputs
Top Co-Authors

Avatar

Hun-Young Jung

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

In-Su Kang

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Jin-Ji Li

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Sang-Hyob Nam

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Sang-Hyeob Nam

Pohang University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Dongil Kim

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Yong-Hun Lee

Pohang University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge