Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hae Chang Rim is active.

Publication


Featured researches published by Hae Chang Rim.


empirical methods in natural language processing | 2008

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models

Jung Tae Lee; Sang Bum Kim; Young In Song; Hae Chang Rim

Lexical gaps between queries and questions (documents) have been a major issue in question retrieval on large online question and answer (Q&A) collections. Previous studies address the issue by implicitly expanding queries with the help of translation models pre-constructed using statistical techniques. However, since it is possible for unimportant words (e.g., non-topical words, common words) to be included in the translation models, a lack of noise control on the models can cause degradation of retrieval performance. This paper investigates a number of empirical methods for eliminating unimportant words in order to construct compact translation models for retrieval purposes. Experiments conducted on a real world Q&A collection show that substantial improvements in retrieval performance can be achieved by using compact translation models.


empirical methods in natural language processing | 2014

Joint Relational Embeddings for Knowledge-based Question Answering

Min Chul Yang; Nan Duan; Ming Zhou; Hae Chang Rim

Transforming a natural language (NL) question into a corresponding logical form (LF) is central to the knowledge-based question answering (KB-QA) task. Unlike most previous methods that achieve this goal based on mappings between lexicalized phrases and logical predicates, this paper goes one step further and proposes a novel embedding-based approach that maps NL-questions into LFs for KBQA by leveraging semantic associations between lexical representations and KBproperties in the latent space. Experimental results demonstrate that our proposed method outperforms three KB-QA baseline methods on two publicly released QA data sets.


meeting of the association for computational linguistics | 2009

The Contribution of Stylistic Information to Content-based Mobile Spam Filtering

Dae Neung Sohn; Jung Tae Lee; Hae Chang Rim

Content-based approaches to detecting mobile spam to date have focused mainly on analyzing the topical aspect of a SMS message (what it is about) but not on the stylistic aspect (how it is written). In this paper, as a preliminary step, we investigate the utility of commonly used stylistic features based on shallow linguistic analysis for learning mobile spam filters. Experimental results show that the use of stylistic information is potentially effective for enhancing the performance of the mobile spam filters.


international acm sigir conference on research and development in information retrieval | 2012

Finding interesting posts in Twitter based on retweet graph analysis

Min Chul Yang; Jung Tae Lee; Seung Wook Lee; Hae Chang Rim

Millions of posts are being generated in real-time by users in social networking services, such as Twitter. However, a considerable number of those posts are mundane posts that are of interest to the authors and possibly their friends only. This paper investigates the problem of automatically discovering valuable posts that may be of potential interest to a wider audience. Specifically, we model the structure of Twitter as a graph consisting of users and posts as nodes and retweet relations between the nodes as edges. We propose a variant of the HITS algorithm for producing a static ranking of posts. Experimental results on real world data demonstrate that our method can achieve better performance than several baseline methods.


Information Processing and Management | 2005

Improving query translation in English-Korean cross-language information retrieval

Hee Cheol Seo; Sang Bum Kim; Hae Chang Rim; Sung Hyon Myaeng

Query translation is a viable method for cross-language information retrieval (CLIR), but it suffers from translation ambiguities caused by multiple translations of individual query terms. Previous research has employed various methods for disambiguation, including the method of selecting an individual target query term from multiple candidates by comparing their statistical associations with the candidate translations of other query terms. This paper proposes a new method where we examine all combinations of target query term translations corresponding to the source query terms, instead of looking at the candidates for each query term and selecting the best one at a time. The goodness value for a combination of target query terms is computed based on the association value between each pair of the terms in the combination. We tested our method using the NTCIR-3 English Korean-CLIR test collection. The results show some improvements regardless of the association measures we used.


Pattern Recognition Letters | 2012

Content-based mobile spam classification using stylistically motivated features

Dae Neung Sohn; Jung Tae Lee; Kyoung-Soo Han; Hae Chang Rim

The feature of brevity in mobile phone messages makes it difficult to distinguish lexical patterns to identify spam. This paper proposes a novel approach to spam classification of extremely short messages using not only lexical features that reflect the content of a message but new stylistic features that indicate the manner in which the message is written. Experiments on two mobile phone message collections in two different languages show that the approach outperforms previous content-based approaches significantly, regardless of language.


meeting of the association for computational linguistics | 2009

Bridging Morpho-Syntactic Gap between Source and Target Sentences for English-Korean Statistical Machine Translation

Gumwon Hong; Seung Wook Lee; Hae Chang Rim

Often, Statistical Machine Translation (SMT) between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessary function words, or by reordering source sentences. However, the removal of function words can cause a serious loss in information. In this paper, we present a possible method of bridging the morpho-syntactic gap for English-Korean SMT. In particular, the proposed method tries to transform a source sentence by inserting pseudo words, and by reordering the sentence in such a way that both sentences have a similar length and word order. The proposed method achieves 2.4 increase in BLEU score over baseline phrase-based system.


asia information retrieval symposium | 2008

A model for evaluating the quality of user-created documents

Linh Hoang; Jung Tae Lee; Young In Song; Hae Chang Rim

In this paper, we propose a model for evaluating the quality of general user-created documents. The model is based on supervised classification approach, in which output scores are considered as quality of given document. In order to utilize both textual and nontextual attributes of documents, we incorporated a number of objectively measurable, real-valued features selected upon predefined criteria for quality. Experiments on two datasets of real world documents show that textual features are stable indicators for evaluating documents quality. Some features are inferred to be effective for general kinds of documents.


IEEE Transactions on Audio, Speech, and Language Processing | 2009

Probabilistic Modeling of Korean Morphology

Do Gil Lee; Hae Chang Rim

This paper proposes new probabilistic models for analyzing Korean morphology. In order to take advantage of the characteristics of Korean morphology, the proposed models are based on three linguistic units: eojeol (a Korean spacing unit), morpheme, and syllable. Unlike previous approaches that are based on rules and dictionaries, the probabilistic approach proposed in this study can automatically acquire complete linguistic knowledge from part-of-speech (POS) tagged corpora. In addition, this approach, without any system modification, is easily applicable to other corpora with different tag sets and annotation guidelines. The three different models and their combinations are evaluated on three corpora over a wide range of conditions. The eojeol-unit and syllable-unit models compensate for the weaknesses of the morpheme-unit model. The eojeol-unit model performed efficiently, and improved the precision. The syllable-unit model improved in precision as well, showing a particularly robust performance in treating unknown words. The proposed approach is also proven to outperform the previous approaches.


intelligent information systems | 2008

A novel retrieval approach reflecting variability of syntactic phrase representation

Young In Song; Kyoung-Soo Han; Sang Bum Kim; So Young Park; Hae Chang Rim

In this paper, we introduce variability of syntactic phrases and propose a new retrieval approach reflecting the variability of syntactic phrase representation. With variability measure of a phrase, we can estimate how likely a phrase in a given query would appear in relevant documents and control the impact of syntactic phrases in a retrieval model. Various experimental results over different types of queries and document collections show that our retrieval model based on variability of syntactic phrases is very effective in terms of retrieval performance, especially for long natural language queries.

Collaboration


Dive into the Hae Chang Rim's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge