Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sung Hyon Myaeng is active.

Publication


Featured researches published by Sung Hyon Myaeng.


IEEE Transactions on Knowledge and Data Engineering | 2006

Some Effective Techniques for Naive Bayes Text Classification

Sang-Bum Kim; Kyoung-Soo Han; Hae-Chang Rim; Sung Hyon Myaeng

While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem. Based on the observation of naive Bayes for the natural language text, we found a serious problem in the parameter estimation process, which causes poor results in text classification domain. In this paper, we propose two empirical heuristics: per-document text normalization and feature weighting method. While these are somewhat ad hoc methods, our proposed naive Bayes text classifier performs very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM


Information Processing and Management | 2007

A probabilistic music recommender considering user opinions and audio features

Qing Li; Sung Hyon Myaeng; Byeong Man Kim

A recommender system has an obvious appeal in an environment where the amount of on-line information vastly outstrips any individuals capability to survey. Music recommendation is considered a popular application area. In order to make personalized recommendations, many collaborative music recommender systems (CMRS) focus on capturing precise similarities among users or items based on user historical ratings. Despite the valuable information from audio features of music itself, however, few studies have investigated how to utilize information extracted directly from music for personalized recommendation in CMRS. In this paper, we describe a CMRS based on our proposed item-based probabilistic model, where items are classified into groups and predictions are made for users considering the Gaussian distribution of user ratings. In addition, this model has been extended for improved recommendation performance by utilizing audio features that help alleviate three well-known problems associated with data sparseness in collaborative recommender systems: user bias, non-association, and cold start problems in capturing accurate similarities among items. Experimental results based on two real-world data sets lead us to believe that content information is crucial in achieving better personalized recommendation beyond user ratings. We further show how primitive audio features can be combined into aggregate features for the proposed CRMS and analyze their influences on recommendation performance. Although this model was developed originally for music collaborative recommendation based on audio features, our experiment with the movie data set demonstrates that it can be applied to other domains.


Computer Speech & Language | 2004

Unsupervised word sense disambiguation using WordNet relatives

Hee Cheol Seo; Hoojung Chung; Hae Chang Rim; Sung Hyon Myaeng; Soo Hong Kim

This paper describes a sense disambiguation method for a polysemous target noun using the context words surrounding the target noun and its WordNet relatives, such as synonyms, hypernyms and hyponyms. The result of sense disambiguation is a relative that can substitute for that target noun in a context. The selection is made based on co-occurrence frequency between candidate relatives and each word in the context. Since the co-occurrence frequency is obtainable from a raw corpus, the method is considered to be an unsupervised learning algorithm and therefore does not require a sense-tagged corpus. In a series of experiments using SemCor and the corpus of SENSEVAL-2 lexical sample task, all in English, and using some Korean data, the proposed method was shown to be very promising. In particular, its performance was superior to that of the other approaches evaluated on the same test corpora.


Information Sciences | 2007

Semantic passage segmentation based on sentence topics for question answering

Hyo-Jung Oh; Sung Hyon Myaeng; Myung-Gil Jang

We propose a semantic passage segmentation method for a Question Answering (QA) system. We define a semantic passage as sentences grouped by semantic coherence, determined by the topic assigned to individual sentences. Topic assignments are done by a sentence classifier based on a statistical classification technique, Maximum Entropy (ME), combined with multiple linguistic features. We ran experiments to evaluate the proposed method and its impact on application tasks, passage retrieval and template-filling for question answering. The experimental result shows that our semantic passage retrieval method using topic matching is more useful than fixed length passage retrieval. With the template-filling task used for information extraction in the QA system, the value of the sentence topic assignment method was reinforced.


pacific rim international conference on artificial intelligence | 2006

A hybrid mood classification approach for blog text

Yuchul Jung; Hogun Park; Sung Hyon Myaeng

As an effort to detect the mood of a blog, regardless of the length and writing style, we propose a hybrid approach to detecting blog texts mood, which incorporates commonsense knowledge obtained from the general public (ConceptNet) and the Affective Norms English Words (ANEW) list. Our approach picks up blog texts unique features and compute simple statistics such as term frequency, n-gram, and point-wise mutual information (PMI) for the SVM classification method. In addition, to catch mood transitions in a given blog text, we developed a paragraph-level segmentation based on a mood flow analysis using a revised version of the GuessMood operation of ConceptNet and an ANEW-based affective sensing module. For evaluation, a mood corpus comprised of real blog texts has been built semi-automatically. Our experiments using the corpus show meaningful results for 4 mood types: happy, sad, angry, and fear.


Information Processing and Management | 2005

Improving query translation in English-Korean cross-language information retrieval

Hee Cheol Seo; Sang Bum Kim; Hae Chang Rim; Sung Hyon Myaeng

Query translation is a viable method for cross-language information retrieval (CLIR), but it suffers from translation ambiguities caused by multiple translations of individual query terms. Previous research has employed various methods for disambiguation, including the method of selecting an individual target query term from multiple candidates by comparing their statistical associations with the candidate translations of other query terms. This paper proposes a new method where we examine all combinations of target query term translations corresponding to the source query terms, instead of looking at the candidates for each query term and selecting the best one at a time. The goodness value for a combination of target query terms is computed based on the association value between each pair of the terms in the combination. We tested our method using the NTCIR-3 English Korean-CLIR test collection. The results show some improvements regardless of the association measures we used.


asia information retrieval symposium | 2005

A probabilistic model for music recommendation considering audio features

Qing Li; Sung Hyon Myaeng; Dong Hai Guan; Byeong Man Kim

In order to make personalized recommendations, many collaborative music recommender systems (CMRS) focused on capturing precise similarities among users or items based on user historical ratings. Despite the valuable information from audio features of music itself, however, few studies have investigated how to directly extract and utilize information from music for personalized recommendation in CMRS. In this paper, we describe a CMRS based on our proposed item-based probabilistic model, where items are classified into groups and predictions are made for users considering the Gaussian distribution of user ratings. By utilizing audio features, this model provides a way to alleviate three well-known challenges in collaborative recommender systems: user bias, non-association, and cold start problems in capturing accurate similarities among items. Experiments on a real-world data set illustrate that the audio information of music is quite useful and our system is feasible to integrate it for better personalized recommendation.


Information Processing and Management | 2007

Use of place information for improved event tracking

Yun Jin; Sung Hyon Myaeng; Yuchul Jung

The main purpose of topic detection and tracking (TDT) is to detect, group, and organize newspaper articles reporting on the same event. Since an event is a reported occurrence at a specific time and place and the unavoidable consequences, TDT can benefit from an explicit use of time and place information. In this work, we focused on place information, using time information as in the previous research. News articles were analyzed for their characteristics of place information, and a new topic tracking method was proposed to incorporate the analysis results on place information. Experiments show that appropriate use of place information extracted automatically from news articles indeed helps event tracking that identify news articles reporting on the same events.


ACM Transactions on Asian Language Information Processing | 2004

Usefulness of temporal information automatically extracted from news articles for topic tracking

Pyung Kim; Sung Hyon Myaeng

Temporal information plays an important role in natural language processing (NLP) applications such as information extraction, discourse analysis, automatic summarization, and question-answering. In the topic detection and tracking (TDT) area, the temporal information often used is the publication date of a message, which is readily available but limited in its usefulness. We developed a relatively simple NLP method for extracting temporal information from Korean news articles, with the goal of improving performance of TDT tasks. To extract temporal information, we make use of finite state automata and a lexicon containing timerevealing vocabulary. Extracted information is converted into a canonicalized representation of a time point or a time duration. We first evaluated and investigated the extraction and canonicalization methods for their accuracy and the extent to which temporal information extracted as such can help TDT tasks. The experimental results show that time information extracted from the text does indeed help to significantly improve both precision and recall.


Information Processing and Management | 2012

A novel term weighting scheme based on discrimination power obtained from past retrieval results

Sa-Kwang Song; Sung Hyon Myaeng

Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a terms role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A terms evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF^*IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF^*IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.

Collaboration


Dive into the Sung Hyon Myaeng's collaboration.

Top Co-Authors

Avatar

Yuchul Jung

Electronics and Telecommunications Research Institute

View shared research outputs
Top Co-Authors

Avatar

Yun Jin

Chungnam National University

View shared research outputs
Top Co-Authors

Avatar

Hyo-Jung Oh

Electronics and Telecommunications Research Institute

View shared research outputs
Top Co-Authors

Avatar

Myung-Gil Jang

Electronics and Telecommunications Research Institute

View shared research outputs
Top Co-Authors

Avatar

Qing Li

Southwestern University of Finance and Economics

View shared research outputs
Top Co-Authors

Avatar

Byeong Man Kim

Kumoh National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Chang-Hoo Jeong

Korea Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Gary Geunbae Lee

Pohang University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge