Is this you? Create Your Porfile

Sun Park

Chonbuk National University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sun Park is active.

Explore More

Publication

Featured researches published by Sun Park.

international conference on neural information processing | 2009

Document Clustering with Cluster Refinement and Non-negative Matrix Factorization

Sun Park; Dong Un An; ByungRea Char; Chul-Won Kim

Document clustering is an important method for document analysis and is used in many different information retrieval applications. This paper proposes a new document clustering method using the clustering method based NMF (Non-negative Matrix Factorization) and refinement of documents in clusters by using coherence of cluster. The proposed method can improve the quality of document clustering because the re-assigned documents in cluster by using coherence of cluster based similarity between documents, the semantic feature matrix and the semantic variable matrix, which is used in document clustering, can represent an inherent structure of document set better. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

Iete Technical Review | 2010

Automatic Multi-document Summarization Based on Clustering and Nonnegative Matrix Factorization

Sun Park; ByungRea Cha; Dong Un An

Abstract In this paper, a novel summarization method that uses nonnegative matrix factorization (NMF) and the clustering method is introduced to extract meaningful sentences relevant to a given query. The proposed method decomposes a sentence into the linear combination of sparse nonnegative semantic features so that it can represent a sentence as the sum of a few semantic features that are comprehensible intuitively. It can improve the quality of document summaries because it can avoid extracting those sentences whose similarities with the query are high but that are meaningless by using the similarity between the query and the semantic features. In addition, the proposed approach uses the clustering method to remove noise and avoid the biased inherent semantics of the documents being reflected in summaries. The method can ensure the coherence of summaries by using the rank score of sentences with respect to semantic features. The experimental results demonstrate that the proposed method has better performance than other methods that use the thesaurus, the latent semantic analysis (LSA), the K-means, and the NMF.

Journal of information and communication convergence engineering | 2013

Enhancing Text Document Clustering Using Non-negative Matrix Factorization and WordNet

Chul-Won Kim; Sun Park

A classic document clustering technique may incorrectly classify documents into different clusters when documents that should belong to the same cluster do not have any shared terms. Recently, to overcome this problem, internal and external knowledge-based approaches have been used for text document clustering. However, the clustering results of these approaches are influenced by the inherent structure and the topical composition of the documents. Further, the organization of knowledge into an ontology is expensive. In this paper, we propose a new enhanced text document clustering method using non-negative matrix factorization (NMF) and WordNet. The semantic terms extracted as cluster labels by NMF can represent the inherent structure of a document cluster well. The proposed method can also improve the quality of document clustering that uses cluster labels and term weights based on term mutual information of WordNet. The experimental results demonstrate that the proposed method achieves better performance than the other text clustering methods.

Archive | 2013

Enhancing Document Clustering Using Reweighting Terms Based on Semantic Features

Sun Park; Jin Gwan Park; Min A Jeong; Jong Geun Jeong; Yeonwoo Lee; Seong Ro Lee

This paper proposes a new document clustering method using the reweighted term based on semantic features for enhancing document clustering. The proposed method uses document samples of cluster by user to reduce the semantic gap between the user’s requirement and clustering results by machine. The method can enhance the document clustering because it uses the reweighted term which can well represent an inherent structure of document set relevant to a user’s requirement. The experimental results demonstrate that the proposed method achieves better performance than related document clustering methods.

Journal of information and communication convergence engineering | 2013

Document Clustering Using Semantic Features and Fuzzy Relations

Chul-Won Kim; Sun Park

Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

The Journal of Korean Institute of Communications and Information Sciences | 2011

Red Tide Algae Recognition using PCA and Roundness

Sun Park; Yeonwoo Lee; Min-A Jeong; Seong-Ro Lee

Red tide is a natural phenomenon that change sea color by harmful algal blooms. There have been many studies on red tide due to increasing of red tide damage. However, to automatically classify the red tide algae is not enough. Recognition of red tide algae is difficult because they do not have matching center features for recognizing algae image object. Previously studies are used a few type of red tide algae for classification. In this paper, we proposed the red tide algae recognition method using PCA and roundness of image objects.

international conference on information systems, technology and management | 2010

Document Clustering with Semantic Features and Fuzzy Association

Sun Park; Dong Un An; ByungRea Cha; Chul-Won Kim

This paper proposes a new document clustering method using the semantic features and fuzzy association. The proposed method can improve the quality of document clustering because the clustered documents by using fuzzy association values to distinguish well dissimilar documents, the selected cluster label term by semantic features, which is used in document clustering, can represent an inherent structure of document set better. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

The Kips Transactions:partb | 2010

Document Clustering Method using PCA and Fuzzy Association

Sun Park; Dongun An

ABSTRACT This paper proposes a new document clustering method using PCA and fuzzy association. The proposed method can represent an inherent structure of document clusters better since it select the cluster label and terms of representing cluster by semantic features based on PCA. Also it can improve the quality of document clustering because the clustered documents by using fuzzy association values distinguish well dissimilar documents in clusters. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.Keywords:Document Clustering, Principal Componet Analysis, Semantic Features, Fuzzy Association 1. 서 론 1) 근래의 정보 검색 분야에는 사용자의 요구사항을 만족시키기 위하여 다양한 정보를 효율적으로 처리할 수 있는 문서의 범주화에 대해서 많은 연구가 있다. 문서의 범주화는 대량의 문서들을 각각의 문서의 특성 및 주제에 맞게 분류하는 것으로, 사전에 학습이 필요한 지도학습방법인 문서분류와 학습이 필요 없는 비지도학습 방법의 문서군집으로 구분할 수 있다[4]. 전통적인 군집방법은 분할기반 방법, 계층적 기반 방법, 밀도기반 방법, 격자 기반 방법으로 분류 할 수 있다. 이들 대부분의 방법들은 거리 기반의 목적 함수를 사용하기 때문에 고차원의 객체들을 군집하는 것에는 비효율적이다. 이중에서 대표적인 군집방법으로는, 군집을 생성하는 방법에 따라서 k개의 군집을 임의로 정하여 군집을 확장해가는 비계

Journal of information and communication convergence engineering | 2009