Sun Park
Chonbuk National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sun Park.
international conference on neural information processing | 2009
Sun Park; Dong Un An; ByungRea Char; Chul-Won Kim
Document clustering is an important method for document analysis and is used in many different information retrieval applications. This paper proposes a new document clustering method using the clustering method based NMF (Non-negative Matrix Factorization) and refinement of documents in clusters by using coherence of cluster. The proposed method can improve the quality of document clustering because the re-assigned documents in cluster by using coherence of cluster based similarity between documents, the semantic feature matrix and the semantic variable matrix, which is used in document clustering, can represent an inherent structure of document set better. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.
Iete Technical Review | 2010
Sun Park; ByungRea Cha; Dong Un An
Abstract In this paper, a novel summarization method that uses nonnegative matrix factorization (NMF) and the clustering method is introduced to extract meaningful sentences relevant to a given query. The proposed method decomposes a sentence into the linear combination of sparse nonnegative semantic features so that it can represent a sentence as the sum of a few semantic features that are comprehensible intuitively. It can improve the quality of document summaries because it can avoid extracting those sentences whose similarities with the query are high but that are meaningless by using the similarity between the query and the semantic features. In addition, the proposed approach uses the clustering method to remove noise and avoid the biased inherent semantics of the documents being reflected in summaries. The method can ensure the coherence of summaries by using the rank score of sentences with respect to semantic features. The experimental results demonstrate that the proposed method has better performance than other methods that use the thesaurus, the latent semantic analysis (LSA), the K-means, and the NMF.
Journal of information and communication convergence engineering | 2013
Chul-Won Kim; Sun Park
A classic document clustering technique may incorrectly classify documents into different clusters when documents that should belong to the same cluster do not have any shared terms. Recently, to overcome this problem, internal and external knowledge-based approaches have been used for text document clustering. However, the clustering results of these approaches are influenced by the inherent structure and the topical composition of the documents. Further, the organization of knowledge into an ontology is expensive. In this paper, we propose a new enhanced text document clustering method using non-negative matrix factorization (NMF) and WordNet. The semantic terms extracted as cluster labels by NMF can represent the inherent structure of a document cluster well. The proposed method can also improve the quality of document clustering that uses cluster labels and term weights based on term mutual information of WordNet. The experimental results demonstrate that the proposed method achieves better performance than the other text clustering methods.
Archive | 2013
Sun Park; Jin Gwan Park; Min A Jeong; Jong Geun Jeong; Yeonwoo Lee; Seong Ro Lee
This paper proposes a new document clustering method using the reweighted term based on semantic features for enhancing document clustering. The proposed method uses document samples of cluster by user to reduce the semantic gap between the user’s requirement and clustering results by machine. The method can enhance the document clustering because it uses the reweighted term which can well represent an inherent structure of document set relevant to a user’s requirement. The experimental results demonstrate that the proposed method achieves better performance than related document clustering methods.
Journal of information and communication convergence engineering | 2013
Chul-Won Kim; Sun Park
Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.
The Journal of Korean Institute of Communications and Information Sciences | 2011
Sun Park; Yeonwoo Lee; Min-A Jeong; Seong-Ro Lee
Red tide is a natural phenomenon that change sea color by harmful algal blooms. There have been many studies on red tide due to increasing of red tide damage. However, to automatically classify the red tide algae is not enough. Recognition of red tide algae is difficult because they do not have matching center features for recognizing algae image object. Previously studies are used a few type of red tide algae for classification. In this paper, we proposed the red tide algae recognition method using PCA and roundness of image objects.
international conference on information systems, technology and management | 2010
Sun Park; Dong Un An; ByungRea Cha; Chul-Won Kim
This paper proposes a new document clustering method using the semantic features and fuzzy association. The proposed method can improve the quality of document clustering because the clustered documents by using fuzzy association values to distinguish well dissimilar documents, the selected cluster label term by semantic features, which is used in document clustering, can represent an inherent structure of document set better. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.
The Kips Transactions:partb | 2010
Sun Park; Dongun An
ABSTRACT This paper proposes a new document clustering method using PCA and fuzzy association. The proposed method can represent an inherent structure of document clusters better since it select the cluster label and terms of representing cluster by semantic features based on PCA. Also it can improve the quality of document clustering because the clustered documents by using fuzzy association values distinguish well dissimilar documents in clusters. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.Keywords:Document Clustering, Principal Componet Analysis, Semantic Features, Fuzzy Association 1. 서 론 1) 근래의 정보 검색 분야에는 사용자의 요구사항을 만족시키기 위하여 다양한 정보를 효율적으로 처리할 수 있는 문서의 범주화에 대해서 많은 연구가 있다. 문서의 범주화는 대량의 문서들을 각각의 문서의 특성 및 주제에 맞게 분류하는 것으로, 사전에 학습이 필요한 지도학습방법인 문서분류와 학습이 필요 없는 비지도학습 방법의 문서군집으로 구분할 수 있다[4]. 전통적인 군집방법은 분할기반 방법, 계층적 기반 방법, 밀도기반 방법, 격자 기반 방법으로 분류 할 수 있다. 이들 대부분의 방법들은 거리 기반의 목적 함수를 사용하기 때문에 고차원의 객체들을 군집하는 것에는 비효율적이다. 이중에서 대표적인 군집방법으로는, 군집을 생성하는 방법에 따라서 k개의 군집을 임의로 정하여 군집을 확장해가는 비계
Journal of information and communication convergence engineering | 2009
Sun Park; Chul-Won Kim; Dong Un An
The Journal of Advanced Navigation Technology | 2010
Sun Park; Kyung-Jun Kim