Sung-Hae Jun
Cheongju University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sung-Hae Jun.
international conference on neural information processing | 2006
Sung-Hae Jun; Kyung-Whan Oh
The goal of clustering is to cluster the objects into groups that are internally homogeneous and heterogeneous from group to group. Clustering is an important tool for diversely intelligent systems. So, many works have been researched in the machine learning algorithms. But, some problems are still shown in the clustering. One of them is to determine the optimal number of clusters. In K-means algorithm, the number of cluster K is determined by the art of researchers. Another problem is an over fitting of learning models. The majority of learning algorithms for clustering are not free from the problem. Therefore, we propose a competitive co-evolving support vector clustering. Using competitive co-evolutionary computing, we overcome the over fitting problem of support vector clustering which is a good learning model for clustering. The number of clusters is efficiently determined by our competitive co-evolving support vector clustering. To verify the improved performances of our research, we compare competitive co-evolving support vector clustering with established clustering methods using the data sets form UCI machine learning repository.
The International Journal of Fuzzy Logic and Intelligent Systems | 2002
Jae-Sung Jang; Sung-Hae Jun; Kyung-Whan Oh
The interest of data mining in artificial intelligence with fuzzy logic has been increased. Data mining is a process of extracting desirable knowledge and interesting pattern ken large data set. Because of expansion of WWW, web data is more and more huge. Besides mining web contents and web structures, another important task for web mining is web usage mining which mines web log data to discover user access pattern. The goal of web usage mining in this paper is to find interesting user pattern in the web with user feedback. It is very important to find user`s characteristic fer e-business environment. In Customer Relationship Management, recommending product and sending e-mail to user by extracted users characteristics are needed. Using our method, we extract user profile from the result of web usage mining. In this research, we concentrate on finding association rules and verify validity of them. The proposed procedure can integrate fuzzy set concept and association rule. Fuzzy association rule uses given server log file and performs several preprocessing tasks. Extracted transaction files are used to find rules by fuzzy web usage mining. To verify the validity of user`s feedback, the web log data from our laboratory web server.
Journal of Korean Institute of Intelligent Systems | 2004
Jin-Woo Han; Sung-Hae Jun; Kyung-Whan Oh
The fuzzy set theory has been wide used in clustering of machine learning with data mining since fuzzy theory has been introduced in 1960s. In particular, fuzzy C-means algorithm is a popular fuzzy clustering algorithm up to date. An element is assigned to any cluster with each membership value using fuzzy C-means algorithm. This algorithm is affected from the location of initial cluster center and the proper cluster size like a general clustering algorithm as K-means algorithm. This setting up for initial clustering is subjective. So, we get improper results according to circumstances. In this paper, we propose a cluster merging using enhanced density based fuzzy C-means clustering algorithm for solving this problem. Our algorithm determines initial cluster size and center using the properties of training data. Proposed algorithm uses grid for deciding initial cluster center and size. For experiments, objective machine learning data are used for performance comparison between our algorithm and others.
Journal of Korean Institute of Intelligent Systems | 2003
Minjae Park; Sung-Hae Jun; Kyung-Whan Oh
Optimal determination of cluster size has an effect on the result of clustering. In K-means algorithm, the difference of clustering performance is large by initial K. But the initial cluster size is determined by prior knowledge or subjectivity in most clustering process. This subjective determination may not be optimal. In this Paper, the genetic algorithm based optimal determination approach of cluster size is proposed for automatic determination of cluster size and performance upgrading of its result. The initial population based on attribution is generated for searching optimal cluster size. The fitness value is defined the inverse of dissimilarity summation. So this is converged to upgraded total performance. The mutation operation is used for local minima problem. Finally, the re-sampling of bootstrapping is used for computational time cost.
ieee international conference on fuzzy systems | 2009
Kyung-Whan Oh; Sung-Hae Jun; Yongjun Kim
Data analysis including outlier is more difficult to the analysis without outlier. The outlier has a chance to increase the misclassification rate and the variance of estimate in the supervised learning like classification and regression. Also the outlier becomes a cluster in the clustering as unsupervised learning. So we are hard to represent the clustering result. Because of the previous problems, it is removed generally for constructing model in data mining. But when the outlier has some information on given data, we must not remove it from training data set. In this paper, using kernel PCA(principal component analysis) and factor scores, we propose a preprocessing method to contain the outlier in the modeling. The outlier effect of given training data set is reduced by the values of kernel PCA and factor scores. We verify improved performance of our work by the experimental results using simulation data sets in regression model.
Journal of Korean Institute of Intelligent Systems | 2004
Sung-Hae Jun; Jung-Eun Park; Kyung-Whan Oh
In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.
Journal of Korean Institute of Intelligent Systems | 2004
Sung-Hae Jun; Sung-Won Jung; Kyung-Whan Oh
In this paper, we proposed an efficient cache hoarding method in mobile computing environments using collaborative filtering. This method is used for solving the difficult problem of mobile computing, which is the vacuum of information service depending on low bandwidth, long delay, and frequent network disconnection. Many previous researches have been studied a cache hoarding approach for solving these problems of mobile client. But, the research of history information of mobile client did not support all informative requests for mobile clients. In our research, collaborative filtering model using history information and location data of mobile client is proposed. This proposed model supports an efficient service of necessary items for client`s requirement. For the performance evaluation of proposed model, we make an experiment of simulation data using SAS enterprise miner. According to objective evaluation using cache hit ratio, we show that our model has a good result.
The Kips Transactions:partd | 2003
Sung-Hae Jun; Kyung-Whan Oh
The knowledge discovery from web has been studied in many researches. There are some difficulties using web log for training data on efficient information predictive models. In this paper, we studied on the method to eliminate sparseness from web log data and to perform web user clustering. Using missing value imputation by Bayesian inference of MCMC, the sparseness of web data is removed. And web user clustering is performed using self organizing maps based on 3-D plot by principal component. Finally, using KDD Cup data, our experimental results were shown the problem solving process and the performance evaluation.
Communications for Statistical Applications and Methods | 2003
Sung-Hae Jun; Min-Taik Lim; Hong-Seok Jorn; Jin-Soo Hwang; Seong-Yong Park; Jee-Yun Kim; Kyung-Whan Oh
한국지능시스템학회 국제학술대회 발표논문집 | 2005
Sung-Hae Jun; Kyung-Whan Oh