Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sung-Hae Jun is active.

Publication


Featured researches published by Sung-Hae Jun.


international conference on neural information processing | 2006

A competitive co-evolving support vector clustering

Sung-Hae Jun; Kyung-Whan Oh

The goal of clustering is to cluster the objects into groups that are internally homogeneous and heterogeneous from group to group. Clustering is an important tool for diversely intelligent systems. So, many works have been researched in the machine learning algorithms. But, some problems are still shown in the clustering. One of them is to determine the optimal number of clusters. In K-means algorithm, the number of cluster K is determined by the art of researchers. Another problem is an over fitting of learning models. The majority of learning algorithms for clustering are not free from the problem. Therefore, we propose a competitive co-evolving support vector clustering. Using competitive co-evolutionary computing, we overcome the over fitting problem of support vector clustering which is a good learning model for clustering. The number of clusters is efficiently determined by our competitive co-evolving support vector clustering. To verify the improved performances of our research, we compare competitive co-evolving support vector clustering with established clustering methods using the data sets form UCI machine learning repository.


The International Journal of Fuzzy Logic and Intelligent Systems | 2002

Fuzzy Web Usage Mining for User Modeling

Jae-Sung Jang; Sung-Hae Jun; Kyung-Whan Oh

The interest of data mining in artificial intelligence with fuzzy logic has been increased. Data mining is a process of extracting desirable knowledge and interesting pattern ken large data set. Because of expansion of WWW, web data is more and more huge. Besides mining web contents and web structures, another important task for web mining is web usage mining which mines web log data to discover user access pattern. The goal of web usage mining in this paper is to find interesting user pattern in the web with user feedback. It is very important to find user`s characteristic fer e-business environment. In Customer Relationship Management, recommending product and sending e-mail to user by extracted users characteristics are needed. Using our method, we extract user profile from the result of web usage mining. In this research, we concentrate on finding association rules and verify validity of them. The proposed procedure can integrate fuzzy set concept and association rule. Fuzzy association rule uses given server log file and performs several preprocessing tasks. Extracted transaction files are used to find rules by fuzzy web usage mining. To verify the validity of user`s feedback, the web log data from our laboratory web server.


Journal of Korean Institute of Intelligent Systems | 2004

Cluster Merging Using Enhanced Density based Fuzzy C-Means Clustering Algorithm

Jin-Woo Han; Sung-Hae Jun; Kyung-Whan Oh

The fuzzy set theory has been wide used in clustering of machine learning with data mining since fuzzy theory has been introduced in 1960s. In particular, fuzzy C-means algorithm is a popular fuzzy clustering algorithm up to date. An element is assigned to any cluster with each membership value using fuzzy C-means algorithm. This algorithm is affected from the location of initial cluster center and the proper cluster size like a general clustering algorithm as K-means algorithm. This setting up for initial clustering is subjective. So, we get improper results according to circumstances. In this paper, we propose a cluster merging using enhanced density based fuzzy C-means clustering algorithm for solving this problem. Our algorithm determines initial cluster size and center using the properties of training data. Proposed algorithm uses grid for deciding initial cluster center and size. For experiments, objective machine learning data are used for performance comparison between our algorithm and others.


Journal of Korean Institute of Intelligent Systems | 2003

Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm

Minjae Park; Sung-Hae Jun; Kyung-Whan Oh

Optimal determination of cluster size has an effect on the result of clustering. In K-means algorithm, the difference of clustering performance is large by initial K. But the initial cluster size is determined by prior knowledge or subjectivity in most clustering process. This subjective determination may not be optimal. In this Paper, the genetic algorithm based optimal determination approach of cluster size is proposed for automatic determination of cluster size and performance upgrading of its result. The initial population based on attribution is generated for searching optimal cluster size. The fitness value is defined the inverse of dissimilarity summation. So this is converged to upgraded total performance. The mutation operation is used for local minima problem. Finally, the re-sampling of bootstrapping is used for computational time cost.


ieee international conference on fuzzy systems | 2009

A preprocessing of outlier using KERNEL PCA and factor scores in regression model

Kyung-Whan Oh; Sung-Hae Jun; Yongjun Kim

Data analysis including outlier is more difficult to the analysis without outlier. The outlier has a chance to increase the misclassification rate and the variance of estimate in the supervised learning like classification and regression. Also the outlier becomes a cluster in the clustering as unsupervised learning. So we are hard to represent the clustering result. Because of the previous problems, it is removed generally for constructing model in data mining. But when the outlier has some information on given data, we must not remove it from training data set. In this paper, using kernel PCA(principal component analysis) and factor scores, we propose a preprocessing method to contain the outlier in the modeling. The outlier effect of given training data set is reduced by the values of kernel PCA and factor scores. We verify improved performance of our work by the experimental results using simulation data sets in regression model.


Journal of Korean Institute of Intelligent Systems | 2004

A Sparse Data Preprocessing Using Support Vector Regression

Sung-Hae Jun; Jung-Eun Park; Kyung-Whan Oh

In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.


Journal of Korean Institute of Intelligent Systems | 2004

A Cache Hoarding Method Using Collaborative Filtering in Mobile Computing Environments

Sung-Hae Jun; Sung-Won Jung; Kyung-Whan Oh

In this paper, we proposed an efficient cache hoarding method in mobile computing environments using collaborative filtering. This method is used for solving the difficult problem of mobile computing, which is the vacuum of information service depending on low bandwidth, long delay, and frequent network disconnection. Many previous researches have been studied a cache hoarding approach for solving these problems of mobile client. But, the research of history information of mobile client did not support all informative requests for mobile clients. In our research, collaborative filtering model using history information and location data of mobile client is proposed. This proposed model supports an efficient service of necessary items for client`s requirement. For the performance evaluation of proposed model, we make an experiment of simulation data using SAS enterprise miner. According to objective evaluation using cache hit ratio, we show that our model has a good result.


The Kips Transactions:partd | 2003

Sparse Web Data Analysis Using MCMC Missing Value Imputation and PCA Plot-based SOM

Sung-Hae Jun; Kyung-Whan Oh

The knowledge discovery from web has been studied in many researches. There are some difficulties using web log for training data on efficient information predictive models. In this paper, we studied on the method to eliminate sparseness from web log data and to perform web user clustering. Using missing value imputation by Bayesian inference of MCMC, the sparseness of web data is removed. And web user clustering is performed using self organizing maps based on 3-D plot by principal component. Finally, using KDD Cup data, our experimental results were shown the problem solving process and the performance evaluation.


Communications for Statistical Applications and Methods | 2003

Web Log Analysis Using Support Vector Regression

Sung-Hae Jun; Min-Taik Lim; Hong-Seok Jorn; Jin-Soo Hwang; Seong-Yong Park; Jee-Yun Kim; Kyung-Whan Oh


한국지능시스템학회 국제학술대회 발표논문집 | 2005

Evolving Statistical Learning Theory

Sung-Hae Jun; Kyung-Whan Oh

Collaboration


Dive into the Sung-Hae Jun's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jung-Eun Park

Korea Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge