Nojun Kwak
Seoul National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nojun Kwak.
IEEE Transactions on Neural Networks | 2002
Nojun Kwak; Chong-Ho Choi
Feature selection plays an important role in classifying systems such as neural networks (NNs). We use a set of attributes which are relevant, irrelevant or redundant and from the viewpoint of managing a dataset which can be huge, reducing the number of attributes by selecting only the relevant ones is desirable. In doing so, higher performances with lower computational effort is expected. In this paper, we propose two feature selection algorithms. The limitation of mutual information feature selector (MIFS) is analyzed and a method to overcome this limitation is studied. One of the proposed algorithms makes more considered use of mutual information between input attributes and output classes than the MIFS. What is demonstrated is that the proposed method can provide the performance of the ideal greedy selection algorithm when information is distributed uniformly. The computational load for this algorithm is nearly the same as that of MIFS. In addition, another feature selection algorithm using the Taguchi method is proposed. This is advanced as a solution to the question as to how to identify good features with as few experiments as possible. The proposed algorithms are applied to several classification problems and compared with MIFS. These two algorithms can be combined to complement each others limitations. The combined algorithm performed well in several experiments and should prove to be a useful method in selecting features for classification problems.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2002
Nojun Kwak; Chong-Ho Choi
Mutual information is a good indicator of relevance between variables, and have been used as a measure in several feature selection algorithms. However, calculating the mutual information is difficult, and the performance of a feature selection algorithm depends on the accuracy of the mutual information. In this paper, we propose a new method of calculating mutual information between input and class variables based on the Parzen window, and we apply this to a feature selection algorithm for classification problems.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2008
Nojun Kwak
A method of principal component analysis (PCA) based on a new L1-norm optimization technique is proposed. Unlike conventional PCA which is based on L2-norm, the proposed method is robust to outliers because it utilizes L1-norm which is less sensitive to outliers. It is invariant to rotations as well. The proposed L1-norm optimization technique is intuitive, simple, and easy to implement. It is also proven to find a locally maximal solution. The proposed method is applied to several datasets and the performances are compared with those of other conventional methods.
Pattern Recognition | 2008
Nojun Kwak
This study investigates a new method of feature extraction for classification problems. The method is based on the independent component analysis (ICA). However, unlike the original ICA, one of the unsupervised learning methods, it is developed for classification problems by utilizing class information. The proposed method is an extension of our previous work on binary-class problems to multi-class problems. It treats the class labels as input features in order to produce two sets of new features: one that carries much information on the class labels and the other that is irrelevant to the class. The learning rule for this method is obtained using the stochastic gradient method to maximize the likelihood of the observed data. Among the new features, using only class-relevant ones, the dimension of the feature space can be greatly reduced in line with the principle of parsimony, resulting better generalization. This method was applied to recognize face identities and facial expressions using various databases such as the Yale, AT&T (former ORL), Color FERET face databases and so on. The performance of the proposed method was compared with those of conventional methods such as the principal component analysis (PCA), Fishers linear discriminant (FLD), etc. The experimental results show that the proposed method performs well for face recognition problems.
IEEE Transactions on Knowledge and Data Engineering | 2003
Nojun Kwak; Chong-Ho Choi
In manipulating data such as in supervised learning, we often extract new features from the original features for the purpose of reducing the dimensions of feature space and achieving better performance. In this paper, we show how standard algorithms for independent component analysis (ICA) can be appended with binary class labels to produce a number of features that do not carry information about the class labels-these features will be discarded-and a number of features that do. We also provide a local stability analysis of the proposed algorithm. The advantage is that general ICA algorithms become available to a task of feature extraction for classification problems by maximizing the joint mutual information between class labels and new features, although only for two-class problems. Using the new features, we can greatly reduce the dimension of feature space without degrading the performance of classifying systems.
international symposium on neural networks | 1999
Nojun Kwak; Chong-Ho Choi
In classification problems, we use a set of attributes which are relevant, irrelevant or redundant. By selecting only the relevant attributes of the data as input features of a classifying system and excluding redundant ones, higher performance is expected with smaller computational effort. We propose an algorithm of feature selection that makes more careful use of the mutual informations between input attributes and others than the mutual information feature selector (MIFS). The proposed algorithm is applied in several feature selection problems and compared with the MIFS. Experimental results show that the proposed algorithm can be well used in feature selection problems.
IEEE Transactions on Systems, Man, and Cybernetics | 2014
Nojun Kwak
This paper proposes several principal component analysis (PCA) methods based on Lp-norm optimization techniques. In doing so, the objective function is defined using the Lp-norm with an arbitrary p value, and the gradient of the objective function is computed on the basis of the fact that the number of training samples is finite. In the first part, an easier problem of extracting only one feature is dealt with. In this case, principal components are searched for either by a gradient ascent method or by a Lagrangian multiplier method. When more than one feature is needed, features can be extracted one by one greedily, based on the proposed method. Second, a more difficult problem is tackled that simultaneously extracts more than one feature. The proposed methods are shown to find a local optimal solution. In addition, they are easy to implement without significantly increasing computational complexity. Finally, the proposed methods are applied to several datasets with different values of p and their performances are compared with those of conventional PCA methods.
international conference on artificial neural networks | 2001
Nojun Kwak; Chong-Ho Choi; Jin Young Choi
In manipulating data such as in supervised learning, we often extract new features from original features for the purpose of reducing the dimensions of feature space and achieving better performances. In this paper, we propose a new feature extraction algorithm using independent component analysis (ICA) for classification problems. By using ICA in solving supervised classification problems, we can get new features which are made as independent from each other as possible and also convey the output information faithfully. Using the new features along with the conventional feature selection algorithms, we can greatly reduce the dimension of feature space without degrading the performance of classifying systems.
Pattern Recognition | 2009
Nojun Kwak; Jiyong Oh
In many one-class classification problems such as face detection and object verification, the conventional linear discriminant analysis sometimes fails because it makes an inappropriate assumption on negative samples that they are distributed according to a Gaussian distribution. In addition, it sometimes cannot extract sufficient number of features because it merely makes use of the mean value of each class. In order to resolve these problems, in this paper, we extend the biased discriminant analysis (BDA) which was originally developed for one-class classification problems. The BDA makes no assumption on the distribution of negative samples and tries to separate each negative sample as far away from the center of positive samples as possible. The first extension uses a saturation technique to suppress the influence of the samples which are located far away from the decision boundary. The second one utilizes the L1 norm instead of the L2 norm. Also we present a method to extend BDA and its variants to multi-class classification problems. Our approach is considered useful in the sense that without much complexity, it successfully reduces the negative effect of negative samples which are far away from the center of positive samples, resulting in better classification performances. We have applied the proposed methods to several classification problems and compared the performance with conventional methods.
Pattern Recognition Letters | 2011
Sang Il Choi; Chong-Ho Choi; Nojun Kwak
We propose a novel 2D image-based approach that can simultaneously handle illumination and pose variations to enhance face recognition rate. It is much simpler, requires much less computational effort than the methods based on 3D models, and provides a comparable or better recognition rate.