Pilsung Kang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pilsung Kang is active.

Explore More

Publication

Featured researches published by Pilsung Kang.

international conference on neural information processing | 2006

EUS SVMs: ensemble of under-sampled SVMs for data imbalance problems

Pilsung Kang; Sungzoon Cho

Data imbalance occurs when the number of patterns from a class is much larger than that from the other class. It often degenerates the classification performance. In this paper, we propose an Ensemble of Under-Sampled SVMs or EUS SVMs. We applied the proposed method to two synthetic and six real data sets and we found that it outperformed other methods, especially when the number of patterns belonging to the minority class is very small.

international conference on biometrics | 2007

Continual retraining of keystroke dynamics based authenticator

Pilsung Kang; Seong-seob Hwang; Sungzoon Cho

Keystroke dynamics based authentication (KDA) verifies a user based on the typing pattern. During enroll, a few typing patterns are provided, which are then used to train a classifier. The typing style of a user is not expected to change. However, sometimes it does change, resulting in a high false reject. In order to achieve a better authentication performance, we propose to continually retrain classifiers with recent login typing patterns by updating the training data set. There are two ways to update it. The moving window uses a fixed number of most recent patterns while the growing window uses all the new patterns as well as the original enroll patterns. We applied the proposed method to the real data set involving 21 users. The experimental results show that both the moving window and the growing window approach outperform the fixed window approach, which does not retrain a classifier.

Expert Systems With Applications | 2011

Virtual metrology for run-to-run control in semiconductor manufacturing

Pilsung Kang; Dongil Kim; Hyoung-joo Lee; Seungyong Doh; Sungzoon Cho

In semiconductor manufacturing processes, run-to-run (R2R) control is used to improve productivity by adjusting process inputs run by run. A process will be controlled based on information obtained during or after a process, including metrology values of wafers. Those metrology values, however, are usually available for only a small fraction of sampled wafers. In order to overcome the limitation, one can use virtual metrology (VM) to predict metrology values of all wafers, based on sensor data from production equipments and actual metrology values of sampled wafers. In this paper, we develop VM prediction models using various data mining techniques. We also develop a VM embedded R2R control system using the exponentially weighted moving average (EWMA) scheme. The experiments consist of two parts: (1) verifying VM prediction models with actual production equipments data, and (2) conducting simulations of the R2R control system. Our VM prediction models are found to be accurate enough to be directly implemented in actual manufacturing processes. The simulation results show that the VM embedded R2R control system improves productivity.

Expert Systems With Applications | 2009

A virtual metrology system for semiconductor manufacturing

Pilsung Kang; Hyoung-joo Lee; Sungzoon Cho; Dongil Kim; Jinwoo Park; Chan-Kyoo Park; Seungyong Doh

Nowadays, the semiconductor manufacturing becomes very complex, consisting of hundreds of individual processes. If a faulty wafer is produced in an early stage but detected at the last moment, unnecessary resource consumption is unavoidable. Measuring every wafers quality after each process can save resources, but it is unrealistic and impractical because additional measuring processes put in between each pair of contiguous processes significantly increase the total production time. Metrology, as is employed for product quality monitoring tool today, covers only a small fraction of sampled wafers. Virtual metrology (VM), on the other hand, enables to predict every wafers metrology measurements based on production equipment data and preceding metrology results. A well established VM system, therefore, can help improve product quality and reduce production cost and cycle time. In this paper, we develop a VM system for an etching process in semiconductor manufacturing based on various data mining techniques. The experimental results show that our VM system can not only predict the metrology measurement accurately, but also detect possible faulty wafers with a reasonable confidence.

Pattern Recognition | 2008

Locally linear reconstruction for instance-based learning

Pilsung Kang; Sungzoon Cho

Instance-based learning (IBL), so called memory-based reasoning (MBR), is a commonly used non-parametric learning algorithm. k-nearest neighbor (k-NN) learning is the most popular realization of IBL. Due to its usability and adaptability, k-NN has been successfully applied to a wide range of applications. However, in practice, one has to set important model parameters only empirically: the number of neighbors (k) and weights to those neighbors. In this paper, we propose structured ways to set these parameters, based on locally linear reconstruction (LLR). We then employed sequential minimal optimization (SMO) for solving quadratic programming step involved in LLR for classification to reduce the computational complexity. Experimental results from 11 classification and eight regression tasks were promising enough to merit further investigation: not only did LLR outperform the conventional weight allocation methods without much additional computational cost, but also LLR was found to be robust to the change of k.

Information Sciences | 2015

Keystroke dynamics-based user authentication using long and free text strings from various input devices

Pilsung Kang; Sungzoon Cho

Keystroke dynamics, which refers to the typing pattern of an individual, has been highlighted as a practical behavioral biometric feature that does not require any additional recognition device for strengthening user authentication or identification. However, research in the area of keystroke dynamics-based user authentication (KDA) has been primarily focused only on the short predefined text, such as identification (ID) and password, typed on a traditional personal computer (PC) keyboard. In this paper, we aim to explore the extendability of KDA by considering long and free text strings from various input devices. Three fundamental questions are raised about the dependence of authentication performance on (1) the type of input device, (2) the length of text strings, and (3) the type of authentication algorithm. Based on the experimental tests, we observe that (1) the usage of a PC keyboard reported the highest authentication accuracy, followed by a soft keyboard and a touch keyboard; (2) the authentication accuracy could be strengthened by increasing the length of either reference or test keystrokes; (3) the R+A and RA measures report the best performance with a PC keyboard, while the Cramer-von Mises criterion reports the best performance with the other input devices for most cases, followed by the Parzen window density estimator.

Expert Systems With Applications | 2012

Machine learning-based novelty detection for faulty wafer detection in semiconductor manufacturing

Dongil Kim; Pilsung Kang; Sungzoon Cho; Hyoung-joo Lee; Seungyong Doh

Since semiconductor manufacturing consists of hundreds of processes, a faulty wafer detection system, which allows for earlier detection of faulty wafers, is required. statistical process control (SPC) and virtual metrology (VM) have been used to detect faulty wafers. However, there are some limitations in that SPC requires linear, unimodal and single variable data and VM underestimates the deviations of predictors. In this paper, seven different machine learning-based novelty detection methods were employed to detect faulty wafers. The models were trained with Fault Detection and Classification (FDC) data to detect wafers having faulty metrology values. The real world semiconductor manufacturing data collected from a semiconductor fab were tested. Since the real world data have more than 150 input variables, we employed three different dimensionality reduction methods. The experimental results showed a high True Positive Rate (TPR). These results are promising enough to warrant further study.

Neurocomputing | 2013

Locally linear reconstruction based missing value imputation for supervised learning

Pilsung Kang

Most learning algorithms generally assume that data is complete so each attribute of all instances is filled with a valid value. However, missing values are very common in real datasets for various reasons. In this paper, we propose a new single imputation method based on locally linear reconstruction (LLR) that improves the prediction performance of supervised learning (classification & regression) with missing values. First, we investigate how missing values degrade the prediction performance with various missing ratios. Next, we compare the proposed missing value imputation method (LLR) with six well-known single imputation methods for five different learning algorithms based on 13 classification and nine regression datasets. The experimental results showed that (1) all imputation methods helped to improve the prediction accuracy, although some were very simple; (2) the proposed LLR imputation method enhanced the modeling performance more than all other imputation methods, irrespective of the learning algorithms and the missing ratios; and (3) LLR was outstanding when the missing ratio was relatively high and its prediction accuracy was similar to that of the complete dataset.

Neurocomputing | 2015

Constructing a multi-class classifier using one-against-one approach with different binary classifiers

Seokho Kang; Sungzoon Cho; Pilsung Kang

For the one-against-one approach, all the binary classifiers that form a one-against-one classifier should be sufficiently competent. If some of the classifiers are not competent, the consequences might be invalid classification results. To address the problem, we propose diversified one-against-one (DOAO) method that seeks to find the best classification algorithm for each class pair when applying the one-against-one approach to multi-class classification problems. Applying the proposed method makes various classification algorithms to complement each other. Since the best classification algorithm for each class pair is different, the proposed method can obtain improved classification results. Experimental results show that the proposed method outperforms other one-against-one based methods. HighlightsWe propose diversified one-against-one (DOAO) method for multi-class classification problems.DOAO seeks to find the best classification algorithm for each class pair when applying the one-against-one approach.DOAO outperforms other one-against-one based methods.

Engineering Applications of Artificial Intelligence | 2015

Multi-class classification via heterogeneous ensemble of one-class classifiers

Seokho Kang; Sungzoon Cho; Pilsung Kang

In this paper, a multi-class classification method based on heterogeneous ensemble of one-class classifiers is proposed. The proposed method consists of two phases: training heterogeneous one-class classifiers for each class using various one-class classification algorithms, and constructing an ensemble by combining the base classifiers using multi-response linear regression-based stacking. The use of various classification algorithms contributes towards increasing the diversity of the ensemble, while stacking resolves the normalization issues on different scales of outputs obtained from the base classifiers. In addition, we also demonstrate the selective utilization of base classifiers by adopting a stepwise variable selection procedure during stacking. Through our experiments on multi-class benchmark datasets, we concluded that our proposed method outperforms the methods that are based on single one-class classification algorithms with statistical significance.

Explore More