Seokho Kang
Seoul National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Seokho Kang.
Expert Systems With Applications | 2014
Seokho Kang; Sungzoon Cho
Abstract Support vector machine (SVM) is a powerful algorithm for classification and regression problems and is widely applied to real-world applications. However, its high computational load in the test phase makes it difficult to use in practice. In this paper, we propose hybrid neural network (HNN), a method to accelerate an SVM in the test phase by approximating the SVM. The proposed method approximates the SVM using an artificial neural network (ANN). The resulting regression function of the ANN replaces the decision function or the regression function of the SVM. Since the prediction of the ANN requires significantly less computation than that of the SVM, the proposed method yields faster test speed. The proposed method is evaluated by experiments on real-world benchmark datasets. Experimental results show that the proposed method successfully accelerates SVM in the test phase with little or no prediction loss.
Neurocomputing | 2015
Seokho Kang; Sungzoon Cho; Pilsung Kang
For the one-against-one approach, all the binary classifiers that form a one-against-one classifier should be sufficiently competent. If some of the classifiers are not competent, the consequences might be invalid classification results. To address the problem, we propose diversified one-against-one (DOAO) method that seeks to find the best classification algorithm for each class pair when applying the one-against-one approach to multi-class classification problems. Applying the proposed method makes various classification algorithms to complement each other. Since the best classification algorithm for each class pair is different, the proposed method can obtain improved classification results. Experimental results show that the proposed method outperforms other one-against-one based methods. HighlightsWe propose diversified one-against-one (DOAO) method for multi-class classification problems.DOAO seeks to find the best classification algorithm for each class pair when applying the one-against-one approach.DOAO outperforms other one-against-one based methods.
Engineering Applications of Artificial Intelligence | 2015
Seokho Kang; Sungzoon Cho; Pilsung Kang
In this paper, a multi-class classification method based on heterogeneous ensemble of one-class classifiers is proposed. The proposed method consists of two phases: training heterogeneous one-class classifiers for each class using various one-class classification algorithms, and constructing an ensemble by combining the base classifiers using multi-response linear regression-based stacking. The use of various classification algorithms contributes towards increasing the diversity of the ensemble, while stacking resolves the normalization issues on different scales of outputs obtained from the base classifiers. In addition, we also demonstrate the selective utilization of base classifiers by adopting a stepwise variable selection procedure during stacking. Through our experiments on multi-class benchmark datasets, we concluded that our proposed method outperforms the methods that are based on single one-class classification algorithms with statistical significance.
Expert Systems With Applications | 2015
Seokho Kang; Pilsung Kang; Taehoon Ko; Sungzoon Cho; Su-jin Rhee; Kyung-Sang Yu
We propose a method called E3-SVM, efficient and effective ensemble of SVMs.E3-SVM excludes superfluous data points when constructing an SVM ensemble.E3-SVM was applied to the drug failure prediction problem for type 2 diabetes.We confirmed the suitability of SVM with an accuracy of about 80%. The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O ( N 3 ) . To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80%.
IEEE Transactions on Semiconductor Manufacturing | 2015
Seokho Kang; Sungzoon Cho; Daewoong An; Jae-Young Rim
In semiconductor manufacturing, wafer fabrication is followed by chip assembly where individual dies are assembled as a packaged chip. In between, dies are tested in terms of their electrical properties and those which fail to pass the “wafer test” are filtered out. However, some faulty dies pass the test and cause a packaged chip to fail in the final test. The inaccuracy of the wafer test leads to waste in manufacturing time and cost. In this paper, we propose to predict the result of the final test at the die-level before assembly using wafer test items and four derivations concerning wafer map features: 1) distance of the die from the wafer center; 2) previous final yield at the die position; 3) wafer test fail rate for the adjacent dies; and 4) abnormalities of the wafer map pattern. We build prediction models with these variables using a random forest algorithm. Preliminary experimental results on actual data show that the use of these derived variables improves the prediction performance with a statistical significance, thus merits further investigation.
Neurocomputing | 2015
Seokho Kang; Sungzoon Cho
A commonly used strategy for solving a multi-class classification problem is to decompose the original problem into several binary subproblems. The recently proposed method, diversified one-against-one (DOAO), constructs a one-against-one classifier by selecting the best classifier for each class pair from the set of heterogeneous base classifiers. It was found to yield better classification accuracy than other one-against-one classifiers that are based on individual classification algorithms. This paper presents a novel method, called optimally diversified one-against-one (ODOAO) which is an improvement of DOAO. ODOAO is based on meta-learning, and seeks to construct a multiple classifier system where a meta-classifier effectively combines the outputs from all the heterogeneous base classifiers that are trained using various classification algorithms for every class pair. Experimental results show that ODOAO outperforms DOAO and other one-against-one based methods with statistical significance. Author-HighlightsODOAO seeks to construct a one-against-one classifier based on meta-learning.ODOAO utilizes binary base classifiers from various classification algorithms.A meta-classifier effectively combines the outputs from all the base classifiers.The effectiveness of ODOAO is demonstrated through experiments.
Computers & Industrial Engineering | 2017
Seokho Kang; Eunji Kim; Jaewoong Shim; Sungzoon Cho; Wonsang Chang; Junhwan Kim
We propose a data mining process for failure analysis of industrial products.Failures are examined by a mashup of the production and customer service data.Interpretable visualization based on relative failure density is implemented.A case study is conducted using the data of real-world products. Analyzing the causal relationships for failures of industrial products is necessary for manufacturers to prevent the occurrence of failures and enhance customer satisfaction. The data collected from each of the production and customer divisions can be a fruitful source for failure analysis. In this paper, we present a data mining process for efficient failure analysis of industrial products by a mashup of data collected from both divisions. The process consists of four main steps: problem definition, preprocessing, modeling, and visualization. Each step is designed to satisfy two constraints in order to be practically applied to industrial products. First, it has to be quick and incremental because the life cycle of most industrial products is not sufficiently long. Second, the insight derived from the process has to be easy to understand for domain experts since they are generally not familiar with data mining methodologies. A case study is conducted to demonstrate the effectiveness of the data mining process by using real-world data collected from a manufacturer in Korea.
Pattern Analysis and Applications | 2015
Dongil Kim; Pilsung Kang; Seung-kyung Lee; Seokho Kang; Seungyong Doh; Sungzoon Cho
Virtual metrology (VM) has been applied to semiconductor manufacturing processes for the quality management of wafers. However, noises included in training datasets degrade the performance of VM, which is a key obstacle to the application of VM in real-world semiconductor manufacturing processes. In this paper, we develop a VM dataset construction method by identifying and removing noises. We define noises by considering both input and output variables and classify noises into fault detection and classification (FDC) noises and metrology noises, which have abnormal FDC variables and normal metrology variables, and normal FDC variables and abnormal metrology variables, respectively. We propose the construction of a VM training dataset including FDC noises and excluding metrology noises. By employing novelty detection methods, the normal/abnormal regions of FDC variables are identified. In experiments conducted on a real-world photolithography (photo) data, VM models trained with the dataset constructed by the proposed method showed the best accuracy and the most robustness.
IEEE Transactions on Semiconductor Manufacturing | 2016
Seokho Kang; Dongil Kim; Sungzoon Cho
Virtual metrology (VM) has been successfully applied to semiconductor manufacturing as an efficient way of achieving wafer-to-wafer quality control. VM involves the estimation of metrology variables of wafer inspection using a prediction model trained with process parameters and measurements prior to the actual implementation of metrology. VM modeling should incorporate a number of process parameters and measurements collected from each piece of process equipment, which results in a greater number of input variables. Therefore, it is necessary to resolve the problem of high dimensionality through feature selection. A suitable feature selection method for VM modeling should effectively address the high dimensionality by lowering the computational cost, while also achieving high prediction accuracy as an essential requirement for the practical deployment of VM. In this paper, a feature selection method based on random forward search is proposed for efficient VM modeling. This method selects relevant variables sequentially from disjoint random subsets of candidate variables by incorporating randomness. Our preliminary experimental results obtained with real-world semiconductor manufacturing data demonstrate that the proposed feature selection method achieves comparable prediction accuracy yet has the advantages of being computationally more efficient, thus merits further investigation.
Military Psychology | 2013
Soo Jin Kim; Keunyoung Seo; Seokho Kang; Sungzoon Cho
Drawing on the upper-echelons theory and diversity issues, this study examines the relationships between top management team (TMT) organizational tenure, tenure diversity, and combat performance. The study is based on of Korea Combat Training Center (KCTC) that is designed for training and evaluation of the battalion combat power. Findings indicate that battalions with higher levels of TMT tenure have a positive effect on combat performance. Tenure diversity of TMT has a negative effect on combat performance. In addition, results showed that the negative relationships between tenure diversity of TMT and combat performance are attenuated by commander’s shared experience with other TMT members.