Dae-Ki Kang
Dongseo University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dae-Ki Kang.
Expert Systems With Applications | 2010
Myoung-Jong Kim; Dae-Ki Kang
In a bankruptcy prediction model, the accuracy is one of crucial performance measures due to its significant economic impact. Ensemble is one of widely used methods for improving the performance of classification and prediction models. Two popular ensemble methods, Bagging and Boosting, have been applied with great success to various machine learning problems using mostly decision trees as base classifiers. In this paper, we propose an ensemble with neural network for improving the performance of traditional neural networks on bankruptcy prediction tasks. Experimental results on Korean firms indicated that the bagged and the boosted neural networks showed the improved performance over traditional neural networks.
Expert Systems With Applications | 2015
Myoung-Jong Kim; Dae-Ki Kang; Hong Bae Kim
We propose geometric mean based boosting algorithm (GMBoost).We propose GMBoost to resolve data imbalance problem.GMBoost considers geometric mean of error rates of majority and minority classes.We experiment GMBoost, AdaBoost and cost-sensitive boosting on bankruptcy prediction.The comparative results shows GMBoost outperforms in imbalanced and balanced data. In classification or prediction tasks, data imbalance problem is frequently observed when most of instances belong to one majority class. Data imbalance problem has received considerable attention in machine learning community because it is one of the main causes that degrade the performance of classifiers or predictors. In this paper, we propose geometric mean based boosting algorithm (GMBoost) to resolve data imbalance problem. GMBoost enables learning with consideration of both majority and minority classes because it uses the geometric mean of both classes in error rate and accuracy calculation. To evaluate the performance of GMBoost, we have applied GMBoost to bankruptcy prediction task. The results and their comparative analysis with AdaBoost and cost-sensitive boosting indicate that GMBoost has the advantages of high prediction power and robust learning capability in imbalanced data as well as balanced data distribution.
Knowledge and Information Systems | 2006
Jun Zhang; Dae-Ki Kang; Adrian Silvescu; Vasant G. Honavar
In many application domains, there is a need for learning algorithms that can effectively exploit attribute value taxonomies (AVT)—hierarchical groupings of attribute values—to learn compact, comprehensible and accurate classifiers from data—including data that are partially specified. This paper describes AVT-NBL, a natural generalization of the naïve Bayes learner (NBL), for learning classifiers from AVT and data. Our experimental results show that AVT-NBL is able to generate classifiers that are substantially more compact and more accurate than those produced by NBL on a broad range of data sets with different percentages of partially specified values. We also show that AVT-NBL is more efficient in its use of training data: AVT-NBL produces classifiers that outperform those produced by NBL using substantially fewer training examples.
Expert Systems With Applications | 2012
Myoung-Jong Kim; Dae-Ki Kang
Ensemble learning is a method to improve the performance of classification and prediction algorithms. Many studies have demonstrated that ensemble learning can decrease the generalization error and improve the performance of individual classifiers and predictors. However, its performance can be degraded due to multicollinearity problem where multiple classifiers of an ensemble are highly correlated with. This paper proposes a genetic algorithm-based coverage optimization technique in the purpose of resolving multicollinearity problem. Empirical results with bankruptcy prediction on Korea firms indicate that the proposed coverage optimization algorithm can help to design a diverse and highly accurate classification system.
international conference on data mining | 2004
Dae-Ki Kang; Adrian Silvescu; Jun Zhang; Vasant G. Honavar
Attribute value taxonomies (AVT) have been shown to be useful in constructing compact, robust, and comprehensible classifiers. However, in many application domains, human-designed AVTs are unavailable. We introduce AVT-learner, an algorithm for automated construction of attribute value taxonomies from data. AVT-learner uses hierarchical agglomerative clustering (HAC) to cluster attribute values based on the distribution of classes that co-occur with the values. We describe experiments on UCI data sets that compare the performance of AVT-NBL (an AVT-guided naive Bayes learner) with that of the standard naive Bayes learner (NBL) applied to the original data set. Our results show that the AVTs generated by AVT-learner are competitive with human-gene rated AVTs (in cases where such AVTs are available). AVT-NBL using AVTs generated by AVT-learner achieves classification accuracies that are comparable to or higher than those obtained by NBL; and the resulting classifiers are significantly more compact than those generated by NBL.
Pattern Recognition | 2009
Dae-Ki Kang; Kiwook Sohn
We introduce Propositionalized Attribute Taxonomy guided Decision Tree Learner (PAT-DTL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact decision trees. Since taxonomies are unavailable in most domains, we also introduce Propositionalized Attribute Taxonomy Learner (PAT-Learner) that automatically constructs taxonomy from data. Our experimental results on UCI repository data sets show that the proposed algorithms can generate a decision tree that is generally more compact than and is often comparably accurate to those produced by standard decision tree learners.
symposium on abstraction reformulation and approximation | 2005
Dae-Ki Kang; Jun Zhang; Adrian Silvescu; Vasant G. Honavar
In many machine learning applications that deal with sequences, there is a need for learning algorithms that can effectively utilize the hierarchical grouping of words. We introduce Word Taxonomy guided Naive Bayes Learner for the Multinomial Event Model (WTNBL-MN) that exploits word taxonomy to generate compact classifiers, and Word Taxonomy Learner (WTL) for automated construction of word taxonomy from sequence data. WTNBL-MN is a generalization of the Naive Bayes learner for the Multinomial Event Model for learning classifiers from data using word taxonomy. WTL uses hierarchical agglomerative clustering to cluster words based on the distribution of class labels that co-occur with the words. Our experimental results on protein localization sequences and Reuters text show that the proposed algorithms can generate Naive Bayes classifiers that are more compact and often more accurate than those produced by standard Naive Bayes learner for the Multinomial Model.
Journal of information and communication convergence engineering | 2011
Sami Abduljalil; Dae-Ki Kang
In software system development, an application interface is the main communication platform between human developers and applications. Interaction in any software application requires human’s mental and physical activities. Although software systems have increased drastically in diverse sectors and many forms to quench human’s needs and satisfactions, human always concern about the ease in usability of the software application so that it can be easily understood and navigated. Since many software developers still focus on the quantity of contents instead of the quality of the interface from the user’s point of view, it is important to address human factors need in the early stage of the design and to continue addressing them during the entire stages of the software design for the persistent support of usability. In this paper, we propose the Modified Prototype Model (MPM), which helps the software designers and developers to design user-friendly software systems with easy-to-navigate interfaces by uncovering human factors in a convenient way. Moreover, we propose methods that assist to identify more human factors regarding software design. In this paper, we also study the implications of the proposed model and the proposed methods.
Journal of information and communication convergence engineering | 2012
Gan Zhen Ye; Dae-Ki Kang
This paper presents mobile robot control architecture of hierarchical behaviors, inspired by biological life. The system is reactive, highly parallel, and does not rely on representation of the environment. The behaviors of the system are designed hierarchically from the bottom-up with priority given to primitive behaviors to ensure the survivability of the robot and provide robustness to failures in higher-level behaviors. Fuzzy logic is used to perform command fusion on each behavior’s output. Simulations of the proposed methodology are shown and discussed. The simulation results indicate that complex tasks can be performed by a combination of a few simple behaviors and a set of fuzzy inference rules.
knowledge discovery and data mining | 2006
Flavian Vasile; Adrian Silvescu; Dae-Ki Kang; Vasant G. Honavar
In many application domains, there is a need for learning algorithms that generate accurate as well as comprehensible classifiers. In this paper, we present TRIPPER – a rule induction algorithm that extends RIPPER, a widely used rule-learning algorithm. TRIPPER exploits knowledge in the form of taxonomies over the values of features used to describe data. We compare the performance of TRIPPER with that of RIPPER on benchmark datasets from the Reuters 21578 corpus using WordNet (a human-generated taxonomy) to guide rule induction by TRIPPER. Our experiments show that the rules generated by TRIPPER are generally more comprehensible and compact and in the large majority of cases at least as accurate as those generated by RIPPER.