Archive | 2019

Enhanced Classification Models for Iris Dataset

 
 
 
 
 
 
 
 
 

Abstract


Abstract Data mining and machine learning are both useful tools in the field of data analysis. Classification algorithm is one of the most important techniques in data mining, therefore, it is of great significance to select suitable classification models with high efficiency to show superiority when solving classification problems with the use of Iris data. With this goal, a decision tree induction algorithm, namely graftedTree, is proposed to build randomized decision trees. Randomization is explicitly introduced into this algorithm, such that applying the algorithm several times on the same training data results in diversified models. An ensemble classification model is constructed using multiple randomized decision trees via majority voting. In order to show the performance of different models in classification, we propose the usage of precision, recall, F-Measure, the area under the ROC curve (AUC) and Gini coefficient as evaluation indexes of the classifying performance on the Iris dataset. The experimental results show that classification with Random Forests model has generally better performance than that with the Boosting Tree model and other three popular algorithms: KNN, SMO and Simple Cart. However, the Gini coefficient of the Random Forests model shows that it gets less pure training set than other models. The new GraftedTrees model inherits the advantages of Random Forest and further employs random mixture of two interchangeable node splitting rule inductions with the aim to obtain higher computational efficiency and better performance in terms of accuracy. With its superiority, it is expected that the new GraftedTrees model can prove to be the most powerful model with better performance in classification in the near future.

Volume None
Pages 946-954
DOI 10.1016/j.procs.2019.12.072
Language English
Journal None

Full Text