International Journal of Computer Applications | 2021

(ISSBM) Improved Synthetic Sampling based on Model for Imbalance Data

 
 

Abstract


In the data mining research domain imbalanced data is characterized by the rigorous variation in scrutiny frequency between classes and has expected a lot of consideration. The forecast performances usually depreciate as classifiers learn from data imbalanced, as most of classifiers presume the class division is balanced or the costs for different types of classification errors are the same. Although several methods have been analyzed to deal with imbalance problems, it is still difficult to oversimplify those methods to achieve stable improvement in most cases. In this study, we propose a novel framework called Improved Synthetic Sampling Based on Model (ISSBM) to deal with imbalance problems, in which we integrate improved modeling and sampling techniques to generate synthetic data. The key inspiration behind the proposed method is to use deterioration models to capture the relationship between features and to consider data multiplicity in the process of data generation. We conduct experiments on many datasets and compare the proposed method with 5 methods. The experimental results indicate that the proposed method is not only qualified or comparative but also very stable. We also provide detailed analysis of the proposed method to empirically demonstrate why it could

Volume None
Pages None
DOI 10.5120/ijca2021921342
Language English
Journal International Journal of Computer Applications

Full Text