Multimedia Systems | 2021

Few-shot imbalanced classification based on data augmentation

 
 

Abstract


Few-shot imbalanced classification tasks are commonly faced in the real-world applications due to the unbalanced data distribution and few samples of rare classes. As known, the traditional machine learning algorithms perform poorly on the imbalanced classification, usually ignoring the few samples in the minority class to achieve a good overall accuracy. To solve this few-shot problem, a novel data augmentation method was proposed in this study, called H-SMOTE, to rebalance the original imbalanced data in a stable and reasonable way. Extensive experiments were carried out on 12 open datasets covering a wide range of imbalance rate from 3.8 to 16.4. Moreover, two typical classifiers SVM and Random Forest were selected to testify the performance and generalization of proposed H-SMOTE. Further, the typical data oversampling algorithm SMOTE was adopted as the baseline of comparison. The average experimental results show that the proposed H-SMOTE method outperforms the typical SMOTE in terms of accuracy (2.58%), recall (0.67%), F-measure (2.33%), G-mean (2.58%), and AUC (2.5%). Besides, the distribution of augmented dataset by H-SMOTE is more uniform and stable. Thus, this work provides a useful data augmentation method to solve the few-shot imbalanced classification, which can also be generalized to many areas in multimedia systems.

Volume None
Pages None
DOI 10.1007/S00530-021-00827-0
Language English
Journal Multimedia Systems

Full Text