Information Sciences | 2021

Novel three-way generative classifier with weighted scoring distribution

 
 
 
 
 

Abstract


Abstract Naive Bayes classifier (NBC) is a classical binary generative classifier that has been extensively researched and developed for use in various applications owing to its simplicity and high efficiency. However, in practice, the distinct advantages of the NBC are often challenged by the conditional independence assumption among attributes and the zero-count problem. Moreover, the NBC strictly assigns a certain label to every object and lacks a classification mechanism to handle boundary objects ; this may deteriorate the classification performances. Compared with binary classifiers, three-way classifiers provide a delayed decision for boundary objects. However, most existing three-way classifiers have been developed based on rough sets, which have high time complexity and decision conflict. In this study, based on the advantages of the NBC and three-way classifier, a novel three-way generative classifier with a weighted scoring distribution (3WGC-WSD) is proposed to improve the classification performances. First, to calculate the scores of an object under different classes, a scoring function that makes the best use of the advantages of parameter estimation in the NBC is defined. Second, a self-adaptive attribute weighted algorithm is designed to relax the attribute conditional independence assumption by attribute weighted and attribute reduction. Third, a non-parametric binary generative classifier with a weighted scoring function (2GC-WSF) is designed based on the scoring function and attribute weighted algorithm. Finally, inspired by the three-way decision, 3WGC-WSD is extended on 2GC-WSF to improve classification performances by providing delay decision for boundary objects. Experiments and comparisons on 15 widely-used UCI benchmark datasets demonstrate that 3WGC-WSD outperforms three state-of-the-art classifiers and three classical classifiers in terms of four indexes. Furthermore, the efficiency of 3WGC-WSD and 2GC-WSF is demonstrated in comparison with three classifiers on 10 datasets.

Volume 579
Pages 732-750
DOI 10.1016/J.INS.2021.08.025
Language English
Journal Information Sciences

Full Text