2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) | 2021

Scattering-based Quality Measures

 

Abstract


Various clustering algorithms use diverse settings, parameters, and initializations, generally result in different clustering solutions. Therefore, it is essential to compare and evaluate the clustering results and select the methods that best fits the “actual” data distribution. This can be achieved by using informative quality metrics that reflect the “goodness” of the resulting solutions compared to the ground truth. Different Extrinsic validation metrics have been provided in the literature, including F-measure, Entropy, Rand Index, and Purity. However, there is a gap in the literature in evaluating the level of divergence between multiple clusterings in an aggregate, especially in consensus clustering. In this paper, we propose three scattering measures that calculate the divergence level (i.e., scattering level) between two or more clustering algorithms. The proposed metrics are Scatter F-score, Scatter Entropy, and Scatter Purity. The proposed scattering measures are variants of the traditional F-measure, Entropy, and Purity quality measures. The scattering measures are used as pre-assessment criteria for deciding which clustering algorithms to combine in an aggregate. Experimental results on artificial, real, and text datasets show that the scattering measures play an important role in enhancing the clustering quality in consensus clustering and increasing the feasibility of the consensus.

Volume None
Pages 1-8
DOI 10.1109/IEMTRONICS52119.2021.9422563
Language English
Journal 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS)

Full Text