IEEE transactions on pattern analysis and machine intelligence | 2021

Coordinate Descent Method for k-means.

 
 
 
 
 
 

Abstract


Original k-means method using Lloyd algorithm partitions a data set by minimizing a sum of squares cost function to find local minima, which can be used for data analysis and machine learning that shows promising performance. However, Lloyd algorithm suffers from finding bad local minima. In this paper, we use coordinate descent (CD) method to solve the problem. First, we show that the k-means minimization problem can be reformulated as a trace maximization problem, a simple and very efficient coordinate descent scheme is proposed to solve this problem later. The effectiveness of our method is illustrated on several real-world data sets with varing number of clusters, varing number of samples and varing number of dimensionalty. Extensive experiments conducted show that CD performs better compared to Lloyd, i.e., lower objective value and better local minima. What s more, the results show that CD is more robust to initialization than Lloyd method whether the initialization strategy is random or k-means++. In addition, according to the computational complexity analysis, it is verified CD has the same time complexity with original k-means method.

Volume PP
Pages None
DOI 10.1109/TPAMI.2021.3085739
Language English
Journal IEEE transactions on pattern analysis and machine intelligence

Full Text