Expert Syst. Appl. | 2019

A fast hybrid clustering technique based on local nearest neighbor using minimum spanning tree

 
 

Abstract


Abstract With rapid explosion of information, clustering emerged as an active research area for knowledge discovery. Most of the existing clustering algorithms become ineffective when inappropriate parameters are provided or applied on a dataset which consists of clusters of diverse shapes, sizes, and varying densities. To overcome these issues, many graph based hybrid clustering algorithms have been proposed but these algorithms first generate a complete graph of the dataset which takes O(N2) time where N is the number of data points which limits their application on large datasets. This paper proposes an algorithm namely a fast hybrid clustering technique based on local nearest neighbor using minimum spanning tree to reduce the computational overhead. In the first step, the algorithm partitions the dataset into large number of sub-clusters based on dispersion of data points to capture the geometry of clusters. After partitioning the dataset, a minimum spanning tree based on the centroids of each of the sub-clusters is constructed to identify the adjacent pairs. A novel merge method is proposed to find the genuine clusters by repeatedly merging the adjacent sub-clusters. The cohesion and intra-similarity are introduced to compute the level of dispersion of data points with respect to the centers of an adjacent pair and average edge weight of a sub-clusters respectively. The algorithm takes O(N3/2) time which is a N factor improvement over the popular hybrid clustering algorithms. Experimental analyses on both synthetic as well as gene expression datasets demonstrate that the proposed technique shows significant improvement over competing clustering algorithms in terms of execution time and improved cluster quality. Moreover, the proposed algorithm does not require any user defined parameters and it can estimate the number of clusters more accurately.

Volume 132
Pages 28-43
DOI 10.1016/J.ESWA.2019.04.048
Language English
Journal Expert Syst. Appl.

Full Text