Archive | 2019

Clustering High-Dimensional Data: A Reduction-Level Fusion of PCA and Random Projection

Abstract

Principal Component Analysis (PCA) is a very famous statistical tool for representing the data within lower dimension embedding. K-means is a prototype (centroid)-based clustering technique used in unsupervised learning tasks. Random Projection (RP) is another widely used technique for reducing the dimensionality. RP uses projection matrix to project the data into a feature space. Here, we prove the effectiveness of these methods by combining them for efficiently clustering the low as well as high-dimensional data. Our proposed algorithms works by combining Principal Component Analysis (PCA) with Random Projection (RP) to project the data into feature space, then performs K-means clustering on that reduced space (feature space). We compare the proposed algorithm’s performance with simple K-means and PCA-K-means algorithms on 12 benchmark datasets. Of these, 4 are low-dimensional and 8 are high-dimensional datasets. Our proposed algorithms outperform the other methods.

Volume None

Archive | 2019

Clustering High-Dimensional Data: A Reduction-Level Fusion of PCA and Random Projection

Abstract

Volume None

Pages 479-487

DOI 10.1007/978-981-13-1280-9_44

Language English

Journal None

Full Text