IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2019

A Mixed-Norm Laplacian Regularized Low-Rank Representation Method for Tumor Samples Clustering

 
 
 
 
 
 

Abstract


Tumor samples clustering based on biomolecular data is a hot issue of cancer classifications discovery. How to extract the valuable information from high dimensional genomic data is becoming an urgent problem in tumor samples clustering. In this paper, we introduce manifold regularization into low-rank representation model and present a novel method named Mixed-norm Laplacian regularized Low-Rank Representation (MLLRR) to identify the differentially expressed genes for tumor clustering based on gene expression data. Then, in order to advance the accuracy and stability of tumor clustering, we establish the clustering model based on Penalized Matrix Decomposition (PMD) and propose a novel cluster method named MLLRR-PMD. In this method, the cancer clustering research includes three steps. First, the matrix of gene expression data is decomposed into a low rank representation matrix and a sparse matrix by MLLRR. Second, the differentially expressed genes are identified based on the sparse matrix. Finally, the PMD is applied to cluster the samples based on the differentially expressed genes. The experiment results on simulation data and real genomic data illustrate that MLLRR method enhances the robustness to outliers and achieves remarkable performance in the extraction of differentially expressed genes.

Volume 16
Pages 172-182
DOI 10.1109/TCBB.2017.2769647
Language English
Journal IEEE/ACM Transactions on Computational Biology and Bioinformatics

Full Text