2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) | 2019
Single-cell RNAseq Imputation Based on Matrix Completion with Side Information
Abstract
Drop-out events in single-cell RNA sequencing cause large numbers of zero values in gene expression matrices. Zero values hinder accurate down-stream analysis of single-cell RNAseq (scRNAseq) data. In this study, to estimate the zero values in scRNAseq data, we proposed a novel method based on the low rank matrix completion approach. The novelty of the proposed method is due to using the gene association information as side information. We observed that incorporating the gene association information can facilitate more accurate recovery of gene expression matrices. To further improve the accuracy of zero imputation, we additionally employed a statistical model to estimate the drop-out probability of each gene to adjust imputed gene expression matrices. We conducted extensive experiments to evaluate the performance of the proposed approach using several datasets. We compared the performance of the proposed method with that of three commonly used zero imputation methods in terms of accuracy in down-stream clustering analysis. Results show that the proposed method has higher or comparable power for accurate zero imputation while has shorter run time compared to the commonly used zero imputation methods.