IEEE Access | 2019

Indexing and Search of Order-Preserving Submatrix for Gene Expression Data

 
 
 
 

Abstract


Bicluster pattern discovery plays a key role in analysis of gene expression data. One vital model of bicluster mining is Order-Preserving SubMatrix (OPSM), which finds similar tendency of some genes on some conditions. Most of the OPSM discovery methods are batch mining techniques and not suitable for low latency data query. To make data analysis efficient and effective, in this paper, we first propose a prefix-tree based indexing method pfTree, then give an optimization technique pIndex that employs row and column header tables to search the positive, negative and time-delayed OPSMs. Meanwhile, we present an online sharing query technique to accelerate the frequent searches. Finally, we conduct extensive experiments and compare our methods with the existing approaches. Experimental results demonstrate the efficiency and effectiveness of the proposed methods.

Volume 7
Pages 184769-184785
DOI 10.1109/ACCESS.2019.2960856
Language English
Journal IEEE Access

Full Text