F1000Research | 2019

Fast and accurate differential transcript usage by testing equivalence class counts.

 
 
 

Abstract


Background: RNA sequencing has enabled high-throughput and fine-grained quantitative analyses of the transcriptome. While differential gene expression is the most widely used application of this technology, RNA-seq data also has the resolution to infer differential transcript usage (DTU), which can elucidate the role of different transcript isoforms between experimental conditions, cell types or tissues. DTU has typically been inferred from exon-count data, which has issues with assigning reads unambiguously to counting bins, and requires alignment of reads to the genome. Recently, approaches have emerged that use transcript quantifications estimates directly for DTU. Transcript counts can be inferred from pseudo or lightweight aligners, which are significantly faster than traditional genome alignment. However, recent evaluations show lower sensitivity in DTU analysis. Transcript abundances are estimated from equivalence classes (ECs), which determine the transcripts that any given read is compatible with. Recent work has proposed performing differential expression testing directly on equivalence class read counts (ECs). Methods: Here we demonstrate that ECs can be used effectively with existing count-based methods for detecting DTU. We evaluate this approach on simulated human and drosophila data, as well as on a real dataset through subset testing. Results: We find that ECs counts have similar sensitivity and false discovery rates as exon-level counts but can be generated in a fraction of the time through the use of pseudo-aligners. Conclusions: We posit that equivalence class read counts are a natural unit on which to perform many types of analysis.

Volume 8
Pages \n 265\n
DOI 10.12688/f1000research.18276.1
Language English
Journal F1000Research

Full Text