bioRxiv | 2021

TIGAR-V2: Efficient TWAS Tool with Nonparametric Bayesian eQTL Weights of 49 Tissue Types from GTEx V8

 
 
 
 

Abstract


Standard Transcriptome-Wide Association Study (TWAS) methods first train gene expression prediction models using reference transcriptomic data, and then test the association between the predicted genetically regulated gene expression and phenotype of interest. Most existing TWAS tools require cumbersome preparation of genotype input files and extra coding to enable parallel computation. To improve the efficiency of TWAS tools, we develop TIGAR-V2, which directly reads VCF files, enables parallel computation, and reduces up to 90% computation cost compared to the original version. TIGAR-V2 can train gene expression imputation models using either nonparametric Bayesian Dirichlet Process Regression (DPR) or Elastic-Net (as used by PrediXcan), perform TWAS using either individual-level or summary-level GWAS data, and implements both burden and variance-component test statistics for inference. We trained gene expression prediction models by DPR for 49 tissues using GTEx V8 by TIGAR-V2 and illustrated the usefulness of these nonparametric Bayesian DPR eQTL weights through TWAS of breast and ovarian cancer utilizing public GWAS summary statistics. We identified 88 and 37 risk genes respectively for breast and ovarian cancer, most of which are either known or near previously identified GWAS (∼95%) or TWAS (∼40%) risk genes of the corresponding phenotype and three novel independent TWAS risk genes with known functions in carcinogenesis. These findings suggest that TWAS can provide biological insight into the transcriptional regulation of complex diseases. TIGAR-V2 tool, trained Bayesian cis-eQTL weights, and LD information from GTEX V8 are publicly available, providing a useful resource for mapping risk genes of complex diseases.

Volume None
Pages None
DOI 10.1101/2021.07.16.452700
Language English
Journal bioRxiv

Full Text