Briefings in bioinformatics | 2019

Integrating regulatory features data for prediction of functional disease-associated SNPs

 
 
 
 
 
 
 
 
 

Abstract


Genome-wide association studies (GWASs) are an effective strategy to identify susceptibility loci for human complex diseases. However, missing heritability is still a big problem. Most GWASs single-nucleotide polymorphisms (SNPs) are located in noncoding regions, which has been considered to be the unexplored territory of the genome. Recently, data from the Encyclopedia of DNA Elements (ENCODE) and Roadmap Epigenomics projects have shown that many GWASs SNPs in the noncoding regions fall within regulatory elements. In this study, we developed a pipeline named functional disease-associated SNPs prediction (FDSP), to identify novel susceptibility loci for complex diseases based on the interpretation of the functional features for known disease-associated variants with machine learning. We applied our pipeline to predict novel susceptibility SNPs for type 2 diabetes (T2D) and hypertension. The predicted SNPs could explain heritability beyond that explained by GWAS-associated SNPs. Functional annotation by expression quantitative trait loci analyses showed that the target genes of the predicted SNPs were significantly enriched in T2D or hypertension-related pathways in multiple tissues. Our results suggest that combining GWASs and regulatory features data could identify additional functional susceptibility SNPs for complex diseases. We hope FDSP could help to identify novel susceptibility loci for complex diseases and solve the missing heritability problem.

Volume 20 1
Pages \n 26-32\n
DOI 10.1093/bib/bbx094
Language English
Journal Briefings in bioinformatics

Full Text