Bioinformatics | 2019

CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information

 
 
 
 
 
 

Abstract


MOTIVATION\nGenome-wide association studies (GWASs) have been successful in identifying many genetic variants associated with complex traits. However, the mechanistic links between these variants and complex traits remain elusive. A scientific hypothesis is that genetic variants influence complex traits at the organismal level via affecting cellular traits, such as regulating gene expression and altering protein abundance. Although earlier works have already presented some scientific insights about this hypothesis and their findings are very promising, statistical methods that effectively harness multilayered data (e.g. genetic variants, cellular traits and organismal traits) on a large scale for functional and mechanistic exploration are highly demanding.\n\n\nRESULTS\nIn this study, we propose a collaborative mixed model (CoMM) to investigate the mechanistic role of associated variants in complex traits. The key idea is built upon the emerging scientific evidence that genetic effects at the cellular level are much stronger than those at the organismal level. Briefly, CoMM combines two models: the first model relating gene expression with genotype and the second model relating phenotype with predicted gene expression using the first model. The two models are fitted jointly in CoMM, such that the uncertainty in predicting gene expression has been fully accounted. To demonstrate the advantages of CoMM over existing methods, we conducted extensive simulation studies, and also applied CoMM to analyze 25 traits in NFBC1966 and Genetic Epidemiology Research on Aging (GERA) studies by integrating transcriptome information from the Genetic European in Health and Disease (GEUVADIS) Project. The results indicate that by leveraging regulatory information, CoMM can effectively improve the power of prioritizing risk variants. Regarding the computational efficiency, CoMM can complete the analysis of NFBC1966 dataset and GERA datasets in 2 and 18\u2009min, respectively.\n\n\nAVAILABILITY AND IMPLEMENTATION\nThe developed R package is available at https://github.com/gordonliu810822/CoMM.\n\n\nSUPPLEMENTARY INFORMATION\nSupplementary data are available at Bioinformatics online.

Volume 35 10
Pages \n 1644-1652\n
DOI 10.1093/bioinformatics/bty865
Language English
Journal Bioinformatics

Full Text