Joe R. Davis
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Joe R. Davis.
Nature | 2017
Xin Li; Yungil Kim; Emily K. Tsang; Joe R. Davis; Farhan N. Damani; Colby Chiang; Gaelen T. Hess; Zachary Zappala; Benjamin J. Strober; Alexandra J. Scott; Amy Li; Andrea Ganna; Michael C. Bassik; Jason D. Merker; Ira M. Hall; Alexis Battle; Stephen B. Montgomery
Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.
Nature Genetics | 2017
Colby Chiang; Alexandra J. Scott; Joe R. Davis; Emily K. Tsang; Xin Li; Yungil Kim; Tarik Hadzic; Farhan N. Damani; Liron Ganel; Stephen B. Montgomery; Alexis Battle; Donald F Conrad; Ira M. Hall
Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5–6.8% of eQTLs—a substantially higher fraction than prior estimates—and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.
Nature Methods | 2017
David Knowles; Joe R. Davis; Hilary Edgington; Anil Raj; Marie-Julie Favé; Xiaowei Zhu; James B. Potash; Myrna M. Weissman; Jianxin Shi; Douglas F. Levinson; Stephen B. Montgomery; Alexis Battle
Identifying interactions between genetics and the environment (GxE) remains challenging. We have developed EAGLE, a hierarchical Bayesian model for identifying GxE interactions based on associations between environmental variables and allele-specific expression. Combining whole-blood RNA-seq with extensive environmental annotations collected from 922 human individuals, we identified 35 GxE interactions, compared with only four using standard GxE interaction testing. EAGLE provides new opportunities for researchers to identify GxE interactions using functional genomic data.
Genome Research | 2016
Kimberly R. Kukurba; Princy Parsana; Brunilda Balliu; Kevin S. Smith; Zachary Zappala; David A. Knowles; Marie Julie Favé; Joe R. Davis; Xin Li; Xiaowei Zhu; James B. Potash; Myrna M. Weissman; Jianxin Shi; Anshul Kundaje; Douglas F. Levinson; Alexis Battle; Stephen B. Montgomery
The X Chromosome, with its unique mode of inheritance, contributes to differences between the sexes at a molecular level, including sex-specific gene expression and sex-specific impact of genetic variation. Improving our understanding of these differences offers to elucidate the molecular mechanisms underlying sex-specific traits and diseases. However, to date, most studies have either ignored the X Chromosome or had insufficient power to test for the sex-specific impact of genetic variation. By analyzing whole blood transcriptomes of 922 individuals, we have conducted the first large-scale, genome-wide analysis of the impact of both sex and genetic variation on patterns of gene expression, including comparison between the X Chromosome and autosomes. We identified a depletion of expression quantitative trait loci (eQTL) on the X Chromosome, especially among genes under high selective constraint. In contrast, we discovered an enrichment of sex-specific regulatory variants on the X Chromosome. To resolve the molecular mechanisms underlying such effects, we generated chromatin accessibility data through ATAC-sequencing to connect sex-specific chromatin accessibility to sex-specific patterns of expression and regulatory variation. As sex-specific regulatory variants discovered in our study can inform sex differences in heritable disease prevalence, we integrated our data with genome-wide association study data for multiple immune traits identifying several traits with significant sex biases in genetic susceptibilities. Together, our study provides genome-wide insight into how genetic variation, the X Chromosome, and sex shape human gene regulation and disease.
American Journal of Human Genetics | 2016
Joe R. Davis; Laure Frésard; David Knowles; Mauro Pala; Carlos Bustamante; Alexis Battle; Stephen B. Montgomery
Methods for multiple-testing correction in local expression quantitative trait locus (cis-eQTL) studies are a trade-off between statistical power and computational efficiency. Bonferroni correction, though computationally trivial, is overly conservative and fails to account for linkage disequilibrium between variants. Permutation-based methods are more powerful, though computationally far more intensive. We present an alternative correction method called eigenMT, which runs over 500 times faster than permutations and has adjusted p values that closely approximate empirical ones. To achieve this speed while also maintaining the accuracy of permutation-based methods, we estimate the effective number of independent variants tested for association with a particular gene, termed Meff, by using the eigenvalue decomposition of the genotype correlation matrix. We employ a regularized estimator of the correlation matrix to ensure Meff is robust and yields adjusted p values that closely approximate p values from permutations. Finally, using a common genotype matrix, we show that eigenMT can be applied with even greater efficiency to studies across tissues or conditions. Our method provides a simpler, more efficient approach to multiple-testing correction than existing methods and fits within existing pipelines for eQTL discovery.
bioRxiv | 2016
François Aguet; Andrew Anand Brown; Stephane E. Castel; Joe R. Davis; Pejman Mohammadi; Ayellet V. Segrè; Zachary Zappala; Nathan S. Abell; Laure Frésard; Eric R. Gamazon; Ellen T. Gelfand; Machael J Gloudemans; Yuan He; Farhad Hormozdiari; Xiao Li; Xin Li; Boxiang Liu; Diego Garrido-Martín; Halit Ongen; John Palowitch; YoSon Park; Christine B. Peterson; Gerald Quon; Stephan Ripke; Andrey A. Shabalin; Tyler C. Shimko; Benjamin J. Strober; Timothy J. Sullivan; Nicole A. Teran; Emily K. Tsang
Expression quantitative trait locus (eQTL) mapping provides a powerful means to identify functional variants influencing gene expression and disease pathogenesis. We report the identification of cis-eQTLs from 7,051 post-mortem samples representing 44 tissues and 449 individuals as part of the Genotype-Tissue Expression (GTEx) project. We find a cis-eQTL for 88% of all annotated protein-coding genes, with one-third having multiple independent effects. We identify numerous tissue-specific cis-eQTLs, highlighting the unique functional impact of regulatory variation in diverse tissues. By integrating large-scale functional genomics data and state-of-the-art fine-mapping algorithms, we identify multiple features predictive of tissue-specific and shared regulatory effects. We improve estimates of cis-eQTL sharing and effect sizes using allele specific expression across tissues. Finally, we demonstrate the utility of this large compendium of cis-eQTLs for understanding the tissue-specific etiology of complex traits, including coronary artery disease. The GTEx project provides an exceptional resource that has improved our understanding of gene regulation across tissues and the role of regulatory variation in human genetic diseases.
American Journal of Epidemiology | 2017
Marylyn D. Ritchie; Joe R. Davis; Hugues Aschard; Alexis Battle; David V. Conti; Mengmeng Du; Eleazar Eskin; M. Daniele Fallin; Li Hsu; Peter Kraft; Jason H. Moore; Brandon L. Pierce; Stephanie Bien; Duncan C. Thomas; Peng Wei; Stephen B. Montgomery
Abstract A growing knowledge base of genetic and environmental information has greatly enabled the study of disease risk factors. However, the computational complexity and statistical burden of testing all variants by all environments has required novel study designs and hypothesis-driven approaches. We discuss how incorporating biological knowledge from model organisms, functional genomics, and integrative approaches can empower the discovery of novel gene-environment interactions and discuss specific methodological considerations with each approach. We consider specific examples where the application of these approaches has uncovered effects of gene-environment interactions relevant to drug response and immunity, and we highlight how such improvements enable a greater understanding of the pathogenesis of disease and the realization of precision medicine.
Nature Genetics | 2017
Mauro Pala; Zachary Zappala; Mara Marongiu; Xin Li; Joe R. Davis; Roberto Cusano; Francesca Crobu; Kimberly R. Kukurba; Michael J. Gloudemans; Frederic Reinier; Riccardo Berutti; Maria Grazia Piras; Antonella Mulas; Magdalena Zoledziewska; Michele Marongiu; Elena P. Sorokin; Gaelen T. Hess; Kevin S. Smith; Fabio Busonero; Andrea Maschio; Maristella Steri; Carlo Sidore; Serena Sanna; Edoardo Fiorillo; Michael C. Bassik; Stephen Sawcer; Alexis Battle; John Novembre; Chris Jones; Andrea Angius
Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.
Bioinformatics | 2017
Nilah M. Ioannidis; Joe R. Davis; Marianne K. DeGorter; Nicholas B. Larson; Shannon K. McDonnell; Amy J. French; Alexis Battle; Trevor Hastie; Stephen N. Thibodeau; Stephen B. Montgomery; Carlos Bustamante; Weiva Sieh; Alice S. Whittemore
Motivation: Interpreting genetic variation in noncoding regions of the genome is an important challenge for personal genome analysis. One mechanism by which noncoding single nucleotide variants (SNVs) influence downstream phenotypes is through the regulation of gene expression. Methods to predict whether or not individual SNVs are likely to regulate gene expression would aid interpretation of variants of unknown significance identified in whole‐genome sequencing studies. Results: We developed FIRE (Functional Inference of Regulators of Expression), a tool to score both noncoding and coding SNVs based on their potential to regulate the expression levels of nearby genes. FIRE consists of 23 random forests trained to recognize SNVs in cis‐expression quantitative trait loci (cis‐eQTLs) using a set of 92 genomic annotations as predictive features. FIRE scores discriminate cis‐eQTL SNVs from non‐eQTL SNVs in the training set with a cross‐validated area under the receiver operating characteristic curve (AUC) of 0.807, and discriminate cis‐eQTL SNVs shared across six populations of different ancestry from non‐eQTL SNVs with an AUC of 0.939. FIRE scores are also predictive of cis‐eQTL SNVs across a variety of tissue types. Availability and implementation: FIRE scores for genome‐wide SNVs in hg19/GRCh37 are available for download at https://sites.google.com/site/fireregulatoryvariation/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
bioRxiv | 2016
Mauro Pala; Zachary Zappala; Mara Marongiu; Xin Li; Joe R. Davis; Roberto Cusano; Francesca Crobu; Kimberly R. Kukurba; Frederic Reiner; Riccardo Berutti; Maria Grazia Piras; Antonella Mulas; Magdalena Zoledziewska; Michele Marongiu; Fabio Busonero; Andrea Maschio; Maristella Steri; Carlo Sidore; Serena Sanna; Edoardo Fiorillo; Alexis Battle; John Novembre; Chris Jones; Andrea Angius; Gonçalo R. Abecasis; David Schlessinger; Francesco Cucca; Stephen B. Montgomery
Identifying functional non-coding variants can enhance genome interpretation and inform novel genetic risk factors. We used whole genomes and peripheral white blood cell transcriptomes from 624 Sardinian individuals to identify non-coding variants that contribute to population, family, and individual differences in transcript abundance. We identified 21,183 independent expression quantitative trait loci (eQTLs) and 6,768 independent splicing quantitative trait loci (sQTLs) influencing 73 and 41% of all tested genes. When we compared Sardinian eQTLs to those previously identified in Europe, we identified differentiated eQTLs at genes involved in malarial resistance and multiple sclerosis, reflecting the long-term epidemiological history of the island’s population. Taking advantage of pedigree data for the population sample, we identify segregating patterns of outlier gene expression and allelic imbalance in 61 Sardinian trios. We identified 809 expression outliers (median z-score of 2.97) averaging 13.3 genes with outlier expression per individual. We then connected these outlier expression events to rare non-coding variants. Our results provide new insight into the effects of non-coding variants and their relationship to population history, traits and individual genetic risk.