Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gabriel E. Hoffman is active.

Publication


Featured researches published by Gabriel E. Hoffman.


PLOS ONE | 2012

Four loci explain 83% of size variation in the horse.

Shokouh Makvandi-Nejad; Gabriel E. Hoffman; Jeremy J. Allen; Erin Chu; Esther Gu; Alyssa Chandler; Ariel I. Loredo; Rebecca R. Bellone; Jason G. Mezey; Samantha A. Brooks; Nathan B. Sutter

Horse body size varies greatly due to intense selection within each breed. American Miniatures are less than one meter tall at the withers while Shires and Percherons can exceed two meters. The genetic basis for this variation is not known. We hypothesize that the breed population structure of the horse should simplify efforts to identify genes controlling size. In support of this, here we show with genome-wide association scans (GWAS) that genetic variation at just four loci can explain the great majority of horse size variation. Unlike humans, which are naturally reproducing and possess many genetic variants with weak effects on size, we show that horses, like other domestic mammals, carry just a small number of size loci with alleles of large effect. Furthermore, three of our horse size loci contain the LCORL, HMGA2 and ZFAT genes that have previously been found to control human height. The LCORL/NCAPG locus is also implicated in cattle growth and HMGA2 is associated with dog size. Extreme size diversification is a hallmark of domestication. Our results in the horse, complemented by the prior work in cattle and dog, serve to pinpoint those very few genes that have played major roles in the rapid evolution of size during domestication.


PLOS Genetics | 2012

Patterns of ancestry, signatures of natural selection, and genetic association with stature in Western African pygmies.

Joseph P. Jarvis; Laura B. Scheinfeldt; Sameer Soi; Charla Lambert; Larsson Omberg; Bart Ferwerda; Alain Froment; Jean-Marie Bodo; William Beggs; Gabriel E. Hoffman; Jason G. Mezey; Sarah A. Tishkoff

African Pygmy groups show a distinctive pattern of phenotypic variation, including short stature, which is thought to reflect past adaptation to a tropical environment. Here, we analyze Illumina 1M SNP array data in three Western Pygmy populations from Cameroon and three neighboring Bantu-speaking agricultural populations with whom they have admixed. We infer genome-wide ancestry, scan for signals of positive selection, and perform targeted genetic association with measured height variation. We identify multiple regions throughout the genome that may have played a role in adaptive evolution, many of which contain loci with roles in growth hormone, insulin, and insulin-like growth factor signaling pathways, as well as immunity and neuroendocrine signaling involved in reproduction and metabolism. The most striking results are found on chromosome 3, which harbors a cluster of selection and association signals between approximately 45 and 60 Mb. This region also includes the positional candidate genes DOCK3, which is known to be associated with height variation in Europeans, and CISH, a negative regulator of cytokine signaling known to inhibit growth hormone-stimulated STAT5 signaling. Finally, pathway analysis for genes near the strongest signals of association with height indicates enrichment for loci involved in insulin and insulin-like growth factor signaling.


BMC Bioinformatics | 2010

A variational Bayes algorithm for fast and accurate multiple locus genome-wide association analysis

Benjamin A. Logsdon; Gabriel E. Hoffman; Jason G. Mezey

BackgroundThe success achieved by genome-wide association (GWA) studies in the identification of candidate loci for complex diseases has been accompanied by an inability to explain the bulk of heritability. Here, we describe the algorithm V-Bay, a variational Bayes algorithm for multiple locus GWA analysis, which is designed to identify weaker associations that may contribute to this missing heritability.ResultsV-Bay provides a novel solution to the computational scaling constraints of most multiple locus methods and can complete a simultaneous analysis of a million genetic markers in a few hours, when using a desktop. Using a range of simulated genetic and GWA experimental scenarios, we demonstrate that V-Bay is highly accurate, and reliably identifies associations that are too weak to be discovered by single-marker testing approaches. V-Bay can also outperform a multiple locus analysis method based on the lasso, which has similar scaling properties for large numbers of genetic markers. For demonstration purposes, we also use V-Bay to confirm associations with gene expression in cell lines derived from the Phase II individuals of HapMap.ConclusionsV-Bay is a versatile, fast, and accurate multiple locus GWA analysis tool for the practitioner interested in identifying weaker associations without high false positive rates.


Gastroenterology | 2016

Variants in TRIM22 that affect NOD2 signaling are associated with very early onset inflammatory bowel disease

Qi Li; Cheng Hiang Lee; Lauren A. Peters; Lucas A. Mastropaolo; Cornelia Thoeni; Abdul Elkadri; Tobias Schwerd; Jun Zhu; Bin Zhang; Yongzhong Zhao; Ke Hao; Antonio Dinarzo; Gabriel E. Hoffman; Brian A. Kidd; Ryan Murchie; Ziad Al Adham; Conghui Guo; Daniel Kotlarz; Ernest Cutz; Thomas D. Walters; Dror S. Shouval; Mark E. Curran; Radu Dobrin; Carrie Brodmerkel; Scott B. Snapper; Christoph Klein; John H. Brumell; Mingjing Hu; Ralph Nanan; Brigitte Snanter-Nanan

BACKGROUND & AIMS Severe forms of inflammatory bowel disease (IBD) that develop in very young children can be caused by variants in a single gene. We performed whole-exome sequence (WES) analysis to identify genetic factors that might cause granulomatous colitis and severe perianal disease, with recurrent bacterial and viral infections, in an infant of consanguineous parents. METHODS We performed targeted WES analysis of DNA collected from the patient and her parents. We validated our findings by a similar analysis of DNA from 150 patients with very-early-onset IBD not associated with known genetic factors analyzed in Toronto, Oxford, and Munich. We compared gene expression signatures in inflamed vs noninflamed intestinal and rectal tissues collected from patients with treatment-resistant Crohns disease who participated in a trial of ustekinumab. We performed functional studies of identified variants in primary cells from patients and cell culture. RESULTS We identified a homozygous variant in the tripartite motif containing 22 gene (TRIM22) of the patient, as well as in 2 patients with a disease similar phenotype. Functional studies showed that the variant disrupted the ability of TRIM22 to regulate nucleotide binding oligomerization domain containing 2 (NOD2)-dependent activation of interferon-beta signaling and nuclear factor-κB. Computational studies demonstrated a correlation between the TRIM22-NOD2 network and signaling pathways and genetic factors associated very early onset and adult-onset IBD. TRIM22 is also associated with antiviral and mycobacterial effectors and markers of inflammation, such as fecal calprotectin, C-reactive protein, and Crohns disease activity index scores. CONCLUSIONS In WES and targeted exome sequence analyses of an infant with severe IBD characterized by granulomatous colitis and severe perianal disease, we identified a homozygous variant of TRIM22 that affects the ability of its product to regulate NOD2. Combined computational and functional studies showed that the TRIM22-NOD2 network regulates antiviral and antibacterial signaling pathways that contribute to inflammation. Further study of this network could lead to new disease markers and therapeutic targets for patients with very early and adult-onset IBD.


PLOS ONE | 2013

Correcting for population structure and kinship using the linear mixed model: theory and extensions.

Gabriel E. Hoffman

Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.


BMC Evolutionary Biology | 2011

Evolution of light-harvesting complex proteins from Chl c-containing algae

Gabriel E. Hoffman; M. Virginia Sanchez-Puerta; Charles F. Delwiche

BackgroundLight harvesting complex (LHC) proteins function in photosynthesis by binding chlorophyll (Chl) and carotenoid molecules that absorb light and transfer the energy to the reaction center Chl of the photosystem. Most research has focused on LHCs of plants and chlorophytes that bind Chl a and b and extensive work on these proteins has uncovered a diversity of biochemical functions, expression patterns and amino acid sequences. We focus here on a less-studied family of LHCs that typically bind Chl a and c, and that are widely distributed in Chl c-containing and other algae. Previous phylogenetic analyses of these proteins suggested that individual algal lineages possess proteins from one or two subfamilies, and that most subfamilies are characteristic of a particular algal lineage, but genome-scale datasets had revealed that some species have multiple different forms of the gene. Such observations also suggested that there might have been an important influence of endosymbiosis in the evolution of LHCs.ResultsWe reconstruct a phylogeny of LHCs from Chl c-containing algae and related lineages using data from recent sequencing projects to give ~10-fold larger taxon sampling than previous studies. The phylogeny indicates that individual taxa possess proteins from multiple LHC subfamilies and that several LHC subfamilies are found in distantly related algal lineages. This phylogenetic pattern implies functional differentiation of the gene families, a hypothesis that is consistent with data on gene expression, carotenoid binding and physical associations with other LHCs. In all probability LHCs have undergone a complex history of evolution of function, gene transfer, and lineage-specific diversification.ConclusionThe analysis provides a strikingly different picture of LHC diversity than previous analyses of LHC evolution. Individual algal lineages possess proteins from multiple LHC subfamilies. Evolutionary relationships showed support for the hypothesized origin of Chl c plastids. This work also allows recent experimental findings about molecular function to be understood in a broader phylogenetic context.


PLOS Computational Biology | 2013

PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data

Gabriel E. Hoffman; Benjamin A. Logsdon; Jason G. Mezey

Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohnss disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohns disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.


BMC Bioinformatics | 2016

variancePartition: interpreting drivers of variation in complex gene expression studies

Gabriel E. Hoffman; Eric E. Schadt

BackgroundAs large-scale studies of gene expression with multiple sources of biological and technical variation become widely adopted, characterizing these drivers of variation becomes essential to understanding disease biology and regulatory genetics.ResultsWe describe a statistical and visualization framework, variancePartition, to prioritize drivers of variation based on a genome-wide summary, and identify genes that deviate from the genome-wide trend. Using a linear mixed model, variancePartition quantifies variation in each expression trait attributable to differences in disease status, sex, cell or tissue type, ancestry, genetic background, experimental stimulus, or technical variables. Analysis of four large-scale transcriptome profiling datasets illustrates that variancePartition recovers striking patterns of biological and technical variation that are reproducible across multiple datasets.ConclusionsOur open source software, variancePartition, enables rapid interpretation of complex gene expression studies as well as other high-throughput genomics assays. variancePartition is available from Bioconductor: http://bioconductor.org/packages/variancePartition.


BMC Bioinformatics | 2012

Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggressive false positive control

Benjamin A. Logsdon; Gabriel E. Hoffman; Jason G. Mezey

BackgroundWe propose a novel variational Bayes network reconstruction algorithm to extract the most relevant disease factors from high-throughput genomic data-sets. Our algorithm is the only scalable method for regularized network recovery that employs Bayesian model averaging and that can internally estimate an appropriate level of sparsity to ensure few false positives enter the model without the need for cross-validation or a model selection criterion. We use our algorithm to characterize the effect of genetic markers and liver gene expression traits on mouse obesity related phenotypes, including weight, cholesterol, glucose, and free fatty acid levels, in an experiment previously used for discovery and validation of network connections: an F2 intercross between the C57BL/6 J and C3H/HeJ mouse strains, where apolipoprotein E is null on the background.ResultsWe identified eleven genes, Gch1, Zfp69, Dlgap1, Gna14, Yy1, Gabarapl1, Folr2, Fdft1, Cnr2, Slc24a3, and Ccl19, and a quantitative trait locus directly connected to weight, glucose, cholesterol, or free fatty acid levels in our network. None of these genes were identified by other network analyses of this mouse intercross data-set, but all have been previously associated with obesity or related pathologies in independent studies. In addition, through both simulations and data analysis we demonstrate that our algorithm achieves superior performance in terms of power and type I error control than other network recovery algorithms that use the lasso and have bounds on type I error control.ConclusionsOur final network contains 118 previously associated and novel genes affecting weight, cholesterol, glucose, and free fatty acid levels that are excellent obesity risk candidates.


Human Molecular Genetics | 2015

Improved integrative framework combining association data with gene expression features to prioritize Crohn's disease genes

Kaida Ning; Kyle Gettler; Wei Zhang; Sok Meng Ng; B. Monica Bowen; Jeffrey S. Hyams; Michael Stephens; Subra Kugathasan; Lee A. Denson; Eric E. Schadt; Gabriel E. Hoffman; Judy H. Cho

Genome-wide association studies in Crohns disease (CD) have identified 140 genome-wide significant loci. However, identification of genes driving association signals remains challenging. Furthermore, genome-wide significant thresholds limit false positives at the expense of decreased sensitivity. In this study, we explored gene features contributing to CD pathogenicity, including gene-based association data from CD and autoimmune (AI) diseases, as well as gene expression features (eQTLs, epigenetic markers of expression and intestinal gene expression data). We developed an integrative model based on a CD reference gene set. This integrative approach outperformed gene-based association signals alone in identifying CD-related genes based on statistical validation, gene ontology enrichment, differential expression between M1 and M2 macrophages and a validation using genes causing monogenic forms of inflammatory bowel disease as a reference. Besides gene-level CD association P-values, association with AI diseases was the strongest predictor, highlighting generalized mechanisms of inflammation, and the interferon-γ pathway particularly. Within the 140 high-confidence CD regions, 598 of 1328 genes had low prioritization scores, highlighting genes unlikely to contribute to CD pathogenesis. For select regions, comparably high integrative model scores were observed for multiple genes. This is particularly evident for regions having extensive linkage disequilibrium such as the IBD5 locus. Our analyses provide a standardized reference for prioritizing potential CD-related genes, in regions with both highly significant and nominally significant gene-level association P-values. Our integrative model may be particularly valuable in prioritizing rare, potentially private, missense variants for which genome-wide evidence for association may be unattainable.

Collaboration


Dive into the Gabriel E. Hoffman's collaboration.

Top Co-Authors

Avatar

Eric E. Schadt

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Pamela Sklar

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Douglas Ruderfer

Vanderbilt University Medical Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kristen J. Brennand

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Eli A. Stahl

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Kiran Girdhar

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Panos Roussos

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Amanda Dobbyn

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Bernie Devlin

University of Pittsburgh

View shared research outputs
Researchain Logo
Decentralizing Knowledge