Aaron J. Sams
Cornell University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aaron J. Sams.
PLOS ONE | 2014
Diana Chang; Feng Gao; Andrea Slavney; Li Ma; Yedael Y. Waldman; Aaron J. Sams; Paul Billing-Ross; Aviv Madar; Richard A. Spritz; Alon Keinan
Many complex human diseases are highly sexually dimorphic, suggesting a potential contribution of the X chromosome to disease risk. However, the X chromosome has been neglected or incorrectly analyzed in most genome-wide association studies (GWAS). We present tailored analytical methods and software that facilitate X-wide association studies (XWAS), which we further applied to reanalyze data from 16 GWAS of different autoimmune and related diseases (AID). We associated several X-linked genes with disease risk, among which (1) ARHGEF6 is associated with Crohns disease and replicated in a study of ulcerative colitis, another inflammatory bowel disease (IBD). Indeed, ARHGEF6 interacts with a gastric bacterium that has been implicated in IBD. (2) CENPI is associated with three different AID, which is compelling in light of known associations with AID of autosomal genes encoding centromere proteins, as well as established autosomal evidence of pleiotropy between autoimmune diseases. (3) We replicated a previous association of FOXP3, a transcription factor that regulates T-cell development and function, with vitiligo; and (4) we discovered that C1GALT1C1 exhibits sex-specific effect on disease risk in both IBDs. These and other X-linked genes that we associated with AID tend to be highly expressed in tissues related to immune response, participate in major immune pathways, and display differential gene expression between males and females. Combined, the results demonstrate the importance of the X chromosome in autoimmunity, reveal the potential of extensive XWAS, even based on existing data, and provide the tools and incentive to properly include the X chromosome in future studies.
Journal of Human Evolution | 2015
George H. Perry; Logan Kistler; Mary A. Kelaita; Aaron J. Sams
Nuclear genome sequence data from Neandertals, Denisovans, and archaic anatomically modern humans can be used to complement our understanding of hominin evolutionary biology and ecology through i) direct inference of archaic hominin phenotypes, ii) indirect inference of those phenotypes by identifying the effects of previously-introgressed alleles still present among modern humans, or iii) determining the evolutionary timing of relevant hominin-specific genetic changes. Here we review and reanalyze published Neandertal and Denisovan genome sequence data to illustrate an example of the third approach. Specifically, we infer the timing of five human gene presence/absence changes that may be related to particular hominin-specific dietary changes and discuss these results in the context of our broader reconstructions of hominin evolutionary ecology. We show that pseudogenizing (gene loss) mutations in the TAS2R62 and TAS2R64 bitter taste receptor genes and the MYH16 masticatory myosin gene occurred after the hominin-chimpanzee divergence but before the divergence of the human and Neandertal/Denisovan lineages. The absence of a functional MYH16 protein may explain our relatively reduced jaw muscles; this gene loss may have followed the adoption of cooking behavior. In contrast, salivary amylase gene (AMY1) duplications were not observed in the Neandertal and Denisovan genomes, suggesting a relatively recent origin for the AMY1 copy number gains that are observed in modern humans. Thus, if earlier hominins were consuming large quantities of starch-rich underground storage organs, as previously hypothesized, then they were likely doing so without the digestive benefits of increased salivary amylase production. Our most surprising result was the observation of a heterozygous mutation in the first codon of the TAS2R38 bitter taste receptor gene in the Neandertal individual, which likely would have resulted in a non-functional protein and inter-individual PTC (phenylthiocarbamide) taste sensitivity variation, as also observed in both humans and chimpanzees.
Genome Biology | 2016
Aaron J. Sams; Anne Dumaine; Yohann Nédélec; Vania Yotova; Carolina Alfieri; Jerome E. Tanner; Philipp W. Messer; Luis B. Barreiro
BackgroundThe 2’-5’ oligoadenylate synthetase (OAS) locus encodes for three OAS enzymes (OAS1-3) involved in innate immune response. This region harbors high amounts of Neandertal ancestry in non-African populations; yet, strong evidence of positive selection in the OAS region is still lacking.ResultsHere we used a broad array of selection tests in concert with neutral coalescent simulations to demonstrate a signal of adaptive introgression at the OAS locus. Furthermore, we characterized the functional consequences of the Neandertal haplotype in the transcriptional regulation of OAS genes at baseline and infected conditions. We found that cells from people with the Neandertal-like haplotype express lower levels of OAS3 upon infection, as well as distinct isoforms of OAS1 and OAS2.ConclusionsWe present evidence that a Neandertal haplotype at the OAS locus was subjected to positive selection in the human population. This haplotype is significantly associated with functional consequences at the level of transcriptional regulation of innate immune responses. Notably, we suggest that the Neandertal-introgressed haplotype likely reintroduced an ancestral splice variant of OAS1 encoding a more active protein, suggesting that adaptive introgression occurred as a means to resurrect adaptive variation that had been lost outside Africa.
Journal of Human Evolution | 2015
Aaron J. Sams; John Hawks; Alon Keinan
The age of polymorphic alleles in humans is often estimated from population genetic patterns in extant human populations, such as allele frequencies, linkage disequilibrium, and rate of mutations. Ancient DNA can improve the accuracy of such estimates, as well as facilitate testing the validity of demographic models underlying many population genetic methods. Specifically, the presence of an allele in a genome derived from an ancient sample testifies that the allele is at least as old as that sample. In this study, we consider a common method for estimating allele age based on allele frequency as applied to variants from the US National Institutes of Health (NIH) Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project. We view these estimates in the context of the presence or absence of each allele in the genomes of the 5300 year old Tyrolean Iceman, Ötzi, and of the 50,000 year old Altai Neandertal. Our results illuminate the accuracy of these estimates and their sensitivity to demographic events that were not included in the model underlying age estimation. Specifically, allele presence in the Iceman genome provides a good fit of allele age estimates to the expectation based on the age of that specimen. The equivalent based on the Neandertal genome leads to a poorer fit. This is likely due in part to the older age of the Neandertal and the older time of the split between modern humans and Neandertals, but also due to gene flow from Neandertals to modern humans not being considered in the underlying demographic model. Thus, the incorporation of ancient DNA can improve allele age estimation, demographic modeling, and tests of natural selection. Our results also point to the importance of considering a more diverse set of ancient samples for understanding the geographic and temporal range of individual alleles.
PLOS ONE | 2013
Aaron J. Sams; John Hawks
Celiac disease is a common small intestinal inflammatory condition induced by wheat gluten and related proteins from rye and barley. Left untreated, the clinical presentation of CD can include failure to thrive, malnutrition, and distension in juveniles. The disease can additionally lead to vitamin deficiencies, anemia, and osteoporosis. Therefore, CD potentially negatively affected fitness in past populations utilizing wheat, barley, and rye. Previous analyses of CD risk variants have uncovered evidence for positive selection on some of these loci. These studies also suggest the possibility that risk for common autoimmune conditions such as CD may be the result of positive selection on immune related loci in the genome to fight infection. Under this evolutionary scenario, disease phenotypes may be a trade-off from positive selection on immunity. If this hypothesis is generally true, we can expect to find a signal of natural selection when we survey across the network of loci known to influence CD risk. This study examines the non-HLA autosomal network of gene loci associated with CD risk in Europe. We reject the null hypothesis of neutrality on this network of CD risk loci. Additionally, we can localize evidence of selection in time and space by adding information from the genome of the Tyrolean Iceman. While we can show significant differentiation between continental regions across the CD network, the pattern of evidence is not consistent with primarily recent (Holocene) selection across this network in Europe. Further localization of ancient selection on this network may illuminate the ecological pressures acting on the immune system during this critically interesting phase of our evolution.
Genome Research | 2016
Yishay Pinto; Orshay Gabay; Leonardo Arbiza; Aaron J. Sams; Alon Keinan; Erez Y. Levanon
The gradual accumulation of mutations by any of a number of mutational processes is a major driving force of divergence and evolution. Here, we investigate a potentially novel mutational process that is based on the activity of members of the AID/APOBEC family of deaminases. This gene family has been recently shown to introduce-in multiple types of cancer-enzyme-induced clusters of co-occurring somatic mutations caused by cytosine deamination. Going beyond somatic mutations, we hypothesized that APOBEC3-following its rapid expansion in primates-can introduce unique germline mutation clusters that can play a role in primate evolution. In this study, we tested this hypothesis by performing a comprehensive comparative genomic screen for APOBEC3-induced mutagenesis patterns across different hominids. We detected thousands of mutation clusters introduced along primate evolution which exhibit features that strongly fit the known patterns of APOBEC3G mutagenesis. These results suggest that APOBEC3G-induced mutations have contributed to the evolution of all genomes we studied. This is the first indication of site-directed, enzyme-induced genome evolution, which played a role in the evolution of both modern and archaic humans. This novel mutational mechanism exhibits several unique features, such as its higher tendency to mutate transcribed regions and regulatory elements and its ability to generate clusters of concurrent point mutations that all occur in a single generation. Our discovery demonstrates the exaptation of an anti-viral mechanism as a new source of genomic variation in hominids with a strong potential for functional consequences.
Human Biology | 2014
Aaron J. Sams; John Hawks
ABSTRACT Celiac disease (CD) is a multifactorial chronic inflammatory condition that results in injury of the mucosal lining of the small intestine upon ingestion of wheat gluten and related proteins from barley and rye. Although the exact mechanisms leading to CD are not fully understood, the genetic basis of CD has been relatively well characterized. In this review we briefly review the history of discovery, clinical presentation, pathophysiology, and current understanding of the genetics underlying CD risk. Then, we discuss what is known about the current distribution and evolutionary history of genes underlying CD risk in light of other evolutionary models of disease. Specifically, we conclude that the set of loci underlying CD risk did not cohesively evolve as a response to a single past selection event such as the development of agriculture. Rather, deterministic and stochastic evolutionary processes have both contributed to the present distribution of variation in CD risk loci. Selection has shaped some components of this network, but this selection appears to have occurred at different points in the past. Other parts of the CD risk network have likely arisen due to stochastic processes such as genetic drift.
bioRxiv | 2018
Aaron J. Sams; Adam R. Boyko
Inbreeding and consanguinity leave distinct genomic traces, most notably long genomic tracts that are identical by descent and completely homozygous. These runs of homozygosity (ROH) can contribute to inbreeding depression if they contain deleterious variants that are fully or partially recessive. Several lines of evidence have been used to show that long (> 5 megabase (Mb)) ROH are disproportionately likely to harbor deleterious variation, but the extent to which long versus short tracts contribute to autozygosity at loci known to be deleterious and recessive has not been studied. In domestic dogs, nearly 200 mutations are known to cause recessive diseases, most of which can be efficiently assayed using SNP arrays. By examining genome-wide data from over 200,000 markers, including 150 recessive disease variants, we built high-resolution ROH density maps for nearly 2,500 dogs, recording ROH down to 500 kilobases. We observed over 500 homozygous deleterious recessive genotypes in the panel, 90% of which overlapped with ROH inferred by GERMLINE. Although most of these genotypes were contained in ROH over 5 Mb in length, 14% were contained in short (0.5 - 2.5 Mb) tracts, a significant enrichment compared to the genetic background, suggesting that even short tracts are useful for computing inbreeding metrics like the coefficient of inbreeding estimated from ROH (FROH). In our dataset, FROH differed significantly both within and among dog breeds. All breeds harbored some regions of reduced genetic diversity due to drift or selective sweeps, but the degree of inbreeding and the proportion of inbreeding caused by short versus long tracts differed between breeds, reflecting their different population histories. Although only available for a few species, large genome-wide datasets including recessive disease variants hold particular promise not only for disentangling the genetic architecture of inbreeding depression, but also evaluating and improving upon current approaches for detecting ROH.
PLOS Genetics | 2018
Petra E. Deane-Coe; Erin T. Chu; Andrea Slavney; Adam R. Boyko; Aaron J. Sams
Consumer genomics enables genetic discovery on an unprecedented scale by linking very large databases of personal genomic data with phenotype information voluntarily submitted via web-based surveys. These databases are having a transformative effect on human genomics research, yielding insights on increasingly complex traits, behaviors, and disease by including many thousands of individuals in genome-wide association studies (GWAS). The promise of consumer genomic data is not limited to human research, however. Genomic tools for dogs are readily available, with hundreds of causal Mendelian variants already characterized, because selection and breeding have led to dramatic phenotypic diversity underlain by a simple genetic structure. Here, we report the results of the first consumer genomics study ever conducted in a non-human model: a GWAS of blue eyes based on more than 3,000 customer dogs with validation panels including nearly 3,000 more, the largest canine GWAS to date. We discovered a novel association with blue eyes on chromosome 18 (P = 1.3x10-68) and used both sequence coverage and microarray probe intensity data to identify the putative causal variant: a 98.6-kb duplication directly upstream of the Homeobox gene ALX4, which plays an important role in mammalian eye development. This duplication is largely restricted to Siberian Huskies, is strongly associated with the blue-eyed phenotype (chi-square P = 5.2x10-290), and is highly, but not completely, penetrant. These results underscore the power of consumer-data-driven discovery in non-human species, especially dogs, where there is intense owner interest in the personal genomic information of their pets, a high level of engagement with web-based surveys, and an underlying genetic architecture ideal for mapping studies.
bioRxiv | 2016
Aviv Madar; Diana Chang; Feng Gao; Aaron J. Sams; Yedael Y. Waldman; Deborah S. Cunninghame Graham; Timothy J. Vyse; Andrew G. Clark; Alon Keinan
Genetic risk for common autoimmune diseases is influenced by hundreds of small effect, mostly non-coding variants, enriched in regulatory regions active in adaptive-immune cell types. DNaseI hypersensitivity sites (DHSs) are a genomic mark for regulatory DNA. Here, we generated a single DHSs annotation from fifteen deeply sequenced DNase-seq experiments in adaptive-immune as well as non-immune cell types. Using this annotation we quantified accessibility across cell types in a matrix format amenable to statistical analysis, deduced the subset of DHSs unique to adaptive-immune cell types, and grouped DHSs by cell-type accessibility profiles. Measuring enrichment with cell-type-specific TF binding sites as well as proximal gene expression and function, we show that accessibility profiles grouped DHSs into coherent regulatory functions. Using the adaptive-immune-specific DHSs as input (0.37% of genome), we associated DHSs to six autoimmune diseases with GWAS data. Associated loci showed higher replication rates when compared to loci identified by GWAS or by considering all DHSs, allowing the additional discovery of 327 loci (FDR<0.005) below typical GWAS significance threshold, 52 of which are novel and replicating discoveries. Finally, we integrated DHS associations from six autoimmune diseases, using a network model (bird’-eye view) and a regulatory Manhattan plot schema (per locus). Taken together, we described and validated a strategy to leverage finely resolved regulatory priors, enhancing the discovery, interpretability, and resolution of genetic associations, and providing actionable insights for follow up work.