Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Benjamin Vernot is active.

Publication


Featured researches published by Benjamin Vernot.


Nature | 2012

The accessible chromatin landscape of the human genome.

Robert E. Thurman; Eric Rynes; Richard Humbert; Jeff Vierstra; Matthew T. Maurano; Eric Haugen; Nathan C. Sheffield; Andrew B. Stergachis; Hao Wang; Benjamin Vernot; Kavita Garg; Sam John; Richard Sandstrom; Daniel Bates; Lisa Boatman; Theresa K. Canfield; Morgan Diegel; Douglas Dunn; Abigail K. Ebersol; Tristan Frum; Erika Giste; Audra K. Johnson; Ericka M. Johnson; Tanya Kutyavin; Bryan R. Lajoie; Bum Kyu Lee; Kristen Lee; Darin London; Dimitra Lotakis; Shane Neph

DNase I hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ∼2.9 million DHSs that encompass virtually all known experimentally validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. We connect ∼580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is organized with dozens to hundreds of co-activated elements, and the transcellular DNase I sensitivity pattern at a given region can predict cell-type-specific functional behaviours. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.


Nature | 2012

Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations

Brian J. O’Roak; Laura Vives; Santhosh Girirajan; Emre Karakoc; Niklas Krumm; Bradley P. Coe; Roie Levy; Arthur Ko; Choli Lee; Joshua D. Smith; Emily H. Turner; Ian B. Stanaway; Benjamin Vernot; Maika Malig; Carl Baker; Beau Reilly; Joshua M. Akey; Elhanan Borenstein; Mark J. Rieder; Deborah A. Nickerson; Raphael Bernier; Jay Shendure; Evan E. Eichler

It is well established that autism spectrum disorders (ASD) have a strong genetic component; however, for at least 70% of cases, the underlying genetic cause is unknown. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes—so-called sporadic or simplex families—we sequenced all coding regions of the genome (the exome) for parent–child trios exhibiting sporadic ASD, including 189 new trios and 20 that were previously reported. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19), for a total of 677 individual exomes from 209 families. Here we show that de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD. Moreover, 39% (49 of 126) of the most severe or disruptive de novo mutations map to a highly interconnected β-catenin/chromatin remodelling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes: CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3 and SCN1A. Combined with copy number variant (CNV) data, these results indicate extreme locus heterogeneity but also provide a target for future discovery, diagnostics and therapeutics.


Nature | 2012

An expansive human regulatory lexicon encoded in transcription factor footprints

Shane Neph; Jeff Vierstra; Andrew B. Stergachis; Alex Reynolds; Eric Haugen; Benjamin Vernot; Robert E. Thurman; Sam John; Richard Sandstrom; Audra K. Johnson; Matthew T. Maurano; Richard Humbert; Eric Rynes; Hao Wang; Shinny Vong; Kristen Lee; Daniel Bates; Morgan Diegel; Vaughn Roach; Douglas Dunn; Jun Neri; Anthony Schafer; R. Scott Hansen; Tanya Kutyavin; Erika Giste; Molly Weaver; Theresa K. Canfield; Peter J. Sabo; Miaohua Zhang; Gayathri Balasundaram

Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNase I, leaving nucleotide-resolution footprints. Using genomic DNase I footprinting across 41 diverse cell and tissue types, we detected 45 million transcription factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis–regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNase I cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein–DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50-base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency.


Journal of Computational Biology | 2006

A hybrid micro-macroevolutionary approach to gene tree reconstruction.

Dannie Durand; Bjarni V. Halldórsson; Benjamin Vernot

Gene family evolution is determined by microevolutionary processes (e.g., point mutations) and macroevolutionary processes (e.g., gene duplication and loss), yet macroevolutionary considerations are rarely incorporated into gene phylogeny reconstruction methods. We present a dynamic program to find the most parsimonious gene family tree with respect to a macroevolutionary optimization criterion, the weighted sum of the number of gene duplications and losses. The existence of a polynomial delay algorithm for duplication/loss phylogeny reconstruction stands in contrast to most formulations of phylogeny reconstruction, which are NP-complete. We next extend this result to obtain a two-phase method for gene tree reconstruction that takes both micro- and macroevolution into account. In the first phase, a gene tree is constructed from sequence data, using any of the previously known algorithms for gene phylogeny construction. In the second phase, the tree is refined by rearranging regions of the tree that do not have strong support in the sequence data to minimize the duplication/lost cost. Components of the tree with strong support are left intact. This hybrid approach incorporates both micro- and macroevolutionary considerations, yet its computational requirements are modest in practice because the two-phase approach constrains the search space. Our hybrid algorithm can also be used to resolve nonbinary nodes in a multifurcating gene tree. We have implemented these algorithms in a software tool, NOTUNG 2.0, that can be used as a unified framework for gene tree reconstruction or as an exploratory analysis tool that can be applied post hoc to any rooted tree with bootstrap values. The NOTUNG 2.0 graphical user interface can be used to visualize alternate duplication/loss histories, root trees according to duplication and loss parsimony, manipulate and annotate gene trees, and estimate gene duplication times. It also offers a command line option that enables high-throughput analysis of a large number of trees.


Science | 2014

Resurrecting Surviving Neandertal Lineages from Modern Human Genomes

Benjamin Vernot; Joshua M. Akey

Neandertal Shadows in Us Non-African modern humans carry a remnant of Neandertal DNA from interbreeding events that have been postulated to have occurred as humans migrated out of Africa. While the total amount of Neandertal sequence is estimated to be less than 3% of the modern genome, the specific retained sequences vary among individuals. Analyzing the genomes of more than 600 Europeans and East Asians, Vernot and Akey (p. 1017, published online 29 January) identified Neandertal sequences within modern humans that taken together span approximately 20% of the Neandertal genome. Some Neandertal-derived sequences appear to be under positive selection in humans, including several genes associated with skin phenotypes. Ancestral Neandertal sequences within extant humans reveal that positive and purifying selection has occurred. Anatomically modern humans overlapped and mated with Neandertals such that non-African humans inherit ~1 to 3% of their genomes from Neandertal ancestors. We identified Neandertal lineages that persist in the DNA of modern humans, in whole-genome sequences from 379 European and 286 East Asian individuals, recovering more than 15 gigabases of introgressed sequence that spans ~20% of the Neandertal genome (false discovery rate = 5%). Analyses of surviving archaic lineages suggest that there were fitness costs to hybridization, admixture occurred both before and after divergence of non-African modern humans, and Neandertals were a source of adaptive variation for loci involved in skin phenotypes. Our results provide a new avenue for paleogenomics studies, allowing substantial amounts of population-level DNA sequence information to be obtained from extinct groups, even in the absence of fossilized remains.


Cell | 2013

Developmental Fate and Cellular Maturity Encoded in Human Regulatory DNA Landscapes

Andrew B. Stergachis; Shane Neph; Alex Reynolds; Richard Humbert; Brady Miller; Sharon L. Paige; Benjamin Vernot; Jeffrey B. Cheng; Robert E. Thurman; Richard Sandstrom; Eric Haugen; Shelly Heimfeld; Charles E. Murry; Joshua M. Akey; John A. Stamatoyannopoulos

Cellular-state information between generations of developing cells may be propagated via regulatory regions. We report consistent patterns of gain and loss of DNase I-hypersensitive sites (DHSs) as cells progress from embryonic stem cells (ESCs) to terminal fates. DHS patterns alone convey rich information about cell fate and lineage relationships distinct from information conveyed by gene expression. Developing cells share a proportion of their DHS landscapes with ESCs; that proportion decreases continuously in each cell type as differentiation progresses, providing a quantitative benchmark of developmental maturity. Developmentally stable DHSs densely encode binding sites for transcription factors involved in autoregulatory feedback circuits. In contrast to normal cells, cancer cells extensively reactivate silenced ESC DHSs and those from developmental programs external to the cell lineage from which the malignancy derives. Our results point to changes in regulatory DNA landscapes as quantitative indicators of cell-fate transitions, lineage relationships, and dysfunction.


Science | 2013

Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution

Andrew B. Stergachis; Eric Haugen; Anthony Shafer; Wenqing Fu; Benjamin Vernot; Alex Reynolds; Anthony Raubitschek; Steven F. Ziegler; Emily LeProust; Joshua M. Akey; John A. Stamatoyannopoulos

Transcription Factor Binding Sites Transcription factors (TFs) are proteins that bind to DNA to control gene transcription. Stergachis et al. (p. 1367; see the Perspective by Weatheritt and Babu) examined TF binding within the human genome in more than 80 cell types. Nearly 15% of coding regions simultaneously specify both amino acid sequence and TF recognition sites. The distribution of the TF binding sites evolutionarily constrains how codons within these regions can change, independent of encoded protein function. Thus, TF binding may represent a widespread and strong evolutionary force in coding regions. Transcription factor binding within protein-coding regions of DNA constrains how the protein can evolve. [Also see Perspective by Weatheritt and Babu] Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. We found that ~15% of human codons are dual-use codons (“duons”) that simultaneously specify both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, the regulatory code has been selectively depleted of TFs that recognize stop codons. More than 17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual encoding of amino acid and regulatory information appears to be a fundamental feature of genome evolution.


Cell | 2012

Evolutionary history and adaptation from high-coverage whole-genome sequences of diverse African hunter-gatherers

Joseph Lachance; Benjamin Vernot; Clara C. Elbers; Bart Ferwerda; Alain Froment; Jean-Marie Bodo; Godfrey Lema; Wenqing Fu; Thomas B. Nyambo; Timothy R. Rebbeck; Kun Zhang; Joshua M. Akey; Sarah A. Tishkoff

To reconstruct modern human evolutionary history and identify loci that have shaped hunter-gatherer adaptation, we sequenced the whole genomes of five individuals in each of three different hunter-gatherer populations at > 60× coverage: Pygmies from Cameroon and Khoesan-speaking Hadza and Sandawe from Tanzania. We identify 13.4 million variants, substantially increasing the set of known human variation. We found evidence of archaic introgression in all three populations, and the distribution of time to most recent common ancestors from these regions is similar to that observed for introgressed regions in Europeans. Additionally, we identify numerous loci that harbor signatures of local adaptation, including genes involved in immunity, metabolism, olfactory and taste perception, reproduction, and wound healing. Within the Pygmy population, we identify multiple highly differentiated loci that play a role in growth and anterior pituitary function and are associated with height.


Bioinformatics | 2012

Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees

Maureen Stolzer; Han Lai; Minli Xu; Deepa Sathaye; Benjamin Vernot; Dannie Durand

Motivation: Gene duplication (D), transfer (T), loss (L) and incomplete lineage sorting (I) are crucial to the evolution of gene families and the emergence of novel functions. The history of these events can be inferred via comparison of gene and species trees, a process called reconciliation, yet current reconciliation algorithms model only a subset of these evolutionary processes. Results: We present an algorithm to reconcile a binary gene tree with a nonbinary species tree under a DTLI parsimony criterion. This is the first reconciliation algorithm to capture all four evolutionary processes driving tree incongruence and the first to reconcile non-binary species trees with a transfer model. Our algorithm infers all optimal solutions and reports complete, temporally feasible event histories, giving the gene and species lineages in which each event occurred. It is fixed-parameter tractable, with polytime complexity when the maximum species outdegree is fixed. Application of our algorithms to prokaryotic and eukaryotic data show that use of an incomplete event model has substantial impact on the events inferred and resulting biological conclusions. Availability: Our algorithms have been implemented in Notung, a freely available phylogenetic reconciliation software package, available at http://www.cs.cmu.edu/~durand/Notung. Contact: [email protected]


Journal of Computational Biology | 2008

Reconciliation with non-binary species trees.

Benjamin Vernot; Maureen Stolzer; Aiton Goldman; Dannie Durand

Reconciliation extracts information from the topological incongruence between gene and species trees to infer duplications and losses in the history of a gene family. The inferred duplication-loss histories provide valuable information for a broad range of biological applications, including ortholog identification, estimating gene duplication times, and rooting and correcting gene trees. While reconciliation for binary trees is a tractable and well studied problem, there are no algorithms for reconciliation with non-binary species trees. Yet a striking proportion of species trees are non-binary. For example, 64% of branch points in the NCBI taxonomy have three or more children. When applied to non-binary species trees, current algorithms overestimate the number of duplications because they cannot distinguish between duplication and incomplete lineage sorting. We present the first algorithms for reconciling binary gene trees with non-binary species trees under a duplication-loss parsimony model. Our algorithms utilize an efficient mapping from gene to species trees to infer the minimum number of duplications in O(|V(G) | x (k(S) + h(S))) time, where |V(G)| is the number of nodes in the gene tree, h(S) is the height of the species tree and k(S) is the size of its largest polytomy. We present a dynamic programming algorithm which also minimizes the total number of losses. Although this algorithm is exponential in the size of the largest polytomy, it performs well in practice for polytomies with outdegree of 12 or less. We also present a heuristic which estimates the minimal number of losses in polynomial time. In empirical tests, this algorithm finds an optimal loss history 99% of the time. Our algorithms have been implemented in NOTUNG, a robust, production quality, tree-fitting program, which provides a graphical user interface for exploratory analysis and also supports automated, high-throughput analysis of large data sets.

Collaboration


Dive into the Benjamin Vernot's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dannie Durand

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric Haugen

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shane Neph

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Alex Reynolds

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge