Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where John A. Capra is active.

Publication


Featured researches published by John A. Capra.


Cell | 2012

Dynamic and Coordinated Epigenetic Regulation of Developmental Transitions in the Cardiac Lineage

Joseph A. Wamstad; Jeffrey M. Alexander; Rebecca M. Truty; Avanti Shrikumar; Fugen Li; Kirsten E. Eilertson; Huiming Ding; John N. Wylie; Alexander R. Pico; John A. Capra; Genevieve D. Erwin; Steven Kattman; Gordon Keller; Deepak Srivastava; Stuart S. Levine; Katherine S. Pollard; Alisha K. Holloway; Laurie A. Boyer; Benoit G. Bruneau

Heart development is exquisitely sensitive to the precise temporal regulation of thousands of genes that govern developmental decisions during differentiation. However, we currently lack a detailed understanding of how chromatin and gene expression patterns are coordinated during developmental transitions in the cardiac lineage. Here, we interrogated the transcriptome and several histone modifications across the genome during defined stages of cardiac differentiation. We find distinct chromatin patterns that are coordinated with stage-specific expression of functionally related genes, including many human disease-associated genes. Moreover, we discover a novel preactivation chromatin pattern at the promoters of genes associated with heart development and cardiac function. We further identify stage-specific distal enhancer elements and find enriched DNA binding motifs within these regions that predict sets of transcription factors that orchestrate cardiac differentiation. Together, these findings form a basis for understanding developmentally regulated chromatin transitions during lineage commitment and the molecular etiology of congenital heart disease.


Bioinformatics | 2007

Predicting functionally important residues from sequence conservation

John A. Capra; Mona Singh

MOTIVATION All residues in a protein are not equally important. Some are essential for the proper structure and function of the protein, whereas others can be readily replaced. Conservation analysis is one of the most widely used methods for predicting these functionally important residues in protein sequences. RESULTS We introduce an information-theoretic approach for estimating sequence conservation based on Jensen-Shannon divergence. We also develop a general heuristic that considers the estimated conservation of sequentially neighboring sites. In large-scale testing, we demonstrate that our combined approach outperforms previous conservation-based measures in identifying functionally important residues; in particular, it is significantly better than the commonly used Shannon entropy measure. We find that considering conservation at sequential neighbors improves the performance of all methods tested. Our analysis also reveals that many existing methods that attempt to incorporate the relationships between amino acids do not lead to better identification of functionally important sites. Finally, we find that while conservation is highly predictive in identifying catalytic sites and residues near bound ligands, it is much less effective in identifying residues in protein-protein interfaces. AVAILABILITY Data sets and code for all conservation measures evaluated are available at http://compbio.cs.princeton.edu/conservation/


PLOS Computational Biology | 2009

Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure

John A. Capra; Roman A. Laskowski; Janet M. Thornton; Mona Singh; Thomas A. Funkhouser

Identifying a proteins functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).


PLOS Computational Biology | 2010

G-Quadruplex DNA Sequences Are Evolutionarily Conserved and Associated with Distinct Genomic Features in Saccharomyces cerevisiae

John A. Capra; Katrin Paeschke; Mona Singh; Virginia A. Zakian

G-quadruplex DNA is a four-stranded DNA structure formed by non-Watson-Crick base pairing between stacked sets of four guanines. Many possible functions have been proposed for this structure, but its in vivo role in the cell is still largely unresolved. We carried out a genome-wide survey of the evolutionary conservation of regions with the potential to form G-quadruplex DNA structures (G4 DNA motifs) across seven yeast species. We found that G4 DNA motifs were significantly more conserved than expected by chance, and the nucleotide-level conservation patterns suggested that the motif conservation was the result of the formation of G4 DNA structures. We characterized the association of conserved and non-conserved G4 DNA motifs in Saccharomyces cerevisiae with more than 40 known genome features and gene classes. Our comprehensive, integrated evolutionary and functional analysis confirmed the previously observed associations of G4 DNA motifs with promoter regions and the rDNA, and it identified several previously unrecognized associations of G4 DNA motifs with genomic features, such as mitotic and meiotic double-strand break sites (DSBs). Conserved G4 DNA motifs maintained strong associations with promoters and the rDNA, but not with DSBs. We also performed the first analysis of G4 DNA motifs in the mitochondria, and surprisingly found a tenfold higher concentration of the motifs in the AT-rich yeast mitochondrial DNA than in nuclear DNA. The evolutionary conservation of the G4 DNA motif and its association with specific genome features supports the hypothesis that G4 DNA has in vivo functions that are under evolutionary constraint.


Bioinformatics | 2008

Characterization and prediction of residues determining protein functional specificity

John A. Capra; Mona Singh

Motivation: Within a homologous protein family, proteins may be grouped into subtypes that share specific functions that are not common to the entire family. Often, the amino acids present in a small number of sequence positions determine each proteins particular function-al specificity. Knowledge of these specificity determining positions (SDPs) aids in protein function prediction, drug design and experimental analysis. A number of sequence-based computational methods have been introduced for identifying SDPs; however, their further development and evaluation have been hindered by the limited number of known experimentally determined SDPs. Results: We combine several bioinformatics resources to automate a process, typically undertaken manually, to build a dataset of SDPs. The resulting large dataset, which consists of SDPs in enzymes, enables us to characterize SDPs in terms of their physicochemical and evolution-ary properties. It also facilitates the large-scale evaluation of sequence-based SDP prediction methods. We present a simple sequence-based SDP prediction method, GroupSim, and show that, surprisingly, it is competitive with a representative set of current methods. We also describe ConsWin, a heuristic that considers sequence conservation of neighboring amino acids, and demonstrate that it improves the performance of all methods tested on our large dataset of enzyme SDPs. Availability: Datasets and GroupSim code are available online at http://compbio.cs.princeton.edu/specificity/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Science | 2016

The phenotypic legacy of admixture between modern humans and Neandertals

Corinne N. Simonti; Benjamin Vernot; Erwin P. Bottinger; David Carrell; Rex L. Chisholm; David R. Crosslin; Scott J. Hebbring; Gail P. Jarvik; Iftikhar J. Kullo; Rongling Li; Jyotishman Pathak; Marylyn D. Ritchie; Dan M. Roden; Shefali S. Verma; Gerard Tromp; Jeffrey D. Prato; William S. Bush; Joshua M. Akey; Joshua C. Denny; John A. Capra

The legacy of human-Neandertal interbreeding Non-African humans are estimated to have inherited on average 1.5 to 4% of their genomes from Neandertals. However, how this genetic legacy affects human traits is unknown. Simonti et al. combined genotyping data with electronic health records. Individual Neandertal alleles were correlated with clinically relevant phenotypes in individuals of European descent. These archaic genetic variants were associated with medical conditions affecting the skin, the blood, and the risk of depression. Science, this issue p. 737 Genotype-phenotype association analysis of Neandertal alleles in modern humans identifies clinical effects. Many modern human genomes retain DNA inherited from interbreeding with archaic hominins, such as Neandertals, yet the influence of this admixture on human traits is largely unknown. We analyzed the contribution of common Neandertal variants to over 1000 electronic health record (EHR)–derived phenotypes in ~28,000 adults of European ancestry. We discovered and replicated associations of Neandertal alleles with neurological, psychiatric, immunological, and dermatological phenotypes. Neandertal alleles together explained a significant fraction of the variation in risk for depression and skin lesions resulting from sun exposure (actinic keratosis), and individual Neandertal alleles were significantly associated with specific human phenotypes, including hypercoagulation and tobacco use. Our results establish that archaic admixture influences disease risk in modern humans, provide hypotheses about the effects of hundreds of Neandertal haplotypes, and demonstrate the utility of EHR data in evolutionary analyses.


PLOS Computational Biology | 2014

Integrating diverse datasets improves developmental enhancer prediction.

Genevieve D. Erwin; Nir Oksenberg; Rebecca M. Truty; Dennis Kostka; Karl K. Murphy; Nadav Ahituv; Katherine S. Pollard; John A. Capra

Gene-regulatory enhancers have been identified using various approaches, including evolutionary conservation, regulatory protein binding, chromatin modifications, and DNA sequence motifs. To integrate these different approaches, we developed EnhancerFinder, a two-step method for distinguishing developmental enhancers from the genomic background and then predicting their tissue specificity. EnhancerFinder uses a multiple kernel learning approach to integrate DNA sequence motifs, evolutionary patterns, and diverse functional genomics datasets from a variety of cell types. In contrast with prediction approaches that define enhancers based on histone marks or p300 sites from a single cell line, we trained EnhancerFinder on hundreds of experimentally verified human developmental enhancers from the VISTA Enhancer Browser. We comprehensively evaluated EnhancerFinder using cross validation and found that our integrative method improves the identification of enhancers over approaches that consider a single type of data, such as sequence motifs, evolutionary conservation, or the binding of enhancer-associated proteins. We find that VISTA enhancers active in embryonic heart are easier to identify than enhancers active in several other embryonic tissues, likely due to their uniquely high GC content. We applied EnhancerFinder to the entire human genome and predicted 84,301 developmental enhancers and their tissue specificity. These predictions provide specific functional annotations for large amounts of human non-coding DNA, and are significantly enriched near genes with annotated roles in their predicted tissues and lead SNPs from genome-wide association studies. We demonstrate the utility of EnhancerFinder predictions through in vivo validation of novel embryonic gene regulatory enhancers from three developmental transcription factor loci. Our genome-wide developmental enhancer predictions are freely available as a UCSC Genome Browser track, which we hope will enable researchers to further investigate questions in developmental biology.


Molecular Cell | 2013

Acetylation of RNA Polymerase II Regulates Growth-Factor-Induced Gene Transcription in Mammalian Cells

Sebastian Schröder; Eva Herker; Friederike Itzen; Daniel He; Sean Thomas; Daniel A. Gilchrist; Katrin Kaehlcke; Sungyoo Cho; Katherine S. Pollard; John A. Capra; Martina Schnölzer; Philip A. Cole; Matthias Geyer; Benoit G. Bruneau; Karen Adelman; Melanie Ott

Lysine acetylation regulates transcription by targeting histones and nonhistone proteins. Here we report that the central regulator of transcription, RNA polymerase II, is subject to acetylation in mammalian cells. Acetylation occurs at eight lysines within the C-terminal domain (CTD) of the largest polymerase subunit and is mediated by p300/KAT3B. CTD acetylation is specifically enriched downstream of the transcription start sites of polymerase-occupied genes genome-wide, indicating a role in early stages of transcription initiation or elongation. Mutation of lysines or p300 inhibitor treatment causes the loss of epidermal growth-factor-induced expression of c-Fos and Egr2, immediate-early genes with promoter-proximally paused polymerases, but does not affect expression or polymerase occupancy at housekeeping genes. Our studies identify acetylation as a new modification of the mammalian RNA polymerase II required for the induction of growth factor response genes.


Philosophical Transactions of the Royal Society B | 2013

Many human accelerated regions are developmental enhancers

John A. Capra; Genevieve D. Erwin; Gabriel L. McKinsey; John L.R. Rubenstein; Katherine S. Pollard

The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology.


Neuroinformatics | 2003

Informatics center for mouse genomics: the dissection of complex traits of the nervous system.

Glenn D. Rosen; Nathan T. La Porte; Boris Diechtiareff; Christopher J. Pung; Jonathan Nissanov; Carl Gustafson; Louise Bertrand; Smadar Gefen; Yingli Fan; Oleh J. Tretiak; Kenneth F. Manly; Melburn R. Park; Alexander G. Williams; Michael T. Connolly; John A. Capra; Robert W. Williams

In recent years, there has been an explosion in the number of tools and techniques available to researchers interested in exploring the genetic basis of all aspects of central nervous system (CNS) development and function. Here, we exploit a powerful new reductionist approach to explore the genetic basis of the very significant structural and molecular differences between the brains of different strains of mice, called either complex trait or quantitative trait loci (QTL) analysis. Our specific focus has been to provide universal access over the web to tools for the genetic dissection of complex traits of the CNS—tools that allow researchers to map genes that modulate phenotypes at a variety of levels ranging from the molecular all the way to the anatomy of the entire brain.Our website, The Mouse Brain Library (MBL; http://mbl.org) is comprised of four interrelated components that are designed to support this goal: The Brain Library, iScope, Neurocartographer, and WebQTL. The centerpiece of the MBL is an image database of histologically prepared museum-quality slides representing nearly 2000 mice from over 120 strains—a library suitable for stereologic analysis of regional volume. The iScope provides fast access to the entire slide collection using streaming video technology, enabling neuroscientists to acquire high-magnification images of any CNS region for any of the mice in the MBL. Neurocartographer provides automatic segmentation of images from the MBL by warping precisely delineated boundaries from a 3D atlas of the mouse brain. Finally, WebQTL provides statistical and graphical analysis of linkage between phenotypes and genotypes.

Collaboration


Dive into the John A. Capra's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

William S. Bush

Case Western Reserve University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dennis Kostka

University of Pittsburgh

View shared research outputs
Top Co-Authors

Avatar

Ling Chen

Memorial Sloan Kettering Cancer Center

View shared research outputs
Researchain Logo
Decentralizing Knowledge