Ivan Ovcharenko
National Institutes of Health
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ivan Ovcharenko.
Science | 2010
Uffe Hellsten; Richard M. Harland; Michael J. Gilchrist; David A. Hendrix; Jerzy Jurka; Vladimir V. Kapitonov; Ivan Ovcharenko; Nicholas H. Putnam; Shengqiang Shu; Leila Taher; Ira L. Blitz; Bruce Blumberg; Darwin S. Dichmann; Inna Dubchak; Enrique Amaya; John C. Detter; Russell B. Fletcher; Daniela S. Gerhard; David L. Goodstein; Tina Graves; Igor V. Grigoriev; Jane Grimwood; Takeshi Kawashima; Erika Lindquist; Susan Lucas; Paul E. Mead; Therese Mitros; Hajime Ogino; Yuko Ohta; Alexander Poliakov
Frog Genome The African clawed frog Xenopus tropicalis is the first amphibian to have its genome sequenced. Hellsten et al. (p. 633, see the cover) present an analysis of a draft assembly of the genome. The genome of the frog, which is an important model system for developmental biology, encodes over 20,000 protein-coding genes, of which more than 1700 genes have identified human disease associations. Detailed comparison of the content of protein-coding genes with other tetrapods—human and chicken—reveals extensive shared synteny, occasionally spanning entire chromosomes. Assembly, annotation, and analysis of the frog genome compares gene content and synteny with the human and chicken genomes. The western clawed frog Xenopus tropicalis is an important model for vertebrate development that combines experimental advantages of the African clawed frog Xenopus laevis with more tractable genetics. Here we present a draft genome sequence assembly of X. tropicalis. This genome encodes more than 20,000 protein-coding genes, including orthologs of at least 1700 human disease genes. Over 1 million expressed sequence tags validated the annotation. More than one-third of the genome consists of transposable elements, with unusually prevalent DNA transposons. Like that of other tetrapods, the genome of X. tropicalis contains gene deserts enriched for conserved noncoding elements. The genome exhibits substantial shared synteny with human and chicken over major parts of large chromosomes, broken by lineage-specific chromosome fusions and fissions, mainly in the mammalian lineage.
Nature | 2004
Jane Grimwood; Laurie Gordon; Anne S. Olsen; Astrid Terry; Jeremy Schmutz; Jane Lamerdin; Uffe Hellsten; David Goodstein; Olivier Couronne; Mary Tran-Gyamfi; Andrea Aerts; Michael R. Altherr; Linda Ashworth; Eva Bajorek; Stacey Black; Elbert Branscomb; Sean Caenepeel; Anthony Carrano; Yee Man Chan; Mari Christensen; Catherine A. Cleland; Alex Copeland; Eileen Dalin; Paramvir Dehal; Mirian Denys; John C. Detter; Julio Escobar; Dave Flowers; Dea Fotopulos; Carmen Garcia
Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.
Genome Research | 2010
Valer Gotea; Axel Visel; John M. Westlund; Marcelo A. Nobrega; Len A. Pennacchio; Ivan Ovcharenko
Clustering of multiple transcription factor binding sites (TFBSs) for the same transcription factor (TF) is a common feature of cis-regulatory modules in invertebrate animals, but the occurrence of such homotypic clusters of TFBSs (HCTs) in the human genome has remained largely unknown. To explore whether HCTs are also common in human and other vertebrates, we used known binding motifs for vertebrate TFs and a hidden Markov model-based approach to detect HCTs in the human, mouse, chicken, and fugu genomes, and examined their association with cis-regulatory modules. We found that evolutionarily conserved HCTs occupy nearly 2% of the human genome, with experimental evidence for individual TFs supporting their binding to predicted HCTs. More than half of the promoters of human genes contain HCTs, with a distribution around the transcription start site in agreement with the experimental data from the ENCODE project. In addition, almost half of the 487 experimentally validated developmental enhancers contain them as well--a number more than 25-fold larger than expected by chance. We also found evidence of negative selection acting on TFBSs within HCTs, as the conservation of TFBSs is stronger than the conservation of sequences separating them. The important role of HCTs as components of developmental enhancers is additionally supported by a strong correlation between HCTs and the binding of the enhancer-associated coactivator protein Ep300 (also known as p300). Experimental validation of HCT-containing elements in both zebrafish and mouse suggest that HCTs could be used to predict both the presence of enhancers and their tissue specificity, and are thus a feature that can be effectively used in deciphering the gene regulatory code. In conclusion, our results indicate that HCTs are a pervasive feature of human cis-regulatory modules and suggest that they play an important role in gene regulation in the human and other vertebrate genomes.
Cell | 2013
Axel Visel; Leila Taher; Hani Z. Girgis; Dalit May; Olga Golonzhka; Renée V. Hoch; Gabriel L. McKinsey; Kartik Pattabiraman; Shanni N. Silberberg; Matthew J. Blow; David V. Hansen; Alex S. Nord; Jennifer A. Akiyama; Amy Holt; Roya Hosseini; Sengthavy Phouanenavong; Ingrid Plajzer-Frick; Malak Shoukry; Veena Afzal; Tommy Kaplan; Arnold R. Kriegstein; Edward M. Rubin; Ivan Ovcharenko; Len A. Pennacchio; John L.R. Rubenstein
The mammalian telencephalon plays critical roles in cognition, motor function, and emotion. Though many of the genes required for its development have been identified, the distant-acting regulatory sequences orchestrating their in vivo expression are mostly unknown. Here, we describe a digital atlas of in vivo enhancers active in subregions of the developing telencephalon. We identified more than 4,600 candidate embryonic forebrain enhancers and studied the in vivo activity of 329 of these sequences in transgenic mouse embryos. We generated serial sets of histological brain sections for 145 reproducible forebrain enhancers, resulting in a publicly accessible web-based data collection comprising more than 32,000 sections. We also used epigenomic analysis of human and mouse cortex tissue to directly compare the genome-wide enhancer architecture in these species. These data provide a primary resource for investigating gene regulatory mechanisms of telencephalon development and enable studies of the role of distant-acting enhancers in neurodevelopmental disorders.
Nature Genetics | 2013
Robin P. Smith; Leila Taher; Rupali P Patwardhan; Mee J. Kim; Fumitaka Inoue; Jay Shendure; Ivan Ovcharenko; Nadav Ahituv
Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ∼5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers.
Nucleic Acids Research | 2008
Valer Gotea; Ivan Ovcharenko
Regulation of gene expression in eukaryotic genomes is established through a complex cooperative activity of proximal promoters and distant regulatory elements (REs) such as enhancers, repressors and silencers. We have developed a web server named DiRE, based on the Enhancer Identification (EI) method, for predicting distant regulatory elements in higher eukaryotic genomes, namely for determining their chromosomal location and functional characteristics. The server uses gene co-expression data, comparative genomics and profiles of transcription factor binding sites (TFBSs) to determine TFBS-association signatures that can be used for discriminating specific regulatory functions. DiREs unique feature is its ability to detect REs outside of proximal promoter regions, as it takes advantage of the full gene locus to conduct the search. DiRE can predict common REs for any set of input genes for which the user has prior knowledge of co-expression, co-function or other biologically meaningful grouping. The server predicts function-specific REs consisting of clusters of specifically-associated TFBSs and it also scores the association of individual transcription factors (TFs) with the biological function shared by the group of input genes. Its integration with the Array2BIO server allows users to start their analysis with raw microarray expression data. The DiRE web server is freely available at http://dire.dcode.org.
Genome Research | 2010
Leelavati Narlikar; Noboru Jo Sakabe; Alexander Blanski; Fabio Eiji Arimura; John M. Westlund; Marcelo A. Nobrega; Ivan Ovcharenko
The various organogenic programs deployed during embryonic development rely on the precise expression of a multitude of genes in time and space. Identifying the cis-regulatory elements responsible for this tightly orchestrated regulation of gene expression is an essential step in understanding the genetic pathways involved in development. We describe a strategy to systematically identify tissue-specific cis-regulatory elements that share combinations of sequence motifs. Using heart development as an experimental framework, we employed a combination of Gibbs sampling and linear regression to build a classifier that identifies heart enhancers based on the presence and/or absence of various sequence features, including known and putative transcription factor (TF) binding specificities. In distinguishing heart enhancers from a large pool of random noncoding sequences, the performance of our classifier is vastly superior to four commonly used methods, with an accuracy reaching 92% in cross-validation. Furthermore, most of the binding specificities learned by our method resemble the specificities of TFs widely recognized as key players in heart development and differentiation, such as SRF, MEF2, ETS1, SMAD, and GATA. Using our classifier as a predictor, a genome-wide scan identified over 40,000 novel human heart enhancers. Although the classifier used no gene expression information, these novel enhancers are strongly associated with genes expressed in the heart. Finally, in vivo tests of our predictions in mouse and zebrafish achieved a validation rate of 62%, significantly higher than what is expected by chance. These results support the existence of underlying cis-regulatory codes dictating tissue-specific transcription in mammalian genomes and validate our enhancer classifier strategy as a method to uncover these regulatory codes.
Bioinformatics | 2007
Gabriela G. Loots; Ivan Ovcharenko
Evolutionary conservation of DNA sequences provides a tool for the identification of functional elements in genomes. We have created a database of evolutionary conserved regions (ECRs) in vertebrate genomes, entitled ECRbase, which is constructed from a collection of whole-genome alignments produced by the ECR Browser. ECRbase features a database of syntenic blocks that recapitulate the evolution of rearrangements in vertebrates and a comprehensive collection of promoters in all vertebrate genomes generated using multiple sources of gene annotation. The database also contains a collection of annotated transcription factor binding sites (TFBSs) in evolutionary conserved and promoter elements. ECRbase currently includes human, rhesus macaque, dog, opossum, rat, mouse, chicken, frog, zebrafish and fugu genomes. It is freely accessible at http://ecrbase.dcode.org.
Nature Structural & Molecular Biology | 2011
David Martin; Cristina Pantoja; Ana Fernández Miñán; Christian Valdes-Quezada; Eduardo Moltó; Fuencisla Matesanz; Ozren Bogdanović; Elisa de la Calle-Mustienes; Orlando Domínguez; Leila Taher; Mayra Furlan-Magaril; Susana Cañón; María Fedetz; Maria A. Blasco; Paulo Pereira; Ivan Ovcharenko; Félix Recillas-Targa; Lluís Montoliu; Miguel Manzanares; Roderic Guigó; Manuel Serrano; Fernando Casares; José Luis Gómez-Skarmeta
Many genomic alterations associated with human diseases localize in noncoding regulatory elements located far from the promoters they regulate, making it challenging to link noncoding mutations or risk-associated variants with target genes. The range of action of a given set of enhancers is thought to be defined by insulator elements bound by the 11 zinc-finger nuclear factor CCCTC-binding protein (CTCF). Here we analyzed the genomic distribution of CTCF in various human, mouse and chicken cell types, demonstrating the existence of evolutionarily conserved CTCF-bound sites beyond mammals. These sites preferentially flank transcription factor–encoding genes, often associated with human diseases, and function as enhancer blockers in vivo, suggesting that they act as evolutionarily invariant gene boundaries. We then applied this concept to predict and functionally demonstrate that the polymorphic variants associated with multiple sclerosis located within the EVI5 gene impinge on the adjacent gene GFI1.
PLOS ONE | 2011
Leila Taher; Nicole M. Collette; Deepa K. Murugesh; Evan Maxwell; Ivan Ovcharenko; Gabriela G. Loots
Detailed information about stage-specific changes in gene expression is crucial for understanding the gene regulatory networks underlying development and the various signal transduction pathways contributing to morphogenesis. Here we describe the global gene expression dynamics during early murine limb development, when cartilage, tendons, muscle, joints, vasculature and nerves are specified and the musculoskeletal system of limbs is established. We used whole-genome microarrays to identify genes with differential expression at 5 stages of limb development (E9.5 to 13.5), during fore- and hind-limb patterning. We found that the onset of limb formation is characterized by an up-regulation of transcription factors, which is followed by a massive activation of genes during E10.5 and E11.5 which levels off at later time points. Among the 3520 genes identified as significantly up-regulated in the limb, we find ∼30% to be novel, dramatically expanding the repertoire of candidate genes likely to function in the limb. Hierarchical and stage-specific clustering identified expression profiles that are likely to correlate with functional programs during limb development and further characterization of these transcripts will provide new insights into specific tissue patterning processes. Here, we provide for the first time a comprehensive analysis of developmentally regulated genes during murine limb development, and provide some novel insights into the expression dynamics governing limb morphogenesis.