Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ivan V. Kulakovskiy is active.

Publication


Featured researches published by Ivan V. Kulakovskiy.


BMC Genomics | 2014

Effects of cytosine methylation on transcription factor binding sites

Yulia A. Medvedeva; Abdullah M. Khamis; Ivan V. Kulakovskiy; Wail Ba-alawi; Shariful Islam Bhuyan; Hideya Kawaji; Timo Lassmann; Matthias Harbers; Alistair R. R. Forrest; Vladimir B. Bajic

BackgroundDNA methylation in promoters is closely linked to downstream gene repression. However, whether DNA methylation is a cause or a consequence of gene repression remains an open question. If it is a cause, then DNA methylation may affect the affinity of transcription factors (TFs) for their binding sites (TFBSs). If it is a consequence, then gene repression caused by chromatin modification may be stabilized by DNA methylation. Until now, these two possibilities have been supported only by non-systematic evidence and they have not been tested on a wide range of TFs. An average promoter methylation is usually used in studies, whereas recent results suggested that methylation of individual cytosines can also be important.ResultsWe found that the methylation profiles of 16.6% of cytosines and the expression profiles of neighboring transcriptional start sites (TSSs) were significantly negatively correlated. We called the CpGs corresponding to such cytosines “traffic lights”. We observed a strong selection against CpG “traffic lights” within TFBSs. The negative selection was stronger for transcriptional repressors as compared with transcriptional activators or multifunctional TFs as well as for core TFBS positions as compared with flanking TFBS positions.ConclusionsOur results indicate that direct and selective methylation of certain TFBS that prevents TF binding is restricted to special cases and cannot be considered as a general regulatory mechanism of transcription.


Nucleic Acids Research | 2013

HOCOMOCO: a comprehensive collection of human transcription factor binding sites models

Ivan V. Kulakovskiy; Yulia A. Medvedeva; Ulf Schaefer; Artem S. Kasianov; Ilya E. Vorontsov; Vladimir B. Bajic; Vsevolod J. Makeev

Transcription factor (TF) binding site (TFBS) models are crucial for computational reconstruction of transcription regulatory networks. In existing repositories, a TF often has several models (also called binding profiles or motifs), obtained from different experimental data. Having a single TFBS model for a TF is more pragmatic for practical applications. We show that integration of TFBS data from various types of experiments into a single model typically results in the improved model quality probably due to partial correction of source specific technique bias. We present the Homo sapiens comprehensive model collection (HOCOMOCO, http://autosome.ru/HOCOMOCO/, http://cbrc.kaust.edu.sa/hocomoco/) containing carefully hand-curated TFBS models constructed by integration of binding sequences obtained by both low- and high-throughput methods. To construct position weight matrices to represent these TFBS models, we used ChIPMunk software in four computational modes, including newly developed periodic positional prior mode associated with DNA helix pitch. We selected only one TFBS model per TF, unless there was a clear experimental evidence for two rather distinct TFBS models. We assigned a quality rating to each model. HOCOMOCO contains 426 systematically curated TFBS models for 401 human TFs, where 172 models are based on more than one data source.


Bioinformatics | 2010

Deep and wide digging for binding motifs in ChIP-Seq data

Ivan V. Kulakovskiy; Valentina Boeva; Alexander V. Favorov; Vsevolod J. Makeev

SUMMARY ChIP-Seq data are a new challenge for motif discovery. Such a data typically consists of thousands of DNA segments with base-specific coverage values. We present a new version of our DNA motif discovery software ChIPMunk adapted for ChIP-Seq data. ChIPMunk is an iterative algorithm that combines greedy optimization with bootstrapping and uses coverage profiles as motif positional preferences. ChIPMunk does not require truncation of long DNA segments and it is practical for processing up to tens of thousands of data sequences. Comparison with traditional (MEME) or ChIP-Seq-oriented (HMS) motif discovery tools shows that ChIPMunk identifies the correct motifs with the same or better quality but works dramatically faster. AVAILABILITY AND IMPLEMENTATION ChIPMunk is freely available within the ru_genetika Java package: http://line.imb.ac.ru/ChIPMunk. Web-based version is also available. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Nucleic Acids Research | 2016

HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models

Ivan V. Kulakovskiy; Ilya E. Vorontsov; Ivan S. Yevshin; Anastasiia V. Soboleva; Artem S. Kasianov; Haitham Ashoor; Wail Ba-alawi; Vladimir B. Bajic; Yulia A. Medvedeva; Fedor A. Kolpakov; Vsevolod J. Makeev

Models of transcription factor (TF) binding sites provide a basis for a wide spectrum of studies in regulatory genomics, from reconstruction of regulatory networks to functional annotation of transcripts and sequence variants. While TFs may recognize different sequence patterns in different conditions, it is pragmatic to have a single generic model for each particular TF as a baseline for practical applications. Here we present the expanded and enhanced version of HOCOMOCO (http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco10), the collection of models of DNA patterns, recognized by transcription factors. HOCOMOCO now provides position weight matrix (PWM) models for binding sites of 601 human TFs and, in addition, PWMs for 396 mouse TFs. Furthermore, we introduce the largest up to date collection of dinucleotide PWM models for 86 (52) human (mouse) TFs. The update is based on the analysis of massive ChIP-Seq and HT-SELEX datasets, with the validation of the resulting models on in vivo data. To facilitate a practical application, all HOCOMOCO models are linked to gene and protein databases (Entrez Gene, HGNC, UniProt) and accompanied by precomputed score thresholds. Finally, we provide command-line tools for PWM and diPWM threshold estimation and motif finding in nucleotide sequences.


Database | 2015

EpiFactors: a comprehensive database of human epigenetic factors and complexes

Yulia A. Medvedeva; Andreas Lennartsson; Rezvan Ehsani; Ivan V. Kulakovskiy; Ilya E. Vorontsov; Pouda Panahandeh; Grigory Khimulya; Takeya Kasukawa; Finn Drabløs

Epigenetics refers to stable and long-term alterations of cellular traits that are not caused by changes in the DNA sequence per se. Rather, covalent modifications of DNA and histones affect gene expression and genome stability via proteins that recognize and act upon such modifications. Many enzymes that catalyse epigenetic modifications or are critical for enzymatic complexes have been discovered, and this is encouraging investigators to study the role of these proteins in diverse normal and pathological processes. Rapidly growing knowledge in the area has resulted in the need for a resource that compiles, organizes and presents curated information to the researchers in an easily accessible and user-friendly form. Here we present EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets and products. EpiFactors contains information on 815 proteins, including 95 histones and protamines. For 789 of these genes, we include expressions values across several samples, in particular a collection of 458 human primary cell samples (for approximately 200 cell types, in many cases from three individual donors), covering most mammalian cell steady states, 255 different cancer cell lines (representing approximately 150 cancer subtypes) and 134 human postmortem tissues. Expression values were obtained by the FANTOM5 consortium using Cap Analysis of Gene Expression technique. EpiFactors also contains information on 69 protein complexes that are involved in epigenetic regulation. The resource is practical for a wide range of users, including biologists, pharmacologists and clinicians. Database URL: http://epifactors.autosome.ru


Bioinformatics | 2009

Motif discovery and motif finding from genome-mapped DNase footprint data

Ivan V. Kulakovskiy; Alexander V. Favorov; Vsevolod J. Makeev

MOTIVATION Footprint data is an important source of information on transcription factor recognition motifs. However, a footprinting fragment can contain no sequences similar to known protein recognition sites. Inspection of genome fragments nearby can help to identify missing site positions. RESULTS Genome fragments containing footprints were supplied to a pipeline that constructed a position weight matrix (PWM) for different motif lengths and selected the optimal PWM. Fragments were aligned with the SeSiMCMC sampler and a new heuristic algorithm, Bigfoot. Footprints with missing hits were found for approximately 50% of factors. Adding only 2 bp on both sides of a footprinting fragment recovered most hits. We automatically constructed motifs for 41 Drosophila factors. New motifs can recognize footprints with a greater sensitivity at the same false positive rate than existing models. Also we discuss possible overfitting of constructed motifs. AVAILABILITY Software and the collection of regulatory motifs are freely available at http://line.imb.ac.ru/DMMPMM.


Stem cell reports | 2015

Single-Cell Analyses of ESCs Reveal Alternative Pluripotent Cell States and Molecular Mechanisms that Control Self-Renewal

Dmitri Papatsenko; Henia Darr; Ivan V. Kulakovskiy; Avinash Waghray; Vsevolod J. Makeev; Ben D. MacArthur; Ihor R. Lemischka

Summary Analyses of gene expression in single mouse embryonic stem cells (mESCs) cultured in serum and LIF revealed the presence of two distinct cell subpopulations with individual gene expression signatures. Comparisons with published data revealed that cells in the first subpopulation are phenotypically similar to cells isolated from the inner cell mass (ICM). In contrast, cells in the second subpopulation appear to be more mature. Pluripotency Gene Regulatory Network (PGRN) reconstruction based on single-cell data and published data suggested antagonistic roles for Oct4 and Nanog in the maintenance of pluripotency states. Integrated analyses of published genomic binding (ChIP) data strongly supported this observation. Certain target genes alternatively regulated by OCT4 and NANOG, such as Sall4 and Zscan10, feed back into the top hierarchical regulator Oct4. Analyses of such incoherent feedforward loops with feedback (iFFL-FB) suggest a dynamic model for the maintenance of mESC pluripotency and self-renewal.


Biophysics | 2009

Discovery of DNA motifs recognized by transcription factors through integration of different experimental sources

Ivan V. Kulakovskiy; Vsevolod J. Makeev

We suggest a new approach for discovery of DNA motifs specifically recognized by transcription factors. The approach is based on combined analysis of data obtained by different experimental methods, and is implemented as the ChIPMunk software. For 39 transcription factors of D. melanogaster we constructed the improved binding motifs, typically, short (∼10 bp) DNA segments that may specifically interact with these proteins. We used a wide repertoire of publicly available experimental data, including DNase I footprinting, SELEX, ChIP-chip, and B1H system. The motifs built by integrating data from independent sources demonstrated the best sensitivity. The resulting collection of motifs can be used for more precise positioning of protein-binding sites within DNA fragments that exhibit high affinity. The collection of motifs including the results of motif recognition power testing is available on the web: http://line.imb.ac.ru/iDMMPMM. ChIP-Munk software is also available: http://line.imb.ac.ru/Chipmunk.


Nucleic Acids Research | 2012

Spi-1/PU.1 activates transcription through clustered DNA occupancy in erythroleukemia

Maya Ridinger-Saison; Valentina Boeva; Pauline Rimmelé; Ivan V. Kulakovskiy; Isabelle Gallais; Benjamin Levavasseur; Caroline Paccard; Patricia Legoix-Né; François Morlé; Alain Nicolas; Philippe Hupé; Emmanuel Barillot; Françoise Moreau-Gachelin; Christel Guillouf

Acute leukemias are characterized by deregulation of transcriptional networks that control the lineage specificity of gene expression. The aberrant overexpression of the Spi-1/PU.1 transcription factor leads to erythroleukemia. To determine how Spi-1 mechanistically influences the transcriptional program, we combined a ChIP-seq analysis with transcriptional profiling in cells from an erythroleukemic mouse model. We show that Spi-1 displays a selective DNA-binding that does not often cause transcriptional modulation. We report that Spi-1 controls transcriptional activation and repression partially through distinct Spi-1 recruitment to chromatin. We revealed several parameters impacting on Spi-1-mediated transcriptional activation. Gene activation is facilitated by Spi-1 occupancy close to transcriptional starting site of genes devoid of CGIs. Moreover, in those regions Spi-1 acts by binding to multiple motifs tightly clustered and with similar orientation. Finally, in contrast to the myeloid and lymphoid B cells in which Spi-1 exerts a physiological activity, in the erythroleukemic cells, lineage-specific cooperating factors do not play a prevalent role in Spi-1-mediated transcriptional activation. Thus, our work describes a new mechanism of gene activation through clustered site occupancy of Spi-1 particularly relevant in regard to the strong expression of Spi-1 in the erythroleukemic cells.


BMC Genomics | 2014

Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data.

Victor G. Levitsky; Ivan V. Kulakovskiy; Nikita I. Ershov; Dmitry Yu. Oshchepkov; Vsevolod J. Makeev; Tc Hodgman; T. I. Merkulova

BackgroundChIP-Seq is widely used to detect genomic segments bound by transcription factors (TF), either directly at DNA binding sites (BSs) or indirectly via other proteins. Currently, there are many software tools implementing different approaches to identify TFBSs within ChIP-Seq peaks. However, their use for the interpretation of ChIP-Seq data is usually complicated by the absence of direct experimental verification, making it difficult both to set a threshold to avoid recognition of too many false-positive BSs, and to compare the actual performance of different models.ResultsUsing ChIP-Seq data for FoxA2 binding loci in mouse adult liver and human HepG2 cells we compared FoxA binding-site predictions for four computational models of two fundamental classes: pattern matching based on existing training set of experimentally confirmed TFBSs (oPWM and SiteGA) and de novo motif discovery (ChIPMunk and diChIPMunk). To properly select prediction thresholds for the models, we experimentally evaluated affinity of 64 predicted FoxA BSs using EMSA that allows safely distinguishing sequences able to bind TF. As a result we identified thousands of reliable FoxA BSs within ChIP-Seq loci from mouse liver and human HepG2 cells. It was found that the performance of conventional position weight matrix (PWM) models was inferior with the highest false positive rate. On the contrary, the best recognition efficiency was achieved by the combination of SiteGA & diChIPMunk/ChIPMunk models, properly identifying FoxA BSs in up to 90% of loci for both mouse and human ChIP-Seq datasets.ConclusionsThe experimental study of TF binding to oligonucleotides corresponding to predicted sites increases the reliability of computational methods for TFBS-recognition in ChIP-Seq data analysis. Regarding ChIP-Seq data interpretation, basic PWMs have inferior TFBS recognition quality compared to the more sophisticated SiteGA and de novo motif discovery methods. A combination of models from different principles allowed identification of proper TFBSs.

Collaboration


Dive into the Ivan V. Kulakovskiy's collaboration.

Top Co-Authors

Avatar

Vsevolod J. Makeev

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Ilya E. Vorontsov

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Artem S. Kasianov

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yulia A. Medvedeva

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Irina A. Eliseeva

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Vladimir B. Bajic

King Abdullah University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Anton M. Schwartz

Engelhardt Institute of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar

Dmitry V. Kuprash

Engelhardt Institute of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar

Fedor A. Kolpakov

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Ivan S. Yevshin

Russian Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge