Fedor A. Kolpakov
Russian Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fedor A. Kolpakov.
Nucleic Acids Research | 1998
T. Heinemeyer; Edgar Wingender; I. Reuter; H. Hermjakob; Alexander E. Kel; O. V. Kel; E. V. Ignatieva; Elena A. Ananko; O. A. Podkolodnaya; Fedor A. Kolpakov; Nikolay L. Podkolodny; Nikolay A. Kolchanov
TRANSFAC, TRRD (Transcription Regulatory Region Database) and COMPEL are databases which store information about transcriptional regulation in eukaryotic cells. The three databases provide distinct views on the components involved in transcription: transcription factors and their binding sites and binding profiles (TRANSFAC), the regulatory hierarchy of whole genes (TRRD), and the structural and functional properties of composite elements (COMPEL). The quantitative and qualitative changes of all three databases and connected programs are described. The databases are accessible via WWW:http://transfac.gbf.de/TRANSFAC orhttp://www.bionet.nsc.ru/TRRD
Nature Biotechnology | 2009
Nicolas Le Novère; Michael Hucka; Huaiyu Mi; Stuart L. Moodie; Falk Schreiber; Anatoly A. Sorokin; Emek Demir; Katja Wegner; Mirit I. Aladjem; Sarala M. Wimalaratne; Frank T. Bergman; Ralph Gauges; Peter Ghazal; Hideya Kawaji; Lu Li; Yukiko Matsuoka; Alice Villéger; Sarah E. Boyd; Laurence Calzone; Mélanie Courtot; Ugur Dogrusoz; Tom C. Freeman; Akira Funahashi; Samik Ghosh; Akiya Jouraku; Sohyoung Kim; Fedor A. Kolpakov; Augustin Luna; Sven Sahle; Esther Schmidt
Circuit diagrams and Unified Modeling Language diagrams are just two examples of standard visual languages that help accelerate work by promoting regularity, removing ambiguity and enabling software tool support for communication of complex information. Ironically, despite having one of the highest ratios of graphical to textual information, biology still lacks standard graphical notations. The recent deluge of biological knowledge makes addressing this deficit a pressing concern. Toward this goal, we present the Systems Biology Graphical Notation (SBGN), a visual language developed by a community of biochemists, modelers and computer scientists. SBGN consists of three complementary languages: process diagram, entity relationship diagram and activity flow diagram. Together they enable scientists to represent networks of biochemical interactions in a standard, unambiguous way. We believe that SBGN will foster efficient and accurate representation, visualization, storage, exchange and reuse of information on all kinds of biological knowledge, from gene regulation, to metabolism, to cellular signaling.
BMC Systems Biology | 2011
Dagmar Waltemath; Richard Adams; Frank Bergmann; Michael Hucka; Fedor A. Kolpakov; Andrew K. Miller; Ion I. Moraru; David Nickerson; Sven Sahle; Jacky L. Snoep; Nicolas Le Novère
BackgroundThe increasing use of computational simulation experiments to inform modern biological research creates new challenges to annotate, archive, share and reproduce such experiments. The recently published Minimum Information About a Simulation Experiment (MIASE) proposes a minimal set of information that should be provided to allow the reproduction of simulation experiments among users and software tools.ResultsIn this article, we present the Simulation Experiment Description Markup Language (SED-ML). SED-ML encodes in a computer-readable exchange format the information required by MIASE to enable reproduction of simulation experiments. It has been developed as a community project and it is defined in a detailed technical specification and additionally provides an XML schema. The version of SED-ML described in this publication is Level 1 Version 1. It covers the description of the most frequent type of simulation experiments in the area, namely time course simulations. SED-ML documents specify which models to use in an experiment, modifications to apply on the models before using them, which simulation procedures to run on each model, what analysis results to output, and how the results should be presented. These descriptions are independent of the underlying model implementation. SED-ML is a software-independent format for encoding the description of simulation experiments; it is not specific to particular simulation tools. Here, we demonstrate that with the growing software support for SED-ML we can effectively exchange executable simulation descriptions.ConclusionsWith SED-ML, software can exchange simulation experiment descriptions, enabling the validation and reuse of simulation experiments in different tools. Authors of papers reporting simulation experiments can make their simulation protocols available for other scientists to reproduce the results. Because SED-ML is agnostic about exact modeling language(s) used, experiments covering models from different fields of research can be accurately described and combined.
Nucleic Acids Research | 2016
Ivan V. Kulakovskiy; Ilya E. Vorontsov; Ivan S. Yevshin; Anastasiia V. Soboleva; Artem S. Kasianov; Haitham Ashoor; Wail Ba-alawi; Vladimir B. Bajic; Yulia A. Medvedeva; Fedor A. Kolpakov; Vsevolod J. Makeev
Models of transcription factor (TF) binding sites provide a basis for a wide spectrum of studies in regulatory genomics, from reconstruction of regulatory networks to functional annotation of transcripts and sequence variants. While TFs may recognize different sequence patterns in different conditions, it is pragmatic to have a single generic model for each particular TF as a baseline for practical applications. Here we present the expanded and enhanced version of HOCOMOCO (http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco10), the collection of models of DNA patterns, recognized by transcription factors. HOCOMOCO now provides position weight matrix (PWM) models for binding sites of 601 human TFs and, in addition, PWMs for 396 mouse TFs. Furthermore, we introduce the largest up to date collection of dinucleotide PWM models for 86 (52) human (mouse) TFs. The update is based on the analysis of massive ChIP-Seq and HT-SELEX datasets, with the validation of the resulting models on in vivo data. To facilitate a practical application, all HOCOMOCO models are linked to gene and protein databases (Entrez Gene, HGNC, UniProt) and accompanied by precomputed score thresholds. Finally, we provide command-line tools for PWM and diPWM threshold estimation and motif finding in nucleotide sequences.
Bioinformatics | 1998
Fedor A. Kolpakov; Elena A. Ananko; Grigory Kolesov; N. A. Kolchanov
MOTIVATION Gene networks that provide the regulation of physiological processes are the basic feature of organisms. Information regarding the regulation of gene expression and signal transduction pathways is increasing rapidly. However, the information is hard to formalize and systematize. Ways and means for automated visualization of the gene networks based on their formalized description are needed. RESULTS The object-oriented database GeneNet and the software for its automated visualization have been developed. The main principles of a formalized description of the gene network have been worked out. Antiviral response and erythropoiesis are provided as examples to show how this is achieved. The GeneNet graphical user interface written in Java provides automated generation of the gene network diagrams and allows visualization and exploration of the GeneNet database through the Internet. A system of filters allows the selection of particular components of the network for visualization. AVAILABILITY The GeneNet database and its graphical user interface are available at http://wwwmgs.bionet.nsc.ru/systems/MGL/GeneN et/ CONTACT [email protected]
Nucleic Acids Research | 2017
Ivan S. Yevshin; Ruslan N. Sharipov; Tagir Valeev; Alexander E. Kel; Fedor A. Kolpakov
GTRD—Gene Transcription Regulation Database (http://gtrd.biouml.org)—is a database of transcription factor binding sites (TFBSs) identified by ChIP-seq experiments for human and mouse. Raw ChIP-seq data were obtained from ENCODE and SRA and uniformly processed: (i) reads were aligned using Bowtie2; (ii) ChIP-seq peaks were called using peak callers MACS, SISSRs, GEM and PICS; (iii) peaks for the same factor and peak callers, but different experiment conditions (cell line, treatment, etc.), were merged into clusters; (iv) such clusters for different peak callers were merged into metaclusters that were considered as non-redundant sets of TFBSs. In addition to information on location in genome, the sets contain structured information about cell lines and experimental conditions extracted from descriptions of corresponding ChIP-seq experiments. A web interface to access GTRD was developed using the BioUML platform. It provides: (i) browsing and displaying information; (ii) advanced search possibilities, e.g. search of TFBSs near the specified gene or search of all genes potentially regulated by a specified transcription factor; (iii) integrated genome browser that provides visualization of the GTRD data: read alignments, peaks, clusters, metaclusters and information about gene structures from the Ensembl database and binding sites predicted using position weight matrices from the HOCOMOCO database.
Nucleic Acids Research | 2018
Ivan V. Kulakovskiy; Ilya E. Vorontsov; Ivan S. Yevshin; Ruslan N. Sharipov; Alla D. Fedorova; Eugene I. Rumynskiy; Yulia A. Medvedeva; Arturo Magana-Mora; Vladimir B. Bajic; Dmitri A. Papatsenko; Fedor A. Kolpakov; Vsevolod J. Makeev
Abstract We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
BMC Systems Biology | 2013
Elena Kutumova; Andrei Zinovyev; Ruslan N. Sharipov; Fedor A. Kolpakov
BackgroundMany mathematical models characterizing mechanisms of cell fate decisions have been constructed recently. Their further study may be impossible without development of methods of model composition, which is complicated by the fact that several models describing the same processes could use different reaction chains or incomparable sets of parameters. Detailed models not supported by sufficient volume of experimental data suffer from non-unique choice of parameter values, non-reproducible results, and difficulty of analysis. Thus, it is necessary to reduce existing models to identify key elements determining their dynamics, and it is also required to design the methods allowing us to combine them.ResultsHere we propose a new approach to model composition, based on reducing several models to the same level of complexity and subsequent combining them together. Firstly, we suggest a set of model reduction tools that can be systematically applied to a given model. Secondly, we suggest a notion of a minimal complexity model. This model is the simplest one that can be obtained from the original model using these tools and still able to approximate experimental data. Thirdly, we propose a strategy for composing the reduced models together. Connection with the detailed model is preserved, which can be advantageous in some applications. A toolbox for model reduction and composition has been implemented as part of the BioUML software and tested on the example of integrating two previously published models of the CD95 (APO-1/Fas) signaling pathways. We show that the reduced models lead to the same dynamical behavior of observable species and the same predictions as in the precursor models. The composite model is able to recapitulate several experimental datasets which were used by the authors of the original models to calibrate them separately, but also has new dynamical properties.ConclusionModel complexity should be comparable to the complexity of the data used to train the model. Systematic application of model reduction methods allows implementing this modeling principle and finding models of minimal complexity compatible with the data. Combining such models is much easier than of precursor models and leads to new model properties and predictions.
Nucleic Acids Research | 2007
Fedor A. Kolpakov; Vladimir Poroikov; Ruslan N. Sharipov; Y. V. Kondrakhin; Alexey Zakharov; Alexey Lagunin; Luciano Milanesi; Alexander E. Kel
Computational modelling of mammalian cell cycle regulation is a challenging task, which requires comprehensive knowledge on many interrelated processes in the cell. We have developed a web-based integrated database on cell cycle regulation in mammals in normal and pathological states (Cyclonet database). It integrates data obtained by ‘omics’ sciences and chemoinformatics on the basis of systems biology approach. Cyclonet is a specialized resource, which enables researchers working in the field of anticancer drug discovery to analyze the wealth of currently available information in a systematic way. Cyclonet contains information on relevant genes and molecules; diagrams and models of cell cycle regulation and results of their simulation; microarray data on cell cycle and on various types of cancer, information on drug targets and their ligands, as well as extensive bibliography on modelling of cell cycle and cancer-related gene expression data. The Cyclonet database is also accessible through the BioUML workbench, which allows flexible querying, analyzing and editing the data by means of visual modelling. Cyclonet aims to predict promising anticancer targets and their agents by application of Prediction of Activity Spectra for Substances. The Cyclonet database is available at .
international conference on bioinformatics | 1999
Fedor A. Kolpakov; Elena A. Ananko
SUMMARY The GeneNet database has been developed for a formalized hierarchical description of the gene networks. To provide rapid data accumulation in the database, the Java graphical interface for data input through the Internet by independent experts equipped with convenient visual tools is developed. AVAILABILITY http://wwwmgs. bionet.nsc.ru/systems/MGL/GeneNet/