Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Roman A. Sutormin is active.

Publication


Featured researches published by Roman A. Sutormin.


BMC Genomics | 2013

RegPrecise 3.0 – A resource for genome-scale exploration of transcriptional regulation in bacteria

Pavel S. Novichkov; Alexey E. Kazakov; Dmitry A. Ravcheev; Semen A. Leyn; Galina Yu Kovaleva; Roman A. Sutormin; Marat D. Kazanov; William J Riehl; Adam P. Arkin; Inna Dubchak; Dmitry A. Rodionov

BackgroundGenome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches).DescriptionRegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes.ConclusionsRegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.


bioRxiv | 2016

The DOE Systems Biology Knowledgebase (KBase)

Adam P. Arkin; Rick Stevens; Robert W. Cottingham; Sergei Maslov; Christopher S. Henry; Paramvir Dehal; Doreen Ware; Fernando Perez; Nomi L. Harris; Shane Canon; Michael W Sneddon; Matthew L Henderson; William J Riehl; Dan Gunter; Dan Murphy-Olson; Stephen Chan; Roy T Kamimura; Thomas S Brettin; Folker Meyer; Dylan Chivian; David J. Weston; Elizabeth M. Glass; Brian H. Davison; Sunita Kumari; Benjamin H Allen; Jason K. Baumohl; Aaron A. Best; Ben Bowen; Steven E. Brenner; Christopher C Bun

The U.S. Department of Energy Systems Biology Knowledgebase (KBase) is an open-source software and data platform designed to meet the grand challenge of systems biology — predicting and designing biological function from the biomolecular (small scale) to the ecological (large scale). KBase is available for anyone to use, and enables researchers to collaboratively generate, test, compare, and share hypotheses about biological functions; perform large-scale analyses on scalable computing infrastructure; and combine experimental evidence and conclusions that lead to accurate models of plant and microbial physiology and community dynamics. The KBase platform has (1) extensible analytical capabilities that currently include genome assembly, annotation, ontology assignment, comparative genomics, transcriptomics, and metabolic modeling; (2) a web-browser-based user interface that supports building, sharing, and publishing reproducible and well-annotated analyses with integrated data; (3) access to extensive computational resources; and (4) a software development kit allowing the community to add functionality to the system.


Proteins | 2003

BATMAS30: Amino acid substitution matrix for alignment of bacterial transporters

Roman A. Sutormin; Aleksandra B. Rakhmaninova; Mikhail S. Gelfand

Aligned amino acid sequences of three functionally independent samples of transmembrane (TM) transport proteins have been analyzed. The concept of TM‐kernel is proposed as the most probable transmembrane region of a sequence. The average amino acid composition of TM‐kernels differs from the published amino acid composition of transmembrane segments. TM‐kernels contain more alanines, glycines, and less polar, charged, and aromatic residues in contrast to non‐TM‐proteins. There are also differences between TM‐kernels of bacterial and eukaryotic proteins. We have constructed amino acid substitution matrices for bacterial TM‐kernels, named the BATMAS (BActerial Transmembrane MAtrix of Substitutions) series. In TM‐kernels, polar and charged residues, as well as proline and tyrosine, are highly conserved, whereas there are more substitutions within the group of hydrophobic residues, in contrast to non‐TM‐proteins that have fewer, relatively more conserved, hydrophobic residues. These results demonstrate that alignment of transmembrane proteins should be based on at least two amino acid substitution matrices, one for loops (e.g., the BLOSUM series) and one for TM‐segments (the BATMAS series), and the choice of the TM‐matrix should be different for eukaryotic and bacterial proteins. Proteins 2003;51:85–95.


Journal of Bioinformatics and Computational Biology | 2006

RECOGNITION OF TRANSMEMBRANE SEGMENTS IN PROTEINS: REVIEW AND CONSISTENCY-BASED BENCHMARKING OF INTERNET SERVERS

Nataliya S. Sadovskaya; Roman A. Sutormin; Mikhail S. Gelfand

Membrane proteins perform a number of crucial functions as transporters, receptors, and components of enzyme complexes. Identification of membrane proteins and prediction of their topology is thus an important part of genome annotation. We present here an overview of transmembrane segments in protein sequences, summarize data from large-scale genome studies, and report results of benchmarking of several popular internet servers.


Nucleic Acids Research | 2012

CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation

Anna A. Nikulova; Alexander V. Favorov; Roman A. Sutormin; Vsevolod J. Makeev; Andrey A. Mironov

Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ‘grammar’, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.


Genome Biology and Evolution | 2014

Weak negative and positive selection and the drift load at splice sites.

Stepan V. Denisov; Georgii A. Bazykin; Roman A. Sutormin; Alexander V. Favorov; Andrey A. Mironov; Mikhail S. Gelfand; Alexey S. Kondrashov

Splice sites (SSs) are short sequences that are crucial for proper mRNA splicing in eukaryotic cells, and therefore can be expected to be shaped by strong selection. Nevertheless, in mammals and in other intron-rich organisms, many of the SSs often involve nonconsensus (Nc), rather than consensus (Cn), nucleotides, and beyond the two critical nucleotides, the SSs are not perfectly conserved between species. Here, we compare the SS sequences between primates, and between Drosophila fruit flies, to reveal the pattern of selection acting at SSs. Cn-to-Nc substitutions are less frequent, and Nc-to-Cn substitutions are more frequent, than neutrally expected, indicating, respectively, negative and positive selection. This selection is relatively weak (1 < |4Nes| < 4), and has a similar efficiency in primates and in Drosophila. Within some nucleotide positions, the positive selection in favor of Nc-to-Cn substitutions is weaker than the negative selection maintaining already established Cn nucleotides; this difference is due to site-specific negative selection favoring current Nc nucleotides. In general, however, the strength of negative selection protecting the Cn alleles is similar in magnitude to the strength of positive selection favoring replacement of Nc alleles, as expected under the simple nearly neutral turnover. In summary, although a fraction of the Nc nucleotides within SSs is maintained by selection, the abundance of deleterious nucleotides in this class suggests a substantial genome-wide drift load.


Proceedings of the National Academy of Sciences of the United States of America | 2018

Cooption of heat shock regulatory system for anhydrobiosis in the sleeping chironomid Polypedilum vanderplanki

Pavel V. Mazin; Elena I. Shagimardanova; Olga Kozlova; Alexander Cherkasov; Roman A. Sutormin; Vita V. Stepanova; Alexey Stupnikov; Maria D. Logacheva; Aleksey A. Penin; Yoichiro Sogame; Richard Cornette; Shoko Tokumoto; Yugo Miyata; Takahiro Kikawada; Mikhail S. Gelfand; Oleg Gusev

Significance Anhydrobiosis is an ametabolic state found in several organisms that can survive extreme desiccation. It is of practical interest because its application to other systems might allow room temperature preservation of cells, tissues, or organs in the dry state. The insect Polypedilum vanderplanki is the most complex animal that can enter anhydrobiosis. Proteins responsible for desiccation tolerance in P. vanderplanki are relatively well studied, but little is known about mechanisms underlying their induction during desiccation. Here, we show that the heat shock transcription factor regulatory network was coopted during the evolution of P. vanderplanki to activate many known desiccation-protective genes, including genes encoding late embryogenesis abundant (LEA) proteins. Polypedilum vanderplanki is a striking and unique example of an insect that can survive almost complete desiccation. Its genome and a set of dehydration–rehydration transcriptomes, together with the genome of Polypedilum nubifer (a congeneric desiccation-sensitive midge), were recently released. Here, using published and newly generated datasets reflecting detailed transcriptome changes during anhydrobiosis, as well as a developmental series, we show that the TCTAGAA DNA motif, which closely resembles the binding motif of the Drosophila melanogaster heat shock transcription activator (Hsf), is significantly enriched in the promoter regions of desiccation-induced genes in P. vanderplanki, such as genes encoding late embryogenesis abundant (LEA) proteins, thioredoxins, or trehalose metabolism-related genes, but not in P. nubifer. Unlike P. nubifer, P. vanderplanki has double TCTAGAA sites upstream of the Hsf gene itself, which is probably responsible for the stronger activation of Hsf in P. vanderplanki during desiccation compared with P. nubifer. To confirm the role of Hsf in desiccation-induced gene activation, we used the Pv11 cell line, derived from P. vanderplanki embryo. After preincubation with trehalose, Pv11 cells can enter anhydrobiosis and survive desiccation. We showed that Hsf knockdown suppresses trehalose-induced activation of multiple predicted Hsf targets (including P. vanderplanki-specific LEA protein genes) and reduces the desiccation survival rate of Pv11 cells fivefold. Thus, cooption of the heat shock regulatory system has been an important evolutionary mechanism for adaptation to desiccation in P. vanderplanki.


Nature Biotechnology | 2018

KBase: The United States Department of Energy Systems Biology Knowledgebase

Adam P. Arkin; Robert W. Cottingham; Christopher S. Henry; Nomi L. Harris; Rick Stevens; Sergei Maslov; Paramvir Dehal; Doreen Ware; Fernando Perez; Shane Canon; Michael W Sneddon; Matthew L Henderson; William J Riehl; Dan Murphy-Olson; Stephen Chan; Roy T Kamimura; Sunita Kumari; Meghan M Drake; Thomas Brettin; Elizabeth M. Glass; Dylan Chivian; Dan Gunter; David J. Weston; Benjamin H Allen; Jason K. Baumohl; Aaron A. Best; Ben Bowen; Steven E. Brenner; Christopher C Bun; John-Marc Chandonia

Author(s): Arkin, Adam P; Cottingham, Robert W; Henry, Christopher S; Harris, Nomi L; Stevens, Rick L; Maslov, Sergei; Dehal, Paramvir; Ware, Doreen; Perez, Fernando; Canon, Shane; Sneddon, Michael W; Henderson, Matthew L; Riehl, William J; Murphy-Olson, Dan; Chan, Stephen Y; Kamimura, Roy T; Kumari, Sunita; Drake, Meghan M; Brettin, Thomas S; Glass, Elizabeth M; Chivian, Dylan; Gunter, Dan; Weston, David J; Allen, Benjamin H; Baumohl, Jason; Best, Aaron A; Bowen, Ben; Brenner, Steven E; Bun, Christopher C; Chandonia, John-Marc; Chia, Jer-Ming; Colasanti, Ric; Conrad, Neal; Davis, James J; Davison, Brian H; DeJongh, Matthew; Devoid, Scott; Dietrich, Emily; Dubchak, Inna; Edirisinghe, Janaka N; Fang, Gang; Faria, Jose P; Frybarger, Paul M; Gerlach, Wolfgang; Gerstein, Mark; Greiner, Annette; Gurtowski, James; Haun, Holly L; He, Fei; Jain, Rashmi; Joachimiak, Marcin P; Keegan, Kevin P; Kondo, Shinnosuke; Kumar, Vivek; Land, Miriam L; Meyer, Folker; Mills, Marissa; Novichkov, Pavel S; Oh, Taeyun; Olsen, Gary J; Olson, Robert; Parrello, Bruce; Pasternak, Shiran; Pearson, Erik; Poon, Sarah S; Price, Gavin A; Ramakrishnan, Srividya; Ranjan, Priya; Ronald, Pamela C; Schatz, Michael C; Seaver, Samuel MD; Shukla, Maulik; Sutormin, Roman A; Syed, Mustafa H; Thomason, James; Tintle, Nathan L; Wang, Daifeng; Xia, Fangfang; Yoo, Hyunseung; Yoo, Shinjae; Yu, Dantong


RNA Biology | 2016

Probing-directed identification of novel structured RNAs

Svetlana Vinogradova; Roman A. Sutormin; Andrey A. Mironov; Ruslan A. Soldatov

ABSTRACT Transcripts often harbor RNA elements, which regulate cell processes co- or post-transcriptionally. The functions of many regulatory RNA elements depend on their structure, thus it is important to determine the structure as well as to scan genomes for structured elements. State of the art ab initio approaches to predict structured RNAs rely on DNA sequence analysis. They use 2 major types of information inferred from a sequence: thermodynamic stability of an RNA structure and evolutionary footprints of base-pair interactions. In recent years, chemical probing of RNA has arisen as an alternative source of structural information. RNA probing experiments detect positions accessible to specific types of chemicals or enzymes indicating their propensity to be in a paired or unpaired state. There exist several strategies to integrate probing data into RNA secondary structure prediction algorithms that substantially improve the prediction quality. However, whether and how probing data could contribute to detection of structured RNAs remains an open question. We previously developed the energy-based approach RNASurface to detect locally optimal structured RNA elements. Here, we integrate probing data into the RNASurface energy model using a general framework. We show that the use of experimental data allows for better discrimination of ncRNAs from other transcripts. Application of RNASurface to genome-wide analysis of the human transcriptome with PARS data identifies previously undetectable segments, with evidence of functionality for some of them.


BMC Genomics | 2015

Comparative genome analysis of Pseudogymnoascus spp. reveals primarily clonal evolution with small genome fragments exchanged between lineages

Evgeny V. Leushkin; Maria D. Logacheva; Aleksey A. Penin; Roman A. Sutormin; Evgeny S. Gerasimov; G. A. Kochkina; Natalia Ivanushkina; Oleg Vasilenko; Alexey S. Kondrashov; S. M. Ozerskaya

BackgroundPseudogymnoascus spp. is a wide group of fungi lineages in the family Pseudorotiaceae including an aggressive pathogen of bats P. destructans. Although several lineages of P. spp. were shown to produce ascospores in culture, the vast majority of P. spp. demonstrates no evidence of sexual reproduction. P. spp. can tolerate a wide range of different temperatures and salinities and can survive even in permafrost layer. Adaptability of P. spp. to different environments is accompanied by extremely variable morphology and physiology.ResultsWe sequenced genotypes of 14 strains of P. spp., 5 of which were extracted from permafrost, 1 from a cryopeg, a layer of unfrozen ground in permafrost, and 8 from temperate surface environments. All sequenced genotypes are haploid. Nucleotide diversity among these genomes is very high, with a typical evolutionary distance at synonymous sites dS ≈ 0.5, suggesting that the last common ancestor of these strains lived >50Mya. The strains extracted from permafrost do not form a separate clade. Instead, each permafrost strain has close relatives from temperate environments.We observed a strictly clonal population structure with no conflicting topologies for ~99% of genome sequences. However, there is a number of short (~100–10,000 nt) genomic segments with the total length of 67.6 Kb which possess phylogenetic patterns strikingly different from the rest of the genome. The most remarkable case is a MAT-locus, which has 2 distinct alleles interspersed along the whole-genome phylogenetic tree.ConclusionsPredominantly clonal structure of genome sequences is consistent with the observations that sexual reproduction is rare in P. spp. Small number of regions with noncanonical phylogenies seem to arise due to some recombination events between derived lineages of P. spp., with MAT-locus being transferred on multiple occasions. All sequenced strains have heterothallic configuration of MAT-locus.

Collaboration


Dive into the Roman A. Sutormin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adam P. Arkin

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

William J Riehl

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Evgeny V. Leushkin

Russian Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Georgii A. Bazykin

Russian Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge