Anton I. Petrov
Bowling Green State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anton I. Petrov.
Nucleic Acids Research | 2014
Buvaneswari Coimbatore Narayanan; John D. Westbrook; Saheli Ghosh; Anton I. Petrov; Blake A. Sweeney; Craig L. Zirbel; Neocles B. Leontis; Helen M. Berman
The Nucleic Acid Database (NDB) (http://ndbserver.rutgers.edu) is a web portal providing access to information about 3D nucleic acid structures and their complexes. In addition to primary data, the NDB contains derived geometric data, classifications of structures and motifs, standards for describing nucleic acid features, as well as tools and software for the analysis of nucleic acids. A variety of search capabilities are available, as are many different types of reports. This article describes the recent redesign of the NDB Web site with special emphasis on new RNA-derived data and annotations and their implementation and integration into the search capabilities.
RNA | 2013
Anton I. Petrov; Craig L. Zirbel; Neocles B. Leontis
The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson-Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.
Nucleic Acids Research | 2014
Anton I. Petrov; Simon Kay; Richard Gibson; Eugene Kulesha; Dan Staines; Elspeth A. Bruford; Mathew W. Wright; Sarah W. Burge; Robert D. Finn; Paul J. Kersey; Guy Cochrane; Alex Bateman; Sam Griffiths-Jones; Jennifer Harrow; Patricia P. Chan; Todd M. Lowe; Christian Zwieb; Jacek Wower; Kelly P. Williams; Corey M. Hudson; Robin R. Gutell; Michael B. Clark; Marcel E. Dinger; Xiu Cheng Quek; Janusz M. Bujnicki; Nam-Hai Chua; Jun Liu; Huan Wang; Geir Skogerbø; Yi Zhao
Abstract The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.
Nucleic Acids Research | 2012
Amal S. Abu Almakarem; Anton I. Petrov; Jesse Stombaugh; Craig L. Zirbel; Neocles B. Leontis
Base triples are recurrent clusters of three RNA nucleobases interacting edge-to-edge by hydrogen bonding. We find that the central base in almost all triples forms base pairs with the other two bases of the triple, providing a natural way to geometrically classify base triples. Given 12 geometric base pair families defined by the Leontis–Westhof nomenclature, combinatoric enumeration predicts 108 potential geometric base triple families. We searched representative atomic-resolution RNA 3D structures and found instances of 68 of the 108 predicted base triple families. Model building suggests that some of the remaining 40 families may be unlikely to form for steric reasons. We developed an on-line resource that provides exemplars of all base triples observed in the structure database and models for unobserved, predicted triples, grouped by triple family, as well as by three-base combination (http://rna.bgsu.edu/Triples). The classification helps to identify recurrent triple motifs that can substitute for each other while conserving RNA 3D structure, with applications in RNA 3D structure prediction and analysis of RNA sequence evolution.
Journal of Physical Chemistry B | 2010
Jiří Šponer; Judit E. Šponer; Anton I. Petrov; Neocles B. Leontis
In this feature article, we provide a side-by-side introduction for two research fields: quantum chemical calculations of molecular interaction in nucleic acids and RNA structural bioinformatics. Our main aim is to demonstrate that these research areas, while largely separated in contemporary literature, have substantial potential to complement each other that could significantly contribute to our understanding of the exciting world of nucleic acids. We identify research questions amenable to the combined application of modern ab initio methods and bioinformatics analysis of experimental structures while also assessing the limitations of these approaches. The ultimate aim is to attain valuable physicochemical insights regarding the nature of the fundamental molecular interactions and how they shape RNA structures, dynamics, function, and evolution.
The Plant Cell | 2011
Ryuta Takeda; Anton I. Petrov; Neocles B. Leontis; Biao Ding
Cell-to-cell trafficking of RNAs plays an important role in coordinating gene expression at the whole-plant level as well as in virus/viroid infection and host defense response. This work identifies a three-dimensional RNA structure motif in a viroid that mediates trafficking between the leaf mesophyll tissues, providing mechanistic insights into trafficking regulation. Cell-to-cell trafficking of RNA is an emerging biological principle that integrates systemic gene regulation, viral infection, antiviral response, and cell-to-cell communication. A key mechanistic question is how an RNA is specifically selected for trafficking from one type of cell into another type. Here, we report the identification of an RNA motif in Potato spindle tuber viroid (PSTVd) required for trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana leaves. This motif, called loop 6, has the sequence 5′-CGA-3′...5′-GAC-3′ flanked on both sides by cis Watson-Crick G/C and G/U wobble base pairs. We present a three-dimensional (3D) structural model of loop 6 that specifies all non-Watson-Crick base pair interactions, derived by isostericity-based sequence comparisons with 3D RNA motifs from the RNA x-ray crystal structure database. The model is supported by available chemical modification patterns, natural sequence conservation/variations in PSTVd isolates and related species, and functional characterization of all possible mutants for each of the loop 6 base pairs. Our findings and approaches have broad implications for studying the 3D RNA structural motifs mediating trafficking of diverse RNA species across specific cellular boundaries and for studying the structure-function relationships of RNA motifs in other biological processes.
Nucleic Acids Research | 2011
Anton I. Petrov; Craig L. Zirbel; Neocles B. Leontis
WebFR3D is the on-line version of ‘Find RNA 3D’ (FR3D), a program for annotating atomic-resolution RNA 3D structure files and searching them efficiently to locate and compare RNA 3D structural motifs. WebFR3D provides on-line access to the central features of FR3D, including geometric and symbolic search modes, without need for installing programs or downloading and maintaining 3D structure data locally. In geometric search mode, WebFR3D finds all motifs similar to a user-specified query structure. In symbolic search mode, WebFR3D finds all sets of nucleotides making user-specified interactions. In both modes, users can specify sequence, sequence–continuity, base pairing, base-stacking and other constraints on nucleotides and their interactions. WebFR3D can be used to locate hairpin, internal or junction loops, list all base pairs or other interactions, or find instances of recurrent RNA 3D motifs (such as sarcin–ricin and kink-turn internal loops or T- and GNRA hairpin loops) in any PDB file or across a whole set of 3D structure files. The output page provides facilities for comparing the instances returned by the search by superposition of the 3D structures and the alignment of their sequences annotated with pairwise interactions. WebFR3D is available at http://rna.bgsu.edu/webfr3d.
Nucleic Acids Research | 2018
Ioanna Kalvari; Joanna Argasinska; Natalia Quinones‐Olvera; Eric P. Nawrocki; Elena Rivas; Sean R. Eddy; Alex Bateman; Robert D. Finn; Anton I. Petrov
Abstract The Rfam database is a collection of RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. In this paper we introduce Rfam release 13.0, which switches to a new genome-centric approach that annotates a non-redundant set of reference genomes with RNA families. We describe new web interface features including faceted text search and R-scape secondary structure visualizations. We discuss a new literature curation workflow and a pipeline for building families based on RNAcentral. There are 236 new families in release 13.0, bringing the total number of families to 2687. The Rfam website is http://rfam.org.
Nucleic Acids Research | 2017
Anton I. Petrov; Simon Kay; Ioanna Kalvari; Kevin L. Howe; Kristian A. Gray; Elspeth A. Bruford; Paul J. Kersey; Guy Cochrane; Robert D. Finn; Alex Bateman; Ana Kozomara; Sam Griffiths-Jones; Adam Frankish; Christian Zwieb; Britney Y. Lau; Kelly P. Williams; Patricia P. Chan; Todd M. Lowe; Jamie J. Cannone; Robin R. Gutell; Magdalena A. Machnicka; Janusz M. Bujnicki; Maki Yoshihama; Naoya Kenmochi; Benli Chai; James R. Cole; Maciej Szymanski; Wojciech M. Karlowski; Valerie Wood; Eva Huala
Abstract RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.
Nucleic Acids Research | 2015
Craig L. Zirbel; James Roll; Blake A. Sweeney; Anton I. Petrov; Meg Pirrung; Neocles B. Leontis
Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download.