Olga Zagnitko
Argonne National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Olga Zagnitko.
BMC Genomics | 2008
Ramy K. Aziz; Daniela Bartels; Aaron A. Best; Matthew DeJongh; Terrence Disz; Robert Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M. Glass; Michael Kubal; Folker Meyer; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Ross Overbeek; Leslie K. McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D. Pusch; Claudia I. Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko
BackgroundThe number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.DescriptionWe describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment.The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.ConclusionBy providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.
Nucleic Acids Research | 2005
Ross Overbeek; Tadhg P. Begley; Ralph Butler; Jomuna V. Choudhuri; Han-Yu Chuang; Matthew Cohoon; Valérie de Crécy-Lagard; Naryttza N. Diaz; Terry Disz; Robert D. Edwards; Michael Fonstein; Ed D. Frank; Svetlana Gerdes; Elizabeth M. Glass; Alexander Goesmann; Andrew C. Hanson; Dirk Iwata-Reuyl; Roy A. Jensen; Neema Jamshidi; Lutz Krause; Michael Kubal; Niels Bent Larsen; Burkhard Linke; Alice C. McHardy; Folker Meyer; Heiko Neuweger; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Vasiliy A. Portnoy
The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.
Journal of Biological Chemistry | 2006
Chen Yang; Dmitry A. Rodionov; Xiaoqing Li; Olga N. Laikova; Mikhail S. Gelfand; Olga Zagnitko; Margaret F. Romine; Anna Obraztsova; Kenneth H. Nealson; Andrei L. Osterman
We used a comparative genomics approach implemented in the SEED annotation environment to reconstruct the chitin and GlcNAc utilization subsystem and regulatory network in most proteobacteria, including 11 species of Shewanella with completely sequenced genomes. Comparative analysis of candidate regulatory sites allowed us to characterize three different GlcNAc-specific regulons, NagC, NagR, and NagQ, in various proteobacteria and to tentatively assign a number of novel genes with specific functional roles, in particular new GlcNAc-related transport systems, to this subsystem. Genes SO3506 and SO3507, originally annotated as hypothetical in Shewanella oneidensis MR-1, were suggested to encode novel variants of GlcN-6-P deaminase and GlcNAc kinase, respectively. Reconstitution of the GlcNAc catabolic pathway in vitro using these purified recombinant proteins and GlcNAc-6-P deacetylase (SO3505) validated the entire pathway. Kinetic characterization of GlcN-6-P deaminase demonstrated that it is the subject of allosteric activation by GlcNAc-6-P. Consistent with genomic data, all tested Shewanella strains except S. frigidimarina, which lacked representative genes for the GlcNAc metabolism, were capable of utilizing GlcNAc as the sole source of carbon and energy. This study expands the range of carbon substrates utilized by Shewanella spp., unambiguously identifies several genes involved in chitin metabolism, and describes a novel variant of the classical three-step biochemical conversion of GlcNAc to fructose 6-phosphate first described in Escherichia coli.
Nucleic Acids Research | 2007
Leslie K. McNeil; Claudia I. Reich; Ramy K. Aziz; Daniela Bartels; Matthew Cohoon; Terry Disz; Robert Edwards; Svetlana Gerdes; Kaitlyn Hwang; Michael Kubal; Gohar Rem Margaryan; Folker Meyer; William Mihalo; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D. Pusch; Dmitry A. Rodionov; Xinghua Shi; Olga Vassieva; Veronika Vonstein; Olga Zagnitko; Fangfang Xia; Jenifer Zinner; Ross Overbeek; Rick Stevens
The National Microbial Pathogen Data Resource (NMPDR) () is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of ∼50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development.
BMC Genomics | 2010
Dmitry A. Rodionov; Chen-Chen Yang; Xiaoqing Li; Irina A. Rodionova; Yanbing Wang; Anna Obraztsova; Olga Zagnitko; Ross Overbeek; Margaret F. Romine; Samantha B. Reed; James K. Fredrickson; Kenneth H. Nealson; Andrei L. Osterman
BackgroundCarbohydrates are a primary source of carbon and energy for many bacteria. Accurate projection of known carbohydrate catabolic pathways across diverse bacteria with complete genomes constitutes a substantial challenge due to frequent variations in components of these pathways. To address a practically and fundamentally important challenge of reconstruction of carbohydrate utilization machinery in any microorganism directly from its genomic sequence, we combined a subsystems-based comparative genomic approach with experimental validation of selected bioinformatic predictions by a combination of biochemical, genetic and physiological experiments.ResultsWe applied this integrated approach to systematically map carbohydrate utilization pathways in 19 genomes from the Shewanella genus. The obtained genomic encyclopedia of sugar utilization includes ~170 protein families (mostly metabolic enzymes, transporters and transcriptional regulators) spanning 17 distinct pathways with a mosaic distribution across Shewanella species providing insights into their ecophysiology and adaptive evolution. Phenotypic assays revealed a remarkable consistency between predicted and observed phenotype, an ability to utilize an individual sugar as a sole source of carbon and energy, over the entire matrix of tested strains and sugars.Comparison of the reconstructed catabolic pathways with E. coli identified multiple differences that are manifested at various levels, from the presence or absence of certain sugar catabolic pathways, nonorthologous gene replacements and alternative biochemical routes to a different organization of transcription regulatory networks.ConclusionsThe reconstructed sugar catabolome in Shewanella spp includes 62 novel isofunctional families of enzymes, transporters, and regulators. In addition to improving our knowledge of genomics and functional organization of carbohydrate utilization in Shewanella, this study led to a substantial expansion of our current version of the Genomic Encyclopedia of Carbohydrate Utilization. A systematic and iterative application of this approach to multiple taxonomic groups of bacteria will further enhance it, creating a knowledge base adequate for the efficient analysis of any newly sequenced genome as well as of the emerging metagenomic data.
PLOS Computational Biology | 2011
Ying Zhang; Olga Zagnitko; Irina A. Rodionova; Andrei L. Osterman; Adam Godzik
Function diversification in large protein families is a major mechanism driving expansion of cellular networks, providing organisms with new metabolic capabilities and thus adding to their evolutionary success. However, our understanding of the evolutionary mechanisms of functional diversity in such families is very limited, which, among many other reasons, is due to the lack of functionally well-characterized sets of proteins. Here, using the FGGY carbohydrate kinase family as an example, we built a confidently annotated reference set (CARS) of proteins by propagating experimentally verified functional assignments to a limited number of homologous proteins that are supported by their genomic and functional contexts. Then, we analyzed, on both the phylogenetic and the molecular levels, the evolution of different functional specificities in this family. The results show that the different functions (substrate specificities) encoded by FGGY kinases have emerged only once in the evolutionary history following an apparently simple divergent evolutionary model. At the same time, on the molecular level, one isofunctional group (L-ribulokinase, AraB) evolved at least two independent solutions that employed distinct specificity-determining residues for the recognition of a same substrate (L-ribulose). Our analysis provides a detailed model of the evolution of the FGGY kinase family. It also shows that only combined molecular and phylogenetic approaches can help reconstruct a full picture of functional diversifications in such diverse families.
Journal of Protein Chemistry | 1992
Andrei L. Osterman; Nick V. Grishin; S. V. Smulevitch; Mikhail V. Matz; Olga Zagnitko; L. P. Revina; Valentin M. Stepanov
The primary structure of carobxypeptidase T—a Zn-dependent extracellular enzyme ofThermoactinomyces vulgaris—was determined from the clonedcpT gene nucleotide sequence and compared to Zn-carboxypeptidases from various organisms. The compilation and analysis of multiple alignment accompanied by consideration of available tertiary structure data have shown that in the overall spatial structure and active site arrangement CpT is similar to other enzymes constituting the Zn-carboxypeptidase family. Nine of 16 amino acid residues found to be strictly invariant are presumably located close to the active site. The preservation of His69, Glu72, Asn144, Arg145, His196, Tyr248, and Glu270 identified previously as essential catalytic site participants implicates basically the same catalytic mechanism in the Zn-carboxy-peptidase family. It is proposed that Pro205 and Asp256 should play an important role in proper S1′-pocket spatial arrangement. The comparative analysis of amino acid variations in S1′-pocket enabled us to reveal structural determinants of the Zn-carboxypeptidase primary specificity. The relatively reduced size of the pocket and negative charge of Asp253 are supposed to contribute correspondingly to A- and B-type substrate preferences of carboxypeptidase T endowed with dual primary specificity.
Archive | 2005
Ross Overbeek; Veronika Fonstein; Andrei L. Osterman; Svetlana Gerdes; Olga Vassieva; Olga Zagnitko; Dmitry A. Rodionov
The team of the Fellowship for Interpretation of Genomes (FIG) under the leadership of Ross Overbeek, began working on this Project in November 2003. During the previous year, the Project was performed at Integrated Genomics Inc. A transition from the industrial environment to the public domain prompted us to adjust some aspects of the Project. Notwithstanding the challenges, we believe that these adjustments had a strong positive impact on our deliverables. Most importantly, the work of the research team led by R. Overbeek resulted in the deployment of a new open source genomic platform, the SEED (Specific Aim 1). This platform provided a foundation for the development of CyanoSEED a specialized portal to comparative analysis and metabolic reconstruction of all available cyanobacterial genomes (Specific Aim 3). The SEED represents a new generation of software for genome analysis. Briefly, it is a portable and extendable system, containing one of the largest and permanently growing collections of complete and partial genomes. The complete system with annotations and tools is freely available via browsing or via installation on a users Mac or Linux computer. One of the important unique features of the SEED is the support of metabolic reconstruction and comparative genome analysis via encoding and projection of functional subsystems. During the project period, the FIG research team has validated the new software by developing a significant number of core subsystems, covering many aspects of central metabolism (Specific Aim 2), as well as metabolic areas specific for cyanobacteria and other photoautotrophic organisms (Specific Aim 3). In addition to providing a proof of technology and a starting point for further community-based efforts, these subsystems represent a valuable asset. An extensive coverage of central metabolism provides the bulk of information required for metabolic modeling in Synechocystis sp.PCC 6803. Detailed analysis of several subsystems covering energy, carbon, and redox metabolism in the Synechocystis sp. PCC 6803 and other cyanobacteria has been performed (Specific Aim 4). The main objectives for this year (adjusted to reflect a new, public domain, setting of the Project research team) were: Aim 1. To develop, test, and deploy a new open source system, the SEED, for integrating community-based annotation, and comparative analysis of all publicly available microbial genomes. Develop a comprehensive genomic database by integrating within SEED all publicly available complete and nearly complete genome sequences with special emphasis on genomes of cyanobacteria, phototrophic eukaryotes, and anoxygenic phototrophic bacteria--invaluable for comparative genomic studies of energy and carbon metabolism in Synechocystis sp. PCC 6803. Aim 2. To develop the SEEDs biological content in the form of a collection of encoded Subsystems largely covering the conserved cellular machinery in prokaryotes (and central metabolic machinery in eukaryotes). Aim 3. To develop, utilizing core SEED technology, the CyanoSEED--a specialized WEB portal for community-based annotation, and comparative analysis of all publicly available cyanobacterial genomes. Encode the set of additional subsystems representing key metabolic transformations in cyanobacteria and other photoautotrophs. We envisioned this resource as complementary to other public access databases for comparative genomic analysis currently available to the cyanobacterial research community. Aim 4. Perform in-depth analysis of several subsystems covering energy, carbon, and redox metabolism in the Synechocystis sp. PCC 6803 and all other cyanobacteria with available genome sequences. Reveal inconsistencies and gaps in the current knowledge of these subsystems. Use functional and genome context analysis tools in CyanoSEED to predict, whenever possible, candidate genes for inferred functional roles. To disseminate freely these conjectures and predictions by publishing them on CyanoSEED (http://cyanoseed.thefig.info/) and the Subsystems Forum (http://brucella.uchicago.edu/SubsystemForum/) in order to facilitate experimental analysis by our collaborator on this Project and by other experimentalists working in various field of cyanobacterial physiology and biotechnology.
FEBS Journal | 1992
Alexei Teplyakov; Kostya Polyakov; Galya Obmolova; Boris V. Strokopytov; I. P. Kuranova; Andrei L. Osterman; Nikolai V. Grishin; Sergei Smulevitch; Olga Zagnitko; Olga V. Galperina; Michael Matz; Valentin M. Stepanov
Archive | 2006
Chen Yang; Dmitry A. Rodionov; Xiaoqing Li; Olga N. Laikova; Mikhail S. Gelfand; Olga Zagnitko; Margaret F. Romine; Anna Obraztsova; Kenneth H. Nealson; Andrei L. Osterman