Gordon D. Pusch | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gordon D. Pusch is active.

Explore More

Publication

Featured researches published by Gordon D. Pusch.

BMC Genomics | 2008

The RAST Server: Rapid Annotations using Subsystems Technology

Ramy K. Aziz; Daniela Bartels; Aaron A. Best; Matthew DeJongh; Terrence Disz; Robert Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M. Glass; Michael Kubal; Folker Meyer; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Ross Overbeek; Leslie K. McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D. Pusch; Claudia I. Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko

BackgroundThe number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.DescriptionWe describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment.The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.ConclusionBy providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

Nucleic Acids Research | 2014

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

Ross Overbeek; Robert Olson; Gordon D. Pusch; Gary J. Olsen; James J. Davis; Terry Disz; Robert Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R. Wattam; Fangfang Xia; Rick Stevens

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

Nucleic Acids Research | 2005

The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes

Ross Overbeek; Tadhg P. Begley; Ralph Butler; Jomuna V. Choudhuri; Han-Yu Chuang; Matthew Cohoon; Valérie de Crécy-Lagard; Naryttza N. Diaz; Terry Disz; Robert D. Edwards; Michael Fonstein; Ed D. Frank; Svetlana Gerdes; Elizabeth M. Glass; Alexander Goesmann; Andrew C. Hanson; Dirk Iwata-Reuyl; Roy A. Jensen; Neema Jamshidi; Lutz Krause; Michael Kubal; Niels Bent Larsen; Burkhard Linke; Alice C. McHardy; Folker Meyer; Heiko Neuweger; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Vasiliy A. Portnoy

The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.

Nature | 2003

Genome sequence of Bacillus cereus and comparative analysis with Bacillus anthracis

Natalia Ivanova; Alexei Sorokin; Iain Anderson; Nathalie Galleron; Benjamin Candelon; Vinayak Kapatral; Anamitra Bhattacharyya; Gary Reznik; Natalia Mikhailova; Alla Lapidus; Lien Chu; Michael Mazur; Eugene Goltsman; Niels Bent Larsen; Mark D'Souza; Theresa L. Walunas; Yuri Grechkin; Gordon D. Pusch; Robert Haselkorn; Michael Fonstein; S. Dusko Ehrlich; Ross Overbeek; Nikos C. Kyrpides

Bacillus cereus is an opportunistic pathogen causing food poisoning manifested by diarrhoeal or emetic syndromes. It is closely related to the animal and human pathogen Bacillus anthracis and the insect pathogen Bacillus thuringiensis, the former being used as a biological weapon and the latter as a pesticide. B. anthracis and B. thuringiensis are readily distinguished from B. cereus by the presence of plasmid-borne specific toxins (B. anthracis and B. thuringiensis) and capsule (B. anthracis). But phylogenetic studies based on the analysis of chromosomal genes bring controversial results, and it is unclear whether B. cereus, B. anthracis and B. thuringiensis are varieties of the same species or different species. Here we report the sequencing and analysis of the type strain B. cereus ATCC 14579. The complete genome sequence of B. cereus ATCC 14579 together with the gapped genome of B. anthracis A2012 enables us to perform comparative analysis, and hence to identify the genes that are conserved between B. cereus and B. anthracis, and the genes that are unique for each species. We use the former to clarify the phylogeny of the cereus group, and the latter to determine plasmid-independent species-specific markers.

Nature Biotechnology | 2004

Complete sequence and comparative genome analysis of the dairy bacterium Streptococcus thermophilus

Alexander Bolotin; Benoit Quinquis; Pierre Renault; Alexei Sorokin; S. Dusko Ehrlich; Saulius Kulakauskas; Alla Lapidus; Eugene Goltsman; Michael Mazur; Gordon D. Pusch; Michael Fonstein; Ross Overbeek; Nikos Kyprides; Bénédicte Purnelle; Deborah Prozzi; Katrina Ngui; David Masuy; Frédéric Hancy; Sophie Burteau; Marc Boutry; Jean Delcour; André Goffeau; Pascal Hols

The lactic acid bacterium Streptococcus thermophilus is widely used for the manufacture of yogurt and cheese. This dairy species of major economic importance is phylogenetically close to pathogenic streptococci, raising the possibility that it has a potential for virulence. Here we report the genome sequences of two yogurt strains of S. thermophilus. We found a striking level of gene decay (10% pseudogenes) in both microorganisms. Many genes involved in carbon utilization are nonfunctional, in line with the paucity of carbon sources in milk. Notably, most streptococcal virulence-related genes that are not involved in basic cellular processes are either inactivated or absent in the dairy streptococcus. Adaptation to the constant milk environment appears to have resulted in the stabilization of the genome structure. We conclude that S. thermophilus has evolved mainly through loss-of-function events that remarkably mirror the environment of the dairy niche resulting in a severely diminished pathogenic potential.

Nucleic Acids Research | 2000

WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction

Ross Overbeek; Niels Larsen; Gordon D. Pusch; Mark D’Souza; Evgeni Selkov; Nikos C. Kyrpides; Michael Fonstein; Natalia Maltsev

The WIT (What Is There) (http://wit.mcs.anl.gov/WIT2/) system has been designed to support comparative analysis of sequenced genomes and to generate metabolic reconstructions based on chromosomal sequences and metabolic modules from the EMP/MPW family of databases. This system contains data derived from about 40 completed or nearly completed genomes. Sequence homologies, various ORF-clustering algorithms, relative gene positions on the chromosome and placement of gene products in metabolic pathways (metabolic reconstruction) can be used for the assignment of gene functions and for development of overviews of genomes within WIT. The integration of a large number of phylogenetically diverse genomes in WIT facilitates the understanding of the physiology of different organisms.

Journal of Bacteriology | 2002

Genome Sequence and Analysis of the Oral Bacterium Fusobacterium nucleatum Strain ATCC 25586

Vinayak Kapatral; Iain Anderson; Natalia Ivanova; Gary Reznik; Tamara Los; Athanasios Lykidis; Anamitra Bhattacharyya; Allen Bartman; Warren Gardner; Galina Grechkin; Lihua Zhu; Olga Vasieva; Lien Chu; Yakov Kogan; Oleg Chaga; Eugene Goltsman; Axel Bernal; Niels Bent Larsen; Mark D'Souza; Theresa L. Walunas; Gordon D. Pusch; Robert Haselkorn; Michael Fonstein; Nikos C. Kyrpides; Ross Overbeek

We present a complete DNA sequence and metabolic analysis of the dominant oral bacterium Fusobacterium nucleatum. Although not considered a major dental pathogen on its own, this anaerobe facilitates the aggregation and establishment of several other species including the dental pathogens Porphyromonas gingivalis and Bacteroides forsythus. The F. nucleatum strain ATCC 25586 genome was assembled from shotgun sequences and analyzed using the ERGO bioinformatics suite (http://www.integratedgenomics.com). The genome contains 2.17 Mb encoding 2,067 open reading frames, organized on a single circular chromosome with 27% GC content. Despite its taxonomic position among the gram-negative bacteria, several features of its core metabolism are similar to that of gram-positive Clostridium spp., Enterococcus spp., and Lactococcus spp. The genome analysis has revealed several key aspects of the pathways of organic acid, amino acid, carbohydrate, and lipid metabolism. Nine very-high-molecular-weight outer membrane proteins are predicted from the sequence, none of which has been reported in the literature. More than 137 transporters for the uptake of a variety of substrates such as peptides, sugars, metal ions, and cofactors have been identified. Biosynthetic pathways exist for only three amino acids: glutamate, aspartate, and asparagine. The remaining amino acids are imported as such or as di- or oligopeptides that are subsequently degraded in the cytoplasm. A principal source of energy appears to be the fermentation of glutamate to butyrate. Additionally, desulfuration of cysteine and methionine yields ammonia, H(2)S, methyl mercaptan, and butyrate, which are capable of arresting fibroblast growth, thus preventing wound healing and aiding penetration of the gingival epithelium. The metabolic capabilities of F. nucleatum revealed by its genome are therefore consistent with its specialized niche in the mouth.

Scientific Reports | 2015

RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

Thomas Brettin; James J. Davis; Terry Disz; Robert Edwards; Svetlana Gerdes; Gary J. Olsen; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D. Pusch; Maulik Shukla; James Thomason; Rick Stevens; Veronika Vonstein; Alice R. Wattam; Fangfang Xia

The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

Nucleic Acids Research | 2003

The ERGOTM genome analysis and discovery system

Ross Overbeek; Niels Bent Larsen; Theresa L. Walunas; Mark D'Souza; Gordon D. Pusch; Eugene Selkov; Konstantinos Liolios; Viktor Joukov; Denis Kaznadzey; Iain Anderson; Anamitra Bhattacharyya; Henry Burd; Warren Gardner; Paul Hanke; Vinayak Kapatral; Natalia Mikhailova; Olga Vasieva; Andrei L. Osterman; Veronika Vonstein; Michael Fonstein; Natalia V. Ivanova; Nikos C. Kyrpides

The ERGO (http://ergo.integratedgenomics.com/ERGO/) genome analysis and discovery suite is an integration of biological data from genomics, biochemistry, high-throughput expression profiling, genetics and peer-reviewed journals to achieve a comprehensive analysis of genes and genomes. Far beyond any conventional systems that facilitate functional assignments, ERGO combines pattern-based analysis with comparative genomics by visualizing genes within the context of regulation, expression profiling, phylogenetic clusters, fusion events, networked cellular pathways and chromosomal neighborhoods of other functionally related genes. The result of this multifaceted approach is to provide an extensively curated database of the largest available integration of genomes, with a vast collection of reconstructed cellular pathways spanning all domains of life. Although access to ERGO is provided only under subscription, it is already widely used by the academic community. The current version of the system integrates 500 genomes from all domains of life in various levels of completion, 403 of which are available for subscription.

Nucleic Acids Research | 2017

Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

Alice R. Wattam; James J. Davis; Rida Assaf; Sébastien Boisvert; Thomas Brettin; Christopher Bun; Neal Conrad; Emily M. Dietrich; Terry Disz; Joseph L. Gabbard; Svetlana Gerdes; Christopher S. Henry; Ronald Kenyon; Dustin Machi; Chunhong Mao; Eric K. Nordberg; Gary J. Olsen; Daniel Murphy-Olson; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D. Pusch; Maulik Shukla; Veronika Vonstein; Andrew S. Warren; Fangfang Xia; Hyun Seung Yoo; Rick Stevens

The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRICs public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics.

Explore More