Rick Stevens | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rick Stevens is active.

Explore More

Publication

Featured researches published by Rick Stevens.

BMC Genomics | 2008

The RAST Server: Rapid Annotations using Subsystems Technology

Ramy K. Aziz; Daniela Bartels; Aaron A. Best; Matthew DeJongh; Terrence Disz; Robert Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M. Glass; Michael Kubal; Folker Meyer; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Ross Overbeek; Leslie K. McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D. Pusch; Claudia I. Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko

BackgroundThe number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.DescriptionWe describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment.The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.ConclusionBy providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

BMC Bioinformatics | 2008

The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes

Folker Meyer; Daniel Paarmann; Mark D'Souza; Robert Olson; Elizabeth M. Glass; Michael Kubal; Tobias Paczian; Alexis Rodriguez; Rick Stevens; Andreas Wilke; Jared Wilkening; Robert Edwards

AbstractBackgroundRandom community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers.ResultsA high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats.ConclusionThe open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis – the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org

Nucleic Acids Research | 2014

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

Ross Overbeek; Robert Olson; Gordon D. Pusch; Gary J. Olsen; James J. Davis; Terry Disz; Robert Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R. Wattam; Fangfang Xia; Rick Stevens

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

Nucleic Acids Research | 2005

The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes

Ross Overbeek; Tadhg P. Begley; Ralph Butler; Jomuna V. Choudhuri; Han-Yu Chuang; Matthew Cohoon; Valérie de Crécy-Lagard; Naryttza N. Diaz; Terry Disz; Robert D. Edwards; Michael Fonstein; Ed D. Frank; Svetlana Gerdes; Elizabeth M. Glass; Alexander Goesmann; Andrew C. Hanson; Dirk Iwata-Reuyl; Roy A. Jensen; Neema Jamshidi; Lutz Krause; Michael Kubal; Niels Bent Larsen; Burkhard Linke; Alice C. McHardy; Folker Meyer; Heiko Neuweger; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Vasiliy A. Portnoy

The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.

Nature | 2008

Functional metagenomic profiling of nine biomes

Elizabeth A. Dinsdale; Robert Edwards; Dana Hall; Florent E. Angly; Mya Breitbart; Mike Furlan; Christelle Desnues; Matthew Haynes; Linlin Li; Lauren D. McDaniel; Mary Ann Moran; Karen E. Nelson; Christina Nilsson; Robert Olson; John H. Paul; Beltran Rodriguez Brito; Yijun Ruan; Brandon K. Swan; Rick Stevens; David L. Valentine; Rebecca Vega Thurber; Linda Wegley; Bryan A. White; Forest Rohwer

Microbial activities shape the biogeochemistry of the planet and macroorganism health. Determining the metabolic processes performed by microbes is important both for understanding and for manipulating ecosystems (for example, disruption of key processes that lead to disease, conservation of environmental services, and so on). Describing microbial function is hampered by the inability to culture most microbes and by high levels of genomic plasticity. Metagenomic approaches analyse microbial communities to determine the metabolic processes that are important for growth and survival in any given environment. Here we conduct a metagenomic comparison of almost 15 million sequences from 45 distinct microbiomes and, for the first time, 42 distinct viromes and show that there are strongly discriminatory metabolic profiles across environments. Most of the functional diversity was maintained in all of the communities, but the relative occurrence of metabolisms varied, and the differences between metagenomes predicted the biogeochemical conditions of each environment. The magnitude of the microbial metabolic capabilities encoded by the viromes was extensive, suggesting that they serve as a repository for storing and sharing genes among their microbial hosts and influence global evolutionary and metabolic processes.

Nature Biotechnology | 2010

High-throughput generation, optimization and analysis of genome-scale metabolic models

Christopher S. Henry; Matthew DeJongh; Aaron A. Best; Paul M Frybarger; Ben Linsay; Rick Stevens

Genome-scale metabolic models have proven to be valuable for predicting organism phenotypes from genotypes. Yet efforts to develop new models are failing to keep pace with genome sequencing. To address this problem, we introduce the Model SEED, a web-based resource for high-throughput generation, optimization and analysis of genome-scale metabolic models. The Model SEED integrates existing methods and introduces techniques to automate nearly every step of this process, taking ∼48 h to reconstruct a metabolic model from an assembled genome sequence. We apply this resource to generate 130 genome-scale metabolic models representing a taxonomically diverse set of bacteria. Twenty-two of the models were validated against available gene essentiality and Biolog data, with the average model accuracy determined to be 66% before optimization and 87% after optimization.

ieee international conference on high performance computing data and analytics | 2011

The International Exascale Software Project roadmap

Jack J. Dongarra; Pete Beckman; Terry Moore; Patrick Aerts; Giovanni Aloisio; Jean Claude Andre; David Barkai; Jean Yves Berthou; Taisuke Boku; Bertrand Braunschweig; Franck Cappello; Barbara M. Chapman; Xuebin Chi; Alok N. Choudhary; Sudip S. Dosanjh; Thom H. Dunning; Sandro Fiore; Al Geist; Bill Gropp; Robert J. Harrison; Mark Hereld; Michael A. Heroux; Adolfy Hoisie; Koh Hotta; Zhong Jin; Yutaka Ishikawa; Fred Johnson; Sanjay Kale; R.D. Kenway; David E. Keyes

Over the last 20 years, the open-source community has provided more and more software on which the world’s high-performance computing systems depend for performance and productivity. The community has invested millions of dollars and years of effort to build key components. However, although the investments in these separate software elements have been tremendously valuable, a great deal of productivity has also been lost because of the lack of planning, coordination, and key integration of technologies necessary to make them work together smoothly and efficiently, both within individual petascale systems and between different systems. It seems clear that this completely uncoordinated development model will not provide the software needed to support the unprecedented parallelism required for peta/ exascale computation on millions of cores, or the flexibility required to exploit new hardware models and features, such as transactional memory, speculative execution, and graphics processing units. This report describes the work of the community to prepare for the challenges of exascale computing, ultimately combing their efforts in a coordinated International Exascale Software Project.

Scientific Reports | 2015

RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

Thomas Brettin; James J. Davis; Terry Disz; Robert Edwards; Svetlana Gerdes; Gary J. Olsen; Robert Olson; Ross Overbeek; Bruce Parrello; Gordon D. Pusch; Maulik Shukla; James Thomason; Rick Stevens; Veronika Vonstein; Alice R. Wattam; Fangfang Xia

The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

New Generation Computing | 1990

The Aurora or-parallel Prolog system

Ewing L. Lusk; Ralph Butler; Terrence Disz; Robert Olson; Ross Overbeek; Rick Stevens; David H. D. Warren; Alan Calderwood; Péter Szeredi; Seif Haridi; Per Brand; Mats Carlsson; Andrzej Ciepielewski; Bogumil Hausman

Aurora is a prototype or-parallel implementation of the full Prolog language for shared-memory multiprocessors, developed as part of an informal research collaboration known as the “Gigalips Project”. It currently runs on Sequent and Encore machines. It has been constructed by adapting Sicstus Prolog, a fast, portable, sequential Prolog system. The techniques for constructing a portable multiprocessor version follow those pioneered in a predecessor system, ANL-WAM. The SRI model was adopted as the means to extend the Sicstus Prolog engine for or-parallel operation. We describe the design and main implementation features of the current Aurora system, and present some experimental results. For a range of benchmarks, Aurora on a 20-processor Sequent Symmetry is 4 to 7 times faster than Quintus Prolog on a Sun 3/75. Good performance is also reported on some large-scale Prolog applications.

PLOS Computational Biology | 2009

The GAAS Metagenomic Tool and Its Estimations of Viral and Microbial Average Genome Size in Four Major Biomes

Florent E. Angly; Dana Willner; Alejandra Prieto-Davó; Robert Edwards; Robert Schmieder; Rebecca Vega-Thurber; Dionysios A. Antonopoulos; Katie L. Barott; Matthew T. Cottrell; Christelle Desnues; Elizabeth A. Dinsdale; Mike Furlan; Matthew Haynes; Matthew R. Henn; Yongfei Hu; David L. Kirchman; Tracey McDole; John D. McPherson; Folker Meyer; R. Michael Miller; Egbert Mundt; Robert K. Naviaux; Beltran Rodriguez-Mueller; Rick Stevens; Linda Wegley; Lixin Zhang; Baoli Zhu; Forest Rohwer

Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.

Explore More