Aaron A. Best
Hope College
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aaron A. Best.
BMC Genomics | 2008
Ramy K. Aziz; Daniela Bartels; Aaron A. Best; Matthew DeJongh; Terrence Disz; Robert Edwards; Kevin Formsma; Svetlana Gerdes; Elizabeth M. Glass; Michael Kubal; Folker Meyer; Gary J. Olsen; Robert Olson; Andrei L. Osterman; Ross Overbeek; Leslie K. McNeil; Daniel Paarmann; Tobias Paczian; Bruce Parrello; Gordon D. Pusch; Claudia I. Reich; Rick Stevens; Olga Vassieva; Veronika Vonstein; Andreas Wilke; Olga Zagnitko
BackgroundThe number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.DescriptionWe describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment.The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.ConclusionBy providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.
Science | 2007
Hilary G. Morrison; Andrew G. McArthur; Frances D. Gillin; Stephen B. Aley; Rodney D. Adam; Gary J. Olsen; Aaron A. Best; W. Zacheus Cande; Feng Chen; Michael J. Cipriano; Barbara J. Davids; Scott C. Dawson; Heidi G. Elmendorf; Adrian B. Hehl; Michael E. Holder; Susan M. Huse; Ulandt Kim; Erica Lasek-Nesselquist; Gerard Manning; Anuranjini Nigam; Julie E. J. Nixon; Daniel Palm; Nora Q.E. Passamaneck; Anjali Prabhu; Claudia I. Reich; David S. Reiner; John Samuelson; Staffan G. Svärd; Mitchell L. Sogin
The genome of the eukaryotic protist Giardia lamblia, an important human intestinal parasite, is compact in structure and content, contains few introns or mitochondrial relics, and has simplified machinery for DNA replication, transcription, RNA processing, and most metabolic pathways. Protein kinases comprise the single largest protein class and reflect Giardias requirement for a complex signal transduction network for coordinating differentiation. Lateral gene transfer from bacterial and archaeal donors has shaped Giardias genome, and previously unknown gene families, for example, cysteine-rich structural proteins, have been discovered. Unexpectedly, the genome shows little evidence of heterozygosity, supporting recent speculations that this organism is sexual. This genome sequence will not only be valuable for investigating the evolution of eukaryotes, but will also be applied to the search for new therapeutics for this parasite.
Nature Biotechnology | 2010
Christopher S. Henry; Matthew DeJongh; Aaron A. Best; Paul M Frybarger; Ben Linsay; Rick Stevens
Genome-scale metabolic models have proven to be valuable for predicting organism phenotypes from genotypes. Yet efforts to develop new models are failing to keep pace with genome sequencing. To address this problem, we introduce the Model SEED, a web-based resource for high-throughput generation, optimization and analysis of genome-scale metabolic models. The Model SEED integrates existing methods and introduces techniques to automate nearly every step of this process, taking ∼48 h to reconstruct a metabolic model from an assembled genome sequence. We apply this resource to generate 130 genome-scale metabolic models representing a taxonomically diverse set of bacteria. Twenty-two of the models were validated against available gene essentiality and Biolog data, with the average model accuracy determined to be 66% before optimization and 87% after optimization.
BMC Bioinformatics | 2007
Matthew DeJongh; Kevin Formsma; Paul Boillot; John Gould; Matthew Rycenga; Aaron A. Best
BackgroundCurrent methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process.ResultsWe have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organisms genome. This focuses manual efforts on that portion of an organisms metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis.ConclusionOur method sets the stage for the automated generation of substantially complete metabolic networks for over 400 complete genome sequences currently in the SEED. With each genome that is processed using our tools, the database of common components grows to cover more of the diversity of metabolic pathways. This increases the likelihood that components of reaction networks for subsequently processed genomes can be retrieved from the database, rather than assembled and verified manually.
Methods of Molecular Biology | 2013
Scott Devoid; Ross Overbeek; Matthew DeJongh; Veronika Vonstein; Aaron A. Best; Christopher S. Henry
Over the past decade, genome-scale metabolic models have proven to be a crucial resource for predicting organism phenotypes from genotypes. These models provide a means of rapidly translating detailed knowledge of thousands of enzymatic processes into quantitative predictions of whole-cell behavior. Until recently, the pace of new metabolic model development was eclipsed by the pace at which new genomes were being sequenced. To address this problem, the RAST and the Model SEED framework were developed as a means of automatically producing annotations and draft genome-scale metabolic models. In this chapter, we describe the automated model reconstruction process in detail, starting from a new genome sequence and finishing on a functioning genome-scale metabolic model. We break down the model reconstruction process into eight steps: submitting a genome sequence to RAST, annotating the genome, curating the annotation, submitting the annotation to Model SEED, reconstructing the core model, generating the draft biomass reaction, auto-completing the model, and curating the model. Each of these eight steps is documented in detail.
Journal of Bacteriology | 2011
Dmitry A. Ravcheev; Aaron A. Best; Nathan L. Tintle; Matthew DeJongh; Andrei L. Osterman; Pavel S. Novichkov; Dmitry A. Rodionov
Transcriptional regulatory networks are fine-tuned systems that help microorganisms respond to changes in the environment and cell physiological state. We applied the comparative genomics approach implemented in the RegPredict Web server combined with SEED subsystem analysis and available information on known regulatory interactions for regulatory network reconstruction for the human pathogen Staphylococcus aureus and six related species from the family Staphylococcaceae. The resulting reference set of 46 transcription factor regulons contains more than 1,900 binding sites and 2,800 target genes involved in the central metabolism of carbohydrates, amino acids, and fatty acids; respiration; the stress response; metal homeostasis; drug and metal resistance; and virulence. The inferred regulatory network in S. aureus includes ∼320 regulatory interactions between 46 transcription factors and ∼550 candidate target genes comprising 20% of its genome. We predicted ∼170 novel interactions and 24 novel regulons for the control of the central metabolic pathways in S. aureus. The reconstructed regulons are largely variable in the Staphylococcaceae: only 20% of S. aureus regulatory interactions are conserved across all studied genomes. We used a large-scale gene expression data set for S. aureus to assess relationships between the inferred regulons and gene expression patterns. The predicted reference set of regulons is captured within the Staphylococcus collection in the RegPrecise database (http://regprecise.lbl.gov).
Biochimica et Biophysica Acta | 2011
Christopher S. Henry; Ross Overbeek; Fangfang Xia; Aaron A. Best; Elizabeth M. Glass; Jack A. Gilbert; Peter E. Larsen; Robert Edwards; Terry Disz; Folker Meyer; Veronika Vonstein; Matthew DeJongh; Daniela Bartels; Narayan Desai; Mark D'Souza; Scott Devoid; Kevin P. Keegan; Robert Olson; Andreas Wilke; Jared Wilkening; Rick Stevens
BACKGROUND The development of next generation sequencing technology is rapidly changing the face of the genome annotation and analysis field. One of the primary uses for genome sequence data is to improve our understanding and prediction of phenotypes for microbes and microbial communities, but the technologies for predicting phenotypes must keep pace with the new sequences emerging. SCOPE OF REVIEW This review presents an integrated view of the methods and technologies used in the inference of phenotypes for microbes and microbial communities based on genomic and metagenomic data. Given the breadth of this topic, we place special focus on the resources available within the SEED Project. We discuss the two steps involved in connecting genotype to phenotype: sequence annotation, and phenotype inference, and we highlight the challenges in each of these steps when dealing with both single genome and metagenome data. MAJOR CONCLUSIONS This integrated view of the genotype-to-phenotype problem highlights the importance of a controlled ontology in the annotation of genomic data, as this benefits subsequent phenotype inference and metagenome annotation. We also note the importance of expanding the set of reference genomes to improve the annotation of all sequence data, and we highlight metagenome assembly as a potential new source for complete genomes. Finally, we find that phenotype inference, particularly from metabolic models, generates predictions that can be validated and reconciled to improve annotations. GENERAL SIGNIFICANCE This review presents the first look at the challenges and opportunities associated with the inference of phenotype from genotype during the next generation sequencing revolution. This article is part of a Special Issue entitled: Systems Biology of Microorganisms.
BMC Genomics | 2013
Dmitry A. Ravcheev; Aaron A. Best; Natalia V. Sernova; Marat D. Kazanov; Pavel S. Novichkov; Dmitry A. Rodionov
BackgroundGenome scale annotation of regulatory interactions and reconstruction of regulatory networks are the crucial problems in bacterial genomics. The Lactobacillales order of bacteria collates various microorganisms having a large economic impact, including both human and animal pathogens and strains used in the food industry. Nonetheless, no systematic genome-wide analysis of transcriptional regulation has been previously made for this taxonomic group.ResultsA comparative genomics approach was used for reconstruction of transcriptional regulatory networks in 30 selected genomes of lactic acid bacteria. The inferred networks comprise regulons for 102 orthologous transcription factors (TFs), including 47 novel regulons for previously uncharacterized TFs. Numerous differences between regulatory networks of the Streptococcaceae and Lactobacillaceae groups were described on several levels. The two groups are characterized by substantially different sets of TFs encoded in their genomes. Content of the inferred regulons and structure of their cognate TF binding motifs differ for many orthologous TFs between the two groups. Multiple cases of non-orthologous displacements of TFs that control specific metabolic pathways were reported.ConclusionsThe reconstructed regulatory networks substantially expand the existing knowledge of transcriptional regulation in lactic acid bacteria. In each of 30 studied genomes the obtained regulatory network contains on average 36 TFs and 250 target genes that are mostly involved in carbohydrate metabolism, stress response, metal homeostasis and amino acids biosynthesis. The inferred networks can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. All reconstructed regulons are captured within the Streptococcaceae and Lactobacillaceae collections in the RegPrecise database (http://regprecise.lbl.gov).
bioRxiv | 2016
Adam P. Arkin; Rick Stevens; Robert W. Cottingham; Sergei Maslov; Christopher S. Henry; Paramvir Dehal; Doreen Ware; Fernando Perez; Nomi L. Harris; Shane Canon; Michael W Sneddon; Matthew L Henderson; William J Riehl; Dan Gunter; Dan Murphy-Olson; Stephen Chan; Roy T Kamimura; Thomas S Brettin; Folker Meyer; Dylan Chivian; David J. Weston; Elizabeth M. Glass; Brian H. Davison; Sunita Kumari; Benjamin H Allen; Jason K. Baumohl; Aaron A. Best; Ben Bowen; Steven E. Brenner; Christopher C Bun
The U.S. Department of Energy Systems Biology Knowledgebase (KBase) is an open-source software and data platform designed to meet the grand challenge of systems biology — predicting and designing biological function from the biomolecular (small scale) to the ecological (large scale). KBase is available for anyone to use, and enables researchers to collaboratively generate, test, compare, and share hypotheses about biological functions; perform large-scale analyses on scalable computing infrastructure; and combine experimental evidence and conclusions that lead to accurate models of plant and microbial physiology and community dynamics. The KBase platform has (1) extensible analytical capabilities that currently include genome assembly, annotation, ontology assignment, comparative genomics, transcriptomics, and metabolic modeling; (2) a web-browser-based user interface that supports building, sharing, and publishing reproducible and well-annotated analyses with integrated data; (3) access to extensive computational resources; and (4) a software development kit allowing the community to add functionality to the system.
Journal of Bacteriology | 2012
Irina A. Rodionova; Chen Yang; Xiaoqing Li; Oleg V. Kurnasov; Aaron A. Best; Andrei L. Osterman; Dmitry A. Rodionov
Sugar phosphorylation is an indispensable committed step in a large variety of sugar catabolic pathways, which are major suppliers of carbon and energy in heterotrophic species. Specialized sugar kinases that are indispensable for most of these pathways can be utilized as signature enzymes for the reconstruction of carbohydrate utilization machinery from microbial genomic and metagenomic data. Sugar kinases occur in several structurally distinct families with various partially overlapping as well as yet unknown substrate specificities that often cannot be accurately assigned by homology-based techniques. A subsystems-based metabolic reconstruction combined with the analysis of genome context and followed by experimental testing of predicted gene functions is a powerful approach of functional gene annotation. Here we applied this integrated approach for functional mapping of all sugar kinases constituting an extensive and diverse sugar kinome in the thermophilic bacterium Thermotoga maritima. Substrate preferences of 14 kinases mainly from the FGGY and PfkB families were inferred by bioinformatics analysis and biochemically characterized by screening with a panel of 45 different carbohydrates. Most of the analyzed enzymes displayed narrow substrate preferences corresponding to their predicted physiological roles in their respective catabolic pathways. The observed consistency supports the choice of kinases as signature enzymes for genomics-based identification and reconstruction of sugar utilization pathways. Use of the integrated genomic and experimental approach greatly speeds up the identification of the biochemical function of unknown proteins and improves the quality of reconstructed pathways.