Is this you? Create Your Porfile

Shoshana D. Brown

University of California, San Francisco

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shoshana D. Brown is active.

Explore More

Publication

Featured researches published by Shoshana D. Brown.

PLOS Computational Biology | 2009

Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies

Alexandra M. Schnoes; Shoshana D. Brown; Igor Dodevski; Patricia C. Babbitt

Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.

Nucleic Acids Research | 2014

The Structure–Function Linkage Database

Eyal Akiva; Shoshana D. Brown; Daniel E. Almonacid; Alan E. Barber; Ashley F. Custer; Michael A. Hicks; Conrad C. Huang; Florian Lauck; Susan T. Mashiyama; Elaine C. Meng; David Mischel; John H. Morris; Sunil Ojha; Alexandra M. Schnoes; Doug Stryke; Jeffrey M. Yunes; Thomas E. Ferrin; Gemma L. Holliday; Patricia C. Babbitt

The Structure–Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure–function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies ‘look alike’, making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.

Nature | 2013

Discovery of new enzymes and metabolic pathways by using structure and genome context

Suwen Zhao; Ritesh Kumar; Ayano Sakai; Matthew W. Vetting; B. McKay Wood; Shoshana D. Brown; Jeffery B. Bonanno; B. Hillerich; R.D. Seidel; Patricia C. Babbitt; Steven C. Almo; Jonathan V. Sweedler; John A. Gerlt; John E. Cronan; Matthew P. Jacobson

Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns. We and others are developing computation-guided strategies for functional discovery with ‘metabolite docking’ to experimentally derived or homology-based three-dimensional structures. Bacterial metabolic pathways often are encoded by ‘genome neighbourhoods’ (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by ‘predicting’ the intermediates in the glycolytic pathway in Escherichia coli. Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-l-proline betaine (tHyp-B) and cis-4-hydroxy-d-proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.

Proceedings of the National Academy of Sciences of the United States of America | 2012

Homology models guide discovery of diverse enzyme specificities among dipeptide epimerases in the enolase superfamily

Tiit Lukk; Ayano Sakai; Chakrapani Kalyanaraman; Shoshana D. Brown; Heidi Imker; Ling Song; Alexander A. Fedorov; Elena V. Fedorov; Rafael Toro; B. Hillerich; R.D. Seidel; Yury Patskovsky; Matthew W. Vetting; Satish K. Nair; Patricia C. Babbitt; Steven C. Almo; John A. Gerlt; Matthew P. Jacobson

The rapid advance in genome sequencing presents substantial challenges for protein functional assignment, with half or more of new protein sequences inferred from these genomes having uncertain assignments. The assignment of enzyme function in functionally diverse superfamilies represents a particular challenge, which we address through a combination of computational predictions, enzymology, and structural biology. Here we describe the results of a focused investigation of a group of enzymes in the enolase superfamily that are involved in epimerizing dipeptides. The first members of this group to be functionally characterized were Ala-Glu epimerases in Eschericiha coli and Bacillus subtilis, based on the operon context and enzymological studies; these enzymes are presumed to be involved in peptidoglycan recycling. We have subsequently studied more than 65 related enzymes by computational methods, including homology modeling and metabolite docking, which suggested that many would have divergent specificities;, i.e., they are likely to have different (unknown) biological roles. In addition to the Ala-Phe epimerase specificity reported previously, we describe the prediction and experimental verification of: (i) a new group of presumed Ala-Glu epimerases; (ii) several enzymes with specificity for hydrophobic dipeptides, including one from Cytophaga hutchinsonii that epimerizes D-Ala-D-Ala; and (iii) a small group of enzymes that epimerize cationic dipeptides. Crystal structures for certain of these enzymes further elucidate the structural basis of the specificities. The results highlight the potential of computational methods to guide experimental characterization of enzymes in an automated, large-scale fashion.

eLife | 2014

Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks

Suwen Zhao; Ayano Sakai; Xinshuai Zhang; Matthew W. Vetting; Ritesh Kumar; B. Hillerich; Brian San Francisco; Jose O. Solbiati; Adam Steves; Shoshana D. Brown; Eyal Akiva; Alan E. Barber; R.D. Seidel; Patricia C. Babbitt; Steven C. Almo; John A. Gerlt; Matthew P. Jacobson

Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins in 12 families in the PRS that represent ∼85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discovery of the components of novel, uncharacterized metabolic pathways in sequenced genomes. DOI: http://dx.doi.org/10.7554/eLife.03275.001

Aaps Pharmsci | 2003

A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes

Shoshana D. Brown; Jean l. Chang; Wolfgang Sadee; Patricia C. Babbitt

Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previously characterized human MFS genes and 43 new MFS gene candidates. Development of this approach provided insights into the problems and pitfalls of automated data mining using public databases.

Journal of Biological Chemistry | 2014

New insights about enzyme evolution from large scale studies of sequence and structure relationships.

Shoshana D. Brown; Patricia C. Babbitt

Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.

Nature | 2013

Structure-guided discovery of the metabolite carboxy-SAM that modulates tRNA function

Jungwook Kim; Hui Xiao; Jeffrey B. Bonanno; Chakrapani Kalyanaraman; Shoshana D. Brown; Xiangying Tang; Nawar Al-Obaidi; Yury Patskovsky; Patricia C. Babbitt; Matthew P. Jacobson; Young Sam Lee; Steven C. Almo

Identifying novel metabolites and characterizing their biological functions are major challenges of the post-genomic era. X-ray crystallography can reveal unanticipated ligands which persist through purification and crystallization. These adventitious protein:ligand complexes provide insights into new activities, pathways and regulatory mechanisms. We describe a new metabolite, carboxy-S-adenosylmethionine (Cx-SAM), its biosynthetic pathway and its role in tRNA modification. The structure of CmoA, a member of the SAM-dependent methyltransferase superfamily, revealed a ligand in the catalytic site consistent with Cx-SAM. Mechanistic analyses demonstrated an unprecedented role for prephenate as the carboxyl donor and the involvement of a unique ylide intermediate as the carboxyl acceptor in the CmoA-mediated conversion of SAM to Cx-SAM. A second member of the SAM-dependent methyltransferase superfamily, CmoB, recognizes Cx-SAM and acts as a carboxymethyltransferase to convert 5-hydroxyuridine (ho5U) into 5-oxyacetyl uridine (cmo5U) at the wobble position of multiple tRNAs in Gram negative bacteria1, resulting in expanded codon-recognition properties2,3. CmoA and CmoB represent the first documented synthase and transferase for Cx-SAM. These findings reveal new functional diversity in the SAM-dependent methyltransferase superfamily and expand the metabolic and biological contributions of SAM-based biochemistry. These discoveries highlight the value of structural genomics approaches for identifying ligands in the context of their physiologically relevant macromolecular binding partners and for aiding in functional assignment.The identification of novel metabolites and the characterization of their biological functions are major challenges in biology. X-ray crystallography can reveal unanticipated ligands that persist through purification and crystallization. These adventitious protein–ligand complexes provide insights into new activities, pathways and regulatory mechanisms. We describe a new metabolite, carboxy-S-adenosyl-l-methionine (Cx-SAM), its biosynthetic pathway and its role in transfer RNA modification. The structure of CmoA, a member of the SAM-dependent methyltransferase superfamily, revealed a ligand consistent with Cx-SAM in the catalytic site. Mechanistic analyses showed an unprecedented role for prephenate as the carboxyl donor and the involvement of a unique ylide intermediate as the carboxyl acceptor in the CmoA-mediated conversion of SAM to Cx-SAM. A second member of the SAM-dependent methyltransferase superfamily, CmoB, recognizes Cx-SAM and acts as a carboxymethyltransferase to convert 5-hydroxyuridine into 5-oxyacetyl uridine at the wobble position of multiple tRNAs in Gram-negative bacteria, resulting in expanded codon-recognition properties. CmoA and CmoB represent the first documented synthase and transferase for Cx-SAM. These findings reveal new functional diversity in the SAM-dependent methyltransferase superfamily and expand the metabolic and biological contributions of SAM-based biochemistry. These discoveries highlight the value of structural genomics approaches in identifying ligands within the context of their physiologically relevant macromolecular binding partners, and in revealing their functions.

pacific symposium on biocomputing | 2004

Representing structure-function relationships in mechanistically diverse enzyme superfamilies.

Scott C.-H. Pegg; Shoshana D. Brown; Sunil Ojha; Conrad C. Huang; Thomas E. Ferrin; Patricia C. Babbitt

The prediction of protein function from structure or sequence data remains a problem best addressed by leveraging information available from previously determined structure-function relationships. In the case of enzymes, the study of mechanistically diverse superfamilies can provide a rich source of structure-function information useful in functional determination and enzyme engineering. To access these relationships using a computational resource, several issues must be addressed regarding the representation of enzyme function, the organization of structure-function relationships in the superfamily context, the handling of misannotations, and reliability of classifications and evidence. We discuss here our approaches to solving these problems in the development of a Structure-Function Linkage Database (SFLD) (online at http://sfld.rbvi.ucsf.edu).

Journal of Biological Chemistry | 2012

Inference of Functional Properties from Large-scale Analysis of Enzyme Superfamilies

Shoshana D. Brown; Patricia C. Babbitt

As increasingly large amounts of data from genome and other sequencing projects become available, new approaches are needed to determine the functions of the proteins these genes encode. We show how large-scale computational analysis can help to address this challenge by linking functional information to sequence and structural similarities using protein similarity networks. Network analyses using three functionally diverse enzyme superfamilies illustrate the use of these approaches for facile updating and comparison of available structures for a large superfamily, for creation of functional hypotheses for metagenomic sequences, and to summarize the limits of our functional knowledge about even well studied superfamilies.

Explore More