David E. Kim
University of Washington
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by David E. Kim.
Methods in Enzymology | 2011
Andrew Leaver-Fay; Michael D. Tyka; Steven M. Lewis; Oliver F. Lange; James Thompson; Ron Jacak; Kristian W. Kaufman; P. Douglas Renfrew; Colin A. Smith; Will Sheffler; Ian W. Davis; Seth Cooper; Adrien Treuille; Daniel J. Mandell; Florian Richter; Yih-En Andrew Ban; Sarel J. Fleishman; Jacob E. Corn; David E. Kim; Sergey Lyskov; Monica Berrondo; Stuart Mentzer; Zoran Popović; James J. Havranek; John Karanicolas; Rhiju Das; Jens Meiler; Tanja Kortemme; Jeffrey J. Gray; Brian Kuhlman
We have recently completed a full re-architecturing of the ROSETTA molecular modeling program, generalizing and expanding its existing functionality. The new architecture enables the rapid prototyping of novel protocols by providing easy-to-use interfaces to powerful tools for molecular modeling. The source code of this rearchitecturing has been released as ROSETTA3 and is freely available for academic use. At the time of its release, it contained 470,000 lines of code. Counting currently unpublished protocols at the time of this writing, the source includes 1,285,000 lines. Its rapid growth is a testament to its ease of use. This chapter describes the requirements for our new architecture, justifies the design decisions, sketches out central classes, and highlights a few of the common tasks that the new software can perform.
Nucleic Acids Research | 2004
David E. Kim; Dylan Chivian; David Baker
The Robetta server (http://robetta.bakerlab.org) provides automated tools for protein structure prediction and analysis. For structure prediction, sequences submitted to the server are parsed into putative domains and structural models are generated using either comparative modeling or de novo structure prediction methods. If a confident match to a protein of known structure is found using BLAST, PSI-BLAST, FFAS03 or 3D-Jury, it is used as a template for comparative modeling. If no match is found, structure predictions are made using the de novo Rosetta fragment insertion method. Experimental nuclear magnetic resonance (NMR) constraints data can also be submitted with a query sequence for RosettaNMR de novo structure determination. Other current capabilities include the prediction of the effects of mutations on protein-protein interactions using computational interface alanine scanning. The Rosetta protein design and protein-protein docking methodologies will soon be available through the server as well.
Proteins | 2009
Srivatsan Raman; Robert B. Vernon; James Thompson; Michael D. Tyka; Ruslan I. Sadreyev; Jimin Pei; David E. Kim; Elizabeth H. Kellogg; Frank DiMaio; Oliver F. Lange; Lisa N. Kinch; Will Sheffler; Bong Hyun Kim; Rhiju Das; Nick V. Grishin; David Baker
We describe predictions made using the Rosetta structure prediction methodology for the Eighth Critical Assessment of Techniques for Protein Structure Prediction. Aggressive sampling and all‐atom refinement were carried out for nearly all targets. A combination of alignment methodologies was used to generate starting models from a range of templates, and the models were then subjected to Rosetta all atom refinement. For the 64 domains with readily identified templates, the best submitted model was better than the best alignment to the best template in the Protein Data Bank for 24 cases, and improved over the best starting model for 43 cases. For 13 targets where only very distant sequence relationships to proteins of known structure were detected, models were generated using the Rosetta de novo structure prediction methodology followed by all‐atom refinement; in several cases the submitted models were better than those based on the available templates. Of the 12 refinement challenges, the best submitted model improved on the starting model in seven cases. These improvements over the starting template‐based models and refinement tests demonstrate the power of Rosetta structure refinement in improving model accuracy. Proteins 2009.
Proteins | 2003
Dylan Chivian; David E. Kim; Lars Malmström; Philip Bradley; Timothy Robertson; Paul Murphy; Charles E.M. Strauss; Richard Bonneau; Carol A. Rohl; David Baker
Robetta is a fully automated protein structure prediction server that uses the Rosetta fragment‐insertion method. It combines template‐based and de novo structure prediction methods in an attempt to produce high quality models that cover every residue of a submitted sequence. The first step in the procedure is the automatic detection of the locations of domains and selection of the appropriate modeling protocol for each domain. For domains matched to a homolog with an experimentally characterized structure by PSI‐BLAST or Pcons2, Robetta uses a new alignment method, called K*Sync, to align the query sequence onto the parent structure. It then models the variable regions by allowing them to explore conformational space with fragments in fashion similar to the de novo protocol, but in the context of the template. When no structural homolog is available, domains are modeled with the Rosetta de novo protocol, which allows the full length of the domain to explore conformational space via fragment‐insertion, producing a large decoy ensemble from which the final models are selected. The Robetta server produced quite reasonable predictions for targets in the recent CASP‐5 and CAFASP‐3 experiments, some of which were at the level of the best human predictions. Proteins 2003;53:524–533.
Structure | 2013
Yifan Song; Frank DiMaio; Raymond Y. Wang; David E. Kim; Chris Miles; T. J. Brunette; James Thompson; David Baker
We describe an improved method for comparative modeling, RosettaCM, which optimizes a physically realistic all-atom energy function over the conformational space defined by homologous structures. Given a set of sequence alignments, RosettaCM assembles topologies by recombining aligned segments in Cartesian space and building unaligned regions de novo in torsion space. The junctions between segments are regularized using a loop closure method combining fragment superposition with gradient-based minimization. The energies of the resulting models are optimized by all-atom refinement, and the most representative low-energy model is selected. The CASP10 experiment suggests that RosettaCM yields models with more accurate side-chain and backbone conformations than other methods when the sequence identity to the templates is greater than ∼15%.
Proteins | 2007
Rhiju Das; Bin Qian; Srivatsan Raman; Robert B. Vernon; James Thompson; Philip Bradley; Sagar D. Khare; Michael D. Tyka; Divya Bhat; Dylan Chivian; David E. Kim; William Sheffler; Lars Malmström; Andrew M. Wollacott; Chu Wang; Ingemar André; David Baker
We describe predictions made using the Rosetta structure prediction methodology for both template‐based modeling and free modeling categories in the Seventh Critical Assessment of Techniques for Protein Structure Prediction. For the first time, aggressive sampling and all‐atom refinement could be carried out for the majority of targets, an advance enabled by the Rosetta@home distributed computing network. Template‐based modeling predictions using an iterative refinement algorithm improved over the best existing templates for the majority of proteins with less than 200 residues. Free modeling methods gave near‐atomic accuracy predictions for several targets under 100 residues from all secondary structure classes. These results indicate that refinement with an all‐atom energy function, although computationally expensive, is a powerful method for obtaining accurate structure predictions. Proteins 2007.
Proteins | 2005
Dylan Chivian; David E. Kim; Lars Malmström; Jack Schonbrun; Carol A. Rohl; David Baker
The Robetta server and revised automatic protocols were used to predict structures for CASP6 targets. Robetta is a publicly available protein structure prediction server (http://robetta.bakerlab.org/ that uses the Rosetta de novo and homology modeling structure prediction methods. We incorporated some of the lessons learned in the CASP5 experiment into the server prior to participating in CASP6. We additionally tested new ideas that were amenable to full‐automation with an eye toward improving the server. We find that the Robetta server shows the greatest promise for the more challenging targets. The most significant finding from CASP5, that automated protocols can be roughly comparable in ability with the better human‐intervention predictors, is repeated here in CASP6. Proteins 2005;Suppl 7:157–166.
Proteins | 2005
Philip Bradley; Lars Malmström; Bin Qian; Jack Schonbrun; Dylan Chivian; David E. Kim; Jens Meiler; Kira M.S. Misura; David Baker
We describe Rosetta predictions in the Sixth Community‐Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP), focusing on the free modeling category. Methods developed since CASP5 are described, and their application to selected targets is discussed. Highlights include improved performance on larger proteins (100–200 residues) and the prediction of a 70‐residue alpha–beta protein to near‐atomic resolution. Proteins 2005;Suppl 7:128–134.
Proteins | 2005
David E. Kim; Dylan Chivian; Lars Malmström; David Baker
Domain boundary prediction is an important step in both experimental and computational protein structure characterization. We have developed two fully automated domain parsing methods: the first, Ginzu, which we have described previously, utilizes information from homologous sequences and structures, while the second, RosettaDOM, which has not been described previously, uses only information in the query sequence. Ginzu iteratively assigns domains by homology to structures and sequence families using successively less confident methods. RosettaDOM uses the Rosetta de novo structure prediction method to build three‐dimensional models, and then applies Taylors structure based domain assignment method to parse the models into domains. Domain boundaries observed repeatedly in the models are predicted to be domain boundaries for the protein. Interestingly, RosettaDOM produced quite good domain predictions for proteins of a size typically considered to be beyond the reach of de novo structure prediction methods. For remote fold recognition targets and new folds, both Ginzu and RosettaDOM produced promising results, and in some cases where one method failed to detect the correct domain boundary, it was correctly identified by the other method. We describe here the successes and failures using both methods, and address the possibility of incorporating both protocols into an improved hybrid method. Proteins 2005;Suppl 7:193–200.
Science | 2017
Sergey Ovchinnikov; Hahnbeom Park; Neha Varghese; Po-Ssu Huang; Georgios A. Pavlopoulos; David E. Kim; Hetunandan Kamisetty; Nikos C. Kyrpides; David Baker
Filling in the protein fold picture Fewer than a third of the 14,849 known protein families have at least one member with an experimentally determined structure. This leaves more than 5000 protein families with no structural information. Protein modeling using residue-residue contacts inferred from evolutionary data has been successful in modeling unknown structures, but it requires large numbers of aligned sequences. Ovchinnikov et al. augmented such sequence alignments with metagenome sequence data (see the Perspective by Söding). They determined the number of sequences required to allow modeling, developed criteria for model quality, and, where possible, improved modeling by matching predicted contacts to known structures. Their method predicted quality structural models for 614 protein families, of which about 140 represent newly discovered protein folds. Science, this issue p. 294; see also p. 248 Combining metagenome data with protein structure prediction generates models for 614 families with unknown structures. Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families and that metagenome sequence data more than triple the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact-based structure matching, and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the Protein Data Bank. This approach provides the representative models for large protein families originally envisioned as the goal of the Protein Structure Initiative at a fraction of the cost.
