Yifan Song
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yifan Song.
Nature Biotechnology | 2012
Timothy A. Whitehead; Aaron Chevalier; Yifan Song; Cyrille Dreyfus; Sarel J. Fleishman; Cecilia De Mattos; Christopher A. Myers; Hetunandan Kamisetty; Patrick J. Blair; Ian A. Wilson; David Baker
We show that comprehensive sequence-function maps obtained by deep sequencing can be used to reprogram interaction specificity and to leapfrog over bottlenecks in affinity maturation by combining many individually small contributions not detectable in conventional approaches. We use this approach to optimize two computationally designed inhibitors against H1N1 influenza hemagglutinin and, in both cases, obtain variants with subnanomolar binding affinity. The most potent of these, a 51-residue protein, is broadly cross-reactive against all influenza group 1 hemagglutinins, including human H2, and neutralizes H1N1 viruses with a potency that rivals that of several human monoclonal antibodies, demonstrating that computational design followed by comprehensive energy landscape mapping can generate proteins with potential therapeutic utility.
Structure | 2013
Yifan Song; Frank DiMaio; Raymond Y. Wang; David E. Kim; Chris Miles; T. J. Brunette; James Thompson; David Baker
We describe an improved method for comparative modeling, RosettaCM, which optimizes a physically realistic all-atom energy function over the conformational space defined by homologous structures. Given a set of sequence alignments, RosettaCM assembles topologies by recombining aligned segments in Cartesian space and building unaligned regions de novo in torsion space. The junctions between segments are regularized using a loop closure method combining fragment superposition with gradient-based minimization. The energies of the resulting models are optimized by all-atom refinement, and the most representative low-energy model is selected. The CASP10 experiment suggests that RosettaCM yields models with more accurate side-chain and backbone conformations than other methods when the sequence identity to the templates is greater than ∼15%.
Nature Chemical Biology | 2012
Sagar D. Khare; Yakov Kipnis; Per Greisen; Ryo Takeuchi; Yacov Ashani; Moshe Goldsmith; Yifan Song; Jasmine L. Gallaher; Israel Silman; Haim Leader; Joel L. Sussman; Barry L. Stoddard; Dan S. Tawfik; David Baker
The ability to redesign enzymes to catalyze noncognate chemical transformations would have wide-ranging applications. We developed a computational method for repurposing the reactivity of metalloenzyme active site functional groups to catalyze new reactions. Using this method, we engineered a zinc-containing mouse adenosine deaminase to catalyze the hydrolysis of a model organophosphate with a catalytic efficiency (k(cat)/K(m)) of ~10(4) M(-1) s(-1) after directed evolution. In the high-resolution crystal structure of the enzyme, all but one of the designed residues adopt the designed conformation. The designed enzyme efficiently catalyzes the hydrolysis of the R(P) isomer of a coumarinyl analog of the nerve agent cyclosarin, and it shows marked substrate selectivity for coumarinyl leaving groups. Computational redesign of native enzyme active sites complements directed evolution methods and offers a general approach for exploring their untapped catalytic potential for new reactivities.
Nature Methods | 2015
Frank DiMaio; Yifan Song; Xueming Li; Matthias J Brunner; Chunfu Xu; Vincent P. Conticello; Edward H. Egelman; Thomas C Marlovits; Yifan Cheng; David Baker
We describe a general approach for refining protein structure models on the basis of cryo-electron microscopy maps with near-atomic resolution. The method integrates Monte Carlo sampling with local density-guided optimization, Rosetta all-atom refinement and real-space B-factor fitting. In tests on experimental maps of three different systems with 4.5-Å resolution or better, the method consistently produced models with atomic-level accuracy largely independently of starting-model quality, and it outperformed the molecular dynamics–based MDFF method. Cross-validated model quality statistics correlated with model accuracy over the three test systems.
Proceedings of the National Academy of Sciences of the United States of America | 2012
Oliver F. Lange; Paolo Rossi; Nikolaos G. Sgourakis; Yifan Song; Hsiau Wei Lee; James M. Aramini; Asli Ertekin; Rong Xiao; Thomas B. Acton; Gaetano T. Montelione; David Baker
We have developed an approach for determining NMR structures of proteins over 20 kDa that utilizes sparse distance restraints obtained using transverse relaxation optimized spectroscopy experiments on perdeuterated samples to guide RASREC Rosetta NMR structure calculations. The method was tested on 11 proteins ranging from 15 to 40 kDa, seven of which were previously unsolved. The RASREC Rosetta models were in good agreement with models obtained using traditional NMR methods with larger restraint sets. In five cases X-ray structures were determined or were available, allowing comparison of the accuracy of the Rosetta models and conventional NMR models. In all five cases, the Rosetta models were more similar to the X-ray structures over both the backbone and side-chain conformations than the “best effort” structures determined by conventional methods. The incorporation of sparse distance restraints into RASREC Rosetta allows routine determination of high-quality solution NMR structures for proteins up to 40 kDa, and should be broadly useful in structural biology.
Methods in Enzymology | 2013
Andrew Leaver-Fay; O'Meara Mj; Mike Tyka; Ron Jacak; Yifan Song; Elizabeth H. Kellogg; James Thompson; Ian W. Davis; Roland A. Pache; Sergey Lyskov; Jeffrey J. Gray; Tanja Kortemme; Jane S. Richardson; James J. Havranek; Jack Snoeyink; David Baker; Brian Kuhlman
Accurate energy functions are critical to macromolecular modeling and design. We describe new tools for identifying inaccuracies in energy functions and guiding their improvement, and illustrate the application of these tools to the improvement of the Rosetta energy function. The feature analysis tool identifies discrepancies between structures deposited in the PDB and low-energy structures generated by Rosetta; these likely arise from inaccuracies in the energy function. The optE tool optimizes the weights on the different components of the energy function by maximizing the recapitulation of a wide range of experimental observations. We use the tools to examine three proposed modifications to the Rosetta energy function: improving the unfolded state energy model (reference energies), using bicubic spline interpolation to generate knowledge-based torisonal potentials, and incorporating the recently developed Dunbrack 2010 rotamer library (Shapovalov & Dunbrack, 2011).
Nature | 2016
Gaurav Bhardwaj; Vikram Khipple Mulligan; Christopher D. Bahl; Jason Gilmore; Peta J. Harvey; Olivier Cheneval; Garry W. Buchko; Surya V. S. R. K. Pulavarti; Quentin Kaas; Alexander Eletsky; Po-Ssu Huang; William Johnsen; Per Greisen; Gabriel J. Rocklin; Yifan Song; Thomas W. Linsky; Andrew M. Watkins; Stephen A. Rettie; Xianzhong Xu; Lauren Carter; Richard Bonneau; James M. Olson; Colin Correnti; Thomas Szyperski; David J. Craik; David Baker
Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18–47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N–C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.
Cell | 2014
Erik Procko; Geoffrey Y. Berguig; Betty W. Shen; Yifan Song; Shani L. Frayo; Anthony J. Convertine; Daciana Margineantu; Garrett C. Booth; Bruno E. Correia; Yuanhua Cheng; William R. Schief; David M. Hockenbery; Oliver W. Press; Barry L. Stoddard; Patrick S. Stayton; David Baker
Because apoptosis of infected cells can limit virus production and spread, some viruses have co-opted prosurvival genes from the host. This includes the Epstein-Barr virus (EBV) gene BHRF1, a homolog of human Bcl-2 proteins that block apoptosis and are associated with cancer. Computational design and experimental optimization were used to generate a novel protein called BINDI that binds BHRF1 with picomolar affinity. BINDI recognizes the hydrophobic cleft of BHRF1 in a manner similar to other Bcl-2 protein interactions but makes many additional contacts to achieve exceptional affinity and specificity. BINDI induces apoptosis in EBV-infected cancer lines, and when delivered with an antibody-targeted intracellular delivery carrier, BINDI suppressed tumor growth and extended survival in a xenograft disease model of EBV-positive human lymphoma. High-specificity-designed proteins that selectively kill target cells may provide an advantage over the toxic compounds used in current generation antibody-drug conjugates.
Proteins | 2014
David E. Kim; Frank DiMaio; Raymond Y. Wang; Yifan Song; David Baker
A number of methods have been described for identifying pairs of contacting residues in protein three‐dimensional structures, but it is unclear how many contacts are required for accurate structure modeling. The CASP10 assisted contact experiment provided a blind test of contact guided protein structure modeling. We describe the models generated for these contact guided prediction challenges using the Rosetta structure modeling methodology. For nearly all cases, the submitted models had the correct overall topology, and in some cases, they had near atomic‐level accuracy; for example the model of the 384 residue homo‐oligomeric tetramer (Tc680o) had only 2.9 Å root‐mean‐square deviation (RMSD) from the crystal structure. Our results suggest that experimental and bioinformatic methods for obtaining contact information may need to generate only one correct contact for every 12 residues in the protein to allow accurate topology level modeling. Proteins 2014; 82(Suppl 2):208–218.
Proteins | 2011
Yifan Song; Michael D. Tyka; Andrew Leaver-Fay; James Thompson; David Baker
Accurate modeling of biomolecular systems requires accurate forcefields. Widely used molecular mechanics (MM) forcefields obtain parameters from experimental data and quantum chemistry calculations on small molecules but do not have a clear way to take advantage of the information in high‐resolution macromolecular structures. In contrast, knowledge‐based methods largely ignore the physical chemistry of interatomic interactions, and instead derive parameters almost exclusively from macromolecular structures. This can involve considerable double counting of the same physical interactions. Here, we describe a method for forcefield improvement that combines the strengths of the two approaches. We use this method to improve the Rosetta all‐atom forcefield, in which the total energy is expressed as the sum of terms representing different physical interactions as in MM forcefields and the parameters are tuned to reproduce the properties of macromolecular structures. To resolve inaccuracies resulting from possible double counting of interactions, we compare distribution functions from low‐energy modeled structures to those from crystal structures. The structural and physical bases of the deviations between the modeled and reference structures are identified and used to guide forcefield improvements. We describe improvements resolving double counting between backbone hydrogen bond interactions and Lennard‐Jones interactions in helices; between sidechain‐backbone hydrogen bonds and the backbone torsion potential; and between the sidechain torsion potential and Lennard‐Jones interactions. Discrepancies between computed and observed distributions are also used to guide the incorporation of an explicit Cα‐hydrogen bond in β sheets. The method can be used generally to integrate different sources of information for forcefield improvement. Proteins 2011;