Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Min-yi Shen is active.

Publication


Featured researches published by Min-yi Shen.


Protein Science | 2006

Statistical potential for assessment and prediction of protein structures

Min-yi Shen; Andrej Sali

Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance‐dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non‐native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo‐electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER‐8.


Methods of Molecular Biology | 2008

Protein Structure Modeling with MODELLER

Narayanan Eswar; David Eramian; Ben Webb; Min-yi Shen; Andrej Sali

Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. This chapter presents an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of similar protocols (correction of protcols) has resulted in models of useful accuracy for domains in more than half of all known protein sequences.


Current protocols in protein science | 2007

Comparative Protein Structure Modeling Using MODELLER

Narayanan Eswar; Ben Webb; Marc A. Marti-Renom; M.S. Madhusudhan; David Eramian; Min-yi Shen; Ursula Pieper; Andrej Sali

Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three‐dimensional (3‐D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3‐D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3‐D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target‐template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. Curr. Protoc. Protein Sci. 50:2.9.1‐2.9.31.


Protein Science | 2006

A composite score for predicting errors in protein structure models

David Eramian; Min-yi Shen; Damien P. Devos; Francisco Melo; Andrej Sali; Marc A. Marti-Renom

Reliable prediction of model accuracy is an important unsolved problem in protein structure modeling. To address this problem, we studied 24 individual assessment scores, including physics‐based energy functions, statistical potentials, and machine learning–based scoring functions. Individual scores were also used to construct ∼85,000 composite scoring functions using support vector machine (SVM) regression. The scores were tested for their abilities to identify the most native‐like models from a set of 6000 comparative models of 20 representative protein structures. Each of the 20 targets was modeled using a template of <30% sequence identity, corresponding to challenging comparative modeling cases. The best SVM score outperformed all individual scores by decreasing the average RMSD difference between the model identified as the best of the set and the model with the lowest RMSD (ΔRMSD) from 0.63 Å to 0.45 Å, while having a higher Pearson correlation coefficient to RMSD (r = 0.87) than any other tested score. The most accurate score is based on a combination of the DOPE non‐hydrogen atom statistical potential; surface, contact, and combined statistical potentials from MODPIPE; and two PSIPRED/DSSP scores. It was implemented in the SVMod program, which can now be applied to select the final model in various modeling problems, including fold assignment, target–template alignment, and loop modeling.


Journal of Molecular Biology | 2003

Investigations into Sequence and Conformational Dependence of Backbone Entropy, Inter-basin Dynamics and the Flory Isolated-pair Hypothesis for Peptides

Muhammad H. Zaman; Min-yi Shen; R. Stephen Berry; Karl F. Freed; Tobin R. Sosnick

The populations and transitions between Ramachandran basins are studied for combinations of the standard 20 amino acids in monomers, dimers and trimers using an implicit solvent Langevin dynamics algorithm and employing seven commonly used force-fields. Both the basin populations and inter-conversion rates are influenced by the nearest neighbors conformation and identity, contrary to the Flory isolated-pair hypothesis. This conclusion is robust to the choice of force-field, even though the use of different force-fields produces large variations in the populations and inter-conversion rates between the dominant helical, extended beta, and polyproline II basins. The computed variation of conformational and dynamical properties with different force-fields exceeds the difference between explicit and implicit solvent calculations using the same force-field. For all force-fields, the inter-basin transitions exhibit a directional dependence, with most transitions going through extended beta conformation, even when it is the least populated basin. The implications of these results are discussed in the context of estimates for the backbone entropy of single residues, and for the ability of all-atom simulations to reproduce experimental protein folding data.


Biophysical Journal | 2002

Long Time Dynamics of Met-Enkephalin: Comparison of Explicit and Implicit Solvent Models

Min-yi Shen; Karl F. Freed

Met-enkephalin is one of the smallest opiate peptides. Yet, its dynamical structure and receptor docking mechanism are still not well understood. The conformational dynamics of this neuron peptide in liquid water are studied here by using all-atom molecular dynamics (MD) and implicit water Langevin dynamics (LD) simulations with AMBER potential functions and the three-site transferable intermolecular potential (TIP3P) model for water. To achieve the same simulation length in physical time, the full MD simulations require 200 times as much CPU time as the implicit water LD simulations. The solvent hydrophobicity and dielectric behavior are treated in the implicit solvent LD simulations by using a macroscopic solvation potential, a single dielectric constant, and atomic friction coefficients computed using the accessible surface area method with the TIP3P model water viscosity as determined here from MD simulations for pure TIP3P water. Both the local and the global dynamics obtained from the implicit solvent LD simulations agree very well with those from the explicit solvent MD simulations. The simulations provide insights into the conformational restrictions that are associated with the bioactivity of the opiate peptide dermorphin for the delta-receptor.


Proteins | 2002

All‐atom fast protein folding simulations: The villin headpiece

Min-yi Shen; Karl F. Freed

We provide a fast folding simulation using an all‐atom solute, implicit solvent method that eliminates the need for treating solvent degrees of freedom. The folding simulations for the 36‐residue villin headpiece exhibit close correspondence with the landmark all‐atom explicit solvent molecular dynamics simulations by Duan and Kollman (Duan & Kollman, Science 1998;282:740–744; Duan, Wang, & Kollman, Proc Natl Acad Sci USA 1998;95:9897–9902). Our implicit solvent approach uses only an entry‐level single CPU PC with comparable throughput (∼4 nsec/day) to the DK supercomputer simulation. The native state is shown to be stable. Our 200‐nsec folding trajectory agrees with the DK simulation in displaying a burst phase, a rapid initial shrinkage, a highly native‐like binding site structure, and more. Proteins 2002;49:439–445.


Protein Science | 2008

How well can the accuracy of comparative protein structure models be predicted

David Eramian; Narayanan Eswar; Min-yi Shen; Andrej Sali

Comparative structure models are available for two orders of magnitude more protein sequences than are experimentally determined structures. These models, however, suffer from two limitations that experimentally determined structures do not: They frequently contain significant errors, and their accuracy cannot be readily assessed. We have addressed the latter limitation by developing a protocol optimized specifically for predicting the Cα root‐mean‐squared deviation (RMSD) and native overlap (NO3.5Å) errors of a model in the absence of its native structure. In contrast to most traditional assessment scores that merely predict one model is more accurate than others, this approach quantifies the error in an absolute sense, thus helping to determine whether or not the model is suitable for intended applications. The assessment relies on a model‐specific scoring function constructed by a support vector machine. This regression optimizes the weights of up to nine features, including various sequence similarity measures and statistical potentials, extracted from a tailored training set of models unique to the model being assessed: If possible, we use similarly sized models with the same fold; otherwise, we use similarly sized models with the same secondary structure composition. This protocol predicts the RMSD and NO3.5Å errors for a diverse set of 580,317 comparative models of 6174 sequences with correlation coefficients (r) of 0.84 and 0.86, respectively, to the actual errors. This scoring function achieves the best correlation compared to 13 other tested assessment criteria that achieved correlations ranging from 0.35 to 0.71.


Nucleic Acids Research | 2006

Protein complex compositions predicted by structural similarity.

Fred P. Davis; Hannes Braberg; Min-yi Shen; Ursula Pieper; Andrej Sali; M.S. Madhusudhan

Proteins function through interactions with other molecules. Thus, the network of physical interactions among proteins is of great interest to both experimental and computational biologists. Here we present structure-based predictions of 3387 binary and 1234 higher order protein complexes in Saccharomyces cerevisiae involving 924 and 195 proteins, respectively. To generate candidate complexes, comparative models of individual proteins were built and combined together using complexes of known structure as templates. These candidate complexes were then assessed using a statistical potential, derived from binary domain interfaces in PIBASE (). The statistical potential discriminated a benchmark set of 100 interface structures from a set of sequence-randomized negative examples with a false positive rate of 3% and a true positive rate of 97%. Moreover, the predicted complexes were also filtered using functional annotation and sub-cellular localization data. The ability of the method to select the correct binding mode among alternates is demonstrated for three camelid VHH domain—porcine α–amylase interactions. We also highlight the prediction of co-complexed domain superfamilies that are not present in template complexes. Through integration with MODBASE, the application of the method to proteomes that are less well characterized than that of S.cerevisiae will contribute to expansion of the structural and functional coverage of protein interaction space. The predicted complexes are deposited in MODBASE ().


PLOS Computational Biology | 2005

Structural modeling of protein interactions by analogy: application to PSD-95.

Dmitry Korkin; Fred P. Davis; Frank Alber; Tinh N. Luong; Min-yi Shen; Vladan Lucic; Mary B. Kennedy; Andrej Sali

We describe comparative patch analysis for modeling the structures of multidomain proteins and protein complexes, and apply it to the PSD-95 protein. Comparative patch analysis is a hybrid of comparative modeling based on a template complex and protein docking, with a greater applicability than comparative modeling and a higher accuracy than docking. It relies on structurally defined interactions of each of the complex components, or their homologs, with any other protein, irrespective of its fold. For each component, its known binding modes with other proteins of any fold are collected and expanded by the known binding modes of its homologs. These modes are then used to restrain conventional molecular docking, resulting in a set of binary domain complexes that are subsequently ranked by geometric complementarity and a statistical potential. The method is evaluated by predicting 20 binary complexes of known structure. It is able to correctly identify the binding mode in 70% of the benchmark complexes compared with 30% for protein docking. We applied comparative patch analysis to model the complex of the third PSD-95, DLG, and ZO-1 (PDZ) domain and the SH3-GK domains in the PSD-95 protein, whose structure is unknown. In the first predicted configuration of the domains, PDZ interacts with SH3, leaving both the GMP-binding site of guanylate kinase (GK) and the C-terminus binding cleft of PDZ accessible, while in the second configuration PDZ interacts with GK, burying both binding sites. We suggest that the two alternate configurations correspond to the different functional forms of PSD-95 and provide a possible structural description for the experimentally observed cooperative folding transitions in PSD-95 and its homologs. More generally, we expect that comparative patch analysis will provide useful spatial restraints for the structural characterization of an increasing number of binary and higher-order protein complexes.

Collaboration


Dive into the Min-yi Shen's collaboration.

Top Co-Authors

Avatar

Andrej Sali

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Eramian

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fred P. Davis

Howard Hughes Medical Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ursula Pieper

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge