Featured Researches

Biomolecules

Accurate Evaluation on the Interactions of SARS-CoV-2 with Its Receptor ACE2 and Antibodies CR3022/CB6

The spread of the coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has become a global health crisis. The binding affinity of SARS-CoV-2 (in particular the receptor binding domain, RBD) to its receptor angiotensin converting enzyme 2 (ACE2) and the antibodies is of great importance in understanding the infectivity of COVID-19 and evaluating the candidate therapeutic for COVID-19. In this work, we propose a new method based on molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) to accurately calculate the free energy of SARS-CoV-2 RBD binding to ACE2 and antibodies. The calculated binding free energy of SARS-CoV-2 RBD to ACE2 is -13.3 kcal/mol, and that of SARS-CoV RBD to ACE2 is -11.4 kcal/mol, which agrees well with experimental result (-11.3 kcal/mol and -10.1 kcal/mol, respectively). Moreover, we take two recently reported antibodies as the example, and calculate the free energy of antibodies binding to SARS-CoV-2 RBD, which is also consistent with the experimental findings. Further, within the framework of the modified MM/PBSA, we determine the key residues and the main driving forces for the SARS-CoV-2 RBD/CB6 interaction by the computational alanine scanning method. The present study offers a computationally efficient and numerically reliable method to evaluate the free energy of SARS-CoV-2 binding to other proteins, which may stimulate the development of the therapeutics against the COVID-19 disease in real applications.

Read more
Biomolecules

Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations

Proteins are the major building blocks of life, and actuators of almost all chemical and biophysical events in living organisms. Their native structures in turn enable their biological functions which have a fundamental role in drug design. This motivates predicting the structure of a protein from its sequence of amino acids, a fundamental problem in computational biology. In this work, we demonstrate state-of-the-art protein structure prediction (PSP) results using embeddings and deep learning models for prediction of backbone atom distance matrices and torsion angles. We recover 3D coordinates of backbone atoms and reconstruct full atom protein by optimization. We create a new gold standard dataset of proteins which is comprehensive and easy to use. Our dataset consists of amino acid sequences, Q8 secondary structures, position specific scoring matrices, multiple sequence alignment co-evolutionary features, backbone atom distance matrices, torsion angles, and 3D coordinates. We evaluate the quality of our structure prediction by RMSD on the latest Critical Assessment of Techniques for Protein Structure Prediction (CASP) test data and demonstrate competitive results with the winning teams and AlphaFold in CASP13 and supersede the results of the winning teams in CASP12. We make our data, models, and code publicly available.

Read more
Biomolecules

Accurate protein-folding transition-path statistics from a simple free-energy landscape

A central goal of protein-folding theory is to predict the stochastic dynamics of transition paths --- the rare trajectories that transit between the folded and unfolded ensembles --- using only thermodynamic information, such as a low-dimensional equilibrium free-energy landscape. However, commonly used one-dimensional landscapes typically fall short of this aim, because an empirical coordinate-dependent diffusion coefficient has to be fit to transition-path trajectory data in order to reproduce the transition-path dynamics. We show that an alternative, first-principles free-energy landscape predicts transition-path statistics that agree well with simulations and single-molecule experiments without requiring dynamical data as an input. This 'topological configuration' model assumes that distinct, native-like substructures assemble on a timescale that is slower than native-contact formation but faster than the folding of the entire protein. Using only equilibrium simulation data to determine the free energies of these coarse-grained intermediate states, we predict a broad distribution of transition-path transit times that agrees well with the transition-path durations observed in simulations. We further show that both the distribution of finite-time displacements on a one-dimensional order parameter and the ensemble of transition-path trajectories generated by the model are consistent with the simulated transition paths. These results indicate that a landscape based on transient folding intermediates, which are often hidden by one-dimensional projections, can form the basis of a predictive model of protein-folding transition-path dynamics.

Read more
Biomolecules

Adaptive Markov State Model estimation using short reseeding trajectories

In the last decade, advances in molecular dynamics (MD) and Markov State Model (MSM) methodologies have made possible accurate and efficient estimation of kinetic rates and reactive pathways for complex biomolecular dynamics occurring on slow timescales. A promising approach to enhanced sampling of MSMs is to use so-called "adaptive" methods, in which new MD trajectories are "seeded" preferentially from previously identified states. Here, we investigate the performance of various MSM estimators applied to reseeding trajectory data, for both a simple 1D free energy landscape, and for mini-protein folding MSMs of WW domain and NTL9(1-39). Our results reveal the practical challenges of reseeding simulations, and suggest a simple way to reweight seeding trajectory data to better estimate both thermodynamic and kinetic quantities.

Read more
Biomolecules

Advances to tackle backbone flexibility in protein docking

Computational docking methods can provide structural models of protein-protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. In recent blind challenges, medium or high accuracy models were submitted in less than 20% of the "difficult" targets (with significant backbone change or uncertainty). Here, we describe recent developments in protein-protein docking and highlight advances that tackle backbone flexibility. In molecular dynamics and Monte Carlo approaches, enhanced sampling techniques have reduced time-scale limitations. Internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. And machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. These tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change.

Read more
Biomolecules

Advancing Standards-Free Methods for the Identification of Small Molecules in Complex Samples

The current gold standard for unambiguous identification in metabolomics analysis is based on comparing two or more orthogonal properties from the analysis of authentic, pure reference materials (standards) to experimental data acquired in the same laboratory with the same analytical methods. This represents a significant limitation for comprehensive chemical identification of small molecules in complex samples since this process is time-consuming and costly, and the majority of molecules are not yet represented by standards, leading to a need for standards-free identification. To address this need, we are advancing chemical property calculations and developing multi-attribute scoring and matching algorithms to utilize data from multiple analytical platforms through the utilization and creation of the in silico Chemical Library Engine (ISiCLE) and the Multi-Attribute Matching Engine (MAME). Here, we describe our results in a blinded analysis of synthetic chemical mixtures as part of the U.S. Environmental Protection Agency's (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). The blinded false negative rate (FNR), false discovery rate (FDR), and accuracy were 57%, 77%, and 91%, respectively. For high confidence identifications, the FDR was 35%. After unblinding of the sample compositions, we improved our approach by optimizing the scoring parameters used to increase confidence. The final FNR, FDR, and accuracy were 67%, 53%, and 96%, respectively. For high confidence identifications, the FDR was 10%. This study demonstrates that standards-free small molecule identification and multi-attribute matching methods can significantly reduce reliance on standards.

Read more
Biomolecules

Adverse effect of ethanol on insulin dimer stability

Alcohol is widely believed to have an effect on diabetes, often considered beneficial in small amounts but detrimental in excess. The reasons are not fully known but questions have been asked about the stability of insulin oligomers in the presence of ethanol. We compute the free energy surface (FES) and the pathway of insulin dimer dissociation in water and in 5% and 10% water-ethanol mixture. We find that in the presence of ethanol the barrier energy of dissociation reaction decreases by about 40% even in 5% water-ethanol solution. In addition, ethanol induces a significant change in the reaction pathway. We obtain estimates of the rate of reaction and binding energy for all the three systems and those agree well with the previous experimental results for the insulin dimer dissociation in water. The computed FES in water exhibits ruggedness due to the existence of a number of intermediate states surrounded by high and broad transition state region. However, the presence of ethanol smoothens out the ruggedness. We extracted the conformations of the intermediate states along the minimum energy pathway in all the three systems and analyzed the change in microscopic structures in the presence of ethanol. Interestingly, we discover a stable intermediate state in water-ethanol mixtures where the monomers are separated (center-to-center) by about 3 nm and the contact order parameter is close to zero. This intermediate is stabilized by the distribution of ethanol and water molecules at the interface and which, significantly, serves to reduce the dissociation rate constant .The solvation of the two monomers during the dissociation and proteins' departure from native state configuration are analyzed to obtain insight into the dimer dissociation processes.

Read more
Biomolecules

Algebraic graph learning of protein-ligand binding affinity

Although algebraic graph theory based models have been widely applied in physical modeling and molecular studies, they are typically incompetent in the analysis and prediction of biomolecular properties when compared with other quantitative approaches. There is a need to explore the capability and limitation of algebraic graph theory for molecular and biomolecular modeling, analysis, and prediction. In this work, we propose novel algebraic graph learning (AGL) models that encode high-dimensional physical and biological information into intrinsically low-dimensional representations. The proposed AGL model introduces multiscale weighted colored subgraphs to describe crucial molecular and biomolecular interactions via graph invariants associated with the graph Laplacian, its pseudo-inverse, and adjacent matrix. Additionally, the AGL models are incorporated with an advanced machine learning algorithm to connect the low-dimensional graph representation of biomolecular structures with their macroscopic properties. Three popular protein-ligand binding affinity benchmarks, namely CASF-2007, CASF-2013, and CASF-2016, are employed to validate the accuracy, robustness, and reliability of the present AGL model. Numerical results indicate that the proposed AGL method outperforms the other state-of-the-art methods in the binding affinity predictions of the protein-ligand complexes.

Read more
Biomolecules

Aligning Multiple Protein Structures using Biochemical and Biophysical Properties

Aligning multiple protein structures can yield valuable information about structural similarities among related proteins, as well as provide insight into evolutionary relationships between proteins in a family. We have developed an algorithm (msTALI) for aligning multiple protein structures using biochemical and biophysical properties, including torsion angles, secondary structure, hydrophobicity, and surface accessibility. The algorithm is a progressive alignment algorithm motivated by popular techniques from multiple sequence alignment. It has demonstrated success in aligning the major structural regions of a set of proteins from the s/r kinase family. The algorithm was also successful at aligning functional residues of these proteins. In addition, the algorithm was also successful in aligning seven members of the acyl carrier protein family, including both experimentally derived as well as computationally modeled structures.

Read more
Biomolecules

Allostery and conformational changes upon binding as generic features of proteins: a high-dimension geometrical approach

A growing number of experimental evidence shows that it is general for a ligand binding protein to have a potential for allosteric regulation and for further evolution. In addition, such proteins generically change their conformation upon binding. O. Rivoire has recently proposed an evolutionary scenario that explains these properties as a generic byproduct of selection for exquisite discrimination between very similar ligands. The initial claim was supported by two classes of basic examples: continuous protein models with small numbers of degrees of freedom, on which the development of a conformational switch was established, and a 2-dimensional spin glass model supporting the rest of the statement. This work aimed to clarify the implication of the exquisite discrimination for smooth models with large number of degrees of freedom, the situation closer to real biological systems. With the help of differential geometry, jet-space analysis, and transversality theorems, it is shown that the claim holds true for any generic flexible system that can be described in terms of smooth manifolds. The result suggests that, indeed, evolutionary solutions to the exquisite discrimination problem, if exist, are located near a codimension-1 subspace of the appropriate genotypical space. This constraint, in turn, gives rise to a potential for the allosteric regulation of the discrimination via generic conformational changes upon binding.

Read more

Ready to get started?

Join us today