Featured Researches

Biomolecules

Choice of adaptive sampling strategy impacts state discovery, transition probabilities, and the apparent mechanism of conformational changes

Interest in equilibrium-based sampling methods has grown with recent advances in computational hardware and Markov state modeling (MSM) methods, yet outstanding questions remain that hinder widespread adoption. Namely, how do sampling strategies explore conformational space and how might this influence predictions? Here, we seek to answer these questions for four commonly used sampling methods: 1) a long simulation, 2) many short simulations, 3) adaptive sampling, and 4) FAST. We first develop a theoretical framework for analytically calculating the probability of discovering states and uncover the drastic effects of varying the number and length of simulations. We then use kinetic Monte Carlo simulations on a variety of physically inspired landscapes to characterize state discovery and transition pathways. Consistently, we find that FAST simulations discover target states with the highest probability and traverse realistic pathways. Furthermore, we uncover the pathology that short parallel simulations sometimes predict an incorrect transition pathway by crossing large energy barriers that long simulations would typically circumnavigate, which we refer to as pathway tunneling. To protect against tunneling, we introduce FAST-string, which samples along the highest-flux transition paths to refine an MSMs transition probabilities and discriminate between competing pathways. Additionally, we compare MSM estimators in describing thermodynamics and kinetics. For adaptive sampling, we recommend normalizing the transition counts out of each state after adding pseudo-counts to avoid creating sources or sinks. Lastly, we evaluate our insights from simple landscapes with all-atom molecular dynamics simulations of the folding of the {\lambda}-repressor protein. We find that FAST-contacts predicts the same folding pathway as long simulations but with orders of magnitude less simulation time.

Read more
Biomolecules

Circular spectropolarimetric sensing of higher plant and algal chloroplast structural variations

Photosynthetic eukaryotes show a remarkable variability in photosynthesis, including large differences in light harvesting proteins and pigment composition. In vivo circular spectropolarimetry enables us to probe the molecular architecture of photosynthesis in a non-invasive and non-destructive way and, as such, can offer a wealth of physiological and structural information. In the present study we have measured the circular polarizance of several multicellular green, red and brown algae and higher plants, which show large variations in circular spectropolarimetric signals with differences in both spectral shape and magnitude. Many of the algae display spectral characteristics not previously reported, indicating a larger variation in molecular organization than previously assumed. As the strengths of these signals vary by three orders of magnitude, these results also have important implications in terms of detectability for the use of circular polarization as a signature of life.

Read more
Biomolecules

Citrate stabilized gold nanoparticles interfere with amyloid fibril formation: D76N and ΔN6 \b{eta}2-microglobulin variants

Protein aggregation including the formation of dimers and multimers in solution, underlies an array of human diseases such as systemic amyloidosis which is a fatal disease caused by misfolding of native globular proteins damaging the structure and function of affected organs. Different kind of interactors can interfere with the formation of protein dimers and multimers in solution. A very special class of interactors are nanoparticles thanks to the extremely efficient extension of their interaction surface. In particular citrate-coated gold nanoparticles (cit-AuNPs) were recently investigated with amyloidogenic protein β 2-microglobulin ( β 2m). Here we present the computational studies on two challenging models known for their enhanced amyloidogenic propensity, namely Δ N6 and D76N β 2m naturally occurring variants, and disclose the role of cit-AuNPs on their fibrillogenesis. The proposed interaction mechanism lies in the interference of the cit-AuNPs with the protein dimers at the early stages of aggregation, that induces dimer disassembling. As a consequence, natural fibril formation can be inhibited. Relying on the comparison between atomistic simulations at multiple levels (enhanced sampling molecular dynamics and Brownian dynamics) and protein structural characterisation by NMR, we demonstrate that the cit-AuNPs interactors are able to inhibit protein dimer assembling. As a consequence, the natural fibril formation is also inhibited, as found in experiment.

Read more
Biomolecules

Classification of crystallization outcomes using deep convolutional neural networks

The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups. Here, state-of-the-art machine learning algorithms are trained and tested on different parts of this data set. We find that more than 94% of the test images can be correctly labeled, irrespective of their experimental origin. Because crystal recognition is key to high-density screening and the systematic analysis of crystallization experiments, this approach opens the door to both industrial and fundamental research applications.

Read more
Biomolecules

Clustering Bioactive Molecules in 3D Chemical Space with Unsupervised Deep Learning

Unsupervised clustering has broad applications in data stratification, pattern investigation and new discovery beyond existing knowledge. In particular, clustering of bioactive molecules facilitates chemical space mapping, structure-activity studies, and drug discovery. These tasks, conventionally conducted by similarity-based methods, are complicated by data complexity and diversity. We ex-plored the superior learning capability of deep autoencoders for unsupervised clustering of 1.39 mil-lion bioactive molecules into band-clusters in a 3-dimensional latent chemical space. These band-clusters, displayed by a space-navigation simulation software, band molecules of selected bioactivity classes into individual band-clusters possessing unique sets of common sub-structural features beyond structural similarity. These sub-structural features form the frameworks of the literature-reported pharmacophores and privileged fragments. Within each band-cluster, molecules are further banded into selected sub-regions with respect to their bioactivity target, sub-structural features and molecular scaffolds. Our method is potentially applicable for big data clustering tasks of different fields.

Read more
Biomolecules

Coarse-Grained Nucleic Acid-Protein Model for Hybrid Nanotechnology

The emerging field of hybrid DNA - protein nanotechnology brings with it the potential for many novel materials which combine the addressability of DNA nanotechnology with versatility of protein interactions. However, the design and computational study of these hybrid structures is difficult due to the system sizes involved. To aid in the design and in silico analysis process, we introduce here a coarse-grained DNA/RNA-protein model that extends the oxDNA/oxRNA models of DNA/RNA with a coarse-grained model of proteins based on an anisotropic network model representation. Fully equipped with analysis scripts and visualization, our model aims to facilitate hybrid nanomaterial design towards eventual experimental realization, as well as enabling study of biological complexes. We further demonstrate its usage by simulating DNA-protein nanocage, DNA wrapped around histones, and a nascent RNA in polymerase.

Read more
Biomolecules

Coarse-Grained Residue-Based Models of Disordered Protein Condensates: Utility and Limitations of Simple Charge Pattern Parameters

Biomolecular condensates undergirded by phase separations of proteins and nucleic acids serve crucial biological functions. To gain physical insights into their genetic basis, we study how liquid-liquid phase separation (LLPS) of intrinsically disordered proteins (IDPs) depends on their sequence charge patterns using a continuum Langevin chain model wherein each amino acid residue is represented by a single bead. Charge patterns are characterized by the `blockiness' measure κ and the `sequence charge decoration' (SCD) parameter. Consistent with random phase approximation (RPA) theory and lattice simulations, LLPS propensity as characterized by critical temperature T ∗ cr increases with increasingly negative SCD for a set of sequences showing a positive correlation between κ and − SCD. Relative to RPA, the simulated sequence-dependent variation in T ∗ cr is often---though not always---smaller, whereas the simulated critical volume fractions are higher. However, for a set of sequences exhibiting an anti-correlation between κ and − SCD, the simulated T ∗ cr 's are quite insensitive to either parameters. Additionally, we find that blocky sequences that allow for strong electrostatic repulsion can lead to coexistence curves with upward concavity as stipulated by RPA, but the LLPS propensity of a strictly alternating charge sequence was likely overestimated by RPA and lattice models because interchain stabilization of this sequence requires spatial alignments that are difficult to achieve in real space. These results help delineate the utility and limitations of the charge pattern parameters and of RPA, pointing to further efforts necessary for rationalizing the newly observed subtleties.

Read more
Biomolecules

Combinatorial Control through Allostery

Many instances of cellular signaling and transcriptional regulation involve switch-like molecular responses to the presence or absence of input ligands. To understand how these responses come about and how they can be harnessed, we develop a statistical mechanical model to characterize the types of Boolean logic that can arise from allosteric molecules following the Monod-Wyman-Changeux (MWC) model. Building upon previous work, we show how an allosteric molecule regulated by two inputs can elicit AND, OR, NAND and NOR responses, but is unable to realize XOR or XNOR gates. Next, we demonstrate the ability of an MWC molecule to perform ratiometric sensing - a response behavior where activity depends monotonically on the ratio of ligand concentrations. We then extend our analysis to more general schemes of combinatorial control involving either additional binding sites for the two ligands or an additional third ligand and show how these additions can cause a switch in the logic behavior of the molecule. Overall, our results demonstrate the wide variety of control schemes that biological systems can implement using simple mechanisms.

Read more
Biomolecules

Combining Alchemical Transformation with Physical Pathway to Accurately Compute Absolute Binding Free Energy

We present a new method that combines alchemical transformation with physical pathway to accurately and efficiently compute the absolute binding free energy of receptor-ligand complex. Currently, the double decoupling method (DDM) and the potential of mean force approach (PMF) methods are widely used to compute the absolute binding free energy of biomolecules. The DDM relies on alchemically decoupling the ligand from its environments, which can be computationally challenging for large ligands and charged ligands because of the large magnitude of the decoupling free energies involved. On the other hand, the PMF approach uses physical pathway to extract the ligand out of the binding site, thus avoids the alchemical decoupling of the ligand. However, the PMF method has its own drawback because of the reliance on a ligand binding/unbinding pathway free of steric obstruction from the receptor atoms. Therefore, in the presence of deeply buried ligand functional groups the convergence of the PMF calculation can be very slow leading to large errors in the computed binding free energy. Here we develop a new method called AlchemPMF by combining alchemical transformation with physical pathway to overcome the major drawback in the PMF method. We have tested the new approach on the binding of a charged ligand to an allosteric site on HIV-1 Integrase. After 20 ns of simulation per umbrella sampling window, the new method yields absolute binding free energies within ~1 kcal/mol from the experimental result, whereas the standard PMF approach and the DDM calculations result in errors of ~5 kcal/mol and > 2 kcal/mol, respectively. Furthermore, the binding free energy computed using the new method is associated with smaller statistical error compared with those obtained from the existing methods.

Read more
Biomolecules

Combining docking pose rank and structure with deep learning improves protein-ligand binding mode prediction

We present a simple, modular graph-based convolutional neural network that takes structural information from protein-ligand complexes as input to generate models for activity and binding mode prediction. Complex structures are generated by a standard docking procedure and fed into a dual-graph architecture that includes separate sub-networks for the ligand bonded topology and the ligand-protein contact map. This network division allows contributions from ligand identity to be distinguished from effects of protein-ligand interactions on classification. We show, in agreement with recent literature, that dataset bias drives many of the promising results on virtual screening that have previously been reported. However, we also show that our neural network is capable of learning from protein structural information when, as in the case of binding mode prediction, an unbiased dataset is constructed. We develop a deep learning model for binding mode prediction that uses docking ranking as input in combination with docking structures. This strategy mirrors past consensus models and outperforms the baseline docking program in a variety of tests, including on cross-docking datasets that mimic real-world docking use cases. Furthermore, the magnitudes of network predictions serve as reliable measures of model confidence

Read more

Ready to get started?

Join us today