Featured Researches

Biomolecules

Iterative Annealing Mechanism Explains the Functions of the GroEL and RNA Chaperones

Molecular chaperones are ATP-consuming biological machines, which facilitate the folding of proteins and RNA molecules that are kinetically trapped in misfolded states for long times. Unassisted folding occurs by the kinetic partitioning mechanism according to which folding to the native state, with low probability as well as misfolding to one of the many metastable states, with high probability, occur rapidly on similar time scales. GroEL is an all-purpose stochastic machine that assists misfolded substrate proteins (SPs) to fold. The RNA chaperones (CYT-19) help the folding of ribozymes that readily misfold. GroEL does not interact with the folded proteins but CYT-19 disrupts both the folded and misfolded ribozymes. Despite this major difference, the Iterative Annealing Mechanism (IAM) quantitatively explains all the available experimental data for assisted folding of proteins and ribozymes. Driven by ATP binding and hydrolysis and GroES binding, GroEL undergoes a catalytic cycle during which it samples three allosteric states, referred to as T (apo), R (ATP bound), and R'' (ADP bound). In accord with the IAM predictions, analyses of the experimental data shows that the efficiency of the GroEL-GroES machinery and mutants is determined by the resetting rate k R ′′ →T , which is largest for the wild type GroEL. Generalized IAM accurately predicts the folding kinetics of Tetrahymena ribozyme and its variants. Chaperones maximize the product of the folding rate and the steady state native state fold by driving the substrates out of equilibrium. Neither the absolute yield nor the folding rate is optimized.

Read more
Biomolecules

Key biology you should have learned in physics class: Using ideal-gas mixtures to understand biomolecular machines

The biological cell exhibits a fantastic range of behaviors, but ultimately these are governed by a handful of physical and chemical principles. Here we explore simple theory, known for decades and based on the simple thermodynamics of mixtures of ideal gases, which illuminates several key functions performed within the cell. Our focus is the free-energy-driven import and export of molecules, such as nutrients and other vital compounds, via transporter proteins. Complementary to a thermodynamic picture is a description of transporters via "mass-action" chemical kinetics, which lends further insights into biological machinery and free energy use. Both thermodynamic and kinetic descriptions can shed light on the fundamental non-equilibrium aspects of transport. On the whole, our biochemical-physics discussion will remain agnostic to chemical details, but we will see how such details ultimately enter a physical description through the example of the cellular fuel ATP.

Read more
Biomolecules

Kinematic Flexibility Analysis: Hydrogen Bonding Patterns Impart a Spatial Hierarchy of Protein Motion

Elastic network models (ENM) and constraint-based, topological rigidity analysis are two distinct, coarse-grained approaches to study conformational flexibility of macromolecules. In the two decades since their introduction, both have contributed significantly to insights into protein molecular mechanisms and function. However, despite a shared purpose of these approaches, the topological nature of rigidity analysis, and thereby the absence of motion modes, has impeded a direct comparison. Here, we present an alternative, kinematic approach to rigidity analysis, which circumvents these drawbacks. We introduce a novel protein hydrogen bond network spectral decomposition, which provides an orthonormal basis for collective motions modulated by non-covalent interactions, analogous to the eigenspectrum of normal modes, and decomposes proteins into rigid clusters identical to those from topological rigidity. Our kinematic flexibility analysis bridges topological rigidity theory and ENM, and enables a detailed analysis of motion modes obtained from both approaches. Our analysis reveals that collectivity of protein motions, reported by the Shannon entropy, is significantly lower for rigidity theory versus normal mode approaches. Strikingly, kinematic flexibility analysis suggests that the hydrogen bonding network encodes a protein-fold specific, spatial hierarchy of motions, which goes nearly undetected in ENM. This hierarchy reveals distinct motion regimes that rationalize protein stiffness changes observed from experiment and molecular dynamics simulations. A formal expression for changes in free energy derived from the spectral decomposition indicates that motions across nearly 40% of modes obey enthalpy-entropy compensation. Taken together, our analysis suggests that hydrogen bond networks have evolved to modulate protein structure and dynamics.

Read more
Biomolecules

Knoto-ID: a tool to study the entanglement of open protein chains using the concept of knotoids

The backbone of most proteins forms an open curve. To study their entanglement, a common strategy consists in searching for the presence of knots in their backbones using topological invariants. However, this approach requires to close the curve into a loop, which alters the geometry of curve. Knoto-ID allows evaluating the entanglement of open curves without the need to close them, using the recent concept of knotoids which is a generalization of the classical knot theory to open curves. Knoto-ID can analyse the global topology of the full chain as well as the local topology by exhaustively studying all subchains or only determining the knotted core. Knoto-ID permits to localize topologically non-trivial protein folds that are not detected by informatics tools detecting knotted protein folds.

Read more
Biomolecules

Knotted Proteins: Tie Etiquette in Structural Biology

A small fraction of all protein structures characterized so far are entangled. The challenge of understanding the properties of these knotted proteins, and the why and the how of their natural folding process, has been taken up in the past decade with different approaches, such as structural characterization, in vitro experiments, and simulations of protein models with varying levels of complexity. The simplest among these are the lattice Gō models, which belong to the class of structure-based models, i.e., models that are biased to the native structure by explicitly including structural data. In this review we highlight the contributions to the field made in the scope of lattice Gō models, putting them into perspective in the context of the main experimental and theoretical results and of other, more realistic, computational approaches.

Read more
Biomolecules

Large-scale ligand-based virtual screening for SARS-CoV-2 inhibitors using deep neural networks

Due to the current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, there is an urgent need for novel therapies and drugs. We conducted a large-scale virtual screening for small molecules that are potential CoV-2 inhibitors. To this end, we utilized "ChemAI", a deep neural network trained on more than 220M data points across 3.6M molecules from three public drug-discovery databases. With ChemAI, we screened and ranked one billion molecules from the ZINC database for favourable effects against CoV-2. We then reduced the result to the 30,000 top-ranked compounds, which are readily accessible and purchasable via the ZINC database. Additionally, we screened the DrugBank using ChemAI to allow for drug repurposing, which would be a fast way towards a therapy. We provide these top-ranked compounds of ZINC and DrugBank as a library for further screening with bioassays at this https URL.

Read more
Biomolecules

Learning from Protein Structure with Geometric Vector Perceptrons

Learning on 3D structures of large biomolecules is emerging as a distinct area in machine learning, but there has yet to emerge a unifying network architecture that simultaneously leverages the graph-structured and geometric aspects of the problem domain. To address this gap, we introduce geometric vector perceptrons, which extend standard dense layers to operate on collections of Euclidean vectors. Graph neural networks equipped with such layers are able to perform both geometric and relational reasoning on efficient and natural representations of macromolecular structure. We demonstrate our approach on two important problems in learning from protein structure: model quality assessment and computational protein design. Our approach improves over existing classes of architectures, including state-of-the-art graph-based and voxel-based methods. We release our code at this https URL.

Read more
Biomolecules

Ligand-induced oligomerization of the human GPCR neurotensin receptor 1 monitored in living HEK293T cells

The human neurotensin receptor 1 (NTSR1) is a G protein-coupled receptor that can be expressed in HEK293T cells by stable transfection. Its ligand is a 13-amino-acid peptide that binds with nanomolar affinity from the extracellular side to NTSR1. Ligand binding induces conformational changes that trigger the intracellular signaling processes. Recent single-molecule studies revealed a dynamic monomer - dimer equilibrium of the receptor in vitro. Here we report on the oligomerization state of the human NTSR1 in the plasma membrane of HEK293T cells in vivo. We fused different fluorescent marker proteins mRuby3 or mNeonGreen to the C-terminus of NTSR1 and mutated a tetracysteine motif into intracellular loop 3 (ICL3) for subsequent FlAsH labeling. Oligomerization of NTSR1 was monitored before and after stimulation of the receptor with its ligand by FLIM and homoFRET microscopy (i.e. Forster resonance energy transfer between identical fluorophores detected by fluorescence anisotropy), by colocalization microscopy and by time-lapse imaging using structured illumination microscopy (SIM).

Read more
Biomolecules

Ligand-protein interactions in lysozyme investigated through a dual-resolution model

A fully atomistic modelling of biological macromolecules at relevant length- and time-scales is often cumbersome or not even desirable, both in terms of computational effort required and it a posteriori analysis. This difficulty can be overcome with the use of multi-resolution models, in which different regions of the same system are concurrently described at different levels of detail. In enzymes, computationally expensive atomistic detail is crucial in the modelling of the active site in order to capture e.g. the chemically subtle process of ligand binding. In contrast, important yet more collective properties of the remainder of the protein can be reproduced with a coarser description. In the present work, we demonstrate the effectiveness of this approach through the calculation of the binding free energy of hen egg white lysozyme (HEWL) with the inhibitor di-N-acetylchitotriose. Particular attention is posed to the impact of the mapping, i.e. the selection of atomistic and coarse-grained residues, on the binding free energy. It is shown that, in spite of small variations of the binding free energy with respect to the active site resolution, the separate contributions coming from different energetic terms (such as electrostatic and van der Waals interactions) manifest a stronger dependence on the mapping, thus pointing to the existence of an optimal level of intermediate resolution.

Read more
Biomolecules

LinearDesign: Efficient Algorithms for Optimized mRNA Sequence Design

A messenger RNA (mRNA) vaccine has emerged as a promising direction to combat the current COVID-19 pandemic. This requires an mRNA sequence that is stable and highly productive in protein expression, features which have been shown to benefit from greater mRNA secondary structure folding stability and optimal codon usage. However, sequence design remains a hard problem due to the exponentially many synonymous mRNA sequences that encode the same protein. We show that this design problem can be reduced to a classical problem in formal language theory and computational linguistics that can be solved in O(n^3) time, where n is the mRNA sequence length. This algorithm could still be too slow for large n (e.g., n = 3, 822 nucleotides for the spike protein of SARS-CoV-2), so we further developed a linear-time approximate version, LinearDesign, inspired by our recent work, LinearFold. This algorithm, LinearDesign, can compute the approximate minimum free energy mRNA sequence for this spike protein in just 11 minutes using beam size b = 1, 000, with only 0.6% loss in free energy change compared to exact search (i.e., b = +infinity, which costs 1 hour). We also develop two algorithms for incorporating the codon optimality into the design, one based on k-best parsing to find alternative sequences and one directly incorporating codon optimality into the dynamic programming. Our work provides efficient computational tools to speed up and improve mRNA vaccine development.

Read more

Ready to get started?

Join us today