Featured Researches

Biomolecules

Development of antibacterial compounds that block evolutionary pathways to resistance

Antibiotic resistance is a worldwide challenge. A potential approach to block resistance is to simultaneously inhibit WT and known escape variants of the target bacterial protein. Here we applied an integrated computational and experimental approach to discover compounds that inhibit both WT and trimethoprim (TMP) resistant mutants of E. coli dihydrofolate reductase (DHFR). We identified a novel compound (CD15-3) that inhibits WT DHFR and its TMP resistant variants L28R, P21L and A26T with IC50 50-75 micromoles against WT and TMP-resistant strains. Resistance to CD15-3 was dramatically delayed compared to TMP in in vitro evolution. Whole genome sequencing of CD15-3 resistant strains showed no mutations in the target folA locus. Rather, gene duplication of several efflux pumps gave rise to weak (about twofold increase in IC50) resistance against CD15-3. Altogether, our results demonstrate the promise of strategy to develop evolution drugs - compounds which block evolutionary escape routes in pathogens.

Read more
Biomolecules

Dietary Restriction of Amino Acids for Cancer Therapy

Biosyntheses of proteins, nucleotides and fatty acids, are essential for the malignant proliferation and survival of cancer cells. Cumulating research findings show that amino acid restrictions are potential strategies for cancer interventions. Meanwhile, dietary strategies are popular among cancer patients. However, there is still lacking solid rationale to clarify what is the best strategy, why and how it is. Here, integrated analyses and comprehensive summaries for the abundances, signalling and functions of amino acids in proteomes, metabolism, immunity and food compositions, suggest that, intermittent fasting or intermittent dietary lysine restriction with normal maize as an intermittent staple food for days or weeks, might have the value and potential for cancer prevention or therapy. Moreover, dietary supplements were also discussed for cancer cachexia including dietary immunomodulatory.

Read more
Biomolecules

Different approaches to unveil biomolecule configurations and their mutual interactions

A novel technique was demonstrated that overcome important drawbacks to crosslink cells by irradiation with ultrashort UV laser pulses (L-crosslinking). To use this technique coupled to Chromatin ImmunoPrecipitation (ChIP) in a high throughput context, a pre-screening fast method needs to be implemented to set up suitable irradiation conditions of the cell sample for efficient L-crosslinking with no final and long ChIP analysis. Here a fast method is reported where living human cells have been first transfected with a vector coding for Estrogen Receptor {\alpha} (ER{\alpha}), linked to Green Florescent protein (ER{\alpha}-GFP), so that the well-known interaction between the Estrogen Receptor Elements (ERE) region of the cell DNA and the ER{\alpha} protein can be detected by studying the fluorometric response of the irradiated cells. The damage induced to cells by UV irradiation is characterized by looking at DNA integrity, proteins stability and cellular viability. A second novel approach is presented to analyze or re-visit DNA and RNA sequences and their molecular configurations. This approach is based on methods derived from Chern-Simons super-gravity adapted to describe mutations in DNA/RNA strings, as well as interactions between nucleic acids. As a preliminary case we analyze the KRAS human gene sequence and some of its mutations. Interestingly, our model shows how the Chern-Simons current are capable to characterize the mutations within a sequence, in particular giving a quantitative indication of the mutation likelihood.

Read more
Biomolecules

Dihedral angle prediction using generative adversarial networks

Several dihedral angles prediction methods were developed for protein structure prediction and their other applications. However, distribution of predicted angles would not be similar to that of real angles. To address this we employed generative adversarial networks (GAN). Generative adversarial networks are composed of two adversarially trained networks: a discriminator and a generator. A discriminator distinguishes samples from a dataset and generated samples while a generator generates realistic samples. Although the discriminator of GANs is trained to estimate density, GAN model is intractable. On the other hand, noise-contrastive estimation (NCE) was introduced to estimate a normalization constant of an unnormalized statistical model and thus the density function. In this thesis, we introduce noise-contrastive estimation generative adversarial networks (NCE-GAN) which enables explicit density estimation of a GAN model. And a new loss for the generator is proposed. We also propose residue-wise variants of auxiliary classifier GAN (AC-GAN) and Semi-supervised GAN to handle sequence information in a window. In our experiment, the conditional generative adversarial network (C-GAN), AC-GAN and Semi-supervised GAN were compared. And experiments done with improved conditions were invested. We identified a phenomenon of AC-GAN that distribution of its predicted angles is composed of unusual clusters. The distribution of the predicted angles of Semi-supervised GAN was most similar to the Ramachandran plot. We found that adding the output of the NCE as an additional input of the discriminator is helpful to stabilize the training of the GANs and to capture the detailed structures. Adding regression loss and using predicted angles by regression loss only model could improve the conditional generation performance of the C-GAN and AC-GAN.

Read more
Biomolecules

Direct homologous dsDNA-dsDNA pairing: how, where and why?

The ability of homologous chromosomes (or selected chromosomal loci) to pair specifically in the apparent absence of DNA breakage and recombination represents a prominent feature of eukaryotic biology. The mechanism of homology recognition at the basis of such recombination-independent pairing has remained elusive. A number of studies have supported the idea that sequence homology can be sensed between intact DNA double helices in vivo. In particular, recent analyses of the two silencing phenomena in fungi, known as repeat-induced point mutation (RIP) and meiotic silencing by unpaired DNA (MSUD), have provided genetic evidence for the existence of the direct homologous dsDNA-dsDNA pairing. Both RIP and MSUD likely rely on the same search strategy, by which dsDNA segments are matched as arrays of interspersed base-pair triplets. This process is general and very efficient, yet it proceeds normally without the RecA/Rad51/Dmc1 proteins. Further studies of RIP and MSUD may yield surprising insights into the function of DNA in the cell.

Read more
Biomolecules

Directed Non-Targeted Mass Spectrometry and Chemical Networking for Discovery of Eicosanoids

Eicosanoids and related species are critical, small bioactive mediators of human physiology and inflammation. While ~1100 distinct eicosanoids have been predicted to exist, to date, less than 150 of these molecules have been measured in humans, limiting our understanding of eicosanoids and their role in human biology. Using a directed non-targeted mass spectrometry approach in conjunction with computational chemical networking of spectral fragmentation patterns, we find over 500 discrete chemical signals highly consistent with known and putative eicosanoids in human plasma, including 46 putative novel molecules not previously described, thereby greatly expanding the breath of prior analytical strategies. In plasma samples from 1500 individuals, we find members of this expanded eicosanoid library hold close association with markers of inflammation, as well as clinical characteristics linked with inflammation, including advancing age and obesity. These experimental and computational approaches enable discovery of new chemical entities and will shed important insight into the role of bioactive molecules in human disease.

Read more
Biomolecules

Discovering Synergistic Drug Combinations for COVID with Biological Bottleneck Models

Drug combinations play an important role in therapeutics due to its better efficacy and reduced toxicity. Recent approaches have applied machine learning to identify synergistic combinations for cancer, but they are not applicable to new diseases with limited combination data. Given that drug synergy is closely tied to biological targets, we propose a \emph{biological bottleneck} model that jointly learns drug-target interaction and synergy. The model consists of two parts: a drug-target interaction and target-disease association module. This design enables the model to \emph{explain} how a biological target affects drug synergy. By utilizing additional biological information, our model achieves 0.78 test AUC in drug synergy prediction using only 90 COVID drug combinations for training. We experimentally tested the model predictions in the U.S. National Center for Advancing Translational Sciences (NCATS) facilities and discovered two novel drug combinations (Remdesivir + Reserpine and Remdesivir + IQ-1S) with strong synergy in vitro.

Read more
Biomolecules

Discovery of Self-Assembling π -Conjugated Peptides by Active Learning-Directed Coarse-Grained Molecular Simulation

Electronically-active organic molecules have demonstrated great promise as novel soft materials for energy harvesting and transport. Self-assembled nanoaggregates formed from π -conjugated oligopeptides composed of an aromatic core flanked by oligopeptide wings offer emergent optoelectronic properties within a water soluble and biocompatible substrate. Nanoaggregate properties can be controlled by tuning core chemistry and peptide composition, but the sequence-structure-function relations remain poorly characterized. In this work, we employ coarse-grained molecular dynamics simulations within an active learning protocol employing deep representational learning and Bayesian optimization to efficiently identify molecules capable of assembling pseudo-1D nanoaggregates with good stacking of the electronically-active π -cores. We consider the DXXX-OPV3-XXXD oligopeptide family, where D is an Asp residue and OPV3 is an oligophenylene vinylene oligomer (1,4-distyrylbenzene), to identify the top performing XXX tripeptides within all 20 3 = 8,000 possible sequences. By direct simulation of only 2.3% of this space, we identify molecules predicted to exhibit superior assembly relative to those reported in prior work. Spectral clustering of the top candidates reveals new design rules governing assembly. This work establishes new understanding of DXXX-OPV3-XXXD assembly, identifies promising new candidates for experimental testing, and presents a computational design platform that can be generically extended to other peptide-based and peptide-like systems.

Read more
Biomolecules

Disordered peptide chains in an α-C-based coarse-grained model

We construct a one-bead-per-residue coarse-grained dynamical model to describe intrinsically disordered proteins at significantly longer timescales than in the all-atom models. In this model, inter-residue contacts form and disappear during the course of the time evolution. The contacts may arise between the sidechains, the backbones or the sidechains and backbones of the interacting residues. The model yields results that are consistent with many all-atom and experimental data on these systems. We demonstrate that the geometrical properties of various homopeptides differ substantially in this model. In particular, the average radius of gyration scales with the sequence length in a residue-dependent manner.

Read more
Biomolecules

Distance-based Protein Folding Powered by Deep Learning

Contact-assisted protein folding has made very good progress, but two challenges remain. One is accurate contact prediction for proteins lack of many sequence homologs and the other is that time-consuming folding simulation is often needed to predict good 3D models from predicted contacts. We show that protein distance matrix can be predicted well by deep learning and then directly used to construct 3D models without folding simulation at all. Using distance geometry to construct 3D models from our predicted distance matrices, we successfully folded 21 of the 37 CASP12 hard targets with a median family size of 58 effective sequence homologs within 4 hours on a Linux computer of 20 CPUs. In contrast, contacts predicted by direct coupling analysis (DCA) cannot fold any of them in the absence of folding simulation and the best CASP12 group folded 11 of them by integrating predicted contacts into complex, fragment-based folding simulation. The rigorous experimental validation on 15 CASP13 targets show that among the 3 hardest targets of new fold our distance-based folding servers successfully folded 2 large ones with <150 sequence homologs while the other servers failed on all three, and that our ab initio folding server also predicted the best, high-quality 3D model for a large homology modeling target. Further experimental validation in CAMEO shows that our ab initio folding server predicted correct fold for a membrane protein of new fold with 200 residues and 229 sequence homologs while all the other servers failed. These results imply that deep learning offers an efficient and accurate solution for ab initio folding on a personal computer.

Read more

Ready to get started?

Join us today