Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Benjamin Sanchez-Lengeling is active.

Publication


Featured researches published by Benjamin Sanchez-Lengeling.


ACS central science | 2018

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

Rafael Gómez-Bombarelli; Jennifer Wei; David K. Duvenaud; José Miguel Hernández-Lobato; Benjamin Sanchez-Lengeling; Dennis Sheberla; Jorge Aguilera-Iparraguirre; Timothy D. Hirzel; Ryan P. Adams; Alán Aspuru-Guzik

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of drug-like molecules and also in a set of molecules with fewer that nine heavy atoms.


Scientific Reports | 2015

Quantum Chemical Approach to Estimating the Thermodynamics of Metabolic Reactions

Adrian Jinich; Dmitrij Rappoport; Ian F. Dunn; Benjamin Sanchez-Lengeling; Roberto Olivares-Amaya; Elad Noor; Arren Bar Even; Alán Aspuru-Guzik

Thermodynamics plays an increasingly important role in modeling and engineering metabolism. We present the first nonempirical computational method for estimating standard Gibbs reaction energies of metabolic reactions based on quantum chemistry, which can help fill in the gaps in the existing thermodynamic data. When applied to a test set of reactions from core metabolism, the quantum chemical approach is comparable in accuracy to group contribution methods for isomerization and group transfer reactions and for reactions not including multiply charged anions. The errors in standard Gibbs reaction energy estimates are correlated with the charges of the participating molecules. The quantum chemical approach is amenable to systematic improvements and holds potential for providing thermodynamic data for all of metabolism.


Journal of Chemical Information and Modeling | 2018

Reinforced Adversarial Neural Computer for de Novo Molecular Design

Evgeny Putin; Arip Asadulaev; Yan Ivanenkov; Vladimir Aladinskiy; Benjamin Sanchez-Lengeling; Alán Aspuru-Guzik; Alex Zhavoronkov

In silico modeling is a crucial milestone in modern drug design and development. Although computer-aided approaches in this field are well-studied, the application of deep learning methods in this research area is at the beginning. In this work, we present an original deep neural network (DNN) architecture named RANC (Reinforced Adversarial Neural Computer) for the de novo design of novel small-molecule organic structures based on the generative adversarial network (GAN) paradigm and reinforcement learning (RL). As a generator RANC uses a differentiable neural computer (DNC), a category of neural networks, with increased generation capabilities due to the addition of an explicit memory bank, which can mitigate common problems found in adversarial settings. The comparative results have shown that RANC trained on the SMILES string representation of the molecules outperforms its first DNN-based counterpart ORGANIC by several metrics relevant to drug discovery: the number of unique structures, passing medicinal chemistry filters (MCFs), Muegge criteria, and high QED scores. RANC is able to generate structures that match the distributions of the key chemical features/descriptors (e.g., MW, logP, TPSA) and lengths of the SMILES strings in the training data set. Therefore, RANC can be reasonably regarded as a promising starting point to develop novel molecules with activity against different biological targets or pathways. In addition, this approach allows scientists to save time and covers a broad chemical space populated with novel and diverse compounds.


ACS central science | 2017

Learning More, with Less

Benjamin Sanchez-Lengeling; Alán Aspuru-Guzik

P racticing chemists solve problems via “chemical intuition”, a quality that lets them skip intermediate details and get to the essential result, even if the outcome is counterintuitive to the uninitiated. There is no human shortcut to building this intuition; chemists hone their skills through years of experience of learning and memorizing patterns of molecular structure and reactivity. It is in this spirit that Vijay Pande and co-workers propose in “Low Data Drug Discovery with One-Shot Learning” in this issue of ACS Central Science a computational approach for chemical prediction by learning from a low number of examples. The paper touches on many central themes that are relevant to the intersection of the three main components of computation in chemistry: molecular representation, chemical space exploration, and machine learning to accelerate computation. For discovering new molecules, the enormity of chemical space cannot be understated; the number of “small” to “medium” sized molecules is estimated to be in the range of 10 to 10180, a number that is a hundred orders of magnitude larger than the number of atoms in the visible universe. With just a considerably small number of examples, chemists are able to distinguish and assess the potential function of a molecule for a given task. For example, we recently created a “Molecular Tinder” application that helped us in the design of molecules for organic displays. In analogy to the dating application, Molecular Tinder was a voting system that allowed us to harvest information from experimentalists who voted “Yes”, “No”, or “Maybe” on the synthesizability of molecules. Voting results allowed us to design algorithms that preferentially generated molecules with practical synthesic routes that were eventually synthesized and tested in real devices. Another very important aspect of human intuition is “transferability”, which enables the generalization of knowledge learned in a particular domain to untested domains. Everyone who has passed an undergraduate organic chemistry test had to show that their brain is able to generalize from one domain to the other. This is a much more challenging task for a computer. We are sometimes able to predict with varying degrees of success these properties using quantum chemistry calculations, but when these simulations are involved, supralinear computational scaling laws hinder the application of most common algorithms to complex molecules. Therefore, to cover chemical space efficiently, we cannot go unaided by intuition if we ever hope to explore it for successful molecular design. It is often thought in the artificial intelligence (AI) community that any human decision that can be done in a matter of a few seconds, can be in theory, learned and automated by a computer. There have been many recent examples where deep learning is solving increasingly complex tasks and getting closer to the performance of humans, even surpassing it in certain tasks such as the game Go with AlphaGo. This progress has been propelled mainly by two factors: broader availability of data and cheaper


bioRxiv | 2018

The thermodynamic landscape of carbon redox biochemistry

Adrian Jinich; Benjamin Sanchez-Lengeling; Haniu Ren; Joshua E. Goldford; Elad Noor; Jacob N. Sanders; Daniel Segrè; Alán Aspuru-Guzik

Redox biochemistry plays a key role in the transduction of chemical energy in living systems. However, the compounds observed in metabolic redox reactions are a minuscule fraction of chemical space. It is not clear whether compounds that ended up being selected as metabolites display specific properties that distinguish them from non-biological compounds. Here we introduce a systematic approach for comparing the chemical space of all possible redox states of linear-chain carbon molecules to the corresponding metabolites that appear in biology. Using cheminformatics and quantum chemistry, we analyze the physicochemical and thermodynamic properties of the biological and non-biological compounds. We find that, among all compounds, aldose sugars have the highest possible number of redox connections to other molecules. Metabolites are enriched in carboxylic acid functional groups and depleted of carbonyls, and have higher solubility than non-biological compounds. Upon constructing the energy landscape for the full chemical space as a function of pH and electron donor potential, we find that over a large range of conditions metabolites tend to have lower Gibbs energies than non-biological molecules. Finally, we generate Pourbaix phase diagrams that serve as a thermodynamic atlas to indicate which compounds are local and global energy minima in redox chemical space across a set of pH values and electron donor potentials. Our work yields insight into the physicochemical principles governing redox metabolism, and suggests that thermodynamic stability in aqueous environments may have played an important role in early metabolic processes.Redox biochemistry plays a key role in the transduction of chemical energy in all living systems. Observed redox reactions in metabolic networks represent only a minuscule fraction of the space of all possible redox reactions. Here we ask what distinguishes observed, natural redox biochemistry from the space of all possible redox reactions between natural and non-natural compounds. We generate the set of all possible biochemical redox reactions involving linear chain molecules with a fixed numbers of carbon atoms. Using cheminformatics and quantum chemistry tools we analyze the physicochemical and thermodynamic properties of natural and non-natural compounds and reactions. We find that among all compounds, aldose sugars are the ones with the highest possible number of connections (reductions and oxidations) to other molecules. Natural metabolites are significantly enriched in carboxylic acid functional groups and depleted in carbonyls and have significantly higher solubilities than non-natural compounds. Upon constructing a thermodynamic landscape for the full set of reactions as a function of pH and of steady-state redox cofactor potential, we find that, over this whole range of conditions, natural metabolites have significantly lower energies than the non-natural compounds. For the set of 4-carbon compounds, we generate a Pourbaix phase diagram to determine which metabolites are local energetic minima in the landscape as a function of pH and redox potential. Our results suggest that, across a set of conditions, succinate and butyrate are local minima and would thus tend to accumulate at equilibrium. Our work suggests that metabolic compounds could have been selected for thermodynamic stability, and yields insight into thermodynamic and design principles governing nature9s metabolic redox reactions.


bioRxiv | 2018

A mixed quantum chemistry/machine learning approach for the fast and accurate prediction of biochemical redox potentials and its large-scale application to 315,000 redox reactions

Adrian Jinich; Benjamin Sanchez-Lengeling; Haniu Ren; Rebecca Harman; Alán Aspuru-Guzik

A quantitative understanding of the thermodynamics of biochemical reactions is essential for accurately modeling metabolism. The group contribution method (GCM) is one of the most widely used approaches to estimating standard Gibbs energies and redox potentials of reactions for which no experimental measurements are available. Previous work has shown that quantum chemical predictions of biochemical thermodynamics are a promising approach to overcome the limitations of GCM. However, the quantum chemistry approach is significantly more expensive. Here we use a combination of quantum chemistry and machine learning to obtain a fast and accurate method for predicting the thermodynamics of biochemical redox reactions. We focus on predicting the redox potentials of carbonyl functional group reductions to alcohols and amines, one of the most ubiquitous carbon redox transformations in biology. The method relies on semi-empirical calculations calibrated with Gaussian Process (GP) regression against available experimental data. Our approach results in higher predictive power than the GCM. We demonstrate the high-throughput applicability of the method by predicting the standard potentials of more than 315,000 redox reactions involving ∼70,000 compounds, obtained with a network expansion algorithm that iteratively reduces and oxidizes a set of natural seed metabolites. We provide open access to all source code and data generated.


Science | 2018

Inverse molecular design using machine learning: Generative models for matter engineering

Benjamin Sanchez-Lengeling; Alán Aspuru-Guzik

The discovery of new materials can bring enormous societal and technological progress. In this context, exploring completely the large space of potential materials is computationally intractable. Here, we review methods for achieving inverse design, which aims to discover tailored materials from the starting point of a particular desired functionality. Recent advances from the rapidly growing field of artificial intelligence, mostly from the subfield of machine learning, have resulted in a fertile exchange of ideas, where approaches to inverse molecular design are being proposed and employed at a rapid pace. Among these, deep generative models have been applied to numerous classes of materials: rational design of prospective drugs, synthetic routes to organic compounds, and optimization of photovoltaics and redox flow batteries, as well as a variety of other solid-state materials.


PLOS Computational Biology | 2018

Quantum chemistry reveals thermodynamic principles of redox biochemistry

Adrian Jinich; Avi Flamholz; Haniu Ren; Sung-Jin Kim; Benjamin Sanchez-Lengeling; Charles A. R. Cotton; Elad Noor; Alán Aspuru-Guzik; Arren Bar-Even

Thermodynamics dictates the structure and function of metabolism. Redox reactions drive cellular energy and material flow. Hence, accurately quantifying the thermodynamics of redox reactions should reveal design principles that shape cellular metabolism. However, only few redox potentials have been measured, and mostly with inconsistent experimental setups. Here, we develop a quantum chemistry approach to calculate redox potentials of biochemical reactions and demonstrate our method predicts experimentally measured potentials with unparalleled accuracy. We then calculate the potentials of all redox pairs that can be generated from biochemically relevant compounds and highlight fundamental trends in redox biochemistry. We further address the question of why NAD/NADP are used as primary electron carriers, demonstrating how their physiological potential range fits the reactions of central metabolism and minimizes the concentration of reactive carbonyls. The use of quantum chemistry can revolutionize our understanding of biochemical phenomena by enabling fast and accurate calculation of thermodynamic values.


arXiv: Machine Learning | 2017

Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models.

Gabriel Lima Guimaraes; Benjamin Sanchez-Lengeling; Pedro Luis Cunha Farias; Alán Aspuru-Guzik


Joule | 2017

Design Principles and Top Non-Fullerene Acceptor Candidates for Organic Photovoltaics

Steven A. Lopez; Benjamin Sanchez-Lengeling; Julio de Goes Soares; Alán Aspuru-Guzik

Collaboration


Dive into the Benjamin Sanchez-Lengeling's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christoph J. Brabec

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

José Darío Perea

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Stefan Langner

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Evgeny Putin

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Vladimir Aladinskiy

Moscow Institute of Physics and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge