[PDF] Generative network complex for the automated generation of druglike molecules

Abstract

Current drug discovery is expensive and time-consuming. It remains a challenging task to create a wide variety of novel compounds with desirable pharmacological properties and cheaply available to low-income people. In this work, we develop a generative network complex (GNC) to generate new drug-like molecules based on the multi-property optimization via the gradient descent in the latent space of an autoencoder. In our GNC, both multiple chemical properties and similarity scores are optimized to generate and predict drug-like molecules with desired chemical properties. To further validate the reliability of the predictions, these molecules are reevaluated and screened by independent 2D fingerprint-based predictors to come up with a few hundreds of new drug candidates. As a demonstration, we apply our GNC to generate a large number of new BACE1 inhibitors, as well as thousands of novel alternative drug candidates for eight existing market drugs, including Ceritinib, Ribociclib, Acalabrutinib, Idelalisib, Dabrafenib, Macimorelin, Enzalutamide, and Panobinostat.

Full PDF

GGenerative network complex for the automated generation ofdruglike molecules

Kaifu Gao , Duc Duy Nguyen , Meihua Tu , and Guo-Wei Wei , , , ∗ Department of Mathematics, Michigan State University, MI 48824, USA. Pﬁzer Medicine Design, 610 Main St, Cambridge, MA 02139, USA. Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA. Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA.June 1, 2020

Abstract

Current drug discovery is expensive and time-consuming. It remains a challenging task to create a widevariety of novel compounds with desirable pharmacological properties and cheaply available to low-incomepeople. In this work, we develop a generative network complex (GNC) to generate new drug-like moleculesbased on the multi-property optimization via the gradient descent in the latent space of an autoencoder. In ourGNC, both multiple chemical properties and similarity scores are optimized to generate and predict drug-likemolecules with desired chemical properties. To further validate the reliability of the predictions, these moleculesare reevaluated and screened by independent 2D ﬁngerprint-based predictors to come up with a few hundredsof new drug candidates. As a demonstration, we apply our GNC to generate a large number of new BACE1inhibitors, as well as thousands of novel alternative drug candidates for eight existing market drugs, includingCeritinib, Ribociclib, Acalabrutinib, Idelalisib, Dabrafenib, Macimorelin, Enzalutamide, and Panobinostat.

I Introduction

Drug discovery ultimately tests our understanding of molecular biology, medicinal chemistry, genetics, physiol-ogy, and pharmacology, the status of biotechnology, the utility of computational sciences, and the maturity ofmathematical biology. Technically, drug discovery involves target discovery, lead discovery, lead optimization,preclinical development, three phases of clinical trials, and ﬁnally, launching to the market only if a drug canbe demonstrated to be safe and effective in every stage. Among them, lead discovery, lead optimization, andpreclinical development disqualify tens of thousands of compounds based on their binding afﬁnities, solubility,partition coefﬁcients, clearances, permeability, toxicities, pharmacokinetics, etc., leaving only about tens of themto clinical trials. Therefore, drug discovery is expensive and time-consuming currently: it takes about $2.6 billiondollars and more than ten years, on average, to bring a new drug to the market. Reducing the expense andspeeding up the process is one of the top priorities of the pharmaceutical industry.One of the key challenges in small molecule drug discovery is to ﬁnd novel chemical compounds with desirableproperties. Much effort has been taken to optimize this critical step in the drug discovery pipeline. For example,the development of high-throughput screening (HTS) has led to an unprecedented increase in the number ofpotential targets and leads. HTS can quickly conduct millions of tests to identify active compounds of interestusing compound libraries rapidly. While there has been an increase in the number of potential targets andleads, the number of newly generated molecular entities has remained stable because of a high attrition rateby the elimination of leads with inappropriate physicochemical or pharmacological properties during preclinicaldevelopment and clinical phases.

Rational drug design (RDD) approaches are proposed to better identifycandidates with the highest probability of success. These methods aim at ﬁnding new medications based onthe knowledge of biologically druggable targets.

More recently, computer-aided drug design (CADD) has emerged as a useful approach in reducing the ex-pense and period of drug discovery. Computational techniques have been developed for both virtual screening ∗ Corresponding to Guo-Wei Wei. Email: [email protected] a r X i v : . [ q - b i o . B M ] M a y VS) and optimizing the ADME properties of lead compounds. Primarily, these methods are designed as in silico ﬁlters to eliminate compounds with undesirable properties. These ﬁlters are widely applied for the assembly ofcompound libraries using combinatorial chemistry. The integration of early ADME proﬁling of lead chemicalshas contributed to speeding up lead selection for phase-I trials without large amounts of revenue loss. Cur-rently, compounds are added in libraries based on target-focused design or diversity considerations. VS andHTS can screen compound libraries to a subset of compounds whose properties are in agreement with variouscriteria. Despite these efforts, current databases of chemical compounds remain small when compared with the chem-ical space spanned by all possible energetically stable stoichiometric combinations of atoms and topologies inmolecules. It is estimated that there are about 10 distinct molecules; among them, roughly 10 are drug-like. As a result, computational techniques are also being developed for de novo design of drug-like molecules andgenerating large virtual chemical libraries, which can be screened more efﬁciently for in silico drug discovery.Among the available computational techniques, deep neural networks (DNN) have gathered much interest fortheir ability to extract features and learn physical principles from training data. Currently, DNN-based architec-tures have been successfully applied in a wide variety of ﬁelds in the biological and biomedical sciences. More interestingly, many deep generative models based on sequence-to-sequence autoencoders (Seq2seqAEs), variational autoencoders (VAEs), adversarial autoencoders (AAEs), generative adversarial networks(GANs) or reinforcement learning, etc. have been proposed for exploring the vast drug-like chemical spaceand generating new drug-like molecules. Winter et al. performed the optimization based on particle swarmoptimization on the continuous latent space of a Seq2seq AE to generate new molecules with desired properties.Gomez-Bombarelli et al. used a VAE to encode a molecule in the continuous latent space for exploring asso-ciated properties. Skalic et al. combined a conditional VAE and a captioning network to generate previouslyunseen compounds from input voxelized molecular representations. Kadurin et al. built an AAE to create newcompounds. Sattarov et al. combined deep autoencoder RNNs with generative topographic mapping to carryout de novo molecule design. A policy-based reinforcement learning approach was proposed to tune RNNsfor episodic tasks, and extended to design desired molecules. Zhou et al. also proposed a strategyto optimize molecules by combining domain knowledge of chemistry and state-of-the-art reinforcement learningtechniques.However, the generative strategies mentioned above are not drug-speciﬁed. What is vital to drug discovery isto design potential drug candidates for speciﬁc drug targets. In a regular drug discovery procedure, the startingpoint is target identiﬁcation, followed by lead generation. Then, lead optimization is performed to make leadcompounds more drug-like. It is useful to ﬁnd new lead compounds to replace existing market drugs for several reasons. First, existingmarket drugs might not be optimal. For example, they might not be potent enough, be too toxic and harmfulto human health, or be too hard to synthesize. Additionally, they might have side effects, insufﬁcient aqueoussolubility (log S) that reduces the absorption of drugs, or higher partition coefﬁcients (log P) that lead toimproper drug distributions within the body. Moreover, for very expensive drugs, it is desirable to ﬁnd cheapalternatives.With the generation of new alternative lead compounds in mind, one can make use of existing drug datasetsto develop drug-speciﬁed generative models. In this process, it is critical to apply a similarity restraint to gen-erate hundreds or even thousands of new drug-like molecules inside the chemical space close to the referencemolecule. This similarity restraint enables us to generate new molecules that remain effective to the target.Moreover, from the viewpoint of machine learning, higher similarities to existing data always lead to more reli-able predictions in generating molecules. The generative model can also realize lead optimization: by incorpo-rating optimizers, generated molecules are designated to have one or more chemical properties better than thereference molecule. As a result, a large number of alternative drug candidates are created by drug-speciﬁedgenerative models. These candidates could be an effective and speciﬁed library to further screen for better orcheaper drug alternatives.Therefore, in this work, we develop a generative network complex (GNC) based on the multiple-propertyoptimization via gradient descent in the latent space to automatically generate new drug-like molecules. Oneworkﬂow of our GNC consists of three following stages1. The SMILES string of a seed molecule are encoded into a vector in the latent space by a pre-trainedencoder. 2. Starting with the latent vector of the seed molecule, a DNN-based drug-like molecule generator modiﬁesthe vector via gradient descent, so that new latent vectors that satisfy multiple property restraints includingchemical properties and similarity scores to the desired objectives are created.3. A pre-trained decoder decodes these new latent vectors into the SMILES strings of newly generatedmolecules.The rest of the paper is organized as follows. Section II introduces our new GNC framework formed by theseq2seq AE and drug-like molecule generator. Section III discusses its reliability test on the BACE1 target, andmore importantly, presents the performance of our GNC on a variety of existing drug targets. The insight into theroles of the multiple property restraints is offered in Section IV. The conclusion is given in Section V.

II MethodsII.A The sequence to sequence antoencoder (seq2seq AE)

The seq2seq model is an autoencoder architecture originated from natural language processing. It has beendemonstrated as a breakthrough in language translation. The basic strategy of the seq2seq AE is to map an inputsequence to a ﬁxed-sized vector in the latent space using a gated recurrent unit (GRU) or a long short-termmemory (LSTM) network, and then map the vector to a target sequence with another GRU or LSTM network.Thus the latent vector is an intermediate representation containing the “meaning" of the input sequence.In our case, input and output sequences are both SMILES strings – a one-dimensional "language" of chemicalstructures. The seq2seq AE is trained to have a high reconstruction ratio between inputs and outputs so thatthe latent vectors contain faithful information of the chemical structures (see the upper part of Figure 1). Herewe utilize a pre-trained seq2seq model from a recent work. II.B The drug-like molecule generator based on the multi-property optimization

Figure 1: A schematic illustration of our generative network complex 2.In our new GNC, we design a drug-like molecule generator elaborately, so that generated molecules not onlysatisfy desired properties but also share common pharmacophores with reference compounds. From a seedmolecule, one generative workﬂow of the GNC is depicted in Figure 1 and described as below.1. Randomly pick a low-binding-afﬁnity molecule from a target-speciﬁed training set as the seed, then theSMILES string of the seed molecule is encoded by a pre-trained encoder (in our case a GRU encoder) intoa latent vector.2. The latent vector of the seed is fed into our DNN molecule generator. In every epoch, the generator comesup with a new vector X ∈ R n , and the deep learning network is instructed to evaluate X via the following3oss function L ( X ) = 1 n n (cid:88) i =1 k i | ˆ y i ( X ) − y i | , (1)where, k i is the i th predeﬁned weight serving the purpose of emphasizing or deemphasizing differentproperty restraints, ˆ y i ( X ) is the predicted i th property value by a pre-trained predictor M i . Additionally, y i is the objective value of the i th property. The restrained properties can be binding afﬁnity (BA), thesimilarity score (Sim) to a reference molecule or others such as partition coefﬁcient (Log P), Lipinski’s ruleof ﬁve, etc. Some guideline for y i includes, in the BA restraint, one often sets y ∆ G < -9.6 kcal/mol, in theLog P condition, it is common to set y LogP < 5, etc.3. Gradient descent is used to minimize the loss function in Eq. (1) until the maximum number of epochs isreached.4. The generated latent vectors satisfying the desired restraints are decoded into SMILES strings through apre-trained decoder, as shown in Figure 1.To create a variety of novel drug-like molecules originated from leads or existing drugs (reference molecules),one can adopt different seed molecules as well as different objective values to achieve desired properties andsimilarity scores. The ultimate purpose of our molecule generator is to keep modifying the latent vector to satisfythe multiple druggable property restraints. Figure 2 illustrates the general mechanism of our generator. Notably,the pre-trained predictor weights inside our model stay intact throughout the backpropagation of the trainingprocess to the generator. The loss function converges when all conditions are met, and then, resulting latentvectors are decoded into SMILES strings.Figure 2: The illustration of the latent-space molecule generator.

II.C The parameters of the molecule generator

In our model, the dimension of the latent space is 512, so the input and output dimensions of the DNN moleculegenerator are also 512. The DNN generator has two hidden layers, with 1024 neurons in each layer. Theactivation function is tanh, the learning rate is 0.1, and the momentum is also 0.1. In this work, we are interestedin binding afﬁnity and similarity score restraints. The regularization coefﬁcients of these two restraints ( k ∆G and k Sim ) are set to 1 and 10, respectively. The similarity score restraints is determined via the Tanimoto coefﬁcientbetween a generated latent vector and the latent vector of a reference molecule.The binding afﬁnity restraints rely on pre-trained binding afﬁnity predictors. A pre-trained binding afﬁnity pre-dictor (LV-BP) takes latent vectors as its inputs and return predicted binding afﬁnities. Therefore, typically, theinput dimension of the predictor is 512, the output dimension is 1. The DNN predictor has three hidden layers4ith 1024, 1536, and 1024 neurons, respectively. The ReLU activation function is applied. The learning rate is0.001, the number of training epochs is 4000, the batch size is 4. The predictor network is trained on target-speciﬁed datasets carefully selected from public databases such as ChEMBL. The generator and predictor areboth programmed in the framework of PyTorch (Version 1.0.0). In the current work, for each generation task, we randomly pick 50 low-binding-afﬁnity molecules from thepreselected dataset as seeds. For each seed, the generative network is run in a total of 2000 epochs, whichtakes less than 10 minutes under the supercomputer equipped with one Nvidia K80 GPU card. In practice, toquickly ﬁll up the potential chemical search space, one can use more seeds and run more epochs for each seed.

II.D Binding afﬁnity reevaluation by the 2D-ﬁngerprint predictor

Besides generating new molecules, the LV-BP in our GNC also predicts their binding afﬁnities. However, noexperimental values are available to validate these predicted afﬁnities. Therefore, we cross-validate them usingalternative binding afﬁnity predictors. In the present work, we construct machine learning predictors based on2D ﬁngerprints (2DFP-BPs) to reevaluate the afﬁnities of generated compounds. The 2D ﬁngerprints computedfrom their SMILES strings are inputs to these 2DFP-BPs. If the predictions from the LV-BP and 2DFP-BPs areconsistent, we regard the predictions as reliable.According to our previous tests, the consensus of ECFP4, Estate1 and Estate2 ﬁngerprints performsbest on binding-afﬁnity prediction tasks. Therefore, this work also makes use of this consensus. We employthe RDKit software (version 2018.09.3) to generate 2D ﬁngerprints from SMILES strings. Since the trainingsets in our current cases are not so large, we apply gradient boosting decision tree (GBDT) model due toits accuracy and speed when handling small datasets. This GBDT predictor is constructed using the gradientboosting regressor module in scikit-learn (version 0.20.1) and the following parameters: n_estimators=10000,max_depth=7, min_samples_split=3, learning_rate=0.01, subsample=0.3, and max_features=sqrt.The criteria used in our reevaluation are the root mean square error (RMSE), Pearson correlation coefﬁcient( R p ), and active ratio. Here, the active ratio means the ratio of the number of the active molecules indicated bothby the 2DFP-BP and LV-BP to the number of the active ones indicated by the LV-BP. II.E Multitask DNN predictors

Multitask DNN predictors for both latent vectors and 2D ﬁngerprints are built for the drug Ribociclib with twodifferent targets.The latent-vector based model has three hidden layers with 1024, 1536, and 1024 neurons. For the 2D-ﬁngerprint based models, because the three different 2D ﬁngerprints ECFP4, Estate1, Estate2 have 2048, 79,and 79 features, respectively, two different network architectures are used. For ECFP4, the numbers of neuronsin the three hidden layers are 2500, 1500, and 500, respectively. For Estate1 and Estate2, their numbers ofneurons are 500, 1000, and 500. Other parameters are the same as those of our single task predictors. Thesemultitask models are also programmed in the framework of PyTorch (Version 1.0.0). II.F Drug-target interaction and common pharmacophore analysis

The interactions between drugs and their targets, as well as the pharmacophores of the drugs, are investigated.The purpose is to explore whether our generated molecules can still bind to the drug targets.Drug-target interactions are analyzed via the protein-ligand interaction proﬁles. It can identify drug-targetinteractions as well as their types such as hydrogen bonds, hydrophobic interactions, etc.However, the interaction analysis itself could not determine whether interactions are critical or not to the drug-target binding. By using the Phase module in Schrödinger (version 2018-4), we build pharmacophore modelsvia searching common pharmacophores in all the active compounds to the target. Since these pharmacophoresare widespread to all the active compounds, they are critical to the binding. Thus, if generated molecules stillcontain such pharmacophores, we can believe they are potential binders.It could be time-consuming to recognize common pharmacophores of hundreds of compounds. To avoid thisobstacle, we group compounds into 50 clusters via the k-means algorithm implemented by scikit-learn. We,then, collect the centroids of these 50 clusters for the common pharmacophore search.

II.G Datasets

In this work, ﬁrst, we explore the effect of different objective values in our generator on the binding-afﬁnityprediction reliability to generated molecules. We carry out this reliability test on the Beta-Secretase 1 (BACE1)dataset.BACE1 is a transmembrane aspartic-acid protease human protein encoded by the BACE1 gene. It is essentialfor the generation of beta-amyloid peptide in neural tissue, a component of amyloid plaques widely believed tobe critical in the development of Alzheimer’s, rendering BACE1 an attractive therapeutic target for this devastating5isease. We download 3916 BACE1 compounds from the ChEMBL database. In the seq2seq autoencoderwe utilized, there is a molecule ﬁlter that only selects organic molecules with more than 3 heavy atoms, theirweights between 12 and 600, and their Log P values between -5 and 7. As a result, a total of 3151 moleculesin the BACE1 dataset pass this ﬁlter and are used as the training set.

Drug name ChEMBLID Treatment FDA ap-provalyear ∆ G (kcal/mol) Filteredtraining setsize ∆ G range oftraining set(kcal/mol) Ceritinib 2403108 Non-small cell lungcancer 2014 -10.77 1203 -5.26 to -13.93Ribociclib 3545110 Breast cancer 2017 -10.98 &-10.16 918 & 289 -6.31 to -12.98 &-4.11 to -14.13Acalabrutinib 3707348 Mantle celllymphoma 2017 -11.67 1451 -4.21 to -14.05Idelalisib 2216870 Blood cancers 2014 -10.43 1959 -1.92 to -14.16Macimorelin 278623 Adult growthhormone deﬁciency 2017 -10.48 608 -4.18 to -14.68Dabrafenib 2028663 Cancers with amutated gene BRAF 2013 -12.35 2254 -5.49 to -14.68Enzalutamide 1082407 Prostate cancer 2012 -10.53 1386 -3.83 to -13.72Panobinostat 483254 Cancers 2015 -12.46 1645 -2.91 to -12.46Table 1: The information about the eight market drugs used in the present study.More importantly, we employ our GNC to design alternative promising drug candidates for the eight drugs onthe market. For each drug, we construct a dataset of the compounds that bind to the same drug target from theChEMBL database. The collected compounds are also ﬁltered by the ﬁlter in the seq2seq autoencoder. Table 1lists information regarding these eight drugs, namely drug name, ChEMBL ID, FDA approval year, experimentaldrug afﬁnity ( ∆ G ), ﬁltered training set size, and afﬁnity range of the training set.The SciFinder is one of the most comprehensive databases for chemical literature and substances ( https://scifinder-n.cas.org ). It contains more than 143 million organic and inorganic substances. Here we searchour generated molecules in this database to conﬁrm they are new and never existed before. III ExperimentsIII.A Designing BACE1 inhibitorsIII.A.1 The accuracy of the seq2seq AE and predictors

We ﬁrst test the accuracy of the seq2seq autoencoder and the LV-BP and 2DFP-BP predictors.When performing the seq2seq model on the ﬁltered BACE1 dataset with 3151 molecules, the reconstructionratio is 96.2%. This high ratio guarantees the essential information of these input molecules is encoded into thecorresponding latent vectors.Subsequently, these latent vectors are used as the features to train our latent-vector DNN binding afﬁnitypredictor (LV-BP), the labels are their corresponding experimental binding afﬁnities. In a 5-fold cross-validationtest on the BACE1 dataset, the LV-BP achieves an average Pearson correlation coefﬁcient ( R p ) of 0.871 and anaverage RMSE of 0.704 kcal/mol.The 2D-ﬁngerprint GBDT binding afﬁnity predictor (2DFP-BP) is used to reevaluate the predictions from theLV-BP. We also test this 2DFP-BP by the 5-fold cross-validation. The average R p and RMSE are 0.874 and 0.692kcal/mol, respectively, quite comparable to the LV-BP. III.A.2 Convergence analysis

Here we conduct our GNC to generate new molecules and analyze how these new molecules evolve over ageneration course. We start this experiment with a seed molecule picked from the BACE1 dataset; this moleculeis far from active with the binding free energy ( ∆ G ) = -6.81 kcal/mol. The reference molecule is also from theBACE1 dataset, it is highly active with ∆ G = -12.02 kcal/mol. The binding afﬁnity objective y ∆ G is set to be-12.02 kcal/mol, and the similarity score to the reference molecule is targeted to y sim = 1 . .Figure 3a depicts the loss function values computed at every epoch; Figure 3b and c illustrate the LV-BPpredicted binding afﬁnities and similarity scores, respectively. From these ﬁgures, one can observe that ourGNC produces new potential BACE1 inhibitors with desirable binding afﬁnities in less than 3000 epochs.6igure 3: The convergence of the loss function, the LV-BP predicted ∆ G s, and the similarity score to the refer-ence molecule during a molecule generation course. In this example, the ∆ G s of the seed molecule and thereference molecule are -6.81 kcal/mol and -12.02 kcal/mol, respectively. To force generated molecules evolvingtowards the reference molecule, we set the similarity score objective and the ∆ G objective to be 1.00 and -12.02kcal/mol, respectively.Figure 4 shows a series of generated molecules over the evolution from the seed to the reference molecule.The starting point is the seed molecule, its binding afﬁnity and similarity score to the reference molecule areas low as -6.81 kcal/mol and 0.01, respectively. By receiving the feedback from the gradient descent in thegenerator, the similarity score gradually rises to 1.0. The improvement to the binding afﬁnity is even faster: whilea created molecule has a similarity score of 0.28, its LV-BP predicted ∆ G already reaches -11.30 kcal/mol; whilethe similarity score is 0.90, the LV-BP predicted ∆ G is -12.00 kcal/mol, which is essentially the same as thereference molecule’s ∆ G of -12.02 kcal/mol. III.A.3 Reliability test on the designed BACE1 inhibitors

Using our GNC, we also generate millions of compounds targeting a wide range of binding afﬁnities and similarityscores. Then the prediction reliability with these different ranges of binding afﬁnities and similarity scores istested. Individually, the similarity score objectives, y sim , vary from 0.50 to 0.95 with an increment of 0.025; thebinding afﬁnity objectives, y ∆ G , receive values from -9.6 kcal/mol to -13.1 kcal/mol with an increment of -0.25kcal/mol. Here we select -9.6 kcal/mol as the starting point since this value is a widely accepted threshold toidentify active compounds; the endpoint of ∆ G = -13.1 kcal/mol is the highest binding afﬁnity value in the BACE1dataset (see Figure 5).The reliability test is based on reevaluating the LV-BP predicted binding afﬁnities by the 2DFP-BP. The reliabilitycriteria are the RMSE, active ratio, and R p between the LV-BP and 2DFP-BP prediction. The heatmaps in Figure6 shows these evaluation metric values corresponding to different similarity score and binding afﬁnity restraints.Few blank points are present in each heatmap due to no available generation meeting these speciﬁc restraints.Figure 6a plots the RMSE metrics. It reveals the most reliable region, i.e., having low RMSE, is y sim above0.925, and the y ∆ G between -10.1 kcal/mol and -12.1 kcal/mol. This is expectable since machine learningmodels can render accurate predictions if generated structures are highly similar to the training data (see Figure5). Besides, a large population of training samples have ∆ G s between -10.1 kcal/mol and -12.1 kcal/mol, leading7igure 4: A series of generated molecules over the evolution from the seed to the reference molecule.more reliable predictions to molecules generated inside this range. Outside this range, as both y ∆ G and y sim decreasing, the RMSEs between the LV-BP and 2DFP-BP predictions increase. Speciﬁcally, if y ∆ G < − . kcal/mol and y sim < . , the RMSEs are always over 3.2 kcal/mol.Figure 6c depicts the R p s between the LV-BP and 2DFP-BP predictions with respect to y ∆ G and y sim . Similarto the manner of the RMSE distribution, − ≤ y ∆ G ≤ . kcal/mol and . ≤ y sim ≤ . lead to the8igure 5: The experimental ∆ G distribution of the ﬁltered BACE1 dataset. R p values consistently higher than 0.8.The last component of our reliable analysis regards the loss function magnitudes of our GNC generator plottedin Figure 6d. In most cases, the loss function values are less than 0.05. However, in some special situations,our network cannot maintain the loss values lower than 0.15. At these points, we cannot ﬁnd any generatedmolecules subject to the multi-property restraints simultaneously, which renders the blank spots in the criteriaplots in Figures 6a, b, and c.In summary, to generate molecules with reliable predictions, one should set the binding afﬁnity objective y ∆ G in a region ﬁlled with a large population of training data. Besides, the similarity score restraint y sim should behigh. However, in some circumstances, the generated molecules that have high predicted afﬁnities should alsobe included in further consideration. III.B Designing alternative drug candidates

In this section, we utilize our GNC to produce alternative drug-like molecules with high binding afﬁnities to theexisting drugs’ targets, which provides effective libraries for further improvement or searching cheaper drug al-ternatives. This work discusses eight drugs and their targets with the information regarding the names, ChEMBLIDs, energies, etc. summarized in Table 1. All of these drugs were approved by the FDA in the recent decadeto treat critical diseases, especially a variety of cancers. Notably, the drug Ribociclib has two different targets(Cyclin-dependent kinase 4 and Cyclin-dependent kinase 6), so Ribociclib has two sets of ∆ G s and two sets oftraining compounds. III.B.1 The single-target drug: CeritinibThe statistics of the drug Ceritinib.

The brand name of Ceritinib (ChEMBL ID: CHEMBL2403108) is Zykadia.It was developed by Novartis and approved by the FDA in April 2014 to treat different types of non-small cell lungcancers. Ceritinib is extremely expensive, with the monthly cost of Ceritinib-based treatment in the US beingapproximately $11,428.Ceritinib inhibits the ALK tyrosine kinase receptor (ALK, CHEMBL ID: CHEMBL4247). The ChEMBL databaseprovides 1407 molecules with experimental binding afﬁnity labels to this target available. After going through theﬁlter in our model, 1203 molecules are left to train our generator and predictor. Figure 7a depicts the ∆ G distribution of this training set; it unveils that, hundreds of training samples have their binding afﬁnities close to oreven higher than Ceritinib. The similarity score distribution in Figure 7b indicates, the training set includes 355samples with their similarity scores to Ceritinib over 0.3, 56 samples with such scores over 0.6. These promisinganalytical assessments enable our GNC to design potential inhibitors to the target ALK. Designing new druglike molecules.

Here we use Ceritinib as the reference molecule to design alternativeCeritinib drugs. Section III.A.3 suggests, to generate new molecules with desirable properties, the bindingafﬁnity objectives should be inside a region with plenty of training data, and the similarity scores to the referencemolecule y sim should be restrained to be high. However, high similarity scores could lead to quite limited chemicalspace. Our solution to this drawback is, ﬁrst, extend similarity score restraint to a broader range, then reevaluategenerated compounds using the 2DFP-BP, and only pick the ones with low discrepancies between the LV-BP9igure 6: The reliability test to our GNC generator on the BACE1 dataset. The prediction reliability with differentbinding afﬁnity objectives ( y ∆ G ) and similarity score objectives ( y sim ) is tested. The discrepancies between theLV-BP and the 2DFP-BP predictions are evaluated by different criteria: (a) The RMSE; (b) The activate ratio; (c)The Pearson correlation; (d) The loss function values.and 2DFP-BP predicted afﬁnities.Following this strategy, we set the similarity score restraints varying from 0.3 to 0.9 with an increment of 0.025;this is because more than 300 training samples have their similarity scores over 0.3, which is supportive togenerate compounds in this similarity range. The binding afﬁnities are aimed at the interval from -10.5 kcal/molto -12.25 kcal/mol with an increment of -0.25 kcal/mol; this binding afﬁnity region covers hundreds of trainingsamples as well as the drug Ceritinib itself. After reevaluating by the 2DFP-BP, any generated molecules withthe relative errors between the LV-BP and 2DFP-BP predicted afﬁnities over 5 % are thrown away. The 2DFP-BP reevaluation.

The details of the 2DFP-BP are offered in Section II.D. With Ceritinib as the ref-erence, our GNC model creates 1095 novel active drug-like molecules, upon eliminating the ones with highdiscrepancies in their LV-BP and 2DFP-BP predictions, 629 molecules are left. Figure 8 indicates, for these 629molecules, the correlation between the two predictions is quite promising, with the RMSE = 0.28 kcal/mol, R p =0.80, and active ratio = 0.95, respectively. This statistical information endorses the drug-likeness potential of ourAI-generated molecules.The LV-BP and 2DFP-BP predicted binding afﬁnities of these 629 molecules are also averaged, and theirdistribution is shown in 9a. This ﬁgure reveals the preferred afﬁnities of the generated compounds is from -9.8kcal/mol to -10.8 kcal/mol, which is also the most popular afﬁnity region of the training samples. Figure 9b10igure 7: (a) The ∆ G distribution of the ﬁltered training set to the target ALK. The red bar indicates the intervalcontaining the ∆ G of Ceritinib. (b) The similarity scores of the other molecules in this set to Ceritinib.Figure 8: The correlation plot between the LV-BP and 2DFP-BP predicted ∆ G s of the generated molecules tothe target ALK, the ones having high relative errors between the two predictions (>5 %) are already eliminated.illustrates their similarity score distribution to the reference drug Ceritinib. Top 6 drug candidates.

Ranked by the average predicted binding afﬁnities of these 629 molecules, we selectthe top 6 drug candidates. Their 2D draws are plotted in Figure 10. The relative errors between their LV-BPand 2DFP-BP predictions are 1.3 %, 3.1 %, 0.1 %, 4.0 %, 3.5 %, 3.9 %, respectively. It is delighted to seethat their average predicted afﬁnities are much higher than that of the reference drug Ceritinib. Moreover, thesecandidates have similarity scores to the reference drug from 0.54 to 0.69. They are brand new and do not existin the SciFinder database. We also calculate their log P values and synthetic accessibility scores (SAS) usingRDKit, log S values by Alog PS 2.1, and report them in Figure 10; these values are comparable to that of thereference drug and in reasonable ranges. 11igure 9: The average predicted binding afﬁnities to the target ALK of the generated molecules and their similarityscores to the reference drug Ceritinib. (a) The distribution of the averages of the LV-BP and 2DFP-BP predicted ∆ G s to the target ALK. (b) The similarity score distribution to the reference drug Ceritinib. The interaction and pharmacophore analysis.

One can observe from Figure 10, even though the similarityscores are only between 0.56 and 0.69, these generated compounds share some moieties with the referencedrug Ceritinib. This designates these common moieties to possibly involve critical interactions with the bindingsite of the target.To verify our generated compounds still contain critical moieties to the binding, from experimental structures,we investigate the drug-target interactions, as well as the common pharmacophores among all the active com-pounds to the target. Figure 11a shows the interaction details between the drug Ceritinib and its target in the3D crystal structure of their complex (PDB ID 4MKC ). It reveals their interactions include, one hydrogen bondwith one N atom in the pyrimidine of the drug (marked by 1 in the Figure 11a) as the acceptor, one hydrogenbond with one N atom in the chain of the drug (marked by 2 in the Figure 11a) as the donor, and another hydro-gen bond with one O atom in the drug as the acceptor, one hydrophobic interaction between one benzene ring(marked by 3 in the Figure 11a) and the target, as well as other hydrophobic interactions.The common pharmacophore analysis in Figure 11b is consistent with the interaction analysis. The N atomin the pyrimidine of the drug (marked by 1 in the Figure 11a and 11b) and the N atom in the chain of the drug(marked by 2 in the Figure 11a and 11b)) are critical pharmacophores, they plays the roles of an acceptor anda donor of hydrogen bonds, respectively. Another pharmacophore is the benzene ring forming a hydrophobicinteraction with the target. The pharmacophore analysis also reveals more potential interaction modes, such asthe hydrophobic interaction between the Cl atom and target, and the hydrogen bond with the other N atom in thechain of the drug as its donor.As illustrated in Figure 11b, all these critical pharmacophores are retained in our top 6 generated compounds,which strongly supports that these compounds are potential inhibitors to the target. III.B.2 The double-target drug: RibociclibThe statistics of the drug Ribociclib.

Here, we test the generative power of our GNC on multi-target drugs. Inthis case study, the multi-property restraints consist of multiple binding afﬁnity criteria and the similarity score toa reference molecule. The drug we test here is Ribociclib (ChEMBL ID: CHEMBL3545110, brand name: Kisqali).It was developed by Novartis and Astex Pharmaceuticals and approved by the FDA in 2017 to treat certain kindsof breast cancers. The monthly cost of Ribociclib treatment is $10,950 in the US.Ribociclib inhibits two different targets, namely the Cyclin-dependent kinase 4 (CDK4, CHEMBL ID: CHEMBL331)and Cyclin-dependent kinase 6 (CDK6, CHEMBL ID: CHEMBL2508). In the ChEMBL database, 919 moleculeshave CDK4 binding data, 289 molecules have CDK6 binding data. After ﬁltered out, 918 and 289 molecules areretained, respectively, providing a small training set to the CDK6. Figure 12a and b plot the ∆ G distributions ofthese two training sets. It reveals, in both the CDK4 and CDK6 sets, hundreds of samples have binding afﬁnitiesclose to or even higher than Ribociclib. Figure 12c and d show the similarity score distributions to Ribociclib of12igure 10: The drug Ceritinib and its top 6 generated molecules. The predicted ∆ G s to the target ALK, similarityscores (Sim) to the drug, calculated log P, log S values, and synthetic accessibility scores (SASs) are alsopresent.the two training sets. Both these sets include more than 200 samples with the similarity scores to the drug over13igure 11: The pharmacophore analysis to the ALK tyrosine kinase’s binding site. (a) The interaction plotbetween the drug Ceritinib and ALK tyrosine kinase from the 3D experimental structure with the PDB ID 4MKC.(b) The 3D alignments of the common pharmacophores obtained from all the active compounds to the targetALK with the 3D experimental structure of the drug Ceritinib, the top 6 generated molecules are also aligned toit.0.3, more than 60 samples with such scores over 0.6. The multitask predictor.

Since the CDK6 training set is small, it is challenging to train an accurate predictor forthis target. However, the two targets, CDK4 and CDK6, are similar due to the calculation via SWISS-MODEL with the sequence identity being 71.1%. Therefore, multitask deep learning can enhance reliability.In our multitask architecture, the binding afﬁnity predictor in our generator offers two outputs, each for oneof the two targets, so the predictor is trained by the two training sets simultaneously. As a result, in a 5-foldcrossing-validation test, the multitask model signiﬁcantly improves the performance on the small dataset.Target CDK4 Target CDK6Single task Multitask Single task MultitaskLV-BP 0.791 0.804 0.524 0.8112FP-BP 0.824 0.836 0.485 0.779Table 2: The R p s of ∆ G predictions from the 5-fold cross-validation tests on the two targets of the drug Ribociclibby the single task and multitask predictors.As illustrated in Table 2, the target CDK4 with a 918-molecule training set does not beneﬁt so much from themultitask. However, to the target CDK6 only with a 289-molecule training set, the improvement is dramatic: the R p s rise from 0.524 to 0.811 by the LV-BP and from 0.485 to 0.779 by the 2FP-BP. These results demonstratethe efﬁciency of the multitask architecture. The generation of new druglike molecules.

To design alternative Ribociclib drugs in broader chemical space,we set the similarity score restraints to Ribociclib from 0.30 to 0.90, with an increment of 0.025. The bindingafﬁnities to the target CDK4 are aimed at the interval between -10.80 kcal/mol and -12.00 kcal/mol with anincrement of -0.2 kcal/mol; this interval covers the ∆ G s of Ribociclib as well as lots of other training samples.For the same reason, the objectives of the binding afﬁnities to the target CDK6 are set from -10.2 kcal/mol to-11.0 kcal/mol with an increment of -0.2 kcal/mol. Totally, following this scheme, we create 1080 novel molecules. The reevaluation by 2DFP-BP.

The predicted bind afﬁnities of these 1080 generated compounds are reeval-uated by the 2DFP-BP model incorporated with the multitask DNN. The parameters of this architecture areintroduced in Section II.E. After excluding the ones with high discrepancies between the LV-BP and 2DFP-BPpredictions, 271 molecules are left.Figure 13 depicts the correlation plots between their LV-BP and 2DFP-BP predicted ∆ G s to the two targets.The correlations of the ∆ G s to both the two targets are promising. Speciﬁcally, the ∆ G s to the target CDK4 canachieve an RMSE of 0.29 kcal/mol, a R p of 0.69, and an active ratio as high as 0.98; the ∆ G s to the target CDK6also have an RMSE of 0.27 kcal/mol, a R p of 0.65 and an active ratio of 0.78.Their LV-BP and 2DFP-BP predicted binding afﬁnities to the two targets are also averaged, and the distribu-14igure 12: (a) The experimental ∆ G distribution of the training set to the target CDK4 with the interval containingthe ∆ G of the drug Ribociclib to CDK4 marked in red. (b) The experimental ∆ G distribution of the CDK6 trainingset with the interval containing the ∆ G of Ribociclib to CDK4 in red. (c) The similarity score distribution of theother samples in the CDK4 training set to Ribociclib. (d) The similarity score distribution of the other samples inthe CDK6 set to Ribociclib.tions are shown in Figure 14a,b. Figure 14c illustrates their similarity score distribution to the reference drugRibociclib. Top 6 drug candidates.

According to the average predicted binding afﬁnities of these 271 molecules, we selectthe top 6 drug candidates. Figure 15 represents their 2D plots. To the target CDK4, the relative errors betweentheir LV-BP and 2DFP-BP predictions are 4.6 %, 3.0 %, 3.2 %, 0.2 %, 1.6 % and 0.7 %, respectively; to CDK6,the relative errors between the two predictions are 1.7 %, 1.2 %, 3.7 %, 3.4 %, 4.8 % and 1.7 %, respectively.Most of them have better binding afﬁnities than the reference drug Ribociclib. Such as, to CDK4, the afﬁnitiesof the ﬁrst and fourth compounds are predicted to be higher than that of Ribociclib, the other three have similarones with Ribociclib; to CDK6, all the top 6 candidates have better binding afﬁnities. Moreover, their similarityscores to the reference drug are between 0.65 and 0.75. They are not in the SciFinder database, which meansthese six candidates are novel. The log P, log S values and synthetic accessibility scores (SASs) calculated byRdKit and Alog PS are also showed in the ﬁgure; their log P, log S values and SASs are comparable to that ofthe reference drug.

The interaction and pharmacophore analysis.

Figure 16(a) shows the interaction details between the drugRibociclib and the target CDK6 in the 3D crystal structure of their complex (PDB ID 5L2T ). It indicates the maininteractions are hydrogen bonds and hydrophobic interactions. The pharmacophore analysis in Figure 16(b)15igure 13: The correlations between the LV-BP and 2DFP-BP predicted ∆ G s of the generated molecules to thetargets CDK4 and CDK6, the ones having high relative errors between the two predictions (>5 %) are alreadyeliminated.reveals that the critical pharmacophores are the pyrrolopyrimidine, pyridine, and cyclohexane, which can formhydrogen bonds and hydrophobic interactions with the binding site.Among the top 6 candidates, except the ﬁrst one, the others contain all these critical pharmacophores; theﬁrst one has most of them. This suggests all the six compounds are potential inhibitors to the targets. III.B.3 The tests on the other single-target drugs

We also apply our GNC to the rest six drugs listed in Table 1 and design novel drug candidates. The similarityscore restraints are from 0.30 to 0.90, with an increment of 0.025. The ∆ G objective ranges are chosen tocontain the ∆ G s of the drugs as well as lots of other training samples. We also only collect the generatedmolecules with the relative errors between the LV-BP and 2DFP-BP predicted ∆ G s below 5 %.Figure 17 indicates, since the ones with high discrepancies between the two predictions are eliminated, ourgenerated compounds to all these six drug targets have promising correlations.Now, we focus on whether highly active drug candidates are created or not. This relies on the binding afﬁnityand similarity score distributions of the training sets. Figure 18 provides the sizes and the ∆ G distributions ofthese six target-speciﬁed training sets; the red bars indicate the intervals containing the ∆ G s of the referencedrugs. Figure 19 plots their similarity score distributions.Figures 18a shows that the training set of the drug Acalabrutinib contains 1459 compounds. Among them,more than 400 compounds have higher binding afﬁnities than that of the drug, which is beneﬁcial to generatecompounds more potent than the drug. Also, Figure 19a reveals that in the training set, over 500 moleculeshave the similarity scores to the drug over 0.3, which is quite helpful to generate novel compounds with similarityscores also in this range. These promising statistical information helps to explain why our GNC can create asmany as 879 new compounds for this drug target with high conﬁdence, and 17 of them have higher or similarbinding afﬁnities with that of Acalabrutinib; the best ∆ G is -11.90 kcal/mol (see Figures S1 and S3).As revealed by 18b, the training set of the drug Idelalisib is larger than that of Acalabrutinib; and over 500molecules in this set have higher binding afﬁnities than that of Idelalisib. As a result, 73 of the 794 generatedcompounds are more potent than Idelalisib, with the best ∆ G being -11.27 kcal/mol (see Figures S1 and S4).Figure 18c exhibits that the dataset of Macimorelin is quite small. However, the correlation of the generatedmolecules is still satisfactory, and 13 generated molecules have similar ∆ G s with that of the drug (see FigureS1). The reason is, its training set contains 205 compounds with higher binding afﬁnities than that of the drug,and also 182 compounds with the similarity scores to the drug over 0.3.Figures 18d and e depict the datasets of the drugs Dabrafenib and Enzalutamide. The afﬁnity values of thesetwo drugs are close to the upper boundaries of their training sets’ afﬁnity domains. In other words, few moleculesin their training sets are more active than the drugs. Since the interpolation nature of machine learning models16igure 14: The average predicted ∆ G to the targets CDK4 and CDK6 of the generated molecules and theirsimilarity scores to the reference drug Ribociclib: (a, b) The distributions of the averages of the LV-BP and2DFP-BP predicted ∆ G s to the two targets, respectively. (c) The similarity score distribution to the referencedrug Ribociclib.tends to provide reliable predictions inside the populated range, it is tough to generate compounds more potentthan these drugs. Although their training set sizes and similarity scores are favorable, the numbers of generatedcompounds with better binding afﬁnities than that of the drugs are only 2 and 4, respectively (see Figures S1,S6 and S7).We perform our last experiment on the drug Panobinostat. Panobinostat is an extreme example: as illustratedin Figure 18f, its binding afﬁnity is the highest in its training set. Therefore, although our GNC model generates319 molecules, the top active one among them only reaches a ∆ G of -11.56 kcal/mol, which is far less potentthan the drug itself ( ∆ G =-12.46 kcal/mol).In the supporting information, Figures S3 to S8 provide the top six generated alternative compounds forthe drugs Acalabrutinib, Idelalisib, Macimorelin, Dabrafenib, Enzalutamide, and Panobinostat, as well as theirpredicted ∆ G s. IV Discussions

With the availability of deep learning technologies, more and more in silico molecule generation models havebeen proposed. These models can be classiﬁed into three categories: randomized output, controlled output, and17igure 15: The drug Ribociclib and its top 6 generated molecules. The predicted ∆ G s to the targets CDK4and CDK6, similarity scores (Sim) to the drug, calculated log P, log S values, and synthetic accessibility scores(SASs) are also reported.optimized output. One of the challenges is how to generate new molecules with desired chemical properties,especially drug-like molecules. Another challenge is how to improve the utility of in silico molecule generationwithout direct experimental validations. To address these challenges, we propose a new GNC to generate drug-18igure 16: (a) The illustration of the interactions between the drug Ribociclib and the target CDK6 extracted fromthe experimental structure (PDB ID 5L2T). (b) The 3D alignment of the common pharmacophores obtained fromall the active compounds to the target CDK6 with the 3D experimental structure of the drug Ribociclib, the top 6generated molecules are also aligned to it.like molecules based on multi-property optimizations via gradient descent.

IV.A The essential circumstances for reliable and desired molecule generation

Based on our experiments, there are two essential circumstances to generate molecules with reliable and desiredpredicted chemical properties:1. An objective property value should always be in a region with lots of training samples. Based on the natureof machine learning methods, the predictor can be built with high accuracy at an objective value if lots oftraining samples around it. It can be seen from Section III.A.3 that when a binding afﬁnity objective is in themiddle of the training set’s distribution, the latent space generator can always generate novel moleculeswith reliable predictions conﬁrmed by the 2DFP-BP. However, when an objective is close to or even at theedge of the training-set distribution, one may still generate many novel compounds, but the predictions tothem are quite risky, such as high discrepancies between the LV-BP and 2DFP-BP predictions.2. Generated compounds should have some high similarity scores to some existing molecules in the trainingset. If some molecules (not necessarily reference molecules) in the training set are similar to generatedcompounds, then the predictions are reliable and can be veriﬁed by the 2DFP-BP.

IV.B The necessity of both similarity and property value restraints

The two points above also explain why both similarity score and property value restraints should be included inour generator.The goal of property restraint is two-fold. First, it restricts property values to desired ones. Additionally, it canbe used to achieve high reliability. As discussed above, if the property value is limited to a populated region ofthe training-set distribution, the resulting generated molecules will most likely have reliable predictions.Similarity restraint is also to ensure prediction accuracy. High similarity scores to existing molecules makepredictions more accurate and reliable. Moreover, in a regular drug discovery procedure for a given target, onetypically starts from some lead compounds or even current drugs and then carries out lead optimizations to makecandidates more "drug-like", e.g., higher activity and lower side effects. Therefore, similarly, in a drug-speciﬁedgenerative model, the similarity to a reference compound or drug must be controlled to guarantee that newdrug-like compounds still bind to the target. For example, with similarity restraint, generated molecules sharepharmacophores with the reference drug.

IV.C Multiple property restraints

Drug design is sophisticated. To develop a drug, plenty of properties must be carefully studied, such as bindingafﬁnity, toxicity, partition coefﬁcient (log P), aqueous solubility (log S), off-target effect. The failure in any one ofthese properties can prevent drug candidates from the market. In other words, drug design is a multi-propertyoptimization process. 19igure 17: The correlation plots between the LV-BP and 2DFP-BP predicted ∆ G s of the generated molecules tothe six drug targets, the ones with high relative errors between the two predictions are already eliminated.Technically, our generator can handle this multi-property optimization. In our framework, the restraint to eachproperty is realized by one term in the loss function. Therefore, multi-property optimization can be satisﬁedsimultaneously in our GNC. In this work, multiple property restraints are tested on one drug with two targets(Ribociclib). It turns out, the generated new candidates have ideal binding afﬁnities to the two targets simulta-20igure 18: The sizes and experimental ∆ G distributions of the training sets to the six drug targets. The red barsindicate the intervals containing the ∆ G s of the six reference drugs.21igure 19: The similarity score distributions of other molecules in the six training sets to the reference drugs.22eously; this means, our model can work on the multiple property restraints. Multiple properties can also bespeciﬁed as toxicity, log P, log S, etc. To avoid side effects, one can also control drug candidates to have a highbinding afﬁnity to one target but low binding afﬁnities to other targets. V Conclusion

Searching alternative drugs is important for improving the quality of existing drugs and making new drugs cheaplyavailable to low-income people. In this work, we develop a new generative network complex (GNC) for auto-mated generation of drug-like molecules based on the multi-property optimization via gradient descent in thelatent space. In this new GNC, multiple chemical properties, particularly binding afﬁnity and similarity score, areoptimized to generate new molecules with desired chemical and drug properties. To ensure the prediction relia-bility of these new compounds, we reevaluate them by independent 2D-ﬁngerprint based predictors. Moleculeswithout consistent predictions from the latent-vector model and the 2D-ﬁngerprint model are not accepted. Afterthe consistent check, hundreds of potential potent drug candidates are reported. Performed on a supercom-puter, the generation process from one seed to a large number of new molecules takes less than 10 minutes.Therefore, our GNC is an efﬁcient new paradigm for discovering new drug candidates. To demonstrate the utilityof the present GNC, we ﬁrst test its reliability on the BACE1 target and then, further apply this model to gener-ate thousands of new alternative drug candidates for a few market existing drugs, namely, Ceritinib, Ribociclib,Acalabrutinib, Idelalisib, Dabrafenib, Macimorelin, Enzalutamide, and Panobinostat.We also discuss the keys to generate drug-like candidates with reliable predictions. First, an objective propertyvalue should be in a populated region of the training-set distribution. Second, generated molecules need to havegood similarity scores to some existing compounds in the training set.

Supplementary materials

Figures S1 to S8 are the ∆ G and similarity score distributions of the generated drug-like molecules for the drugsAcalabrutinib, Idelalisib, Dabrafenib, Macimorelin, Enzalutamide, and Panobinostat, as well as the 2D plots andpredicted ∆ G s of the top 6 among them. Acknowledgments

This work was supported in part by NSF Grants DMS-1721024, DMS-1761320, and IIS1900473, NIH grantGM126189, and Michigan Economic Development Corporation. DDN and GWW are also funded by Bristol-Myers Squibb and Pﬁzer. 23 eferences [1] Joseph A DiMasi, Henry G Grabowski, and Ronald W Hansen. Innovation in the pharmaceutical industry:new estimates of R&D costs.

Journal of Health Economics , 47:20–33, 2016.[2] James P Hughes, Stephen Rees, S Barrett Kalindjian, and Karen L Philpott. Principles of early drugdiscovery.

British Journal of Pharmacology , 162(6):1239–1249, 2011.[3] Ricardo Macarron, Martyn N Banks, Dejan Bojanic, David J Burns, Dragan A Cirovic, Tina Garyantes,Darren VS Green, Robert P Hertzberg, William P Janzen, Jeff W Paslay, et al. Impact of high-throughputscreening in biomedical research.

Nature Reviews Drug Discovery , 10(3):188, 2011.[4] Ann I Graul, Laura Revel, Esmeralda Rosa, and Elisabet Cruces. Overcoming the obstacles in thepharma/biotech industry: 2008 update.

Drug News Perspect , 22(1):39, 2009.[5] Mahmud Tareq Hassan Khan. Predictions of the admet properties of candidate drug molecules utilizingdifferent qsar/qspr modelling approaches.

Current Drug Metabolism , 11(4):285–295, 2010.[6] Michael J Waring, John Arrowsmith, Andrew R Leach, Paul D Leeson, Sam Mandrell, Robert M Owen,Garry Pairaudeau, William D Pennie, Stephen D Pickett, Jibo Wang, et al. An analysis of the attrition ofdrug candidates from four major pharmaceutical companies.

Nature Reviews Drug Discovery , 14(7):475,2015.[7] Kristian Strømgaard, Povl Krogsgaard-Larsen, and Ulf Madsen.

Textbook of drug design and discovery .CRC Press, 2017.[8] Saeed Alqahtani. In silico adme-tox modeling: progress and prospects.

Expert opinion on drug metabolism& toxicology , 13(11):1147–1158, 2017.[9] KV Balakin, YA Ivanenkov, and NP Savchuk. Compound library design for target families. In

Chemoge-nomics , pages 21–46. Springer, 2009.[10] IM Kapetanovic. Computer-aided drug discovery and development (CADDD): in silico-chemico-biologicalapproach.

Chemico-Biological Interactions , 171(2):165–176, 2008.[11] Renjie Huang and Ivanhoe Leung. Protein-directed dynamic combinatorial chemistry: a guide to proteinligand and inhibitor discovery.

Molecules , 21(7):910, 2016.[12] Paweł Szyma ´nski, Magdalena Markowicz, and El˙zbieta Mikiciuk-Olasik. Adaptation of high-throughputscreening in drug discovery-toxicological screening tests.

International Journal of Molecular Sciences ,13(1):427–452, 2012.[13] Gisbert Schneider and Uli Fechner. Computer-based de novo design of drug-like molecules.

Nature Re-views Drug Discovery , 4(8):649, 2005.[14] Seonwoo Min, Byunghan Lee, and Sungroh Yoon. Deep learning in bioinformatics.

Brieﬁngs in Bioinfor-matics , 18(5):851–869, 2017.[15] Polina Mamoshina, Armando Vieira, Evgeny Putin, and Alex Zhavoronkov. Applications of deep learning inbiomedicine.

Molecular Pharmaceutics , 13(5):1445–1454, 2016.[16] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In

Advances in neural information processing systems , pages 3104–3112, 2014.[17] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 ,2013.[18] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. Adversarial autoen-coders. arXiv preprint arXiv:1511.05644 , 2015. 2419] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, AaronCourville, and Yoshua Bengio. Generative adversarial nets. In

Advances in neural information process-ing systems , pages 2672–2680, 2014.[20] Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. Reinforcement learning: A survey.

Journalof Artiﬁcial Intelligence Research , 4:237–285, 1996.[21] Robin Winter, Floriane Montanari, Andreas Steffen, Hans Briem, Frank Noé, and Djork-Arné Clevert. Efﬁ-cient multi-objective molecular optimization in a continuous latent space.

Chemical Science , 10(34):8016–8024, 2019.[22] Robin Winter, Floriane Montanari, Frank Noé, and Djork-Arné Clevert. Learning continuous and data-drivenmolecular descriptors by translating equivalent chemical representations.

Chemical Science , 10(6):1692–1701, 2019.[23] Rafael Gómez-Bombarelli, Jennifer N Wei, David Duvenaud, José Miguel Hernández-Lobato, BenjamínSánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D Hirzel, Ryan P Adams, andAlán Aspuru-Guzik. Automatic chemical design using a data-driven continuous representation of molecules.

ACS Central Science , 4(2):268–276, 2018.[24] Miha Skalic, José Jiménez, Davide Sabbadin, and Gianni De Fabritiis. Shape-based generative modelingfor de novo drug design.

Journal of Chemical Information and Modeling , 59(3):1205–1214, 2019.[25] Artur Kadurin, Sergey Nikolenko, Kuzma Khrabrov, Alex Aliper, and Alex Zhavoronkov. drugan: an advancedgenerative adversarial autoencoder model for de novo generation of new molecules with desired molecularproperties in silico.

Molecular Pharmaceutics , 14(9):3098–3104, 2017.[26] Boris Sattarov, Igor I Baskin, Dragos Horvath, Gilles Marcou, Esben Jannik Bjerrum, and Alexandre Varnek.De novo molecular design by combining deep autoencoder recurrent neural networks with generative topo-graphic mapping.

Journal of Chemical Information and Modeling , 59(3):1182–1196, 2019.[27] Marcus Olivecrona, Thomas Blaschke, Ola Engkvist, and Hongming Chen. Molecular de-novo designthrough deep reinforcement learning.

Journal of Cheminformatics , 9(1):48, 2017.[28] Mariya Popova, Olexandr Isayev, and Alexander Tropsha. Deep reinforcement learning for de novo drugdesign.

Science Advances , 4(7):eaap7885, 2018.[29] Seokho Kang and Kyunghyun Cho. Conditional molecular design with deep generative models.

Journal ofChemical Information and Modeling , 59(1):43–52, 2018.[30] Zhenpeng Zhou, Steven Kearnes, Li Li, Richard N Zare, and Patrick Riley. Optimization of molecules viadeep reinforcement learning.

Scientiﬁc Reports , 9(1):10752, 2019.[31] Edward H Kerns, Li Di, and Guy T Carter. In vitro solubility assays in drug discovery.

Current DrugMetabolism , 9(9):879–885, 2008.[32] Albert Leo, Corwin Hansch, and David Elkins. Partition coefﬁcients and their uses.

Chemical Reviews ,71(6):525–616, 1971.[33] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, HolgerSchwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statis-tical machine translation. arXiv preprint arXiv:1406.1078 , 2014.[34] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.

Neural Computation , 9(8):1735–1780,1997.[35] David Weininger. Smiles, a chemical language and information system. 1. introduction to methodology andencoding rules.

Journal of Chemical Information and Computer Sciences , 28(1):31–36, 1988.2536] Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and com-putational approaches to estimate solubility and permeability in drug discovery and development settings.

Advanced Drug Delivery Reviews , 23(1-3):3–25, 1997.[37] Anna Gaulton, Louisa J Bellis, A Patricia Bento, Jon Chambers, Mark Davies, Anne Hersey, Yvonne Light,Shaun McGlinchey, David Michalovich, Bissan Al-Lazikani, et al. Chembl: a large-scale bioactivity databasefor drug discovery.

Nucleic Acids Research , 40(D1):D1100–D1107, 2011.[38] Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. Pytorch: Tensors and dynamic neuralnetworks in python with strong gpu acceleration.

PyTorch: Tensors and dynamic neural networks in Pythonwith strong GPU acceleration , 6, 2017.[39] Kaifu Gao, Duc Duy Nguyen, Vishnu Sresht, Alan M. Mathiowetz, Meihua Tu, and Guo-Wei Wei. Are 2dﬁngerprints still valuable for drug discovery?

Phys. Chem. Chem. Phys. , 22:8373–8390, 2020.[40] David Rogers and Mathew Hahn. Extended-connectivity ﬁngerprints.

Journal of Chemical Information andModeling , 50(5):742–754, 2010.[41] Lowell H Hall and Lemont B Kier. Electrotopological state indices for atom types: a novel combinationof electronic, topological, and valence state information.

Journal of Chemical Information and ComputerSciences , 35(6):1039–1045, 1995.[42] Greg Landrum et al. Rdkit: Open-source Cheminformatics, 2006.[43] Robert E Schapire. The boosting approach to machine learning: An overview. In

Nonlinear Estimation andClassiﬁcation , pages 149–171. Springer, 2003.[44] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel,Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning inpython.

Journal of Machine Learning Research , 12(Oct):2825–2830, 2011.[45] Rich Caruana. Multitask learning.

Machine Learning , 28(1):41–75, 1997.[46] Sebastian Salentin, Sven Schreiber, V Joachim Haupt, Melissa F Adasme, and Michael Schroeder. Plip:fully automated protein–ligand interaction proﬁler.

Nucleic Acids Research , 43(W1):W443–W447, 2015.[47] Steven L Dixon, Alexander M Smondyrev, Eric H Knoll, Shashidhar N Rao, David E Shaw, and Richard AFriesner. Phase: a new engine for pharmacophore perception, 3d qsar model development, and 3ddatabase screening: 1. methodology and preliminary results.

Journal of Computer-aided Molecular De-sign , 20(10-11):647–671, 2006.[48] Robert Vassar, Dora M Kovacs, Riqiang Yan, and Philip C Wong. The β -secretase enzyme bace in healthand alzheimer’s disease: regulation, cell biology, function, and therapeutic potential. Journal of Neuro-science , 29(41):12787–12794, 2009.[49] Federica Prati, Giovanni Bottegoni, Maria Laura Bolognesi, and Andrea Cavalli. Bace-1 inhibitors: Fromrecent single-target molecules to multitarget compounds for alzheimer’s disease: Miniperspective.

Journalof Medicinal Chemistry , 61(3):619–637, 2017.[50] Igor V Tetko, Johann Gasteiger, Roberto Todeschini, Andrea Mauri, David Livingstone, Peter Ertl, Vladimir APalyulin, Eugene V Radchenko, Nikolay S Zeﬁrov, Alexander S Makarenko, et al. Virtual computationalchemistry laboratory–design and description.

Journal of Computer-aided Molecular Design , 19(6):453–463, 2005.[51] Luc Friboulet, Nanxin Li, Ryohei Katayama, Christian C Lee, Justin F Gainor, Adam S Crystal, Pierre-YvesMichellys, Mark M Awad, Noriko Yanagitani, Sungjoon Kim, et al. The alk inhibitor ceritinib overcomescrizotinib resistance in non–small cell lung cancer.

Cancer Discovery , 4(6):662–673, 2014.[52] Torsten Schwede, Jurgen Kopp, Nicolas Guex, and Manuel C Peitsch. Swiss-model: an automated proteinhomology-modeling server.

Nucleic Acids Research , 31(13):3381–3385, 2003.2653] Ping Chen, Nathan V Lee, Wenyue Hu, Meirong Xu, Rose Ann Ferre, Hieu Lam, Simon Bergqvist, JamesSolowiej, Wade Diehl, You-Ai He, et al. Spectrum and degree of CDK drug interactions predicts clinicalperformance.

Molecular Cancer Therapeutics , 15(10):2273–2281, 2016.[54] Christopher Grow, Kaifu Gao, Duc Duy Nguyen, and Guo-Wei Wei. Generative network complex (gnc) fordrug discovery.