Journal of chemical information and modeling | 2019

Prediction of CYP450 Enzyme-Substrate Selectivity Based on the Network-Based Label Space Division Method

 
 
 
 
 
 
 

Abstract


A drug may be metabolized by multiple CYP450 isoforms. Predicting the metabolic fate of drugs is very important to prevent drug-drug interactions in the development of novel pharmaceuticals. Prediction of CYP450 enzyme-substrate selectivity is formulized as a multi-label learning task in this study. Firstly, we compared the performance of feature combinations based on 4 different categories of features which are physiochemical property descriptors (PC), mol2vec descriptors (M2V), Extended Connectivity Fingerprints (ECFP) and Molecular ACCess System (MACCS) keys fingerprints on modeling. After identifying the best combination of features, we applied 7 different multi-label models which are ML-kNN, MLTSVM and 5 Network-based Label Space Division (NLSD)-based methods (NLSD-MLP, NLSD-XGB, NLSD-EXT, NLSD-RF, NLSD-SVM). The six models (ML-kNN, NLSD-MLP, NLSD-XGB, NLSD-EXT, NLSD-RF, NLSD-SVM) in this paper all produce better performances than the previous work. Besides, NLSD-XGB achieves the best performance with the average top-1 prediction success of 91.1%, the average top-2 prediction success of 96.2%, and the average top-3 prediction success of 98.2%. When compared with the previous work, NLSD-XGB shows a significant improvement over 11% on top-1 in 10 times repeated 5-fold cross-validation test and over 14% on top-1 in 10 times repeated hold-out method. To the best of our knowledge, the Network-based Label Space Division model is firstly introduced in drug metabolism and performs well in this task.

Volume None
Pages None
DOI 10.1021/acs.jcim.9b00749
Language English
Journal Journal of chemical information and modeling

Full Text