Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Philip W. Lee is active.

Publication


Featured researches published by Philip W. Lee.


Journal of Chemical Information and Modeling | 2012

admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties.

Feixiong Cheng; Weihua Li; Yadi Zhou; Jie Shen; Zengrui Wu; Guixia Liu; Philip W. Lee; Yun Tang

Absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties play key roles in the discovery/development of drugs, pesticides, food additives, consumer products, and industrial chemicals. This information is especially useful when to conduct environmental and human hazard assessment. The most critical rate limiting step in the chemical safety assessment workflow is the availability of high quality data. This paper describes an ADMET structure-activity relationship database, abbreviated as admetSAR. It is an open source, text and structure searchable, and continually updated database that collects, curates, and manages available ADMET-associated properties data from the published literature. In admetSAR, over 210,000 ADMET annotated data points for more than 96,000 unique compounds with 45 kinds of ADMET-associated properties, proteins, species, or organisms have been carefully curated from a large number of diverse literatures. The database provides a user-friendly interface to query a specific chemical profile, using either CAS registry number, common name, or structure similarity. In addition, the database includes 22 qualitative classification and 5 quantitative regression models with highly predictive accuracy, allowing to estimate ecological/mammalian ADMET properties for novel chemicals. AdmetSAR is accessible free of charge at http://www.admetexp.org.


Journal of Chemical Information and Modeling | 2011

Classification of Cytochrome P450 Inhibitors and Noninhibitors Using Combined Classifiers

Feixiong Cheng; Yue Yu; Jie Shen; Lei Yang; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

Adverse side effects of drug-drug interactions induced by human cytochrome P450 (CYP) inhibition is an important consideration, especially, during the research phase of drug discovery. It is highly desirable to develop computational models that can predict the inhibitive effect of a compound against a specific CYP isoform. In this study, inhibitor predicting models were developed for five major CYP isoforms, namely 1A2, 2C9, 2C19, 2D6, and 3A4, using a combined classifier algorithm on a large data set containing more than 24,700 unique compounds, extracted from PubChem. The combined classifiers algorithm is an ensemble of different independent machine learning classifiers including support vector machine, C4.5 decision tree, k-nearest neighbor, and naïve Bayes, fused by a back-propagation artificial neural network (BP-ANN). All developed models were validated by 5-fold cross-validation and a diverse validation set composed of about 9000 diverse unique compounds. The range of the area under the receiver operating characteristic curve (AUC) for the validation sets was 0.764 to 0.815 for CYP1A2, 0.837 to 0.861 for CYP2C9, 0.793 to 0.842 for CYP2C19, 0.839 to 0.886 for CYP2D6, and 0.754 to 0.790 for CYP3A4, respectively, using the new developed combined classifiers. The overall performance of the combined classifiers fused by BP-ANN was superior to that of three classic fusion techniques (Mean, Maximum, and Multiply). The chemical spaces of data sets were explored by multidimensional scaling plots, and the use of applicability domain improved the prediction accuracies of models. In addition, some representative substructure fragments differentiating CYP inhibitors and noninhibitors were characterized by the substructure fragment analysis. These classification models are applicable for virtual screening of the five major CYP isoforms inhibitors or can be used as simple filters of potential chemicals in drug discovery.


Journal of Chemical Information and Modeling | 2012

In silico prediction of chemical Ames mutagenicity.

Congying Xu; Feixiong Cheng; Lei Chen; Zheng Du; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

Mutagenicity is one of the most important end points of toxicity. Due to high cost and laboriousness in experimental tests, it is necessary to develop robust in silico methods to predict chemical mutagenicity. In this paper, a comprehensive database containing 7617 diverse compounds, including 4252 mutagens and 3365 nonmutagens, was constructed. On the basis of this data set, high predictive models were then built using five machine learning methods, namely support vector machine (SVM), C4.5 decision tree (C4.5 DT), artificial neural network (ANN), k-nearest neighbors (kNN), and naïve Bayes (NB), along with five fingerprints, namely CDK fingerprint (FP), Estate fingerprint (Estate), MACCS keys (MACCS), PubChem fingerprint (PubChem), and Substructure fingerprint (SubFP). Performances were measured by cross validation and an external test set containing 831 diverse chemicals. Information gain and substructure analysis were used to interpret the models. The accuracies of fivefold cross validation were from 0.808 to 0.841 for top five models. The range of accuracy for the external validation set was from 0.904 to 0.980, which outperformed that of Toxtree. Three models (PubChem-kNN, MACCS-kNN, and PubChem-SVM) showed high and reliable predictive accuracy for the mutagens and nonmutagens and, hence, could be used in prediction of chemical Ames mutagenicity.


Chemosphere | 2011

In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods

Feixiong Cheng; Jie Shen; Yue Yu; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

There is an increasing need for the rapid safety assessment of chemicals by both industries and regulatory agencies throughout the world. In silico techniques are practical alternatives in the environmental hazard assessment. It is especially true to address the persistence, bioaccumulative and toxicity potentials of organic chemicals. Tetrahymena pyriformis toxicity is often used as a toxic endpoint. In this study, 1571 diverse unique chemicals were collected from the literature and composed of the largest diverse data set for T. pyriformis toxicity. Classification predictive models of T. pyriformis toxicity were developed by substructure pattern recognition and different machine learning methods, including support vector machine (SVM), C4.5 decision tree, k-nearest neighbors and random forest. The results of a 5-fold cross-validation showed that the SVM method performed better than other algorithms. The overall predictive accuracies of the SVM classification model with radial basis functions kernel was 92.2% for the 5-fold cross-validation and 92.6% for the external validation set, respectively. Furthermore, several representative substructure patterns for characterizing T. pyriformis toxicity were also identified via the information gain analysis methods.


Journal of Chemical Information and Modeling | 2012

In Silico Assessment of Chemical Biodegradability

Feixiong Cheng; Yutaka Ikenaga; Yadi Zhou; Yue Yu; Weihua Li; Jie Shen; Zheng Du; Lei Chen; Congying Xu; Guixia Liu; Philip W. Lee; Yun Tang

Biodegradation is the principal environmental dissipation process. Due to a lack of comprehensive experimental data, high study cost and time-consuming, in silico approaches for assessing the biodegradable profiles of chemicals are encouraged and is an active current research topic. Here we developed in silico methods to estimate chemical biodegradability in the environment. At first 1440 diverse compounds tested under the Japanese Ministry of International Trade and Industry (MITI) protocol were used. Four different methods, namely support vector machine, k-nearest neighbor, naïve Bayes, and C4.5 decision tree, were used to build the combinatorial classification probability models of ready versus not ready biodegradability using physicochemical descriptors and fingerprints separately. The overall predictive accuracies of the best models were more than 80% for the external test set of 164 diverse compounds. Some privileged substructures were further identified for ready or not ready biodegradable chemicals by combining information gain and substructure fragment analysis. Moreover, 27 new predicted chemicals were selected for experimental assay through the Japanese MITI test protocols, which validated that all 27 compounds were predicted correctly. The predictive accuracies of our models outperform the commonly used software of the EPI Suite. Our study provided critical tools for early assessment of biodegradability of new organic chemicals in environmental hazard assessment.


Journal of Chemical Information and Modeling | 2011

Insights into Molecular Basis of Cytochrome P450 Inhibitory Promiscuity of Compounds

Feixiong Cheng; Yue Yu; Yadi Zhou; Zhonghua Shen; Wen Xiao; Guixia Liu; Weihua Li; Philip W. Lee; Yun Tang

Cytochrome P450 inhibitory promiscuity of a drug has potential effects on the occurrence of clinical drug-drug interactions. Understanding how a molecular property is related to the P450 inhibitory promiscuity could help to avoid such adverse effects. In this study, an entropy-based index was defined to quantify the P450 inhibitory promiscuity of a compound based on a comprehensive data set, containing more than 11,500 drug-like compounds with inhibition against five major P450 isoforms, 1A2, 2C9, 2C19, 2D6, and 3A4. The results indicated that the P450 inhibitory promiscuity of a compound would have a moderate correlation with molecular aromaticity, a minor correlation with molecular lipophilicity, and no relations with molecular complexity, hydrogen bonding ability, and TopoPSA. We also applied an index to quantify the susceptibilities of different P450 isoforms to inhibition based on the same data set. The results showed that there was a surprising level of P450 inhibitory promiscuity even for substrate specific P450, susceptibility to inhibition follows the rank-order: 1A2 > 2C19 > 3A4 > 2C9 > 2D6. There was essentially no correlation between P450 inhibitory potency and specificity and minor negative trade-offs between P450 inhibitory promiscuity and catalytic promiscuity. In addition, classification models were built to predict the P450 inhibitory promiscuity of new chemicals using support vector machine algorithm with different fingerprints. The area under the receiver operating characteristic curve of the best model was about 0.9, evaluated by 5-fold cross-validation. These findings would be helpful for understanding the mechanism of P450 inhibitory promiscuity and improving the P450 inhibitory selectivity of new chemicals in drug discovery.


Chemosphere | 2015

In silico prediction of chemical toxicity on avian species using chemical category approaches

Chen Zhang; Feixiong Cheng; Lixia Sun; Shulin Zhuang; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

Avian species are sensitive to pesticides and industrial chemicals, and hence used as model species in evaluation of chemical toxicity. In present study, we assessed the toxicity of more than 663 diverse chemicals on 17 avian species. All the chemicals were classified into three categories, i.e. highly toxic, slightly toxic and non-toxic, based on the toxicity classification criteria of the United States Environmental Protection Agency (EPA). To evaluate these chemicals, the toxicity prediction models were built using chemical category approaches with molecular descriptors and five commonly used fingerprints, in which five machine learning methods were performed on two standard test species: aquatic bird mallard duck and terrestrial bird northern bobwhite quail. The support vector machine (SVM) method with Pubchem fingerprint performed best as revealed by 5-fold cross-validation and the external validation set on Japanese quail. No species difference existed in our database despite several chemicals with different toxicity on some avian species. The best model had an overall accuracy at 0.851 for the prediction of toxicity on avian species, which outperformed the work of Mazzatorta et al. Furthermore, several representative substructures for characterizing avian toxicity were identified via information gain (IG) method. This study would provide a new tool for chemical safety assessment.


Molecular Informatics | 2016

In silico Prediction of Drug Induced Liver Toxicity Using Substructure Pattern Recognition Method

Chen Zhang; Feixiong Cheng; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

Drug‐induced liver injury (DILI) is a leading cause of acute liver failure in the US and less severe liver injury worldwide. It is also one of the major reasons of drug withdrawal from the market. Thus, DILI has become one of the most important concerns of drugs, and should be predicted in very early stage of drug discovery process. In this study, a comprehensive data set containing 1317 diverse compounds was collected from publications. Then, high accuracy classification models were built using five machine learning methods based on MACCS and FP4 fingerprints after evaluating by substructure pattern recognition method. The best model was built using SVM method together with FP4 fingerprint at the IG value threshold of 0.0005. Its overall predictive accuracies were 79.7 % and 64.5 % for the training and test sets, separately, which yielded overall accuracy of 75.0 % for the external validation dataset, consisting of 88 compounds collected from a benchmark DILI database – the Liver Toxicity Knowledge Base. This model could be used for drug‐induced liver toxicity prediction. Moreover, some key substructure patterns correlated with drug‐induced liver toxicity were also identified as structural alerts.


Toxicology Research | 2015

In silico prediction of chemical aquatic toxicity with chemical category approaches and substructural alerts

Lu Sun; Chen Zhang; Yingjie Chen; Xiao Li; Shulin Zhuang; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

Aquatic toxicity is an important endpoint in the evaluation of chemically adverse effects on ecosystems. In this study, in silico models were developed for the prediction of chemical aquatic toxicity in different fish species. Firstly, a large data set containing 6422 data points on aquatic toxicity with 1906 diverse chemicals was constructed. Using molecular descriptors and fingerprints to represent the molecules, local and global models were then developed with five machine learning methods based on three fish species (rainbow trout, fathead minnow and bluegill sunfish). For the local models, both binary and ternary classification models were obtained for each of the three fish species. For the global models, data of all the three fish species were used together. The predictive accuracy of both the local and global models was around 0.8 for the test sets. Moreover, data of the sheepshead minnow were used as an external validation set. For the best local model (model 2), the predictive accuracy was 0.875 for the sheepshead minnow, while for the best global model (model 14), the predictive accuracy was 0.872 for the sheepshead minnow. The FN compounds in model 2 and model 14 were 18 and 10, respectively. Hence, model 14 was the best model, and thus could predict the toxicity of other fish species’. Furthermore, information gain and ChemoTyper methods were used to identify toxic substructures, which could significantly correlate with chemical aquatic toxicity. This study provides critical tools for an early evaluation of chemical aquatic toxicity in an environmental hazard assessment.


Journal of Chemical Information and Modeling | 2018

Multiclassification Prediction of Enzymatic Reactions for Oxidoreductases and Hydrolases Using Reaction Fingerprints and Machine Learning Methods

Yingchun Cai; Hongbin Yang; Weihua Li; Guixia Liu; Philip W. Lee; Yun Tang

Drug metabolism is a complex procedure in the human body, including a series of enzymatically catalyzed reactions. However, it is costly and time consuming to investigate drug metabolism experimentally; computational methods are hence developed to predict drug metabolism and have shown great advantages. As the first step, classification of metabolic reactions and enzymes is highly desirable for drug metabolism prediction. In this study, we developed multiclassification models for prediction of reaction types catalyzed by oxidoreductases and hydrolases, in which three reaction fingerprints were used to describe the reactions and seven machine learnings algorithms were employed for model building. Data retrieved from KEGG containing 1055 hydrolysis and 2510 redox reactions were used to build the models, respectively. The external validation data consisted of 213 hydrolysis and 512 redox reactions extracted from the Rhea database. The best models were built by neural network or logistic regression with a 2048-bit transformation reaction fingerprint. The predictive accuracies of the main class, subclass, and superclass classification models on external validation sets were all above 90%. This study will be very helpful for enzymatic reaction annotation and further study on metabolism prediction.

Collaboration


Dive into the Philip W. Lee's collaboration.

Top Co-Authors

Avatar

Weihua Li

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yun Tang

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Guixia Liu

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jie Shen

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Chen Zhang

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yadi Zhou

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yue Yu

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Congying Xu

East China University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Hongbin Yang

East China University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge