Aixia Yan
Beijing University of Chemical Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aixia Yan.
Journal of Chemical Information and Modeling | 2011
Zhi Wang; Yuanying Chen; Hu Liang; Andreas Bender; Robert C. Glen; Aixia Yan
P-glycoprotein (P-gp) is one of the major ABC transporters and involved in many essential processes such as lipid and steroid transport across cell membranes but also in the uptake of drugs such as HIV protease and reverse transcriptase inhibitors. Despite its importance, reliable models predicting substrates of P-gp are scarce. In this study, we have built several computational models to predict whether or not a compound is a P-gp substrate, based on the largest data set yet published, employing 332 distinct structures. Each molecule is represented by ADRIANA.Code, MOE, and ECFP_4 fingerprint descriptors. The models are computed using a support vector machine based on a training set which includes 131 substrates and 81 nonsubstrates that were evaluated by 5-, 10-fold, and leave-one-out (LOO) cross-validation. The best model gives a Matthews Correlation Coefficient of 0.73 and a prediction accuracy of 0.88 on the test set. Examination of the model based on ECFP_4 fingerprints revealed several substructures which could have significance in separating substrates and nonsubstrates of P-gp, such as the nitrile and sulfoxide functional groups which have a higher frequency in nonsubstrates than in substrates. In addition structural isomerism in sugars was found to result in remarkable differences regarding the likelihood of a compound to be a substrate for P-gp.
Sar and Qsar in Environmental Research | 2013
Aixia Yan; H. Liang; Y. Chong; X. Nie; C. Yu
The ability of penetration of the blood–brain barrier is one of the significant properties of a drug or drug-like compound for the central nervous system (CNS), which is commonly expressed by log BB (log BB = log (C brain/C blood)). In this work, a dataset of 320 compounds with log BB values was split into a training set including 198 compounds and a test set including 122 compounds according to their structure properties by a Kohonens self-organizing map (SOM). Each molecule was represented by global and shape descriptors, 2D autocorrelation descriptors and RDF descriptors calculated by ADRIANA.Code. Several quantitative models for prediction of log BB were built by a multilinear regression (MLR), a support vector machine (SVM) and an artificial neural network (ANN) analysis. The models show good prediction performance on the test set compounds.
Bioorganic & Medicinal Chemistry Letters | 2013
Min Zhong; Shouyi Xuan; Ling Wang; Xiaoli Hou; Maolin Wang; Aixia Yan; Bin Dai
Two quantitative structure-activity relationships (QSAR) models for predicting 95 compounds inhibiting Acyl-coenzyme A: cholesterol acyltransferase2 (ACAT2) were developed. The whole data set was randomly split into a training set including 72 compounds and a test set including 23 compounds. The molecules were represented by 11 descriptors calculated by software ADRIANA.Code. Then the inhibitory activity of ACAT2 inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. The correlation coefficients of the models for the test sets were 0.90 for MLR model, and 0.91 for SVM model. Y-randomization was employed to ensure the robustness of the SVM model. The atom charge and electronegativity related descriptors were important for the interaction between the inhibitors and ACAT2.
Bioorganic & Medicinal Chemistry Letters | 2017
Zijian Qin; Maolin Wang; Aixia Yan
In this study, quantitative structure-activity relationship (QSAR) models using various descriptor sets and training/test set selection methods were explored to predict the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by using a multiple linear regression (MLR) and a support vector machine (SVM) method. 512 HCV NS3/4A protease inhibitors and their IC50 values which were determined by the same FRET assay were collected from the reported literature to build a dataset. All the inhibitors were represented with selected nine global and 12 2D property-weighted autocorrelation descriptors calculated from the program CORINA Symphony. The dataset was divided into a training set and a test set by a random and a Kohonens self-organizing map (SOM) method. The correlation coefficients (r2) of training sets and test sets were 0.75 and 0.72 for the best MLR model, 0.87 and 0.85 for the best SVM model, respectively. In addition, a series of sub-dataset models were also developed. The performances of all the best sub-dataset models were better than those of the whole dataset models. We believe that the combination of the best sub- and whole dataset SVM models can be used as reliable lead designing tools for new NS3/4A protease inhibitors scaffolds in a drug discovery pipeline.
Sar and Qsar in Environmental Research | 2016
X. Hou; X. Chen; M. Zhang; Aixia Yan
Abstract Plasmodium falciparum, the most fatal parasite that causes malaria, is responsible for over one million deaths per year. P. falciparum dihydroorotate dehydrogenase (PfDHODH) has been validated as a promising drug development target for antimalarial therapy since it catalyzes the rate-limiting step for DNA and RNA biosynthesis. In this study, we investigated the quantitative structure–activity relationships (QSAR) of the antimalarial activity of PfDHODH inhibitors by generating four computational models using a multilinear regression (MLR) and a support vector machine (SVM) based on a dataset of 255 PfDHODH inhibitors. All the models display good prediction quality with a leave-one-out q2 >0.66, a correlation coefficient (r) >0.85 on both training sets and test sets, and a mean square error (MSE) <0.32 on training sets and <0.37 on test sets, respectively. The study indicated that the hydrogen bonding ability, atom polarizabilities and ring complexity are predominant factors for inhibitors’ antimalarial activity. The models are capable of predicting inhibitors’ antimalarial activity and the molecular descriptors for building the models could be helpful in the development of new antimalarial drugs.
Sar and Qsar in Environmental Research | 2014
Maolin Wang; Min Zhong; Aixia Yan; L. Li; Changyuan Yu
Several QSAR (quantitative structure–activity relationship) models for predicting the inhibitory activity of 333 hepatitis C virus (HCV) NS5B polymerase inhibitors were developed. All the inhibitors are HCV polymerase non-nucleoside analogue inhibitors (NNIs) fitting into the pocket of the NNI III binding site. For each molecule, global descriptors and 2D property autocorrelation descriptors were calculated from the program ADRIANA.Code. Pearson correlation analysis was used to select the significant descriptors for building models. The whole dataset was split into a training set and a test set randomly or using a Kohonen’s self-organizing map (SOM). Then, the inhibitory activity of 333 HCV NS5B polymerase inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. For the test set of the best model (Model 2B), correlation coefficient of 0.91 was achieved. Some molecular descriptors, such as molecular complexity (Complexity), the number of hydrogen bonding donors (HDon) and the solubility of the molecule in water (log S) were found to be very important factors which determined the bioactivity of the HCV NS5B inhibitors. Some other molecular properties such as electrostatic and charge properties also played important roles in the interaction between the ligand and the protein. The selected molecular descriptors were further confirmed by analysing the interaction between two representative inhibitors and the polymerase in their crystal structures.
Molecular Informatics | 2012
Zhi Wang; Hamse Y. Mussa; Robert Lowe; Robert C. Glen; Aixia Yan
The US Food and Drug Administration (FDA) require in vitro human ether‐a‐go‐go related (hERG) ion channel affinity tests for all drug candidates prior to clinical trials. In this study, probabilistic‐based methods were employed to develop prediction models on hERG inhibition prediction, which are different from traditional QSAR models that are mainly based on supervised ‘hard point’ (HP) classification approaches giving ‘yes/no’ answers. The obtained models can ‘ascertain’ whether or not a given set of compounds can block hERG ion channels. The results presented indicate that the proposed probabilistic‐based method can be a valuable tool for ranking compounds with respect to their potential cardio‐toxicity and will be promising for other toxic property predictions.
Molecular Informatics | 2016
Maolin Wang; Li Li; Changyuan Yu; Aixia Yan; Zhongzhen Zhao; Ge Zhang; Miao Jiang; Aiping Lu; Johann Gasteiger
Chinese Herbal Medicines (CHMs) are typically mixtures of compounds and are often categorized into cold and hot according to the theory of Chinese Medicine. This classification is essential for guiding the clinical application of CHMs. In this study, three types of molecular descriptors were used to build models for classification of 59 CHMs with typical cold/hot properties in the training set taken from the original records on properties in China Pharmacopeia as reference. The accuracy and the Matthews correlation coefficient of the models were validated by a test set containing other 56 CHMs. The best model produced the accuracies of 94.92 % and 83.93 % on training set and test set, respectively. The MACCS fingerprint model is robust in predicting hot/cold properties of the CHMs from their major constituting compounds. This work shows how a classification model for data consisting of multi‐components can be developed. The derived model can be used for the application of Chinese herbal medicines.
Molecular Informatics | 2013
Shouyi Xuan; Maolin Wang; Hang Kang; Johannes Kirchmair; Lu Tan; Aixia Yan
Inhibition of the 3′ processing step of HIV‐1 integrase by small molecule inhibitors is one of the most promising strategies for the treatment of AIDS. Using a support vector machine (SVM) approach, we developed six classification models for predicting 3′P inhibitors. The models are based on up to 48 selected molecular descriptors and a comprehensive data set of 1253 molecules, with measured activities ranging from nanomolar to micromolar IC50 values. Model B2, the most robust SVM model, obtains a prediction accuracy, sensitivity, specificity and Matthews correlation coefficient (MCC) of 93 %, 81 %, 94 % and 0.67 on the test set, respectively. The presence of hydrogen bonding features and hydrophilicity in general were identified as key determinants of inhibitory activity. Further important properties include molecular refractivity, π atom charge, total charge, lone pair electronegativity, and effective atom polarizability. Comparative fragment‐based analysis of the active and inactive molecules corroborated these observations and revealed several characteristic structural elements of 3′P inhibitors. The models built in this study can be obtained from the authors.
Sar and Qsar in Environmental Research | 2017
D. Qu; Aixia Yan; J. S. Zhang
Abstract In this paper, structure–activity relationship (SAR, classification) and quantitative structure–activity relationship (QSAR) models have been established to predict the bioactivity of human epidermal growth factor receptor-2 (HER2) inhibitors. For the SAR study, we established six SAR (or classification) models to distinguish highly and weakly active HER2 inhibitors. The dataset contained 868 HER2 inhibitors, which was split into a training set including 580 inhibitors and a test set including 288 inhibitors by a Kohonen’s self-organizing map (SOM), or a random method. The SAR models were performed using support vector machine (SVM), random forest (RF) and multilayer perceptron (MLP) methods. Among the six models, SVM models obtained superior results compared with other models. The prediction accuracy of the best model (model 1A) was 90.27% and the Matthews correlation coefficient (MCC) was 0.80 on the test set. For the QSAR study, we chose 286 HER2 inhibitors to establish six quantitative prediction models using MLR, SVM and MLP methods. The correlation coefficient (r) of the best model (model 4B) was 0.92 on the test set. The descriptors analysis showed that HAccN, lone pair electronegativity and π electronegativity were closely related to the bioactivity of HER2 inhibitors.