Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Botao Fan is active.

Publication


Featured researches published by Botao Fan.


Molecular Diversity | 2006

Molecular similarity and diversity in chemoinformatics: From theory to applications

Ana G. Maldonado; Jean-Pierre Doucet; Michel Petitjean; Botao Fan

This review is dedicated to a survey on molecular similarity and diversity. Key findings reported in recent investigations are selectively highlighted and summarized. Even if this overview is mainly centered in chemoinformatics, applications in other areas (pharmaceutical and medical chemistry, combinatorial chemistry, chemical databases management, etc.) are also introduced. The approaches used to define and descript the concepts of molecular similarity and diversity in the context of chemoinformatics are discussed in the first part of this review. We introduce, in the second and third parts, the descriptions and analyses of different methods and techniques. Finally, current applications and problems are enumerated and discussed in the last part.


Chemometrics and Intelligent Laboratory Systems | 2002

Radial basis function neural network-based QSPR for the prediction of critical temperature

Xiaojun Yao; Yawei Wang; Xiaoyun Zhang; Ruisheng Zhang; Mancang Liu; Zhide Hu; Botao Fan

Abstract A QSPR study was performed to develop models that relate the structures of 856 organic compounds to their critical temperatures. Molecular descriptors derived solely from structure were used to represent molecular structures. A subset of the calculated descriptors selected using forward stepwise regression was used in the QSPR models development. Multiple linear regression (MLR) and radial basis function neural networks (RBFNNs) are utilized to construct the linear and nonlinear QSPR models, respectively. The optimal QSPR model was developed based on a 10–33–1 radial basis function neural network architecture using molecular descriptors calculated from molecular structure alone. The root mean square errors in critical temperature predictions were 13.97 K for the whole set, 12.32 K for the training set, and 14.23 K for the prediction set. The prediction results are in good agreement with the experimental value.


Talanta | 2007

Prediction of surface tension for common compounds based on novel methods using heuristic method and support vector machine.

Jie Wang; Hongying Du; Huanxiang Liu; Xiaojun Yao; Zhide Hu; Botao Fan

As a novel type of learning machine method a support vector machine (SVM) was first used to develop a quantitative structure-property relationship (QSPR) model for the latest surface tension data of common diversity liquid compounds. Each compound was represented by structural descriptors, which were calculated from the molecular structure by the CODESSA program. The heuristic method (HM) was used to search the descriptor space, select the descriptors responsible for surface tension, and give the best linear regression model using the selected descriptors. Using the same descriptors, the non-linear regression model was built based on the support vector machine. Comparing the results of the two methods, the non-linear regression model gave a better prediction result than the heuristic method. Some insights into the factors that were likely to govern the surface tension of the diversity compounds could be gained by interpreting the molecular descriptors, which were selected by the heuristic model. This paper proposes a new effective way of researching interface chemistry, and can be very helpful to industry.


Journal of Chemical Information and Modeling | 2006

Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores.

Igor V. Tetko; Vitaly P. Solov'ev; Alexey V. Antonov; Xiaojun Yao; Jean Pierre Doucet; Botao Fan; Frank Hoonakker; Denis Fourches; Piere Jost; Nicolas Lachiche; Alexandre Varnek

A benchmark of several popular methods, Associative Neural Networks (ANN), Support Vector Machines (SVM), k Nearest Neighbors (kNN), Maximal Margin Linear Programming (MMLP), Radial Basis Function Neural Network (RBFNN), and Multiple Linear Regression (MLR), is reported for quantitative-structure property relationships (QSPR) of stability constants logK1 for the 1:1 (M:L) and logbeta2 for 1:2 complexes of metal cations Ag+ and Eu3+ with diverse sets of organic molecules in water at 298 K and ionic strength 0.1 M. The methods were tested on three types of descriptors: molecular descriptors including E-state values, counts of atoms determined for E-state atom types, and substructural molecular fragments (SMF). Comparison of the models was performed using a 5-fold external cross-validation procedure. Robust statistical tests (bootstrap and Kolmogorov-Smirnov statistics) were employed to evaluate the significance of calculated models. The Wilcoxon signed-rank test was used to compare the performance of methods. Individual structure-complexation property models obtained with nonlinear methods demonstrated a significantly better performance than the models built using multilinear regression analysis (MLRA). However, the averaging of several MLRA models based on SMF descriptors provided as good of a prediction as the most efficient nonlinear techniques. Support Vector Machines and Associative Neural Networks contributed in the largest number of significant models. Models based on fragments (SMF descriptors and E-state counts) had higher prediction ability than those based on E-state indices. The use of SMF descriptors and E-state counts provided similar results, whereas E-state indices lead to less significant models. The current study illustrates the difficulties of quantitative comparison of different methods: conclusions based only on one data set without appropriate statistical tests could be wrong.


Current Computer - Aided Drug Design | 2007

Nonlinear SVM Approaches to QSPR/QSAR Studies and Drug Design

Jean-Pierre Doucet; Florent Barbault; Hairong Xia; Annick Panaye; Botao Fan

Recently, a new promising nonlinear method, the support vector machine (SVM), was proposed by Vapnik. It rapidly found numerous applications in chemistry, biochemistry and pharmacochemistry. Several attempts using SVM in drug design have been reported. It became an attractive nonlinear approach in this field. In this review, the theoretical basis of SVM in classification and regression is briefly described. Its applications in QSPR/QSAR studies, and particularly in drug design are discussed. Comparative studies with some linear and other nonlinear methods show SVMs high performance both in classification and correlation.


Pharmaceutical Research | 2005

Prediction of pKa for Neutral and Basic Drugs Based on Radial Basis Function Neural Networks and the Heuristic Method

Feng Luan; Weiping Ma; Haixia Zhang; Xiaoyun Zhang; Mancang Liu; Zhide Hu; Botao Fan

PurposesQuantitative structure–property relationships (QSPR) were developed to predict the pKa values of a set of neutral and basic drugs via linear and nonlinear methods. The ability of the models to predict pKa was assessed and compared.MethodsThe descriptors of 74 neutral and basic drugs in this study were calculated by the software CODESSA, which can calculate constitutional, topological, geometrical, electrostatic, and quantum chemical descriptors. Linear and nonlinear QSPR models were developed based on the heuristic method (HM) and radial basis function neural networks (RBFNN), respectively. The heuristic method was also used for the preselection of appropriate molecular descriptors.ResultsThe obtained linear model had a correlation coefficient of r = 0.884, F = 37.72 with a root-mean-squared (RMS) error of 0.482 for the training set, and r = 0.693, F = 11.99, and RMS = 0.987 for the test set. The RMS in predicting the overall data set is 0.619. The nonlinear model gave better results; for the training set, r = 0.886, F = 202.314, and RMS = 0.458, and for the test set r = 0.737, F = 15.41, and RMS = 0.613. The RMS error in prediction for overall data set is 0.493. Prediction results from nonlinear model are in good agreement with experimental values.ConclusionsIn present study, we developed a QSPR model to predict the important parameter (pKa) of neutral and basic drugs. The model is useful in predicting pKa during the discovery of new drugs when experimental data are unknown.


Analytica Chimica Acta | 2002

Radial basis function network-based quantitative structure–property relationship for the prediction of Henry’s law constant

Xiaojun Yao; Mancang Liu; Xiaoyun Zhang; Zhide Hu; Botao Fan

Quantitative structure-property relationship (QSPR) method is used to develop the correlation models between the structures of a great number of organic compounds and their Henrys law constants in water. Molecular descriptors calculated from structure alone are used to represent molecular structures. A subset of the calculated descriptors, selected using forward step-wise regression is used in the QSPR models development. Multiple linear regression (MLR) and radial basis function networks (RBFNs) are utilized to construct the linear and non-linear prediction model respectively. The optimal QSPR model developed was based on a 10-17-1 RBFNs architecture using molecular descriptors calculated from molecular structure alone. The root mean square errors in log H predictions for the training, test and overall data sets are 0.3023, 0.3121, and 0.3038 log H units, respectively. The prediction result is agreement with the experimental value


Chemometrics and Intelligent Laboratory Systems | 1994

Artificial neural network simulation of 13C NMR shifts for methyl substituted cyclohexanes

A. Panaye; J.P. Doucet; Botao Fan; E. Feuilleaubois; S.Rahali El Azzouzi

Abstract A feed-forward layered neural network is used for the calculation of the 13C chemical shifts in methyl substituted cyclohexanes. Structural descriptors used as input to the network only specify the position α, β, γ, …, of the methyl substituents with respect to the resonating carbon and their orientation (axial, equatorial). Calculated shifts are in good agreement with experimental values, indicating that artificial neural network methodology gives slightly better results than usual additive incremental models.


Talanta | 2005

QSPR prediction of GC retention indices for nitrogen-containing polycyclic aromatic compounds from heuristically computed molecular descriptors

Rongjing Hu; Huanxiang Liu; Ruisheng Zhang; Chunxia Xue; Xiaojun Yao; Mancang Liu; Zhide Hu; Botao Fan

Gas chromatographic retention indices of nitrogen-containing polycyclic aromatic compounds (N-PACs) have been predicted by quantitative structure-property relationship (QSPR) analysis based on heuristic method (HM) implemented in CODESSA. In order to indicate the influence of different molecular descriptors on retention indices and well understand the important structural factors affecting the experimental values, three multivariable linear models derived from three groups of different molecular descriptors were built. Moreover, each molecular descriptor in these models was discussed to well understand the relationship between molecular structures and their retention indices. The proposed models gave the following results: the square of correlation coefficient, R(2), for the models with one, two and three molecular descriptors was 0.9571, 0.9776 and 0.9846, respectively.


Talanta | 2002

Prediction of gas chromatographic retention indices by the use of radial basis function neural networks

Xiaojun Yao; Xiaoyun Zhang; Ruisheng Zhang; Mancang Liu; Zhide Hu; Botao Fan

A new method for the prediction of retention indices for a diverse set of compounds from their physicochemical parameters has been proposed. The two used input parameters for representing molecular properties are boiling point and molar volume. Models relating relationships between physicochemical parameters and retention indices of compounds are constructed by means of radial basis function neural networks. To get the best prediction results, some strategies are also employed to optimize the topology and learning parameters of the RBFNNs. For the test set, a predictive correlation coefficient R=0.9910 and root mean squared error of 14.1 are obtained. Results show that radial basis function networks can give satisfactory prediction ability and its optimization is less-time consuming and easy to implement.

Collaboration


Dive into the Botao Fan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Annick Panaye

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Jean-Pierre Doucet

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Hai-Feng Chen

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge