David J. Livingstone | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David J. Livingstone is active.

Explore More

Publication

Featured researches published by David J. Livingstone.

Journal of Computer-aided Molecular Design | 2005

Virtual computational chemistry laboratory - design and description

Igor V. Tetko; Johann Gasteiger; Roberto Todeschini; A. Mauri; David J. Livingstone; Peter Ertl; V. A. Palyulin; E. V. Radchenko; Nikolai S. Zefirov; Alexander Makarenko; Vsevolod Yu. Tanchuk; Volodymyr V. Prokopenko

Internet technology offers an excellent opportunity for the development of tools by the cooperative effort of various groups and institutions. We have developed a multi-platform software system, Virtual Computational Chemistry Laboratory, http://www.vcclab.org, allowing the computational chemist to perform a comprehensive series of molecular indices/properties calculations and data analysis. The implemented software is based on a three-tier architecture that is one of the standard technologies to provide client-server services on the Internet. The developed software includes several popular programs, including the indices generation program, DRAGON, a 3D structure generator, CORINA, a program to predict lipophilicity and aqueous solubility of chemicals, ALOGPS and others. All these programs are running at the host institutes located in five countries over Europe. In this article we review the main features and statistics of the developed system that can be used as a prototype for academic and industry models.

Journal of Chemical Information and Computer Sciences | 1995

Neural Network Studies. 1. Comparison of Overfitting and Overtraining

Igor V. Tetko; David J. Livingstone; A. I. Luik

The application of feed forward back propagation artificial neural networks with one hidden layer (ANN) to perform the equivalent of multiple linear regression (MLR) has been examined using artificial structured data sets and real literature data. The predictive ability of the networks has been estimated using a training/ test set protocol. The results have shown advantages of ANN over MLR analysis. The ANNs do not require high order terms or indicator variables to establish complex structure-activity relationships. Overfitting does not have any influence on network prediction ability when overtraining is avoided by cross-validation. Application of ANN ensembles has allowed the avoidance of chance correlations and satisfactory predictions of new data have been obtained for a wide range of numbers of neurons in the hidden layer.

Journal of Chemical Information and Computer Sciences | 2000

Unsupervised forward selection: a method for eliminating redundant variables

David C. Whitley; Martyn G. Ford; David J. Livingstone

An unsupervised learning method is proposed for variable selection and its performance assessed using three typical QSAR data sets. The aims of this procedure are to generate a subset of descriptors from any given data set in which the resultant variables are relevant, redundancy is eliminated, and multicollinearity is reduced. Continuum regression, an algorithm encompassing ordinary least squares regression, regression on principal components, and partial least squares regression, was used to construct models from the selected variables. The variable selection routine is shown to produce simple, robust, and easily interpreted models for the chosen data sets.

European Journal of Medicinal Chemistry | 1999

Neural networks in drug discovery : have they lived up to their promise?

David T. Manallack; David J. Livingstone

Over the last decade neural networks have become an efficient method for data analysis in the field of drug discovery. The early problems encountered with neural networks such as overfitting and overtraining have been addressed resulting in a technique that surpasses traditional statistical methods. Neural networks have thus largely lived up to their promise, which was to overcome QSAR statistical problems. The next revolution in QSAR will no doubt involve research into producing better descriptors used in these studies to improve our ability to relate chemical structure to biological activity. This review focuses on the applications of neural network methods and their development over the last five years.

Journal of Chemical Information and Computer Sciences | 1996

Neural Network Studies. 2. Variable Selection

Igor V. Tetko; A. E. P. Villa; David J. Livingstone

Quantitative structure-activity relationship (QSAR) studies usually require an estimation of the relevance of a very large set of initial variables. Determination of the most important variables allows theoretically a better generalization by all pattern recognition methods. This study introduces and investigates five pruning algorithms designed to estimate the importance of input variables in feed-forward artificial neural network trained by back propagation algorithm (ANN) applications and to prune nonrelevant ones in a statistically reliable way. The analyzed algorithms performed similar variable estimations for simulated data sets, but differences were detected for real QSAR examples. Improvement of ANN prediction ability was shown after the pruning of redundant input variables. The statistical coefficients computed by ANNs for QSAR examples were better than those of multiple linear regression. Restrictions of the proposed algorithms and the potential use of ANNs are discussed.

Journal of Computer-aided Molecular Design | 1997

Data modelling with neural networks: Advantages and limitations

David J. Livingstone; David T. Manallack; Igor V. Tetko

The origins and operation of artificial neural networks are briefly described and their early application to data modelling in drug design is reviewed. Four problems in the use of neural networks in data modelling are discussed, namely overfitting, chance effects, overtraining and interpretation, and examples are given of the means by which the first three of these may be avoided. The use of neural networks as a variable selection tool is shown and the advantage of networks as a nonlinear data modelling device is discussed. The display of multivariate data in two dimensions employing a neural network is illustrated using experimental and theoretical data for a set of charge transfer complexes.

Journal of Chemical Information and Computer Sciences | 1998

Neural Network Studies. 3. Variable Selection in the Cascade-Correlation Learning Architecture

Vasyl Kovalishyn; Igor V. Tetko; A. I. Luik; Vladyslav Kholodovych; A. E. P. Villa; David J. Livingstone

Pruning methods for feed-forward artificial neural networks trained by the cascade-correlation learning algorithm are proposed. The cascade-correlation algorithm starts with a small network and dynamically adds new nodes until the analyzed problem has been solved. This feature of the algorithm removes the requirement to predefine the architecture of the neural network prior to network training. The developed pruning methods are used to estimate the importance of large sets of initial variables for quantitative structure−activity relationship studies and simulated data sets. The calculated results are compared with the performance of fixed-size back-propagation neural networks and multiple regression analysis and are carefully validated using different training/test set protocols, such as leave-one-out and full cross-validation procedures. The results suggest that the pruning methods can be successfully used to optimize the set of variables for the cascade-correlation learning algorithm neural networks. Th...

Journal of Computer-aided Molecular Design | 2001

Simultaneous prediction of aqueous solubility and octanol/water partition coefficient based on descriptors derived from molecular structure

David J. Livingstone; Martyn G. Ford; Jarmo Huuskonen; David W. Salt

It has been shown that water solubility and octanol/water partition coefficient for a large diverse set of compounds can be predicted simultaneously using molecular descriptors derived solely from a two dimensional representation of molecular structure. These properties have been modelled using multiple linear regression, artificial neural networks and a statistical method known as canonical correlation analysis. The neural networks give slightly better models both in terms of fitting and prediction presumably due to the fact that they include non-linear terms. The statistical methods, on the other hand, provide information concerning the explanation of variance and allow easy interrogation of the models. Models were fitted using a training set of 552 compounds, a validation set and test set each containing 68 molecules and two separate literature test sets for solubility and partition.

European Journal of Medicinal Chemistry | 2000

Prediction of aqueous solubility for a diverse set of organic compounds based on atom-type electrotopological state indices

Jarmo Huuskonen; Jukka Rantanen; David J. Livingstone

We describe robust methods for estimating the aqueous solubility of a set of 734 organic compounds from different structural classes based on multiple linear regression (MLR) and artificial neural networks (ANN) model. The structures were represented by atom-type electrotopological state (E-state) indices. The squared correlation coefficient and standard deviation for the MLR with 34 structural parameters were r(2) = 0.94 and s = 0.58 for the training set of 675 compounds. For the test set of 21 compounds, the equivalent statistics were r(2)(pred) = 0.80 and s = 0.87, respectively. Neural networks gave a significant improvement using the same set of parameters, and the standard deviations were s = 0.52 for the training set and s = 0.75 for the test set when an artificial neural network with five neurons in the hidden layer was used. The results clearly show that accurate models can be rapidly calculated for the estimation of aqueous solubility for a large and diverse set of organic compounds using easily calculated structural parameters.

Journal of Computer-aided Molecular Design | 1989

Pattern recognition display methods for the analysis of computed molecular properties

Brian D. Hudson; David J. Livingstone; Elizabeth Rahr

SummaryPattern recognition methods, particularly the ‘unsupervised learning’ techniques, are well suited for the preliminary analysis of the large data sets produced by computer chemistry. The use of linear and non-linear display methods for such exploratory analysis are exemplified with the aid of two data sets of biologically active molecules. Advantages and disadvantages of these techniques are discussed.

Explore More