Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ralph A. Mansson is active.

Publication


Featured researches published by Ralph A. Mansson.


Journal of Chemical Information and Modeling | 2006

Statistical Modeling of a Ligand Knowledge Base

Ralph A. Mansson; Alan Welsh; Natalie Fey; A. Guy Orpen

A range of different statistical models has been fitted to experimental data for the Tolman electronic parameter (TEP) based on a large set of calculated descriptors in a prototype ligand knowledge base (LKB) of phosphorus(III) donor ligands. The models have been fitted by ordinary least squares using subsets of descriptors, principal component regression, and partial least squares which use variables derived from the complete set of descriptors, least angle regression, and the least absolute shrinkage and selection operator. None of these methods is robust against outliers, so we also applied a robust estimation procedure to the linear regression model. Criteria for model evaluation and comparison have been discussed, highlighting the importance of resampling methods for assessing the robustness of models and the scope for making predictions in chemically intuitive models. For the ligands covered by this LKB, ordinary least squares models of descriptor subsets provide a good representation of the data, while partial least squares, principal component regression, and least angle regression models are less suitable for our dual aims of prediction and interpretation. A linear regression model with robustly fitted parameters achieves the best model performance over all classes of models fitted to TEP data, and the weightings assigned to ligands during the robust estimation procedure are chemically intuitive. The increased model complexity when compared to the ordinary least squares linear model is justified by the reduced influence of individual ligands on the model parameters and predictions of new ligands. Robust linear regression models therefore represent the best compromise for achieving statistical robustness in simple, chemically meaningful models.


Journal of Chemical Information and Modeling | 2005

Prediction of properties from simulations: a re-examination with modern statistical methods.

Ralph A. Mansson; Jeremy G. Frey; Jonathan W. Essex; Alan Welsh

We discuss models fit to data collected by Duffy and Jorgensen to predict solvation free energies and partition equilibria of drugs, organic molecules, aromatic heterocycles, and other molecules. These data were originally examined using linear regression, but here more recently developed statistical models are applied. The data set is complicated due to the presence of discrepant observations and also curvature in the response. In some cases it is possible to discard a small number of the observations to get good fit to the data, but, in others, discarding an increasing proportion of the observations does not improve the fit. Our general preference is to use robust parameter estimation which downweights to reduce the influence of discrepant observations on the fitted models. Models are selected for four responses using linear or more complicated representations of the explanatory variables, such as cubic polynomials, B-splines, or smoothers via generalized additive models (GAMs). Variables are chosen using the traditional approach of formal tests to assess contribution to the fit of a model, and resampling methods including bootstrap are also considered to assess the prediction error for given models. Results of our analysis indicate that GAMs are an improvement on linear models for describing the data and making predictions. In general robust regression models and GAMs have the smallest conditional expected loss of prediction over the four responses. In addition, robust regression models offer the advantage of identifying molecules that perform poorly in the fit. In general, models were identified that yielded an improvement of approximately 50% in the conditional expected loss of prediction compared with the original parametrization of Duffy and Jorgensen. It was also found that the use of cross-validation to compare models was unreliable, and bootstrapping is preferred.


Journal of Statistical Planning and Inference | 2001

Robustness of balanced incomplete block designs to randomly missing observations

Philip Prescott; Ralph A. Mansson

Practical experimenters must always be aware of the possibility that some of their observations could become unavailable for analysis. In an experiment involving treatments and blocks, it could be desirable to select a design that is resistant to the loss of a complete block or treatment, or a small number of observations distributed at random throughout the initial design. In this paper, we examine the robustness of binary, variance-balanced, incomplete block designs using the eigenvalues of the associated information matrix when specific observations are missing. Results are presented for up to three missing observations and the procedure is illustrated using an example involving eight treatments arranged in 14 blocks of four treatments per block. On the basis of these considerations, it is recommended that, to guard against a substantial loss of efficiency due to a small number of randomly missing observations, it is preferable to use designs with as few treatments common to pairs of blocks as possible.


Computational Statistics & Data Analysis | 2002

Missing observations in Youden square designs

Ralph A. Mansson; Philip Prescott

The reduction in efficiency in estimating treatment differences in Youden square designs from which individual observations have been lost is considered. A simple generalised inverse of the associated information matrix is used to develop expressions for the variances of the pairwise treatment comparisons. Results on the robustness of Youden squares to the loss of one or two observations are given, and it is shown that for two observations missing there are eight possible cases of resulting design which need to be considered. The frequencies of these cases depend on the form of the initial design, as well as on the design parameters. Examples of similar designs are used to illustrate these different frequencies of resulting designs.


Computational Statistics & Data Analysis | 2004

Robustness of diallel cross designs to the loss of one or more observations

Philip Prescott; Ralph A. Mansson

The effects of missing observations on complete and partial diallel cross designs are examined. A-efficiencies, based on average variances of the elementary contrasts of the line-effects, suggest that these designs are fairly robust. Simple g-inverses may be found for the information matrices of the line effects which allow evaluation of expressions for the variances of the line-effect differences with and without the missing observations. It is shown that, for small designs or when the number of lines is large, the reduction in efficiency for individual line comparisons can be quite large. When these designs are employed, care should be taken to ensure that individual observations are not lost.


Communications in Statistics-theory and Methods | 2002

EFFICIENCY OF PAIR-WISE TREATMENT COMPARISONS IN INCOMPLETE BLOCK EXPERIMENTS SUBJECT TO THE LOSS OF A BLOCK OF OBSERVATIONS

Philip Prescott; Ralph A. Mansson

ABSTRACT The robustness of incomplete block designs to the loss of all observations in a block is investigated in terms of the efficiency of the residual design. Previous results use the eigenvalues of the information matrix of the treatment effects to determine the overall efficiency of the implemented design, relative to the complete design, in terms of the average variance of pair-wise treatment comparisons. Here we use a simple generalized inverse of this information matrix to identify the variances of the individual pair-wise treatment comparisons and show the effects of the loss of a block on specific treatment comparisons. We determine results for balanced incomplete block (BIB) designs and Youden square designs and show that, although the overall efficiency remains high when a block is lost, comparisons of two treatments that appear in the missing block can be quite seriously affected. The efficiencies of individual treatment comparisons in a BIB design are shown to depend on the number of treatments, the number of blocks used in the initial design and the number of treatments in a block. However, we also show that, for a single replicate of a Youden square, the efficiencies of individual treatment comparisons depend only on the size of the block that is lost and not on the number of treatments being compared.


Quality Engineering | 2004

Robustness of a class of partial diallel cross designs to the unavailability of a complete block of observations.

Ralph A. Mansson; Philip Prescott

Complete and partial diallel cross designs are examined as to their construction and robustness against the loss of a block of observations. A simple generalized inverse is found for the information matrix of the line effects, which allows evaluation of expressions for the variances of the line-effect differences with and without the missing block. A-efficiencies, based on average variances of the elementary contrasts of the line-effects, suggest that these designs are fairly robust. The loss of efficiency is generally less than 10%, but it is shown that specific comparisons might suffer a loss of efficiency of as much as 40%.


Journal of Applied Statistics | 2001

Missing values in replicated Latin squares

Ralph A. Mansson; Philip Prescott

Designs based on any number of replicated Latin squares are examined for their robustness against the loss of up to three observations randomly scattered throughout the design. The information matrix for the treatment effects is used to evaluate the average variances of the treatment differences for each design in terms of the number of missing values and the size of the design. The resulting average variances are used to assess the overall robustness of the designs. In general, there are 16 different situations for the case of three missing values when there are at least three Latin square replicates in the design. Algebraic expressions may be determined for all possible configurations, but here the best and worst cases are given in detail. Numerical illustrations are provided for the average variances, relative efficiencies, minimum and maximum variances and the frequency counts, showing the effects of the missing values for a range of design sizes and levels of replication.


Chemistry: A European Journal | 2006

Development of a ligand knowledge base, part 1: computational descriptors for phosphorus donor ligands.

Natalie Fey; Athanassios C. Tsipis; Stephanie E. Harris; Jeremy N. Harvey; A. Guy Orpen; Ralph A. Mansson


Chemometrics and Intelligent Laboratory Systems | 2005

Statistical analysis of second harmonic generation experiments: a phenomenological model

Alan Welsh; Ralph A. Mansson; Jeremy G. Frey; Lefteris Danos

Collaboration


Dive into the Ralph A. Mansson's collaboration.

Top Co-Authors

Avatar

Philip Prescott

University of Southampton

View shared research outputs
Top Co-Authors

Avatar

Alan Welsh

Australian National University

View shared research outputs
Top Co-Authors

Avatar

Jeremy G. Frey

University of Southampton

View shared research outputs
Top Co-Authors

Avatar

Lefteris Danos

University of Southampton

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jeremy N. Harvey

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge