Ross D. King | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ross D. King is active.

Explore More

Publication

Featured researches published by Ross D. King.

Nature | 2004

Functional genomic hypothesis generation and experimentation by a robot scientist

Ross D. King; Kenneth Edward Whelan; Ffion M. Jones; Philip G. K. Reiser; Christopher H. Bryant; Stephen Muggleton; Douglas B. Kell; Stephen G. Oliver

The question of whether it is possible to automate the scientific process is of both great theoretical interest and increasing practical importance because, in many scientific areas, data are being generated much faster than they can be effectively analysed. We describe a physically implemented robotic system that applies techniques from artificial intelligence to carry out cycles of scientific experimentation. The system automatically originates hypotheses to explain observations, devises experiments to test these hypotheses, physically runs the experiments using a laboratory robot, interprets the results to falsify hypotheses inconsistent with the data, and then repeats the cycle. Here we apply the system to the determination of gene function using deletion mutants of yeast (Saccharomyces cerevisiae) and auxotrophic growth experiments. We built and tested a detailed logical model (involving genes, proteins and metabolites) of the aromatic amino acid synthesis pathway. In biological experiments that automatically reconstruct parts of this model, we show that an intelligent experiment selection strategy is competitive with human performance and significantly outperforms, with a cost decrease of 3-fold and 100-fold (respectively), both cheapest and random-experiment selection.

european conference on principles of data mining and knowledge discovery | 2001

Knowledge Discovery in Multi-label Phenotype Data

Amanda Clare; Ross D. King

The biological sciences are undergoing an explosion in the amount of available data. New data analysis methods are needed to deal with the data. We present work using KDD to analyse data from mutant phenotype growth experiments with the yeast S. cerevisiae to predict novel gene functions. The analysis of the data presented a number of challenges: multi-class labels, a large number of sparsely populated classes, the need to learn a set of accurate rules (not a complete classification), and a very large amount of missing values. We developed resampling strategies and modified the algorithm C4.5 to deal with these problems. Rules were learnt which are accurate and biologically meaningful. The rules predict function of 83 putative genes of currently unknown function at an estimated accuracy of ≥ 80%.

Science | 2009

The Automation of Science

Ross D. King; Jeremy John Rowland; Stephen G. Oliver; Michael Young; Wayne Aubrey; Emma Louise Byrne; Maria Liakata; Magdalena Markham; Pınar Pir; Larisa N. Soldatova; Andrew Sparkes; Kenneth Edward Whelan; Amanda Clare

The basis of science is the hypothetico-deductive method and the recording of experiments in sufficient detail to enable reproducibility. We report the development of Robot Scientist “Adam,” which advances the automation of both. Adam has autonomously generated functional genomics hypotheses about the yeast Saccharomyces cerevisiae and experimentally tested these hypotheses by using laboratory automation. We have confirmed Adams conclusions through manual experiments. To describe Adams research, we have developed an ontology and logical language. The resulting formalization involves over 10,000 different research units in a nested treelike structure, 10 levels deep, that relates the 6.6 million biomass measurements to their logical description. This formalization describes how a machine contributed to scientific knowledge.

Artificial Intelligence | 1996

Theories for mutagenicity: a study in first-order and feature-based induction

Ashwin Srinivasan; Stephen Muggleton; Michael J. E. Sternberg; Ross D. King

Abstract A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with inductive logic programming (ILP) will significantly outperform those using simpler feature-based methods. One area that has long been associated with graph-based or structural representation and reasoning is organic chemistry. In this field, we consider the problem of predicting the mutagenic activity of small molecules: a property that is related to carcinogenicity, and an important consideration in developing less hazardous drugs. By providing an ILP system with progressively more structural information concerning the molecules, we compare the predictive power of the logical theories constructed against benchmarks set by regression, neural, and tree-based methods.

Applied Artificial Intelligence | 1995

STATLOG: COMPARISON OF CLASSIFICATION ALGORITHMS ON LARGE REAL-WORLD PROBLEMS

Ross D. King; Cao Feng; A. Sutherland

This paper describes work in the StatLog project comparing classification algorithms on large real-world problems. The algorithms compared were from symbolic learning (CART. C4.5, NewID, AC2,ITrule, Cal5, CN2), statistics (Naive Bayes, k-nearest neighbor, kernel density, linear discriminant, quadratic discriminant, logistic regression, projection pursuit, Bayesian networks), and neural networks (backpropagation, radial basis functions). Twelve datasets were used: five from image analysis, three from medicine, and two each from engineering and finance. We found that which algorithm performed best depended critically on the data set investigated. We therefore developed a set of data set descriptors to help decide which algorithms are suited to particular data sets. For example, data sets with extreme distributions (skew > l and kurtosis > 7) and with many binary/categorical attributes (>38%) tend to favor symbolic learning algorithms. We suggest how classification algorithms can be extended in a number of d...

Protein Engineering Design & Selection | 1992

Protein secondary structure prediction using logic-based machine learning

Stephen Muggleton; Ross D. King; Michael J. E. Sternberg

Many attempts have been made to solve the problem of predicting protein secondary structure from the primary sequence but the best performance results are still disappointing. In this paper, the use of a machine learning algorithm which allows relational descriptions is shown to lead to improved performance. The Inductive Logic Programming computer program, Golem, was applied to learning secondary structure prediction rules for alpha/alpha domain type proteins. The input to the program consisted of 12 non-homologous proteins (1612 residues) of known structure, together with a background knowledge describing the chemical and physical properties of the residues. Golem learned a small set of rules that predict which residues are part of the alpha-helices--based on their positional relationships and chemical and physical properties. The rules were tested on four independent non-homologous proteins (416 residues) giving an accuracy of 81% (+/- 2%). This is an improvement, on identical data, over the previously reported result of 73% by King and Sternberg (1990, J. Mol. Biol., 216, 441-457) using the machine learning program PROMIS, and of 72% using the standard Garnier-Osguthorpe-Robson method. The best previously reported result in the literature for the alpha/alpha domain type is 76%, achieved using a neural net approach. Machine learning also has the advantage over neural network and statistical methods in producing more understandable results.

Journal of the Royal Society Interface | 2006

An ontology of scientific experiments.

Larisa N. Soldatova; Ross D. King

The formal description of experiments for efficient analysis, annotation and sharing of results is a fundamental part of the practice of science. Ontologies are required to achieve this objective. A few subject-specific ontologies of experiments currently exist. However, despite the unity of scientific experimentation, no general ontology of experiments exists. We propose the ontology EXPO to meet this need. EXPO links the SUMO (the Suggested Upper Merged Ontology) with subject-specific ontologies of experiments by formalizing the generic concepts of experimental design, methodology and results representation. EXPO is expressed in the W3C standard ontology language OWL-DL. We demonstrate the utility of EXPO and its ability to describe different experimental domains, by applying it to two experiments: one in high-energy physics and the other in phylogenetics. The use of EXPO made the goals and structure of these experiments more explicit, revealed ambiguities, and highlighted an unexpected similarity. We conclude that, EXPO is of general value in describing experiments and a step towards the formalization of science.

inductive logic programming | 1997

Carcinogenesis Predictions Using ILP

Ashwin Srinivasan; Ross D. King; Stephen Muggleton; Michael J. E. Sternberg

Obtaining accurate structural alerts for the causes of chemical cancers is a problem of great scientific and humanitarian value. This paper follows up on earlier research that demonstrated the use of Inductive Logic Programming (ILP) for predictions for the related problem of mutagenic activity amongst nitroaromatic molecules. Here we are concerned with predicting carcinogenic activity in rodent bioassays using data from the U.S. National Toxicology Program conducted by the National Institute of Environmental Health Sciences. The 330 chemicals used here are significantly more diverse than the previous study, and form the basis for obtaining Structure-Activity Relationships (SARs) relating molecular structure to cancerous activity in rodents. We describe the use of the ILP system Progol to obtain SARs from this data. The rules obtained from Progol are comparable in accuracy to those from expert chemists, and more accurate than most state-of-the-art toxicity prediction methods. The rules can also be interpreted to give clues about the biological and chemical mechanisms of carcinogenesis, and make use of those learnt by Progol for mutagenesis. Finally, we present details of, and predictions for, an ongoing international blind trial aimed specifically at comparing prediction methods. This trial provides ILP algorithms an opportunity to participate at the leading-edge of scientific discovery.

Bioinformatics | 2001

The Predictive Toxicology Challenge 2000–2001

Christoph Helma; Ross D. King; Stefan Kramer; Ashwin Srinivasan

We initiated the Predictive Toxicology Challenge (PTC) to stimulate the development of advanced SAR techniques for predictive toxicology models. The goal of this challenge is to predict the rodent carcinogenicity of new compounds based on the experimental results of the US National Toxicology Program (NTP). Submissions will be evaluated on quantitative and qualitative scales to select the most predictive models and those with the highest toxicological relevance. Availability: http://www.informatik.uni-freiburg.de/∼ml/ptc/ Contact: [email protected].

New Generation Computing | 1995

Relating chemical activity to structure: An examination of ILP successes

Ross D. King; Michael J. E. Sternberg; Ashwin Srinivasan

Problems concerned with learning the relationships between molecular structure and activity have been important test-beds for Inductive Logic programming (ILP) systems. In this paper we examine these applications and empirically evaluate the extent to which a first-order representation was required. We compared ILP theories with those constructed using standard linear regression and a decision-tree learner on a series of progressively more difficult problems. When a propositional encoding is feasible for the feature-based algorithms, we show that such algorithms are capable of matching the predictive accuracies of an ILP theory. However, as the complexity of the compounds considered increased, propositional encodings becomes intractable. In such cases, our results show that ILP programs can still continue to construct accurate, understandable theories. Based on this evidence, we propose future work to realise fully the potential of ILP in structure-activity problem.

Explore More