Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Allan Tucker is active.

Publication


Featured researches published by Allan Tucker.


Genome Biology | 2004

Consensus clustering and functional interpretation of gene-expression data

Stephen Swift; Allan Tucker; Veronica Vinciotti; Nigel J. Martin; Christine A. Orengo; Xiaohui Liu; Paul Kellam

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas.


IEEE Transactions on Nanobioscience | 2008

Stochastic Dynamic Modeling of Short Gene Expression Time-Series Data

Zidong Wang; Fuwen Yang; Daniel W. C. Ho; Stephen Swift; Allan Tucker; Xiaohui Liu

In this paper, the expectation maximization (EM) algorithm is applied for modeling the gene regulatory network from gene time-series data. The gene regulatory network is viewed as a stochastic dynamic model, which consists of the noisy gene measurement from microarray and the gene regulation first-order autoregressive (AR) stochastic dynamic process. By using the EM algorithm, both the model parameters and the actual values of the gene expression levels can be identified simultaneously. Moreover, the algorithm can deal with the sparse parameter identification and the noisy data in an efficient way. It is also shown that the EM algorithm can handle the microarray gene expression data with large number of variables but a small number of observations. The gene expression stochastic dynamic models for four real-world gene expression data sets are constructed to demonstrate the advantages of the introduced algorithm. Several indices are proposed to evaluate the models of inferred gene regulatory networks, and the relevant biological properties are discussed.


Bioinformatics | 2009

Literature-based priors for gene regulatory networks

Emma Steele; Allan Tucker; Peter A. C. 't Hoen; Martijn J. Schuemie

MOTIVATION The use of prior knowledge to improve gene regulatory network modelling has often been proposed. In this article we present the first research on the massive incorporation of prior knowledge from literature for Bayesian network learning of gene networks. As the publication rate of scientific papers grows, updating online databases, which have been proposed as potential prior knowledge in past research, becomes increasingly challenging. The novelty of our approach lies in the use of gene-pair association scores that describe the overlap in the contexts in which the genes are mentioned, generated from a large database of scientific literature, harnessing the information contained in a huge number of documents into a simple, clear format. RESULTS We present a method to transform such literature-based gene association scores to network prior probabilities, and apply it to learn gene sub-networks for yeast, Escherichia coli and Human organisms. We also investigate the effect of weighting the influence of the prior knowledge. Our findings show that literature-based priors can improve both the number of true regulatory interactions present in the network and the accuracy of expression value prediction on genes, in comparison to a network learnt solely from expression data. Networks learnt with priors also show an improved biological interpretation, with identified subnetworks that coincide with known biological pathways.


electronic commerce | 2005

RGFGA: An Efficient Representation and Crossover for Grouping Genetic Algorithms

Allan Tucker; Jason Crampton; Stephen Swift

There is substantial research into genetic algorithms that are used to group large numbers of objects into mutually exclusive subsets based upon some fitness function. However, nearly all methods involve degeneracy to some degree. We introduce a new representation for grouping genetic algorithms, the restricted growth function genetic algorithm, that effectively removes all degeneracy, resulting in a more efficient search. A new crossover operator is also described that exploits a measure of similarity between chromosomes in a population. Using several synthetic datasets, we compare the performance of our representation and crossover with another well known state-of-the-art GA method, a strawman optimisation method and a well-established statistical clustering algorithm, with encouraging results.


Journal of Biomedical Informatics | 2008

Consensus and Meta-analysis regulatory networks for combining multiple microarray gene expression datasets

Emma Steele; Allan Tucker

Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation.


intelligent data analysis | 2003

Learning Dynamic Bayesian Networks from Multivariate Time Series with Changing Dependencies

Allan Tucker; Xiaohui Liu

Many examples exist of multivariate time series where dependencies between variables change over time. If these changing dependencies are not taken into account, any model that is learnt from the data will average over the different dependency structures. Paradigms that try to explain underlying processes and observed events in multivariate time series must explicitly model these changes in order to allow non-experts to analyse and understand such data. In this paper we have developed a method for generating explanations in multivariate time series that takes into account changing dependency structure. We make use of a dynamic Bayesian network model with hidden nodes. We introduce a representation and search technique for learning such models from data and test it on synthetic time series and real-world data from an oil refinery, both of which contain changing underlying structure. We compare our method to an existing EM-based method for learning structure. Results are very promising for our method and we include sample explanations, generated from models learnt from the refinery dataset.


International Journal of Intelligent Systems | 2001

Evolutionary learning of dynamic probabilistic models with large time lags

Allan Tucker; Xiaohui Liu; Andrew Ogden-Swift

In this paper, we explore the automatic explanation of multivariate time series (MTS) through learning dynamic Bayesian networks (DBNs). We have developed an evolutionary algorithm which exploits certain characteristics of MTS in order to generate good networks as quickly as possible. We compare this algorithm to other standard learning algorithms that have traditionally been used for static Bayesian networks but are adapted for DBNs in this paper. These are extensively tested on both synthetic and real‐world MTS for various aspects of efficiency and accuracy. By proposing a simple representation scheme, an efficient learning methodology, and several useful heuristics, we have found that the proposed method is more efficient for learning DBNs from MTS with large time lags, especially in time‐demanding situations. © 2001 John Wiley & Sons, Inc.


Archive | 2012

Advances in Intelligent Data Analysis XI

Jaakko Hollmén; Frank Klawonn; Allan Tucker

Over-fitting is a ubiquitous problem in machine learning, and a variety of techniques to avoid over-fitting the training sample have proven highly effective, including early stopping, regularization, and ensemble methods. However, while over-fitting in training is widely appreciated and its avoidance now a standard element of best practice, over-fitting can also occur in model selection. This form of over-fitting can significantly degrade generalization performance, but has thus far received little attention. For example the kernel and regularization parameters of a support vector machine are often tuned by optimizing a cross-validation based model selection criterion. However the crossvalidation estimate of generalization performance will inevitably have a finite variance, such that its minimizer depends on the particular sample on which it is evaluated, and this will generally differ from the minimizer of the true generalization error. Therefore if the cross-validation error is aggressively minimized, generalization performance may be substantially degraded. In general, the smaller the amount of data available, the higher the variance of the model selection criterion, and hence the more likely over-fitting in model selection will be a significant problem. Similarly, the more hyper-parameters to be tuned in model selection, the more easily the variance of the model selection criterion can be exploited, which again increases the likelihood of over-fitting in model selection. Over-fitting in model selection is empirically demonstrated to pose a substantial pitfall in the application of kernel learning methods and Gaussian process classifiers. Furthermore, evaluation of machine learning methods can easily be significantly biased unless the evaluation protocol properly accounts for this type of over-fitting. Fortunately the common solutions to avoiding over-fitting in training also appear to be effective in avoiding over-fitting in model selection. Three examples are presented based on regularization of the model selection criterion, early stopping in model selection and minimizing the number of hyper-parameters to be tuned during model selection.


international conference of the ieee engineering in medicine and biology society | 2010

The Pseudotemporal Bootstrap for Predicting Glaucoma From Cross-Sectional Visual Field Data

Allan Tucker; David F. Garway-Heath

Progressive loss of the field of vision is characteristic of a number of eye diseases such as glaucoma, a leading cause of irreversible blindness in the world. Recently, there has been an explosion in the amount of data being stored on patients who suffer from visual deterioration, including visual field (VF) test, retinal image, and frequent intraocular pressure measurements. Like the progression of many biological and medical processes, VF progression is inherently temporal in nature. However, many datasets associated with the study of such processes are often cross sectional and the time dimension is not measured due to the expensive nature of such studies. In this paper, we address this issue by developing a method to build artificial time series, which we call pseudo time series from cross-sectional data. This involves building trajectories through all of the data that can then, in turn, be used to build temporal models for forecasting (which would otherwise be impossible without longitudinal data). Glaucoma, like many diseases, is a family of conditions and it is, therefore, likely that there will be a number of key trajectories that are important in understanding the disease. In order to deal with such situations, we extend the idea of pseudo time series by using resampling techniques to build multiple sequences prior to model building. This approach naturally handles outliers and multiple possible disease trajectories. We demonstrate some key properties of our approach on synthetic data and present very promising results on VF data for predicting glaucoma.


genetic and evolutionary computation conference | 2004

Clustering with Niching Genetic K-means Algorithm

Weiguo Sheng; Allan Tucker; Xiaohui Liu

GA-based clustering algorithms often employ either simple GA, steady state GA or their variants and fail to consistently and efficiently identify high quality solutions (best known optima) of given clustering problems, which involve large data sets with many local optima. To circumvent this problem, we propose Niching Genetic K-means Algorithm (NGKA) that is based on modified deterministic crowding and embeds the computationally attractive k-means. Our experiments show that NGKA can consistently and efficiently identify high quality solutions. Experiments use both simulated and real data with varying size and varying number of local optima. The significance of NGKA is also shown on the experimental data sets by comparing through simulations with Genetically Guided Algorithm (GGA) and Genetic K-means Algorithm (GKA).

Collaboration


Dive into the Allan Tucker's collaboration.

Top Co-Authors

Avatar

Stephen Swift

Brunel University London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Steve Counsell

Brunel University London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paul Kellam

Imperial College London

View shared research outputs
Top Co-Authors

Avatar

Yuanxi Li

Brunel University London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter A. C. 't Hoen

Leiden University Medical Center

View shared research outputs
Researchain Logo
Decentralizing Knowledge