Mark Eastwood | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark Eastwood is active.

Explore More

Publication

Featured researches published by Mark Eastwood.

international conference on knowledge based and intelligent information and engineering systems | 2009

A Non-sequential Representation of Sequential Data for Churn Prediction

Mark Eastwood; Bogdan Gabrys

We investigate the length of event sequence giving best predictions when using a continuous HMM approach to churn prediction from sequential data. Motivated by observations that predictions based on only the few most recent events seem to be the most accurate, a non-sequential dataset is constructed from customer event histories by averaging features of the last few events. A simple K-nearest neighbor algorithm on this dataset is found to give significantly improved performance. It is quite intuitive to think that most people will react only to events in the fairly recent past. Events related to telecommunications occurring months or years ago are unlikely to have a large impact on a customers future behaviour, and these results bear this out. Methods that deal with sequential data also tend to be much more complex than those dealing with simple non-temporal data, giving an added benefit to expressing the recent information in a non-sequential manner.

signal processing systems | 2007

The Dynamics of Negative Correlation Learning

Mark Eastwood; Bogdan Gabrys

In this paper we combine two points made in two previous papers on negative correlation learning (NC) by different authors, which have theoretical implications for the optimal setting of λ, a parameter of the method whose correct choice is critical for stability and good performance. An expression for the optimal λ is derived whose value λ* depends only on the number of classifiers in the ensemble. This result arises from the form of the ambiguity decomposition of the ensemble error, and the close links between this and the error function used in NC. By analyzing the dynamics of the outputs we find dramatically different behavior for λ < λ*, λ = λ* and λ > λ*, providing further motivation for our choice of λ and theoretical explanations for some empirical observations in other papers on NC. These results will be illustrated using well known synthetic and medical datasets.

intelligent data analysis | 2014

From Sensor Readings to Predictions: On the Process of Developing Practical Soft Sensors

Marcin Budka; Mark Eastwood; Bogdan Gabrys; Petr Kadlec; Manuel Martin Salvador; Stephanie Schwan; Athanasios Tsakonas; Indrė Žliobaitė

Automatic data acquisition systems provide large amounts of streaming data generated by physical sensors. This data forms an input to computational models (soft sensors) routinely used for monitoring and control of industrial processes, traffic patterns, environment and natural hazards, and many more. The majority of these models assume that the data comes in a cleaned and pre-processed form, ready to be fed directly into a predictive model. In practice, to ensure appropriate data quality, most of the modelling efforts concentrate on preparing data from raw sensor readings to be used as model inputs. This study analyzes the process of data preparation for predictive models with streaming sensor data. We present the challenges of data preparation as a four-step process, identify the key challenges in each step, and provide recommendations for handling these issues. The discussion is focused on the approaches that are less commonly used, while, based on our experience, may contribute particularly well to solving practical soft sensor tasks. Our arguments are illustrated with a case study in the chemical production industry.

Neurocomputing | 2014

Evaluation of hyperbox neural network learning for classification

Mark Eastwood; Chrisina Jayne

This paper evaluates the performance of a number of novel extensions of the hyperbox neural network algorithm, a method which uses different modes of learning for supervised classification problems. One hyperbox per class is defined that covers the full range of attribute values in the class. Each hyperbox has one or more neurons associated with it, which model the class distribution. During prediction, points falling into only one hyperbox can be classified immediately, with the neural outputs used only when points lie in overlapping regions of hyperboxes. Decomposing the learning problem into easier and harder regions allows extremely efficient classification. We introduce an unsupervised clustering stage in each hyperbox followed by supervised learning of a neuron per cluster. Both random and heuristic-driven initialisation of the cluster centres and initial weight vectors are considered. We also consider an adaptive activation function for use in the neural mode. The performance and computational efficiency of the hyperbox methods is evaluated on artificial datasets and publically available real datasets and compared with results obtained on the same datasets using Support Vector Machine, Decision tree, K-nearest neighbour, and Multilayer Perceptron (with Back Propagation) classifiers. We conclude that the method is competitively performing, computationally efficient and provide recommendations for best usage of the method based on results on artificial datasets, and evaluation of sensitivity to initialisation.

Expert Systems With Applications | 2012

Generalised bottom-up pruning

Mark Eastwood; Bogdan Gabrys

Highlights? Bottom-up pruning is extended to multiple tree context. ? Suitable pruning criteria are proposed. ? Method is tested on a number of UCI datasets. ? Method produces single trees with good performance/compactness. ? Applied to a churn prediction problem, interpretable trees were produced. A generalisation of bottom-up pruning is proposed as a model level combination method for a decision tree ensemble. Bottom up pruning on a single tree involves choosing between a subtree rooted at a node, and a leaf, dependant on a pruning criterion. A natural extension to an ensemble of trees is to allow subtrees from other ensemble trees to be grafted onto a node in addition to the operations of pruning to a leaf and leaving the existing subtree intact. Suitable pruning criteria are proposed and tested for this multi-tree pruning context. Gains in both performance and in particular compactness over individually pruned trees are observed in tests performed on a number of datasets from the UCI database. The method is further illustrated on a churn prediction problem in the telecommunications domain.

international conference on data mining | 2011

Interpretable, Online Soft-Sensors for Process Control

Mark Eastwood; Petr Kadlec

When building a soft sensor for control purposes, it is essential that information regarding the dependence of the soft sensor on the input variables can be extracted from the underlying model. We present an online, adaptive soft sensor with the capability of providing online feedback regarding the dependence of the soft sensor on input variables through an online contribution plot. Two core methods (recursive PLS and adaptive decision trees) producing highly interpretable models are used within a modification of a previously established soft-sensor framework. This framework is used to build a soft sensor on real-world industrial data.

international symposium on neural networks | 2014

Dual Deep Neural Network approach to matching data in different modes

Mark Eastwood; Chrisina Jayne

This paper investigates the application of a novel Deep Neural Network (DNN) architecture to the problem of matching data in different modes. Initially one DNN is pre-trained as a feature extractor using several stacked Restricted Boltzmann Machine (RBM) blocks on the entire training data using unsupervised learning. This DNN is duplicated and each net is fine-tuned by training on the data represented in a specific mode using supervised learning. The target of each DNN is linked to the output from the other DNN thus ensuring matching features are learnt which are adjusted to take differing representation into account. These features are used with some distance metric to determine matches. The expected benefit of this approach is utilizing the capability of DNN to learn higher level features which can better capture the information contained in the input datas structure, while ensuring the differences in data representation are accounted for. The architecture is applied to the problem of matching faces and sketches and the results compared to traditional approaches employing Principal Component Analysis (PCA) or Linear Discriminant Analysis (LDA).

international conference on neural information processing | 2016

Selective Dropout for Deep Neural Networks

Erik Barrow; Mark Eastwood; Chrisina Jayne

Dropout has been proven to be an effective method for reducing overfitting in deep artificial neural networks. We present 3 new alternative methods for performing dropout on a deep neural network which improves the effectiveness of the dropout method over the same training period. These methods select neurons to be dropped through statistical values calculated using a neurons change in weight, the average size of a neuron’s weights, and the output variance of a neuron. We found that increasing the probability of dropping neurons with smaller values of these statistics and decreasing the probability of those with larger statistics gave an improved result in training over 10,000 epochs. The most effective of these was found to be the Output Variance method, giving an average improvement of 1.17 % accuracy over traditional dropout methods.

international conference on neural information processing | 2015

Deep Dropout Artificial Neural Networks for Recognising Digits and Characters in Natural Images

Erik Barrow; Chrisina Jayne; Mark Eastwood

Recognising images using computers is a traditionally hard problem in computing, and one that becomes particularly difficult when these images are from the real world due to the large variations in them. This paper investigates the problem of recognising digits and characters in natural images using a deep neural network approach. The experiments explore the utilisation of a recently introduced dropout method which reduces overfitting. A number of different configuration networks are trained. It is found that the majority of networks give better accuracy when trained using the dropout method. This indicates that dropout is an effective method to improve training of deep neural networks on the application of recognising natural images of digits and characters.

international symposium on neural networks | 2013

Restricted Boltzmann machines for pre-training deep Gaussian networks

Mark Eastwood; Chrisina Jayne

A Restricted Boltzmann Machine (RBM) is proposed with an energy function which we show results in hidden node activation probabilities which match the activation rule of neurons in a Gaussian synapse neural network. This makes the proposed RBM a potential tool in pre-training a Gaussian synapse network with a deep architecture, in a similar way to how RBMs have been used in a greedy layer wise pre-training procedure for deep neural networks with scalar synapses. Using experimental examples, we investigate the training characteristics of this form of RBM and discuss its suitability for pre-training of a deep Gaussian synapse network. While this is the most direct route to a deep Gaussian synapse network, we explain and discuss a number of issues found in using the proposed form of RBM in this way, and suggest possible soutions.

Explore More