Niall M. Adams | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Niall M. Adams is active.

Explore More

Publication

Featured researches published by Niall M. Adams.

knowledge discovery and data mining | 1999

The impact of changing populations on classifier performance

Mark Kelly; David J. Hand; Niall M. Adams

An assumption fundamental to almost all work on supervised classification is that the probabilities of class membership, conditional on the feature vectors, are stationary. However, in many situations this assumption is untenable. We give examples of such population drift, examine its nature, show how the impact of population drift depends on the chosen measure of classification performance, and propose a strategy for dynamically updating classification rules.

Pattern Recognition | 1999

Comparing classifiers when the misallocation costs are uncertain

Niall M. Adams; David J. Hand

Receiver Operating Characteristic (ROC) curves are popular ways of summarising the performance of two class classification rules. In fact, however, they are extremely inconvenient. If the relative severity of the two different kinds of misclassification is known, then an awkward projection operation is required to deduce the overall loss. At the other extreme, when the relative severity is unknown, the area under an ROC curve is often used as an index of performance. However, this essentially assumes that nothing whatsoever is known about the relative severity – a situation which is very rare in real problems. We present an alternative plot which is more revealing than an ROC plot and we describe a comparative index which allows one to take advantage of anything that may be known about the relative severity of the two kinds of misclassification.

Pattern Recognition Letters | 2012

Exponentially weighted moving average charts for detecting concept drift

Gordon J. Ross; Niall M. Adams; Dimitris K. Tasoulis; David J. Hand

Classifying streaming data requires the development of methods which are computationally efficient and able to cope with changes in the underlying distribution of the stream, a phenomenon known in the literature as concept drift. We propose a new method for detecting concept drift which uses an exponentially weighted moving average (EWMA) chart to monitor the misclassification rate of an streaming classifier. Our approach is modular and can hence be run in parallel with any underlying classifier to provide an additional layer of concept drift detection. Moreover our method is computationally efficient with overhead O(1) and works in a fully online manner with no need to store data points in memory. Unlike many existing approaches to concept drift detection, our method allows the rate of false positive detections to be controlled and kept constant over time.

Technometrics | 2011

Nonparametric Monitoring of Data Streams for Changes in Location and Scale

Gordon J. Ross; Dimitris K. Tasoulis; Niall M. Adams

The analysis of data streams requires methods which can cope with a very high volume of data points. Under the requirement that algorithms must have constant computational complexity and a fixed amount of memory, we develop a framework for detecting changes in data streams when the distributional form of the stream variables is unknown. We consider the general problem of detecting a change in the location and/or scale parameter of a stream of random variables, and adapt several nonparametric hypothesis tests to create a streaming change detection algorithm. This algorithm uses a test statistic with a null distribution independent of the data. This allows a desired rate of false alarms to be maintained for any stream even when its distribution is unknown. Our method is based on hypothesis tests which involve ranking data points, and we propose a method for calculating these ranks online in a manner which respects the constraints of data stream analysis.

Data Mining and Knowledge Discovery | 2009

Transaction aggregation as a strategy for credit card fraud detection

Christopher Whitrow; David J. Hand; Piotr Juszczak; David John Weston; Niall M. Adams

The problem of preprocessing transaction data for supervised fraud classification is considered. It is impractical to present an entire series of transactions to a fraud detection system, partly because of the very high dimensionality of such data but also because of the heterogeneity of the transactions. Hence, a framework for transaction aggregation is considered and its effectiveness is evaluated against transaction-level detection, using a variety of classification methods and a realistic cost-based performance measure. These methods are applied in two case studies using real data. Transaction aggregation is found to be advantageous in many but not all circumstances. Also, the length of the aggregation period has a large impact upon performance. Aggregation seems particularly effective when a random forest is used for classification. Moreover, random forests were found to perform better than other classification methods, including SVMs, logistic regression and KNN. Aggregation also has the advantage of not requiring precisely labeled data and may be more robust to the effects of population drift.

Journal of Quality Technology | 2012

Two Nonparametric Control Charts for Detecting Arbitrary Distribution Changes

Geordon J. Ross; Niall M. Adams

Most traditional control charts used for sequential monitoring assume that full knowledge is available regarding the prechange distribution of the process. This assumption is unrealistic in many situations, where insufficient data are available to allow this distribution to be accurately estimated. This creates the need for nonparametric charts that do not assume any specific form for the process distribution, yet are able to maintain a specified level of performance regardless of its true nature. Although several nonparametric Phase II control charts have been developed, these are generally only able to detect changes in a location parameter, such as the mean or median, rather than more general changes. In this work, we present two distribution-free charts that can detect arbitrary changes to the process distribution during Phase II monitoring. Our charts are formed by integrating the omnibus Kolmogorov–Smirnov and Cramer—von-Mises tests into the widely researched change-point model framework.

intelligent data analysis | 2009

Advances in Intelligent Data Analysis VIII

Niall M. Adams; Céline Robardet; Arno Siebes; Jean-François Boulicaut

Invited Papers.- Intelligent Data Analysis in the 21st Century.- Analyzing the Localization of Retail Stores with Complex Systems Tools.- Selected Contributions 1 (Long Talks).- Change (Detection) You Can Believe in: Finding Distributional Shifts in Data Streams.- Exploiting Data Missingness in Bayesian Network Modeling.- DEMScale: Large Scale MDS Accounting for a Ridge Operator and Demographic Variables.- How to Control Clustering Results? Flexible Clustering Aggregation.- Compensation of Translational Displacement in Time Series Clustering Using Cross Correlation.- Context-Based Distance Learning for Categorical Data Clustering.- Semi-supervised Text Classification Using RBF Networks.- Improving k-NN for Human Cancer Classification Using the Gene Expression Profiles.- Subgroup Discovery for Test Selection: A Novel Approach and Its Application to Breast Cancer Diagnosis.- Trajectory Voting and Classification Based on Spatiotemporal Similarity in Moving Object Databases.- Leveraging Call Center Logs for Customer Behavior Prediction.- Condensed Representation of Sequential Patterns According to Frequency-Based Measures.- ART-Based Neural Networks for Multi-label Classification.- Two-Way Grouping by One-Way Topic Models.- Selecting and Weighting Data for Building Consensus Gene Regulatory Networks.- Incremental Bayesian Network Learning for Scalable Feature Selection.- Feature Extraction and Selection from Vibration Measurements for Structural Health Monitoring.- Zero-Inflated Boosted Ensembles for Rare Event Counts.- Selected Contributions 2 (Short Talks).- Mining the Temporal Dimension of the Information Propagation.- Adaptive Learning from Evolving Data Streams.- An Application of Intelligent Data Analysis Techniques to a Large Software Engineering Dataset.- Which Distance for the Identification and the Differentiation of Cell-Cycle Expressed Genes?.- Ontology-Driven KDD Process Composition.- Mining Frequent Gradual Itemsets from Large Databases.- Selecting Computer Architectures by Means of Control-Flow-Graph Mining.- Visualization-Driven Structural and Statistical Analysis of Turbulent Flows.- Distributed Algorithm for Computing Formal Concepts Using Map-Reduce Framework.- Multi-Optimisation Consensus Clustering.- Improving Time Series Forecasting by Discovering Frequent Episodes in Sequences.- Measure of Similarity and Compactness in Competitive Space.- Bayesian Solutions to the Label Switching Problem.- Efficient Vertical Mining of Frequent Closures and Generators.- Isotonic Classification Trees.

Journal of the Operational Research Society | 2008

Performance criteria for plastic card fraud detection tools

David J. Hand; Christopher Whitrow; Niall M. Adams; Piotr Juszczak; David John Weston

In predictive data mining, algorithms will be both optimized and compared using a measure of predictive performance. Different measures will yield different results, and it follows that it is crucial to match the measure to the true objectives. In this paper, we explore the desirable characteristics of measures for constructing and evaluating tools for mining plastic card data to detect fraud. We define two measures, one based on minimizing the overall cost to the card company, and the other based on minimizing the amount of fraud given the maximum number of investigations the card company can afford to make. We also describe a plot, analogous to the standard ROC, for displaying the performance trace of an algorithm as the relative costs of the two different kinds of misclassification—classing a fraudulent transaction as legitimate or vice versa—are varied.

Advanced Data Analysis and Classification | 2008

Plastic card fraud detection using peer group analysis

David John Weston; David J. Hand; Niall M. Adams; Christopher Whitrow; Piotr Juszczak

Peer group analysis is an unsupervised method for monitoring behaviour over time. In the context of plastic card fraud detection, this technique can be used to find anomalous transactions. These are transactions that deviate strongly from their peer group and are flagged as potentially fraudulent. Time alignment, the quality of the peer groups and the timeliness of assigning fraud flags to transactions are described. We demonstrate the ability to detect fraud using peer groups with real credit card transaction data and define a novel method for evaluating performance.

Urban Education | 2010

Reading Proficiency and Mathematics Problem Solving by High School English Language Learners

Carole R. Beal; Niall M. Adams; Paul R. Cohen

The study focused on the relationship of English proficiency and math performance in a sample of high school students, including 47% English language learners (ELLs). Data sources included state math test scores, study-specific pre- and posttest scores, problem solving in an online math tutorial, and responses to a self-report assessment of mathematics self-concept. English conversational and reading proficiency data were available for the ELLs. Results indicated that math performance for the ELLs increased with English-reading proficiency in a nonlinear manner. ELLs’ English-reading proficiency predicted math test scores, progress in the online math tutorial, and math self-concept.

Explore More