Dimitris K. Tasoulis
Imperial College London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dimitris K. Tasoulis.
IEEE Transactions on Evolutionary Computation | 2011
Michael G. Epitropakis; Dimitris K. Tasoulis; Nicos G. Pavlidis; Vassilis P. Plagianakos; Michael N. Vrahatis
Differential evolution is a very popular optimization algorithm and considerable research has been devoted to the development of efficient search operators. Motivated by the different manner in which various search operators behave, we propose a novel framework based on the proximity characteristics among the individual solutions as they evolve. Our framework incorporates information of neighboring individuals, in an attempt to efficiently guide the evolution of the population toward the global optimum, without sacrificing the search capabilities of the algorithm. More specifically, the random selection of parents during mutation is modified, by assigning to each individual a probability of selection that is inversely proportional to its distance from the mutated individual. The proposed framework can be applied to any mutation strategy with minimal changes. In this paper, we incorporate this framework in the original differential evolution algorithm, as well as other recently proposed differential evolution variants. Through an extensive experimental study, we show that the proposed framework results in enhanced performance for the majority of the benchmark problems studied.
Pattern Recognition Letters | 2012
Gordon J. Ross; Niall M. Adams; Dimitris K. Tasoulis; David J. Hand
Classifying streaming data requires the development of methods which are computationally efficient and able to cope with changes in the underlying distribution of the stream, a phenomenon known in the literature as concept drift. We propose a new method for detecting concept drift which uses an exponentially weighted moving average (EWMA) chart to monitor the misclassification rate of an streaming classifier. Our approach is modular and can hence be run in parallel with any underlying classifier to provide an additional layer of concept drift detection. Moreover our method is computationally efficient with overhead O(1) and works in a fully online manner with no need to store data points in memory. Unlike many existing approaches to concept drift detection, our method allows the rate of false positive detections to be controlled and kept constant over time.
Technometrics | 2011
Gordon J. Ross; Dimitris K. Tasoulis; Niall M. Adams
The analysis of data streams requires methods which can cope with a very high volume of data points. Under the requirement that algorithms must have constant computational complexity and a fixed amount of memory, we develop a framework for detecting changes in data streams when the distributional form of the stream variables is unknown. We consider the general problem of detecting a change in the location and/or scale parameter of a stream of random variables, and adapt several nonparametric hypothesis tests to create a streaming change detection algorithm. This algorithm uses a test statistic with a null distribution independent of the data. This allows a desired rate of false alarms to be maintained for any stream even when its distribution is unknown. Our method is based on hypothesis tests which involve ranking data points, and we propose a method for calculating these ranks online in a manner which respects the constraints of data stream analysis.
Archive | 2008
Vassilis P. Plagianakos; Dimitris K. Tasoulis; Michael N. Vrahatis
In this chapter we present an overview of the major applications areas of differential evolution. In particular we pronounce the strengths of DE algorithms in tackling many difficult problems from diverse scientific areas, including single and multiobjective function optimization, neural network training, clustering, and real life DNA microarray classification. To improve the speed and performance of the algorithm we employ distributed computing architectures and demonstrate how parallel, multi–population DE architectures can be utilised in single and multiobjective optimization. Using data mining we present a methodology that allows the simultaneous discovery of multiple local and global minimizers of an objective function. At a next step we present applications of DE in real life problems including the training of integer weight neural networks and the selection of genes of DNA microarrays in order to boost predictive accuracy of classification models. The chapter concludes with a discussion on promising future extensions of the algorithm, and presents novel mutation operators, that are the result of a genetic programming procedure, as very interesting future research direction.
Statistical Analysis and Data Mining | 2012
Christoforos Anagnostopoulos; Dimitris K. Tasoulis; Niall M. Adams; Nicos G. Pavlidis; David J. Hand
Advances in data technology have enabled streaming acquisition of real-time information in a wide range of settings, including consumer credit, electricity consumption, and internet user behavior. Streaming data consist of transiently observed, temporally evolving data sequences, and poses novel challenges to statistical analysis. Foremost among these challenges are the need for online processing, and temporal adaptivity in the face of unforeseen changes, both smooth and abrupt, in the underlying data generation mechanism. In this paper, we develop streaming versions of two widely used parametric classifiers, namely quadratic and linear discriminant analysis. We rely on computationally efficient, recursive formulations of these classifiers. We additionally equip them with exponential forgetting factors that enable temporal adaptivity via smoothly down-weighting the contribution of older data. Drawing on ideas from adaptive filtering, we develop an online method for self-tuning forgetting factors on the basis of an approximate gradient scheme. We provide extensive simulation and real data analysis that demonstrate the effectiveness of the proposed method in handling diverse types of change, while simultaneously offering monitoring capabilities via interpretable behavior of the adaptive forgetting factors.
Pattern Recognition | 2011
Nicos G. Pavlidis; Dimitris K. Tasoulis; Niall M. Adams; David J. Hand
Streaming data introduce challenges mainly due to changing data distributions (population drift). To accommodate population drift we develop a novel linear adaptive online classification method motivated by ideas from adaptive filtering. Our approach allows the impact of past data on parameter estimates to be gradually removed, a process termed forgetting, yielding completely online adaptive algorithms. Extensive experimental results show that this approach adjusts the forgetting mechanism to maintain performance. Moreover, it might be possible to exploit the information in the evolution of the forgetting mechanism to obtain information about the type and speed of the underlying population drift process.
Pattern Recognition | 2010
Sotiris K. Tasoulis; Dimitris K. Tasoulis; Vassilis P. Plagianakos
While data clustering has a long history and a large amount of research has been devoted to the development of numerous clustering techniques, significant challenges still remain. One of the most important of them is associated with high data dimensionality. A particular class of clustering algorithms has been very successful in dealing with such datasets, utilising information driven by the principal component analysis. In this work, we try to deepen our understanding on what can be achieved by this kind of approaches. We attempt to theoretically discover the relationship between true clusters in the data and the distribution of their projection onto the principal components. Based on such findings, we propose appropriate criteria for the various steps involved in hierarchical divisive clustering and develop compilations of them into new algorithms. The proposed algorithms require minimal user-defined parameters and have the desirable feature of being able to provide approximations for the number of clusters present in the data. The experimental results indicate that the proposed techniques are effective in simulated as well as real data scenarios.
international conference on data mining | 2006
Dimitris K. Tasoulis; Niall M. Adams; David J. Hand
Tools for automatically clustering streaming data are becoming increasingly important as data acquisition technology continues to advance. In this paper we present an extension of conventional kernel density clustering to a spatio-temporal setting, and also develop a novel algorithmic scheme for clustering data streams. Experimental results demonstrate both the high efficiency and other benefits of this new approach
intelligent data analysis | 2007
Dimitris K. Tasoulis; Gordon J. Ross; Niall M. Adams
The increasing availability of streaming data is a consequence of the continuing advancement of data acquisition technology. Such data provides new challenges to the various data analysis communities. Clustering has long been a fundamental procedure for acquiring knowledge from data, and new tools are emerging that allow the clustering of data streams. However the dynamic, temporal components of streaming data provide extra challenges to the development of stream clustering and associated visualisation techniques. In this work we combine a streaming clustering framework with an extension of a static cluster visualisation method, in order to construct a surface that graphically represents the clustering structure of the data stream. The proposed method, OpticsStream, provides intuitive representations of the clustering structure as well as the manner in which this structure changes through time.
Journal of the Operational Research Society | 2012
Nicos G. Pavlidis; Dimitris K. Tasoulis; Niall M. Adams; David J. Hand
Credit scoring methods for predicting creditworthiness have proven very effective in consumer finance. In light of the present financial crisis, such methods will become even more important. One of the outstanding issues in credit risk classification is population drift. This term refers to changes occurring in the population due to unexpected changes in economic conditions and other factors. In this paper, we propose a novel methodology for the classification of credit applications that has the potential to adapt to population drift as it occurs. This provides the opportunity to update the credit risk classifier as new labelled data arrives. Assorted experimental results suggest that the proposed method has the potential to yield significant performance improvement over standard approaches, without sacrificing the classifiers descriptive capabilities.