Bogdan Gabrys | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bogdan Gabrys is active.

Explore More

Publication

Featured researches published by Bogdan Gabrys.

Computers & Chemical Engineering | 2009

Data-driven Soft Sensors in the process industry

Petr Kadlec; Bogdan Gabrys; Sibylle Strandt

In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work.

Information Fusion | 2005

Classifier selection for majority voting

Dymitr Ruta; Bogdan Gabrys

Abstract Individual classification models are recently challenged by combined pattern recognition systems, which often show better performance. In such systems the optimal set of classifiers is first selected and then combined by a specific fusion method. For a small number of classifiers optimal ensembles can be found exhaustively, but the burden of exponential complexity of such search limits its practical applicability for larger systems. As a result, simpler search algorithms and/or selection criteria are needed to reduce the complexity. This work provides a revision of the classifier selection methodology and evaluates the practical applicability of diversity measures in the context of combining classifiers by majority voting. A number of search algorithms are proposed and adjusted to work properly with a number of selection criteria including majority voting error and various diversity measures. Extensive experiments carried out with 15 classifiers on 27 datasets indicate inappropriateness of diversity measures used as selection criteria in favour of the direct combiner error based search. Furthermore, the results prompted a novel design of multiple classifier systems in which selection and fusion are recurrently applied to a population of best combinations of classifiers rather than the individual best. The improvement of the generalisation performance of such system is demonstrated experimentally.

IEEE Transactions on Neural Networks | 2000

General fuzzy min-max neural network for clustering and classification

Bogdan Gabrys; Andrzej Bargiela

This paper describes a general fuzzy min-max (GFMM) neural network which is a generalization and extension of the fuzzy min-max clustering and classification algorithms developed by Simpson. The GFMM method combines the supervised and unsupervised learning within a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. This hybrid system exhibits an interesting property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes through the data and consists of placing and adjusting the hyperboxes in the pattern space which is referred to as an expansion-contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations. A detailed account of the GFMM neural network, its comparison with the Simpsons fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given.

Computers & Chemical Engineering | 2011

Review of adaptation mechanisms for data-driven soft sensors

Petr Kadlec; Ratko Grbić; Bogdan Gabrys

Abstract In this article, we review and discuss algorithms for adaptive data-driven soft sensing. In order to be able to provide a comprehensive overview of the adaptation techniques, adaptive soft sensing methods are reviewed from the perspective of machine learning theory for adaptive learning systems. In particular, the concept drift theory is exploited to classify the algorithms into three different types, which are: (i) moving windows techniques; (ii) recursive adaptation techniques; and (iii) ensemble-based methods. The most significant algorithms are described in some detail and critically reviewed in this work. We also provide a comprehensive list of publications where adaptive soft sensors were proposed and applied to practical problems. Furthermore in order to enable the comparison of different methods to standard soft sensor applications, a list of publicly available data sets for the development of data-driven soft sensors is presented.

Applied Soft Computing | 2006

Genetic algorithms in classifier fusion

Bogdan Gabrys; Dymitr Ruta

An intense research around classifier fusion in recent years revealed that combining performance strongly depends on careful selection of classifiers to be combined. Classifier performance depends, in turn, on careful selection of features, which could be further restricted by the subspaces of the data domain. On the other hand, there is already a number of classifier fusion techniques available and the choice of the most suitable method depends back on the selections made within classifier, features and data spaces. In all these multidimensional selection tasks genetic algorithms (GA) appear to be one of the most suitable techniques providing reasonable balance between searching complexity and the performance of the solutions found. In this work, an attempt is made to revise the capability of genetic algorithms to be applied to selection across many dimensions of the classifier fusion process including data, features, classifiers and even classifier combiners. In the first of the discussed models the potential for combined classification improvement by GA-selected weights for the soft combining of classifier outputs has been investigated. The second of the proposed models describes a more general system where the specifically designed GA is applied to selection carried out simultaneously along many dimensions of the classifier fusion process. Both, the weighted soft combiners and the prototype of the three-dimensional fusion-classifier-feature selection model have been developed and tested using typical benchmark datasets and some comparative experimental results are also presented.

Neurocomputing | 2010

Meta-learning for time series forecasting and forecast combination

Christiane Lemke; Bogdan Gabrys

In research of time series forecasting, a lot of uncertainty is still related to the task of selecting an appropriate forecasting method for a problem. It is not only the individual algorithms that are available in great quantities; combination approaches have been equally popular in the last decades. Alone the question of whether to choose the most promising individual method or a combination is not straightforward to answer. Usually, expert knowledge is needed to make an informed decision, however, in many cases this is not feasible due to lack of resources like time, money and manpower. This work identifies an extensive feature set describing both the time series and the pool of individual forecasting methods. The applicability of different meta-learning approaches are investigated, first to gain knowledge on which model works best in which situation, later to improve forecasting performance. Results show the superiority of a ranking-based combination of methods over simple model selection approaches.

International Journal of Approximate Reasoning | 2002

Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems

Bogdan Gabrys

An approach to dealing with missing data, both during the design and normal operation of a neuro-fuzzy classifier is presented in this paper. Missing values are processed within a general fuzzy min?max neural network architecture utilising hyperbox fuzzy sets as input data cluster prototypes. An emphasis is put on ways of quantifying the uncertainty which missing data might have caused. This takes a form of classification procedure whose primary objective is the reduction of a number of viable alternatives rather than attempting to produce one winning class without supporting evidence. If required, the ways of selecting the most probable class among the viable alternatives found during the primary classification step, which are based on utilising the data frequency information, are also proposed. The reliability of the classification and the completeness of information is communicated by producing upper and lower classification membership values similar in essence to plausibility and belief measures to be found in the theory of evidence or possibility and necessity values to be found in the fuzzy sets theory. Similarities and differences between the proposed method and various fuzzy, neuro-fuzzy and probabilistic algorithms are also discussed. A number of simulation results for well-known data sets are provided in order to illustrate the properties and performance of the proposed approach.

International Journal of Approximate Reasoning | 2004

Combining labelled and unlabelled data in the design of pattern classification systems

Bogdan Gabrys; Lina Petrakieva

There has been much interest in applying techniques that incorporate knowledge from unlabelled data into a supervised learning system but less effort has been made to compare the effectiveness of different approaches and to analyse the behaviour of the learning system when using different ratios of labelled to unlabelled data. In this paper various methods for learning from labelled and unlabelled data are first discussed and categorised into one of three major groups: pre-labelling, post-labelling and semi-supervised approaches. Their generalised formal description and extensive experimental analysis is then provided. The experimental results show that when supported by unlabelled samples much less labelled data is generally required to build a classifier without compromising the classification performance. If only a very limited amount of labelled data is available the results based on random selection of labelled samples show high variability and the performance of the final classifier is more dependent on how reliable the labelled data samples are rather than use of additional unlabelled data. In response to this finding three types of static (one-step) selection methods guided by a clustering information and various options of allocating a number of samples within clusters and their distributions have been proposed and analysed. A significant improvement compared to the random selection of the labelled samples have been observed when using these selective sampling techniques.

Pattern Analysis and Applications | 2002

A Theoretical Analysis of the Limits of Majority Voting Errors for Multiple Classifier Systems

Dymitr Ruta; Bogdan Gabrys

Abstract: A robust character of combining diverse classifiers using a majority voting has recently been illustrated in the pattern recognition literature. Furthermore, negatively correlated classifiers turned out to offer further improvement of the majority voting performance even comparing to the idealised model with independent classifiers. However, negatively correlated classifiers represent a very unlikely situation in real-world classification problems, and their benefits usually remain out of reach. Nevertheless, it is theoretically possible to obtain a 0% majority voting error using a finite number of classifiers at error levels lower than 50%. We attempt to show that structuring classifiers into relevant multistage organisations can widen this boundary, as well as the limits of majority voting error, even more. Introducing discrete error distributions for analysis, we show how majority voting errors and their limits depend upon the parameters of a multiple classifier system with hardened binary outputs (correct/incorrect). Moreover, we investigate the sensitivity of boundary distributions of classifier outputs to small discrepancies modelled by the random changes of votes, and propose new more stable patterns of boundary distributions. Finally, we show how organising classifiers into different structures can be used to widen the limits of majority voting errors, and how this phenomenon can be effectively exploited.

Artificial Intelligence Review | 2015

Metalearning: a survey of trends and technologies

Christiane Lemke; Marcin Budka; Bogdan Gabrys

Metalearning attracted considerable interest in the machine learning community in the last years. Yet, some disagreement remains on what does or what does not constitute a metalearning problem and in which contexts the term is used in. This survey aims at giving an all-encompassing overview of the research directions pursued under the umbrella of metalearning, reconciling different definitions given in scientific literature, listing the choices involved when designing a metalearning system and identifying some of the future research challenges in this domain.

Explore More