Jerzy Stefanowski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jerzy Stefanowski is active.

Explore More

Publication

Featured researches published by Jerzy Stefanowski.

computational intelligence | 2001

Incomplete Information Tables and Rough Classification

Jerzy Stefanowski; Alexis Tsoukiàs

The rough set theory, based on the original definition of the indiscernibility relation, is not useful for analysing incomplete information tables where some values of attributes are unknown. In this paper we distinguish two different semantics for incomplete information: the “missing value” semantics and the “absent value” semantics. The already known approaches, e.g. based on the tolerance relations, deal with the missing value case. We introduce two generalisations of the rough sets theory to handle these situations. The first generalisation introduces the use of a non symmetric similarity relation in order to formalise the idea of absent value semantics. The second proposal is based on the use of valued tolerance relations. A logical analysis and the computational experiments show that for the valued tolerance approach it is possible to obtain more informative approximations and decision rules than using the approach based on the simple tolerance relation.

soft computing | 1999

On the Extension of Rough Sets under Incomplete Information

Jerzy Stefanowski; Alexis Tsoukiàs

The rough set theory, based on the conventional indiscernibility relation, is not useful for analysing incomplete information. We introduce two generalizations of this theory. The first proposal is based on non symmetric similarity relations, while the second one uses valued tolerance relation. Both approaches provide more informative results than the previously known approach employing simple tolerance relation.

intelligent information systems | 2004

Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition

Stanis law Osiński; Jerzy Stefanowski; Dawid Weiss

Search results clustering problem is defined as an automatic, on-line grouping of similar documents in a search results list returned from a search engine. In this paper we present Lingo—a novel algorithm for clustering search results, which emphasizes cluster description quality. We describe methods used in the algorithm: algebraic transformations of the term-document matrix and frequent phrase extraction using suffix arrays. Finally, we discuss results acquired from an empirical evaluation of the algorithm.

IEEE Transactions on Neural Networks | 2014

Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm

Dariusz Brzezinski; Jerzy Stefanowski

Data stream mining has been receiving increased attention due to its presence in a wide range of applications, such as sensor networks, banking, and telecommunication. One of the most important challenges in learning from data streams is reacting to concept drift, i.e., unforeseen changes of the streams underlying data distribution. Several classification algorithms that cope with concept drift have been put forward, however, most of them specialize in one type of change. In this paper, we propose a new data stream classifier, called the Accuracy Updated Ensemble (AUE2), which aims at reacting equally well to different types of drift. AUE2 combines accuracy-based weighting mechanisms known from block-based ensembles with the incremental nature of Hoeffding Trees. The proposed algorithm is experimentally compared with 11 state-of-the-art stream methods, including single classifiers, block-based and online ensembles, and hybrid approaches in different drift scenarios. Out of all the compared algorithms, AUE2 provided best average classification accuracy while proving to be less memory consuming than other ensemble approaches. Experimental results show that AUE2 can be considered suitable for scenarios, involving many types of drift as well as static environments.

Lecture Notes in Computer Science | 1998

ROSE - Software Implementation of the Rough Set Theory

Bartłomiej Prędki; Roman Słowiński; Jerzy Stefanowski; Robert Susmaga; Szymon Wilk

This paper briefly describes ROSE software package. It is an interactive, modular system designed for analysis and knowledge discovery based on rough set theory in 32-bit operating systems on PC computers. It implements classical rough set theory as well as its extension based on variable precision model. It includes generation of decision rules for classification systems and knowledge discovery.

multiple criteria decision making | 1989

Rough classification in incomplete information systems

Roman Sowiski; Jerzy Stefanowski

The paper is concerned with the problems of rough sets theory and rough classification of objects. It is a new approach to problems from the field of decision-making, data analysis, knowledge representation, expert systems etc. Several applications (particularly in medical diagnosis and engineering control) confirm the usefulness of the rough sets idea. Rough classification concerns objects described by multiple attributes in a so-called information system. Traditionally, the information system is assumed to be complete, i.e. the descriptors are not missing and are supposed to be precise. In this paper we investigate the case of incomplete information systems, and present a generalization of the rough sets approach which deals with missing and imprecise descriptors.

Lecture Notes in Computer Science | 2000

Variable Consistency Model of Dominance-Based Rough Sets Approach

Salvatore Greco; Benedetto Matarazzo; Roman Słowiński; Jerzy Stefanowski

Consideration of preference-orders requires the use of an extended rough set model called Dominance-based Rough Set Approach (DRSA). The rough approximations defined within DRSA are based on consistency in the sense of dominance principle. It requires that objects having not-worse evaluation with respect to a set of considered criteria than a referent object cannot be assigned to a worse class than the referent object. However, some inconsistencies may decrease the cardinality of lower approximations to such an extent that it is impossible to discover strong patterns in the data, particularly when data sets are large. Thus, a relaxation of the strict dominance principle is worthwhile. The relaxation introduced in this paper to the DRSA model admits some inconsistent objects to the lower approximations; the range of this relaxation is controlled by an index called consistency level. The resulting model is called variable-consistency model (VC-DRSA). We concentrate on the new definitions of rough approximations and their properties, and we propose a new syntax of decision rules characterized by a confidence degree not less than the consistency level. The use of VC-DRSA is illustrated by an example of customer satisfaction analysis referring to an airline company.

Sigkdd Explorations | 2014

Open challenges for data stream mining research

Georg Krempl; Indre Žliobaite; Dariusz Brzezinski; Eyke Hüllermeier; Vincent Lemaire; Tino Noack; Ammar Shaker; Sonja Sievi; Myra Spiliopoulou; Jerzy Stefanowski

Every day, huge volumes of sensory, transactional, and web data are continuously generated as streams, which need to be analyzed online as they arrive. Streaming data can be considered as one of the main sources of what is called big data. While predictive modeling for data streams and big data have received a lot of attention over the last decade, many research approaches are typically designed for well-behaved controlled problem settings, overlooking important challenges imposed by real-world applications. This article presents a discussion on eight open challenges for data stream mining. Our goal is to identify gaps between current research and meaningful applications, highlight open problems, and define new application-relevant research directions for data stream mining. The identified challenges cover the full cycle of knowledge discovery and involve such problems as: protecting data privacy, dealing with legacy systems, handling incomplete and delayed information, analysis of complex data, and evaluation of stream mining algorithms. The resulting analysis is illustrated by practical applications and provides general suggestions concerning lines of future research in data stream mining.

Lecture Notes in Computer Science | 2000

An Algorithm for Induction of Decision Rules Consistent with the Dominance Principle

Salvatore Greco; Benedetto Matarazzo; Roman Słowiński; Jerzy Stefanowski

Induction of decision rules within the dominance-based rough set approach to the multiple-criteria sorting decision problem is discussed in this paper. We intoduce an algorithm called DOMLEM that induces a minimal set of generalized decision rules consistent with the dominance principle. An extension of this algorithm for a variable consistency model of dominance based rough set approach is also presented.

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing | 2010

Learning from imbalanced data in presence of noisy and borderline examples

Krystyna Napierala; Jerzy Stefanowski; Szymon Wilk

In this paper we studied re-sampling methods for learning classifiers from imbalanced data. We carried out a series of experiments on artificial data sets to explore the impact of noisy and borderline examples from the minority class on the classifier performance. Results showed that if data was sufficiently disturbed by these factors, then the focused re-sampling methods - NCR and our SPIDER2 - strongly outperformed the oversampling methods. They were also better for real-life data, where PCA visualizations suggested possible existence of noisy examples and large overlapping ares between classes.

Explore More