Krzysztof Dembczyński

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Krzysztof Dembczyński is active.

Explore More

Publication

Featured researches published by Krzysztof Dembczyński.

Machine Learning | 2012

On label dependence and loss minimization in multi-label classification

Krzysztof Dembczyński; Willem Waegeman; Weiwei Cheng; Eyke Hüllermeier

Most of the multi-label classification (MLC) methods proposed in recent years intended to exploit, in one way or the other, dependencies between the class labels. Comparing to simple binary relevance learning as a baseline, any gain in performance is normally explained by the fact that this method is ignoring such dependencies. Without questioning the correctness of such studies, one has to admit that a blanket explanation of that kind is hiding many subtle details, and indeed, the underlying mechanisms and true reasons for the improvements reported in experimental studies are rarely laid bare. Rather than proposing yet another MLC algorithm, the aim of this paper is to elaborate more closely on the idea of exploiting label dependence, thereby contributing to a better understanding of MLC. Adopting a statistical perspective, we claim that two types of label dependence should be distinguished, namely conditional and marginal dependence. Subsequently, we present three scenarios in which the exploitation of one of these types of dependence may boost the predictive performance of a classifier. In this regard, a close connection with loss minimization is established, showing that the benefit of exploiting label dependence does also depend on the type of loss to be minimized. Concrete theoretical results are presented for two representative loss functions, namely the Hamming loss and the subset 0/1 loss. In addition, we give an overview of state-of-the-art decomposition algorithms for MLC and we try to reveal the reasons for their effectiveness. Our conclusions are supported by carefully designed experiments on synthetic and benchmark data.

Information Sciences | 2008

Stochastic dominance-based rough set model for ordinal classification

Wojciech Kotłowski; Krzysztof Dembczyński; Salvatore Greco; Roman Słowiński

In order to discover interesting patterns and dependencies in data, an approach based on rough set theory can be used. In particular, dominance-based rough set approach (DRSA) has been introduced to deal with the problem of ordinal classification with monotonicity constraints (also referred to as multicriteria classification in decision analysis). However, in real-life problems, in the presence of noise, the notions of rough approximations were found to be excessively restrictive. In this paper, we introduce a probabilistic model for ordinal classification problems with monotonicity constraints. Then, we generalize the notion of lower approximations to the stochastic case. We estimate the probabilities with the maximum likelihood method which leads to the isotonic regression problem for a two-class (binary) case. The approach is easily generalized to a multi-class case. Finally, we show the equivalence of the variable consistency rough sets to the specific empirical risk-minimizing decision rule in the statistical decision theory.

European Journal of Operational Research | 2009

Rough set approach to multiple criteria classification with imprecise evaluations and assignments

Krzysztof Dembczyński; Salvatore Greco; Roman Słowiński

Dominance-based Rough Set Approach (DRSA) has been introduced to deal with multiple criteria classification (also called multiple criteria sorting, or ordinal classification with monotonicity constraints), where assignments of objects may be inconsistent with respect to dominance principle. In this paper, we consider an extension of DRSA to the context of imprecise evaluations of objects on condition criteria and imprecise assignments of objects to decision classes. The imprecisions are given in the form of intervals of possible values. In order to solve the problem, we reformulate the dominance principle and introduce second-order rough approximations. The presented methodology preserves well-known properties of rough approximations, such as rough inclusion, complementarity, identity of boundaries and precisiation. Moreover, the meaning of the precisiation property is extended to the considered case. The paper presents also a way to reduce decision tables and to induce decision rules from rough approximations.

Electronic Notes in Theoretical Computer Science | 2003

Generation of Exhaustive Set of Rules within Dominance-based Rough Set Approach

Krzysztof Dembczyński; Roman Pindur; Robert Susmaga

Abstract The rough sets theory has proved to be a useful mathematical tool for the analysis of a vague description of objects. One of extensions of the classic theory is the Dominance-based Set Approach (DRSA) that allows analysing preference-ordered data. The analysis ends with a set of decision rules induced from rough approximations of decision classes. The role of the decision rules is to explain the analysed phenomena, but they may also be applied in classifying new, unseen objects. There are several strategies of decision rule induction. One of them consists in generating the exhaustive set of minimal rules. In this paper we present an algorithm based on Boolean reasoning techniques that follows this strategy with in DRSA.

Machine Learning | 2012

Learning monotone nonlinear models using the Choquet integral

Ali Fallah Tehrani; Weiwei Cheng; Krzysztof Dembczyński; Eyke Hüllermeier

The learning of predictive models that guarantee monotonicity in the input variables has received increasing attention in machine learning in recent years. By trend, the difficulty of ensuring monotonicity increases with the flexibility or, say, nonlinearity of a model. In this paper, we advocate the so-called Choquet integral as a tool for learning monotone nonlinear models. While being widely used as a flexible aggregation operator in different fields, such as multiple criteria decision making, the Choquet integral is much less known in machine learning so far. Apart from combining monotonicity and flexibility in a mathematically sound and elegant manner, the Choquet integral has additional features making it attractive from a machine learning point of view. Notably, it offers measures for quantifying the importance of individual predictor variables and the interaction between groups of variables. Analyzing the Choquet integral from a classification perspective, we provide upper and lower bounds on its VC-dimension. Moreover, as a methodological contribution, we propose a generalization of logistic regression. The basic idea of our approach, referred to as choquistic regression, is to replace the linear function of predictor variables, which is commonly used in logistic regression to model the log odds of the positive class, by the Choquet integral. First experimental results are quite promising and suggest that the combination of monotonicity and flexibility offered by the Choquet integral facilitates strong performance in practical applications.

Electronic Notes in Theoretical Computer Science | 2003

Dominance-based Rough Set Classifier without Induction of Decision Rules

Krzysztof Dembczyński; Roman Pindur; Robert Susmaga

Abstract Rough Sets Theory is often applied to the task of classification and prediction, in which objects are assigned to some pre-defined decision classes. When the classes are preference-ordered, the process of classification is referred to as sorting. To deal with the specificity of sorting problems an extension of the Classic Rough Sets Approach, called the Dominance-based Rough Sets Approach, was introduced. The final result of the analysis is a set of decision rules induced from what is called rough approximations of decision classes. The main role of the induced decision rules is to discover regularities in the analyzed data set, but the same rules, when combined with a particular classification method, may also be used to classify/sort new objects (i.e. to assign the objects to appropriate classes). There exist many different rule induction strategies, including induction of an exhaustive set of rules. This strategy produces the most comprehensive knowledge base on the analyzed data set, but it requires a considerable amount of computing time, as the complexity of the process is exponential. In this paper we present a shortcut that allows classifying new objects without generating the rules. The presented approach bears some resemblance to the idea of lazy learning.

Data Mining and Knowledge Discovery | 2010

ENDER: a statistical framework for boosting decision rules

Krzysztof Dembczyński; Wojciech Kotłowski; Roman Słowiński

Induction of decision rules plays an important role in machine learning. The main advantage of decision rules is their simplicity and human-interpretable form. Moreover, they are capable of modeling complex interactions between attributes. In this paper, we thoroughly analyze a learning algorithm, called ENDER, which constructs an ensemble of decision rules. This algorithm is tailored for regression and binary classification problems. It uses the boosting approach for learning, which can be treated as generalization of sequential covering. Each new rule is fitted by focusing on examples which were the hardest to classify correctly by the rules already present in the ensemble. We consider different loss functions and minimization techniques often encountered in the boosting framework. The minimization techniques are used to derive impurity measures which control construction of single decision rules. Properties of four different impurity measures are analyzed with respect to the trade-off between misclassification (discrimination) and coverage (completeness) of the rule. Moreover, we consider regularization consisting of shrinking and sampling. Finally, we compare the ENDER algorithm with other well-known decision rule learners such as SLIPPER, LRI and RuleFit.

european conference on principles of data mining and knowledge discovery | 2007

Statistical model for rough set approach to multicriteria classification

Krzysztof Dembczyński; Salvatore Greco; Wojciech Kotłowski; Roman Słowiński

In order to discover interesting patterns and dependencies in data, an approach based on rough set theory can be used. In particular, Dominance-based Rough Set Approach (DRSA) has been introduced to deal with the problem of multicriteria classification. However, in real-life problems, in the presence of noise, the notions of rough approximations were found to be excessively restrictive, which led to the proposal of the Variable Consistency variant of DRSA. In this paper, we introduce a new approach to variable consistency that is based on maximum likelihood estimation. For two-class (binary) problems, it leads to the isotonic regression problem. The approach is easily generalized for the multi-class case. Finally, we show the equivalence of the variable consistency rough sets to the specific risk-minimizing decision rule in statistical decision theory.

european conference on artificial intelligence | 2012

An analysis of chaining in multi-label classification

Krzysztof Dembczyński; Willem Waegeman; Eyke Hüllermeier

The idea of classifier chains has recently been introduced as a promising technique for multi-label classification. However, despite being intuitively appealing and showing strong performance in empirical studies, still very little is known about the main principles underlying this type of method. In this paper, we provide a detailed probabilistic analysis of classifier chains from a risk minimization perspective, thereby helping to gain a better understanding of this approach. As a main result, we clarify that the original chaining method seeks to approximate the joint mode of the conditional distribution of label vectors in a greedy manner. As a result of a theoretical regret analysis, we conclude that this approach can perform quite poorly in terms of subset 0/1 loss. Therefore, we present an enhanced inference procedure for which the worst-case regret can be upper-bounded far more tightly. In addition, we show that a probabilistic variant of chaining, which can be utilized for any loss function, becomes tractable by using Monte Carlo sampling. Finally, we present experimental results confirming the validity of our theoretical findings.

international conference on machine learning | 2008

Maximum likelihood rule ensembles

Krzysztof Dembczyński; Wojciech Kotłowski; Roman Słowiński

We propose a new rule induction algorithm for solving classification problems via probability estimation. The main advantage of decision rules is their simplicity and good interpretability. While the early approaches to rule induction were based on sequential covering, we follow an approach in which a single decision rule is treated as a base classifier in an ensemble. The ensemble is built by greedily minimizing the negative loglikelihood which results in estimating the class conditional probability distribution. The introduced approach is compared with other decision rule induction algorithms such as SLIPPER, LRI and RuleFit.

Explore More