Eyke Hüllermeier | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eyke Hüllermeier is active.

Explore More

Publication

Featured researches published by Eyke Hüllermeier.

Machine Learning | 2008

Multilabel classification via calibrated label ranking

Johannes Fürnkranz; Eyke Hüllermeier; Eneldo Loza Mencía; Klaus Brinker

Label ranking studies the problem of learning a mapping from instances to rankings over a predefined set of labels. Hitherto existing approaches to label ranking implicitly operate on an underlying (utility) scale which is not calibrated in the sense that it lacks a natural zero point. We propose a suitable extension of label ranking that incorporates the calibrated scenario and substantially extends the expressive power of these approaches. In particular, our extension suggests a conceptually novel technique for extending the common learning by pairwise comparison approach to the multilabel scenario, a setting previously not being amenable to the pairwise decomposition technique. The key idea of the approach is to introduce an artificial calibration label that, in each example, separates the relevant from the irrelevant labels. We show that this technique can be viewed as a combination of pairwise preference learning and the conventional relevance classification technique, where a separate classifier is trained to predict whether a label is relevant or not. Empirical results in the area of text categorization, image classification and gene analysis underscore the merits of the calibrated model in comparison to state-of-the-art multilabel learning methods.

Artificial Intelligence | 2008

Label ranking by learning pairwise preferences

Eyke Hüllermeier; Johannes Fürnkranz; Weiwei Cheng; Klaus Brinker

Preference learning is an emerging topic that appears in different guises in the recent literature. This work focuses on a particular learning scenario called label ranking, where the problem is to learn a mapping from instances to rankings over a finite number of labels. Our approach for learning such a mapping, called ranking by pairwise comparison (RPC), first induces a binary preference relation from suitable training data using a natural extension of pairwise classification. A ranking is then derived from the preference relation thus obtained by means of a ranking procedure, whereby different ranking methods can be used for minimizing different loss functions. In particular, we show that a simple (weighted) voting strategy minimizes risk with respect to the well-known Spearman rank correlation. We compare RPC to existing label ranking methods, which are based on scoring individual labels instead of comparing pairs of labels. Both empirically and theoretically, it is shown that RPC is superior in terms of computational efficiency, and at least competitive in terms of accuracy.

european conference on machine learning | 2009

Combining instance-based learning and logistic regression for multilabel classification

Weiwei Cheng; Eyke Hüllermeier

Multilabel classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Recent research has shown that, just like for conventional classification, instance-based learning algorithms relying on the nearest neighbor estimation principle can be used quite successfully in this context. However, since hitherto existing algorithms do not take correlations and interdependencies between labels into account, their potential has not yet been fully exploited. In this paper, we propose a new approach to multilabel classification, which is based on a framework that unifies instance-based learning and logistic regression, comprising both methods as special cases. This approach allows one to capture interdependencies between labels and, moreover, to combine model-based and similarity-based inference for multilabel classification. As will be shown by experimental studies, our approach is able to improve predictive accuracy in terms of several evaluation criteria for multilabel prediction.

data and knowledge engineering | 2006

Online clustering of parallel data streams

Jürgen Beringer; Eyke Hüllermeier

In recent years, the management and processing of so-called data streams has become a topic of active research in several fields of computer science such as, e.g., distributed systems, database systems, and data mining. A data stream can roughly be thought of as a transient, continuously increasing sequence of time-stamped data. In this paper, we consider the problem of clustering parallel streams of real-valued data, that is to say, continuously evolving time series. In other words, we are interested in grouping data streams the evolution over time of which is similar in a specific sense. In order to maintain an up-to-date clustering structure, it is necessary to analyze the incoming data in an online manner, tolerating not more than a constant time delay. For this purpose, we develop an efficient online version of the classical K-means clustering algorithm. Our methods efficiency is mainly due to a scalable online transformation of the original data which allows for a fast computation of approximate distances between streams.

Data Mining and Knowledge Discovery | 2009

FURIA: an algorithm for unordered fuzzy rule induction

Jens Christian Hühn; Eyke Hüllermeier

This paper introduces a novel fuzzy rule-based classification method called FURIA, which is short for Fuzzy Unordered Rule Induction Algorithm. FURIA extends the well-known RIPPER algorithm, a state-of-the-art rule learner, while preserving its advantages, such as simple and comprehensible rule sets. In addition, it includes a number of modifications and extensions. In particular, FURIA learns fuzzy rules instead of conventional rules and unordered rule sets instead of rule lists. Moreover, to deal with uncovered examples, it makes use of an efficient rule stretching method. Experimental results show that FURIA significantly outperforms the original RIPPER, as well as other classifiers such as C4.5, in terms of classification accuracy.

Data Mining and Knowledge Discovery | 2006

A systematic approach to the assessment of fuzzy association rules

Didier Dubois; Eyke Hüllermeier; Henri Prade

In order to allow for the analysis of data sets including numerical attributes, several generalizations of association rule mining based on fuzzy sets have been proposed in the literature. While the formal specification of fuzzy associations is more or less straightforward, the assessment of such rules by means of appropriate quality measures is less obvious. Particularly, it assumes an understanding of the semantic meaning of a fuzzy rule. This aspect has been ignored by most existing proposals, which must therefore be considered as ad-hoc to some extent. In this paper, we develop a systematic approach to the assessment of fuzzy association rules. To this end, we proceed from the idea of partitioning the data stored in a database into examples of a given rule, counterexamples, and irrelevant data. Evaluation measures are then derived from the cardinalities of the corresponding subsets. The problem of finding a proper partition has a rather obvious solution for standard association rules but becomes less trivial in the fuzzy case. Our results not only provide a sound justification for commonly used measures but also suggest a means for constructing meaningful alternatives.

european conference on machine learning | 2003

Pairwise preference learning and ranking

Johannes Fürnkranz; Eyke Hüllermeier

We consider supervised learning of a ranking function, which is a mapping from instances to total orders over a set of labels (options). The training information consists of examples with partial (and possibly inconsistent) information about their associated rankings. From these, we induce a ranking function by reducing the original problem to a number of binary classification problems, one for each pair of labels. The main objective of this work is to investigate the trade-off between the quality of the induced ranking function and the computational complexity of the algorithm, both depending on the amount of preference information given for each example. To this end, we present theoretical results on the complexity of pairwise preference learning, and experimentally investigate the predictive performance of our method for different types of preference information, such as top-ranked labels and complete rankings. The domain of this study is the prediction of a rational agents ranking of actions in an uncertain environment.

Fuzzy Sets and Systems | 2005

Fuzzy methods in machine learning and data mining: Status and prospects

Eyke Hüllermeier

Over the past years, methods for the automated induction of models and the extraction of interesting patterns from empirical data have attracted considerable attention in the fuzzy set community. This paper briefly reviews some typical applications and highlights potential contributions that fuzzy set theory can make to machine learning, data mining, and related fields. The paper concludes with a critical consideration of recent developments and some suggestions for future research directions.

Machine Learning | 2012

On label dependence and loss minimization in multi-label classification

Krzysztof Dembczyński; Willem Waegeman; Weiwei Cheng; Eyke Hüllermeier

Most of the multi-label classification (MLC) methods proposed in recent years intended to exploit, in one way or the other, dependencies between the class labels. Comparing to simple binary relevance learning as a baseline, any gain in performance is normally explained by the fact that this method is ignoring such dependencies. Without questioning the correctness of such studies, one has to admit that a blanket explanation of that kind is hiding many subtle details, and indeed, the underlying mechanisms and true reasons for the improvements reported in experimental studies are rarely laid bare. Rather than proposing yet another MLC algorithm, the aim of this paper is to elaborate more closely on the idea of exploiting label dependence, thereby contributing to a better understanding of MLC. Adopting a statistical perspective, we claim that two types of label dependence should be distinguished, namely conditional and marginal dependence. Subsequently, we present three scenarios in which the exploitation of one of these types of dependence may boost the predictive performance of a classifier. In this regard, a close connection with loss minimization is established, showing that the benefit of exploiting label dependence does also depend on the type of loss to be minimized. Concrete theoretical results are presented for two representative loss functions, namely the Hamming loss and the subset 0/1 loss. In addition, we give an overview of state-of-the-art decomposition algorithms for MLC and we try to reveal the reasons for their effectiveness. Our conclusions are supported by carefully designed experiments on synthetic and benchmark data.

Bioinformatics | 2005

Clustering of gene expression data using a local shape-based similarity measure

Rajarajeswari Balasubramaniyan; Eyke Hüllermeier; Nils Weskamp; Jörg Kämper

MOTIVATION Microarray technology enables the study of gene expression in large scale. The application of methods for data analysis then allows for grouping genes that show a similar expression profile and that are thus likely to be co-regulated. A relationship among genes at the biological level often presents itself by locally similar and potentially time-shifted patterns in their expression profiles. RESULTS Here, we propose a new method (CLARITY; Clustering with Local shApe-based similaRITY) for the analysis of microarray time course experiments that uses a local shape-based similarity measure based on Spearman rank correlation. This measure does not require a normalization of the expression data and is comparably robust towards noise. It is also able to detect similar and even time-shifted sub-profiles. To this end, we implemented an approach motivated by the BLAST algorithm for sequence alignment. We used CLARITY to cluster the times series of gene expression data during the mitotic cell cycle of the yeast Saccharomyces cerevisiae. The obtained clusters were related to the MIPS functional classification to assess their biological significance. We found that several clusters were significantly enriched with genes that share similar or related functions.

Explore More