Hendrik Blockeel | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hendrik Blockeel is active.

Explore More

Publication

Featured researches published by Hendrik Blockeel.

Sigkdd Explorations | 2000

Web mining research: a survey

Raymondus Kosala; Hendrik Blockeel

With the huge amount of information available online, the World Wide Web is a fertile area for data mining research. The Web mining research is at the cross road of research from several research communities, such as database, information retrieval, and within AI, especially the sub-areas of machine learning and natural language processing. However, there is a lot of confusions when comparing research efforts from different point of views. In this paper, we survey the research in the area of Web mining, point out some confusions regarded the usage of the term Web mining and suggest three Web mining categories. Then we situate some of the research with respect to these three categories. We also explore the connection between the Web mining categories and the related agent paradigm. For the survey, we focus on representation issues, on the process, on the learning algorithm, and on the application of the recent works as the criteria. We conclude the paper with some research issues.

Artificial Intelligence | 1998

Top-down induction of first-order logical decision trees

Hendrik Blockeel; Luc De Raedt

Although topddown induction of decision trees is a very popular induction method, up till now it has mainly been used for propositional learnings relational decision tree learners are scarce. This dissertation discusses the application domain of decision tree learning and extends it towards the first order logic context of Inductive Logic Programming.

Machine Learning | 2008

Decision trees for hierarchical multi-label classification

Celine Vens; Jan Struyf; Leander Schietgat; Sašo Džeroski; Hendrik Blockeel

Hierarchical multi-label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. This article presents several approaches to the induction of decision trees for HMC, as well as an empirical study of their use in functional genomics. We compare learning a single HMC tree (which makes predictions for all classes together) to two approaches that learn a set of regular classification trees (one for each class). The first approach defines an independent single-label classification task for each class (SC). Obviously, the hierarchy introduces dependencies between the classes. While they are ignored by the first approach, they are exploited by the second approach, named hierarchical single-label classification (HSC). Depending on the application at hand, the hierarchy of classes can be such that each class has at most one parent (tree structure) or such that classes may have multiple parents (DAG structure). The latter case has not been considered before and we show how the HMC and HSC approaches can be modified to support this setting. We compare the three approaches on 24 yeast data sets using as classification schemes MIPS’s FunCat (tree structure) and the Gene Ontology (DAG structure). We show that HMC trees outperform HSC and SC trees along three dimensions: predictive accuracy, model size, and induction time. We conclude that HMC trees should definitely be considered in HMC tasks where interpretable models are desired.

Archive | 2003

Knowledge Discovery in Databases: PKDD 2003

Nada Lavrač; Dragan Gamberger; Ljupčo Todorovski; Hendrik Blockeel

This paper describes the Robosail project. It started in 1997 with the aim to build a self-learning auto pilot for a single handed sailing yacht. The goal was to make an adaptive system that would help a single handed sailor to go faster on average in a race. Presently, after five years of development and a number of sea trials, we have a commercial system available (www.robosail.com). It is a hybrid system using agent technology, machine learning, data mining and rule-based reasoning. Apart from describing the system we try to generalize our findings, and argue that sailing is an interesting paradigm for a class of hybrid systems that one could call Skill-based Systems.

Journal of Artificial Intelligence Research | 2002

Improving the efficiency of inductive logic programming through the use of query packs

Hendrik Blockeel; Luc Dehaspe; Bart Demoen; Gerda Janssens; Jan Ramon; Henk Vandecasteele

Inductive logic programming, or relational learning, is a powerful paradigm for machine learning or data mining. However, in order for ILP to become practically useful, the efficiency of ILP systems must improve substantially. To this end, the notion of a query pack is introduced: it structures sets of similar queries. Furthermore, a mechanism is described for executing such query packs. A complexity analysis shows that considerable efficiency improvements can be achieved through the use of this query pack execution mechanism. This claim is supported by empirical results obtained by incorporating support for query pack execution in two existing learning systems.

BMC Bioinformatics | 2010

Predicting gene function using hierarchical multi-label decision tree ensembles

Leander Schietgat; Celine Vens; Jan Struyf; Hendrik Blockeel; Dragi Kocev; Sašo Džeroski

BackgroundS. cerevisiae, A. thaliana and M. musculus are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability.ResultsWe study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO). We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use.ConclusionsOur results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.

Data Mining and Knowledge Discovery | 1999

Scaling Up Inductive Logic Programming by Learning from Interpretations

Hendrik Blockeel; Luc De Raedt; Nico Jacobs; Bart Demoen

When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently.Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting).As a case study, we present two alternative implementations of the ILP system TILDE (Top-down Induction of Logical DEcision trees): TILDEclassic, which loads all data in main memory, and TILDELDS, which loads the examples one by one. We experimentally compare the implementations, showing TILDELDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.

european conference on principles of data mining and knowledge discovery | 2006

Decision trees for hierarchical multilabel classification: a case study in functional genomics

Hendrik Blockeel; Leander Schietgat; Jan Struyf; Sašo Džeroski; Amanda Clare

Hierarchical multilabel classification (HMC) is a variant of classification where instances may belong to multiple classes organized in a hierarchy. The task is relevant for several application domains. This paper presents an empirical study of decision tree approaches to HMC in the area of functional genomics. We compare learning a single HMC tree (which makes predictions for all classes together) to learning a set of regular classification trees (one for each class). Interestingly, on all 12 datasets we use, the HMC tree wins on all fronts: it is faster to learn and to apply, easier to interpret, and has similar or better predictive performance than the set of regular trees. It turns out that HMC tree learning is more robust to overfitting than regular tree learning.

Archive | 2003

Machine Learning: ECML 2003

Nada Lavrač; Dragan Gamberger; Hendrik Blockeel; Ljupčo Todorovski

european conference on machine learning | 2001

Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner

Kurt Driessens; Jan Ramon; Hendrik Blockeel

Relational reinforcement learning (RRL) is a learning technique that combines standard reinforcement learning with inductive logic programming to enable the learning system to exploit structural knowledge about the application domain. This paper discusses an improvement of the original RRL. We introduce a fully incremental first order decision tree learning algorithm TG and integrate this algorithm in the RRL system to form RRL-TG. We demonstrate the performance gain on similar experiments to those that were used to demonstrate the behaviour of the original RRL system.

Explore More