Is this you? Create Your Porfile

Anton Dries

Katholieke Universiteit Leuven

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anton Dries is active.

Explore More

Publication

Featured researches published by Anton Dries.

conference on information and knowledge management | 2009

A query language for analyzing networks

Anton Dries; Siegfried Nijssen; Luc De Raedt

With more and more large networks becoming available, mining and querying such networks are increasingly important tasks which are not being supported by database models and querying languages. This paper wants to alleviate this situation by proposing a data model and a query language for facilitating the analysis of networks. Key features include support for executing external tools on the networks, flexible contexts on the network each resulting in a different graph, primitives for querying subgraphs (including paths) and transforming graphs. The data model provides for a closure property, in which the output of every query can be stored in the database and used for further querying.

international conference on data mining | 2013

Dominance Programming for Itemset Mining

Benjamin Negrevergne; Anton Dries; Tias Guns; Siegfried Nijssen

Finding small sets of interesting patterns is an important challenge in pattern mining. In this paper, we argue that several well-known approaches that address this challenge are based on performing pair wise comparisons between patterns. Examples include finding closed patterns, free patterns, relevant subgroups and skyline patterns. Although progress has been made on each of these individual problems, a generic approach for solving these problems (and more) is still lacking. This paper tackles this challenge. It proposes a novel, generic approach for handling pattern mining problems that involve pair wise comparisons between patterns. Our key contributions are the following. First, we propose a novel algebra for programming pattern mining problems. This algebra extends relational algebras in a novel way towards pattern mining. It allows for the generic combination of constraints on individual patterns with dominance relations between patterns. Second, we introduce a modified generic constraint satisfaction system to evaluate these algebraic expressions. Experiments show that this generic approach can indeed effectively identify patterns expressed in the algebra.

siam international conference on data mining | 2012

Mining patterns in networks using homomorphism

Anton Dries; Siegfried Nijssen

In recent years many algorithms have been developed for finding patterns in graphs and networks. A disadvantage of these algorithms is that they use subgraph isomorphism to determine the support of a graph pattern; subgraph isomorphism is a well-known NP complete problem. In this paper, we propose an alternative approach which mines tree patterns in networks by using subgraph homomorphism. The advantage of homomorphism is that it can be computed in polynomial time, which allows us to develop an algorithm that mines tree patterns in arbitrary graphs in incremental polynomial time. Homomorphism however entails two problems not found when using isomorphism: (1) two patterns of different size can be equivalent; (2) patterns of unbounded size can be frequent. In this paper we formalize these problems and study solutions that easily fit within our algorithm.

IEEE Transactions on Knowledge and Data Engineering | 2010

Mining Predictive k-CNF Expressions

Anton Dries; L. De Raedt; Siegfried Nijssen

We adapt Mitchells version space algorithm for mining k-CNF formulas. Advantages of this algorithm are that it runs in a single pass over the data, is conceptually simple, can be used for missing value prediction, and has interesting theoretical properties, while an empirical evaluation on classification tasks yields competitive predictive results.

european conference on machine learning | 2015

ProbLog2: Probabilistic Logic Programming

Anton Dries; Angelika Kimmig; Wannes Meert; Joris Renkens; Guy Van den Broeck; Jonas Vlasselaer; Luc De Raedt

We present ProbLog2, the state of the art implementation of the probabilistic programming language ProbLog. The ProbLog language allows the user to intuitively build programs that do not only encode complex interactions between a large sets of heterogenous components but also the inherent uncertainties that are present in real-life situations. The system provides efficient algorithms for querying such models as well as for learning their parameters from data. It is available as an online tool on the web and for download. The offline version offers both command line access to inference and learning and a Python library for building statistical relational learning applications from the systems components.

mining and learning with graphs | 2010

Analyzing graph databases by aggregate queries

Anton Dries; Siegfried Nijssen

An important step in data analysis is the exploration of data. For traditional relational databases one of the most powerful tools for performing such analysis is the relational database and the aggregates and rankings that they can compute: for instance, simple statistics such as the average number of links between two types of entities (relations) are easily computed using a query on a relational database and may already provide valuable information. However, for the exploration of graph data, relational databases may not be most practical and scalable. For instance, a statistic such as the shortest path between two given nodes cannot be computed by a relational database. Surprisingly, however, tools for querying graph and network databases are much less well developed than for relational data, and only recently an increasing number of studies are devoted to graph or network databases. Our position is that the development of such graph databases is important both to make basic graph mining easier and to prepare data for more complex types of analysis. An important component of such databases is the language that is used to enable aggregating queries, such as shortest path queries. In this paper, we propose an extension to a previously proposed query language. This extension allows for querying and analyzing databases by using aggregates and ranking. A notable feature of our language is that it also supports probabilistic graph queries by conceiving of such queries as aggregating queries. We demonstrate its value on a simple data analysis task.

Bisociative Knowledge Discovery | 2012

BiQL: a query language for analyzing information networks

Anton Dries; Siegfried Nijssen; Luc De Raedt

One of the key steps in data analysis is the exploration of data. For traditional relational data, this process is facilitated by relational database management systems and the aggregates and rankings they can compute. However, for the exploration of graph data, relational databases may not be most practical and scalable. Many tasks related to exploration of information networks involve computation and analysis of connections (e.g. paths) between concepts. Traditional relational databases offer no specific support for performing such tasks. For instance, a statistic such as the shortest path between two given nodes cannot be computed by a relational database. Surprisingly, tools for querying graph and network databases are much less well developed than for relational data, and only recently an increasing number of studies are devoted to graph or network databases. Our position is that the development of such graph databases is important both to make basic graph mining easier and to prepare data for more complex types of analysis. In this chapter we present the BiQL data model for representing and manipulating information networks. The BiQL data model consists of two parts: a data model describing objects, link, domains and networks, and a query language describing basic network manipulations. The main focus here lies on data preparation and data analysis, and less on data mining or knowledge discovery tasks directly.

inductive logic programming | 2009

Towards clausal discovery for stream mining

Anton Dries; Luc De Raedt

With the increasing popularity of data streams it has become time to adapt logical and relational learning techniques for dealing with streams. In this note, we present our preliminary results on upgrading the clausal discovery paradigm towards the mining of streams. In this setting, there is a stream of interpretations and the goal is to learn a clausal theory that is satisfied by these interpretations. Furthermore, in data streams the interpretations can be read (and processed) only once.

international joint conference on artificial intelligence | 2017

Solving Probability Problems in Natural Language

Anton Dries; Angelika Kimmig; Jesse Davis; Vaishak Belle; Luc De Raedt

The ability to solve probability word problems such as those found in introductory discrete mathematics textbooks, is an important cognitive and intellectual skill. In this paper, we develop a two-step endto-end fully automated approach for solving such questions that is able to automatically provide answers to exercises about probability formulated in natural language. In the first step, a question formulated in natural language is analysed and transformed into a highlevel model specified in a declarative language. In the second step, a solution to the high-level model is computed using a probabilistic programming system. On a dataset of 2160 probability problems, our solver is able to correctly answer 97.5% of the questions given a correct model. On the end-toend evaluation, we are able to answer 12.5% of the questions (or 31.1% if we exclude examples not supported by design).

international conference on data mining | 2013

The MiningZinc Framework for Constraint-Based Itemset Mining

Tias Guns; Anton Dries; Guido Tack; Siegfried Nijssen; Luc De Raedt

We present Mining Zinc, a novel system for constraint-based pattern mining. It provides a declarative approach to data mining, where a user specifies a problem in terms of constraints and the system employs advanced techniques to efficiently find solutions. Declarative programming and modeling are common in artificial intelligence and in database systems, but not so much in data mining, by building on ideas from these communities, Mining Zinc advances the state-of-the-art of declarative data mining significantly. Key components of the Mining Zinc system are (1) a high-level and natural language for formalizing constraint-based item set mining problems in models, and (2) an infrastructure for executing these models, which supports both specialized mining algorithms as well as generic constraint solving systems. A use case demonstrates the generality of the language, as well as its flexibility towards adding and modifying constraints and data, and the use of different solution methods.

Explore More