Is this you? Create Your Porfile

Jérémy Besson

Institut national des sciences Appliquées de Lyon

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jérémy Besson is active.

Explore More

Publication

Featured researches published by Jérémy Besson.

ACM Transactions on Knowledge Discovery From Data | 2009

Closed patterns meet n -ary relations

Loïc Cerf; Jérémy Besson; Céline Robardet; Jean-François Boulicaut

Set pattern discovery from binary relations has been extensively studied during the last decade. In particular, many complete and efficient algorithms for frequent closed set mining are now available. Generalizing such a task to n-ary relations (n ≥ 2) appears as a timely challenge. It may be important for many applications, for example, when adding the time dimension to the popular objects × features binary case. The generality of the task (no assumption being made on the relation arity or on the size of its attribute domains) makes it computationally challenging. We introduce an algorithm called Data-Peeler. From an n-ary relation, it extracts all closed n-sets satisfying given piecewise (anti) monotonic constraints. This new class of constraints generalizes both monotonic and antimonotonic constraints. Considering the special case of ternary relations, Data-Peeler outperforms the state-of-the-art algorithms CubeMiner and Trias by orders of magnitude. These good performances must be granted to a new clever enumeration strategy allowing to efficiently enforce the closeness property. The relevance of the extracted closed n-sets is assessed on real-life 3-and 4-ary relations. Beyond natural 3-or 4-ary relations, expanding a relation with an additional attribute can help in enforcing rather abstract constraints such as the robustness with respect to binarization. Furthermore, a collection of closed n-sets is shown to be an excellent starting point to compute a tiling of the dataset.

international conference on data mining | 2007

ORIGAMI: Mining Representative Orthogonal Graph Patterns

M. Al Hasan; Vineet Chaoji; Saeed Salem; Jérémy Besson; Mohammed Javeed Zaki

In this paper, we introduce the concept of alpha-orthogonal patterns to mine a representative set of graph patterns. Intuitively, two graph patterns are alpha-orthogonal if their similarity is bounded above by alpha. Each alpha-orthogonal pattern is also a representative for those patterns that are at least beta similar to it. Given user defined alpha, beta isin [0,1], the goal is to mine an alpha-orthogonal, beta-representative set that minimizes the set of unrepresented patterns. We present ORIGAMI, an effective algorithm for mining the set of representative orthogonal patterns. ORIGAMI first uses a randomized algorithm to randomly traverse the pattern space, seeking previously unexplored regions, to return a set of maximal patterns. ORIGAMI then extracts an alpha-orthogonal, beta-representative set from the mined maximal patterns. We show the effectiveness of our algorithm on a number of real and synthetic datasets. In particular, we show that our method is able to extract high quality patterns even in cases where existing enumerative graph mining methods fail to do so.

international conference on management of data | 2003

Using transposition for pattern discovery from microarray data

François Rioult; Jean-François Boulicaut; Bruno Crémilleux; Jérémy Besson

We analyze expression matrices to identify a priori interesting sets of genes, e.g., genes that are frequently co-regulated. Such matrices provide expression values for given biological situations (the lines) and given genes (columns). The frequent itemset (sets of columns) extraction technique enables to process difficult cases (millions of lines, hundreds of columns) provided that data is not too dense. However, expression matrices can be dense and have generally only few lines w.r.t. the number of columns. Known algorithms, including the recent algorithms that compute the so-called condensed representations can fail. Thanks to the properties of Galois connections, we propose an original technique that processes the transposed matrices while computing the sets of genes. We validate the potential of this framework by looking for the closed sets in two microarray data sets.

international conference on conceptual structures | 2006

Mining a new fault-tolerant pattern type as an alternative to formal concept discovery

Jérémy Besson; Céline Robardet; Jean-François Boulicaut

Formal concept analysis has been proved to be useful to support knowledge discovery from boolean matrices. In many applications, such 0/1 data have to be computed from experimental data and it is common to miss some one values. Therefore, we extend formal concepts towards fault-tolerance. We define the DR-bi-set pattern domain by allowing some zero values to be inside the pattern. Crucial properties of formal concepts are preserved (number of zero values bounded on objects and attributes, maximality and availability of functions which “connect” the set components). DR-bi-sets are defined by constraints which are actively used by our correct and complete algorithm. Experimentation on both synthetic and real data validates the added-value of the DR-bi-sets.

pacific-asia conference on knowledge discovery and data mining | 2004

Constraint-Based Mining of Formal Concepts in Transactional Data

Jérémy Besson; Céline Robardet; Jean-François Boulicaut

We are designing new data mining techniques on boolean contexts to identify a priori interesting concepts, i.e., closed sets of objects (or transactions) and associated closed sets of attributes (or items). We propose a new algorithm D-Miner for mining concepts under constraints. We provide an experimental comparison with previous algorithms and an application to an original microarray dataset for which D-Miner is the only one that can mine all the concepts.

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases | 2005

Constraint-Based mining of fault-tolerant patterns from boolean data

Jérémy Besson; Ruggero G. Pensa; Céline Robardet; Jean-François Boulicaut

Thanks to an important research effort during the last few years, inductive queries on local patterns (e.g., set patterns) and their associated complete solvers have been proved extremely useful to support knowledge discovery. The more we use such queries on real-life data, e.g., biological data, the more we are convinced that inductive queries should return fault-tolerant patterns. This is obviously the case when considering formal concept discovery from noisy datasets. Therefore, we study various extensions of this kind of bi-set towards fault-tolerance. We compare three declarative specifications of fault-tolerant bi-sets by means of a constraint-based mining approach. Our framework enables a better understanding of the needed trade-off between extraction feasibility, completeness, relevance, and ease of interpretation of these fault-tolerant patterns. An original empirical evaluation on both synthetic and real-life medical data is given. It enables a comparison of the various proposals and it motivates further directions of research.

international conference on formal concept analysis | 2008

Actionability and formal concepts: a data mining perspective

Jean-François Boulicaut; Jérémy Besson

The last few years, we have studied different set pattern mining techniques from binary data. It includes the computation of formal concepts to support various knowledge discovery processes. For instance, when considering post-genomics, we can exploit Boolean data sets that encode a relation between some genes and the proteins that may regulate them. In such a context, it appears interesting to exploit the analogy between a putative transcriptional module (i.e., a typically important hypothesis for gene regulation understanding) and a formal concept that holds within such data. In this paper, we assume that knowledge nuggets can be captured by collections of formal concepts and we discuss the challenging issue of mining/selecting actionable patterns from these collections, i.e., looking for relevant patterns that really support knowledge discovery. Therefore, a major issue concerns the computation of complete collections of formal concepts that satisfy user-defined constraints. This is useful not only to avoid the computation of too small patterns that might be due to noise (e.g., using size constraints on both their intents and extents) but also to introduce some fault-tolerance. We discuss the pros and the cons of some recent proposals in that direction.

discovery science | 2004

A Methodology for Biologically Relevant Pattern Discovery from Gene Expression Data

Ruggero G. Pensa; Jérémy Besson; Jean-François Boulicaut

One of the most exciting scientific challenges in functional genomics concerns the discovery of biologically relevant patterns from gene expression data. For instance, it is extremely useful to provide putative synexpression groups or transcription modules to molecular biologists. We propose a methodology that has been proved useful in real cases. It is described as a prototypical KDD scenario which starts from raw expression data selection until useful patterns are delivered. Our conceptual contribution is (a) to emphasize how to take the most from recent progress in constraint-based mining of set patterns, and (b) to propose a generic approach for gene expression data enrichment. The methodology has been validated on real data sets.

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases | 2004

Mining formal concepts with a bounded number of exceptions from transactional data

Jérémy Besson; Céline Robardet; Jean-François Boulicaut

We are designing new data mining techniques on boolean contexts to identify a priori interesting bi-sets (i.e., sets of objects or transactions associated to sets of attributes or items). A typical important case concerns formal concept mining (i.e., maximal rectangles of true values or associated closed sets by means of the so-called Galois connection). It has been applied with some success to, e.g., gene expression data analysis where objects denote biological situations and attributes denote gene expression properties. However in such real-life application domains, it turns out that the Galois association is a too strong one when considering intrinsically noisy data. It is clear that strong associations that would however accept a bounded number of exceptions would be extremely useful. We study the new pattern domain of α/β concepts, i.e., consistent maximal bi-sets with less than α false values per row and less than β false values per column. We provide a complete algorithm that computes all the α/β concepts based on the generation of concept unions pruned thanks to anti-monotonic constraints. An experimental validation on synthetic data is given. It illustrates that more relevant associations can be discovered in noisy data. We also discuss a practical application in molecular biology that illustrates an incomplete but quite useful extraction when all the concepts that are needed beforehand can not be discovered.

Lecture Notes in Computer Science | 2006

Mining bi-sets in numerical data

Jérémy Besson; Céline Robardet; Luc De Raedt; Jean-François Boulicaut

Thanks to an important research effort the last few years, inductive queries on set patterns and complete solvers which can evaluate them on large 0/1 data sets have been proved extremely useful. However, for many application domains, the raw data is numerical (matrices of real numbers whose dimensions denote objects and properties). Therefore, using efficient 0/1 mining techniques needs for tedious Boolean property encoding phases. This is, e.g., the case, when considering microarray data mining and its impact for knowledge discovery in molecular biology. We consider the possibility to mine directly numerical data to extract collections of relevant bi-sets, i.e., couples of associated sets of objects and attributes which satisfy some user-defined constraints. Not only we propose a new pattern domain but also we introduce a complete solver for computing the so-called numerical bi-sets. Preliminary experimental validation is given.

Explore More