Yves Bastide
Blaise Pascal University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yves Bastide.
international conference on database theory | 1999
Nicolas Pasquier; Yves Bastide; Rafik Taouil; Lotfi Lakhal
In this paper, we address the problem of finding frequent itemsets in a database. Using the closed itemset lattice framework, we show that this problem can be reduced to the problem of finding frequent closed itemsets. Based on this statement, we can construct efficient data mining algorithms by limiting the search space to the closed itemset lattice rather than the subset lattice. Moreover, we show that the set of all frequent closed itemsets suffices to determine a reduced set of association rules, thus addressing another important data mining problem: limiting the number of rules produced without information loss. We propose a new algorithm, called A-Close, using a closure mechanism to find frequent closed itemsets. We realized experiments to compare our approach to the commonly used frequent itemset search approach. Those experiments showed that our approach is very valuable for dense and/or correlated data that represent an important part of existing databases.
Information Systems | 1999
Nicolas Pasquier; Yves Bastide; Rafik Taouil; Lotfi Lakhal
Abstract Discovering association rules is one of the most important task in data mining. Many efficient algorithms have been proposed in the literature. The most noticeable are Apriori, Mannilas algorithm, Partition, Sampling and DIC, that are all based on the Apriori mining method: pruning the subset lattice (itemset lattice). In this paper we propose an efficient algorithm, called Close, based on a new mining method: pruning the closed set lattice (closed itemset lattice). This lattice, which is a sub-order of the subset lattice, is closely related to Willes concept lattice in formal concept analysis. Experiments comparing Close to an optimized version of Apriori showed that Close is very efficient for mining dense and/or correlated data such as census style data, and performs reasonably well for market basket style data.
data and knowledge engineering | 2002
Gerd Stumme; Rafik Taouil; Yves Bastide; Nicolas Pasquier; Lotfi Lakhal
We introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganters Next-Closure algorithm. The evaluation shows an important gain in efficiency, especially for weakly correlated data.
Sigkdd Explorations | 2000
Yves Bastide; Rafik Taouil; Nicolas Pasquier; Gerd Stumme; Lotfi Lakhal
In this paper, we propose the algorithm PASCAL which introduces a novel optimization of the well-known algorithm Apriori. This optimization is based on a new strategy called pattern counting inference that relies on the concept of key patterns. We show that the support of frequent non-key patterns can be inferred from frequent key patterns without accessing the database. Experiments comparing PASCAL to the three algorithms Apriori, Close and Max-Miner, show that PASCAL is among the most efficient algorithms for mining frequent patterns.
Lecture Notes in Computer Science | 2000
Yves Bastide; Nicolas Pasquier; Rafik Taouil; Gerd Stumme; Lotfi Lakhal
The problem of the relevance and the usefulness of extracted association rules is of primary importance because, in the majority of cases, real-life databases lead to several thousands association rules with high confidence and among which are many redundancies. Using the closure of the Galois connection, we define two new bases for association rules which union is a generating set for all valid association rules with support and confidence. These bases are characterized using frequent closed itemsets and their generators; they consist of the nonredundant exact and approximate association rules having minimal antecedents and maximal consequents, i.e. the most relevant association rules. Algorithms for extracting these bases are presented and results of experiments carried out on real-life databases show that the proposed bases are useful, and that their generation is not time consuming.
intelligent information systems | 2005
Nicolas Pasquier; Rafik Taouil; Yves Bastide; Gerd Stumme; Lotfi Lakhal
Association rule extraction from operational datasets often produces several tens of thousands, and even millions, of association rules. Moreover, many of these rules are redundant and thus useless. Using a semantic based on the closure of the Galois connection, we define a condensed representation for association rules. This representation is characterized by frequent closed itemsets and their generators. It contains the non-redundant association rules having minimal antecedent and maximal consequent, called min-max association rules. We think that these rules are the most relevant since they are the most general non-redundant association rules. Furthermore, this representation is a basis, i.e., a generating set for all association rules, their supports and their confidences, and all of them can be retrieved needless accessing the data. We introduce algorithms for extracting this basis and for reconstructing all association rules. Results of experiments carried out on real datasets show the usefulness of this approach. In order to generate this basis when an algorithm for extracting frequent itemsets—such as Apriori for instance—is used, we also present an algorithm for deriving frequent closed itemsets and their generators from frequent itemsets without using the dataset.
Lecture Notes in Computer Science | 2001
Gerd Stumme; Rafik Taouil; Yves Bastide; Nicolas Pasquier; Lotfi Lakhal
Association rules are used to investigate large databases. The analyst is usually confronted with large lists of such rules and has to find the most relevant ones for his purpose. Based on results about knowledge representation within the theoretical framework of Formal Concept Analysis, we present relatively small bases for association rules from which all rules can be deduced. We also provide algorithms for their calculation.
international conference on data engineering | 2000
Rafik Taouil; Nicolas Pasquier; Yves Bastide; Lotfi Lakhal
We address the problem of the usefulness and the relevance of the set of discovered association rules. Using the frequent closed itemset groundwork, we propose to generate bases for association rules, that are non-redundant generating sets for all association rules.
7th International Workshop on Knowledge Representation meets Databases - KRDB'2000 | 2000
Gerd Stumme; Rafik Taouil; Yves Bastide; Nicolas Pasquier; Lotfi Lakhal
BDA'1999 international conference on Advanced Databases | 1999
Nicolas Pasquier; Yves Bastide; Rafik Taouil; Lotfi Lakhal