Elie Prudhomme
University of Lyon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Elie Prudhomme.
Quality Measures in Data Mining | 2007
Stéphane Lallich; Olivier Teytaud; Elie Prudhomme
The search for interesting Boolean association rules is an important topic in knowledge discovery in databases. The set of admissible rules for the selected support and condence thresholds can easily be extracted by algorithms based on support and condence, such as Apriori. However, they may produce a large number of rules, many of them are uninteresting. One has to resolve a two-tier problem: choosing the measures best suited to the problem at hand, then validating the interesting rules against the selected measures. First, the usual measures suggested in the literature will be reviewed and criteria to appreciate the qualities of these measures will be proposed. Statistical validation of the most interesting rules requests performing a large number of tests. Thus, controlling for false discoveries (type I errors) is of prime importance. An original bootstrap-based validation method is proposed which controls, for a given level, the number of false discoveries. The interest of this method for the selection of interesting association rules will be illustrated by several examples.
Archive | 2006
Stéphane Lallich; Olivier Teytaud; Elie Prudhomme
Data Mining is characterized by its ability at processing large amounts of data. Among those are the data “features”- variables or association rules that can be derived from them. Selecting the most interesting features is a classical data mining problem. That selection requires a large number of tests from which arise a number of false discoveries. An original non parametric control method is proposed in this paper. A new criterion, UAFWER, defined as the risk of exceeding a pre-set number of false discoveries, is controlled by BS FD, a bootstrap based algorithm that can be used on one- or two-sided problems. The usefulness of the procedure is illustrated by the selection of differentially interesting association rules on genetic data.
international syposium on methodologies for intelligent systems | 2008
Elie Prudhomme; Stéphane Lallich
In many practical cases, only few labels are available on the data. Algorithms must then take advantage of the unlabeled data to ensure an efficient learning. This type of learning is called semi-supervised learning (SSL). In this article, we propose a methodology adapted to both the representation and the prediction of large datasets in that situation. For that purpose, groups of non-correlated attributes are created in order to overcome problems related to high dimensional spaces. An ensemble is then set up to learn each group with a self-organizing map (SOM). Beside the prediction, these maps also aim at providing a relevant representation of the data which could be used in semi-supervised learning. Finally, the prediction is achieved by a vote of the different maps. Experimentations are performed both in supervised and semi-supervised learning. They show the relevance of this approach.
EGC (Ateliers) | 2005
Benoît Vaillant; Patrick Meyer; Elie Prudhomme; Stéphane Lallich; Philippe Lenca; Sébastien Bigaret
DMIN | 2008
Elie Prudhomme; Stéphane Lallich
EGC | 2007
Elie Prudhomme; Stéphane Lallich
AAFD | 2008
Elie Prudhomme; Stéphane Lallich
International Workshop on Intelligent Information Acces --- IIIA 2006 | 2005
Olivier Teytaud; Sylvain Gelly; Stéphane Lallich; Elie Prudhomme
EGC | 2005
Elie Prudhomme; Stéphane Lallich
Archive | 2004
Stéphane Lallich; Elie Prudhomme; Olivier Teytaud