Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Elie Prudhomme is active.

Publication


Featured researches published by Elie Prudhomme.


Quality Measures in Data Mining | 2007

Association Rule Interestingness: Measure and Statistical Validation

Stéphane Lallich; Olivier Teytaud; Elie Prudhomme

The search for interesting Boolean association rules is an important topic in knowledge discovery in databases. The set of admissible rules for the selected support and condence thresholds can easily be extracted by algorithms based on support and condence, such as Apriori. However, they may produce a large number of rules, many of them are uninteresting. One has to resolve a two-tier problem: choosing the measures best suited to the problem at hand, then validating the interesting rules against the selected measures. First, the usual measures suggested in the literature will be reviewed and criteria to appreciate the qualities of these measures will be proposed. Statistical validation of the most interesting rules requests performing a large number of tests. Thus, controlling for false discoveries (type I errors) is of prime importance. An original bootstrap-based validation method is proposed which controls, for a given level, the number of false discoveries. The interest of this method for the selection of interesting association rules will be illustrated by several examples.


Archive | 2006

Statistical inference and data mining: false discoveries control

Stéphane Lallich; Olivier Teytaud; Elie Prudhomme

Data Mining is characterized by its ability at processing large amounts of data. Among those are the data “features”- variables or association rules that can be derived from them. Selecting the most interesting features is a classical data mining problem. That selection requires a large number of tests from which arise a number of false discoveries. An original non parametric control method is proposed in this paper. A new criterion, UAFWER, defined as the risk of exceeding a pre-set number of false discoveries, is controlled by BS FD, a bootstrap based algorithm that can be used on one- or two-sided problems. The usefulness of the procedure is illustrated by the selection of differentially interesting association rules on genetic data.


international syposium on methodologies for intelligent systems | 2008

Maps ensemble for semi-supervised learning of large high dimensional datasets

Elie Prudhomme; Stéphane Lallich

In many practical cases, only few labels are available on the data. Algorithms must then take advantage of the unlabeled data to ensure an efficient learning. This type of learning is called semi-supervised learning (SSL). In this article, we propose a methodology adapted to both the representation and the prediction of large datasets in that situation. For that purpose, groups of non-correlated attributes are created in order to overcome problems related to high dimensional spaces. An ensemble is then set up to learn each group with a self-organizing map (SOM). Beside the prediction, these maps also aim at providing a relevant representation of the data which could be used in semi-supervised learning. Finally, the prediction is achieved by a vote of the different maps. Experimentations are performed both in supervised and semi-supervised learning. They show the relevance of this approach.


EGC (Ateliers) | 2005

Mesurer l'intérêt des règles d'association.

Benoît Vaillant; Patrick Meyer; Elie Prudhomme; Stéphane Lallich; Philippe Lenca; Sébastien Bigaret


DMIN | 2008

Optimization of Self-Organizing Maps Ensemble in Prediction.

Elie Prudhomme; Stéphane Lallich


EGC | 2007

Ensemble prédicteur fondé sur les cartes auto-organisatrices adapté aux données volumineuses.

Elie Prudhomme; Stéphane Lallich


AAFD | 2008

Représentation des données par un comité de cartes auto-organisatrices : une application aux données bruitées.

Elie Prudhomme; Stéphane Lallich


International Workshop on Intelligent Information Acces --- IIIA 2006 | 2005

Quasi-Random resamplings, with applications to rule-samplng, cross-validation and (su-)bagging

Olivier Teytaud; Sylvain Gelly; Stéphane Lallich; Elie Prudhomme


EGC | 2005

Validation statistique des cartes de Kohonen en apprentissage supervisé.

Elie Prudhomme; Stéphane Lallich


Archive | 2004

Contr?ole du risque multiple en s'election de r`egles d''association significatives

Stéphane Lallich; Elie Prudhomme; Olivier Teytaud

Collaboration


Dive into the Elie Prudhomme's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Philippe Lenca

Institut Mines-Télécom

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Benoît Vaillant

Centre national de la recherche scientifique

View shared research outputs
Researchain Logo
Decentralizing Knowledge