Benoît Vaillant
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Benoît Vaillant.
European Journal of Operational Research | 2008
Philippe Lenca; Patrick Meyer; Benoît Vaillant; Stéphane Lallich
Abstract Data mining algorithms, especially those used for unsupervised learning, generate a large quantity of rules. In particular this applies to the A priori family of algorithms for the determination of association rules. It is hence impossible for an expert in the field being mined to sustain these rules. To help carry out the task, many measures which evaluate the interestingness of rules have been developed. They make it possible to filter and sort automatically a set of rules with respect to given goals. Since these measures may produce different results, and as experts have different understandings of what a good rule is, we propose in this article a new direction to select the best rules: a two-step solution to the problem of the recommendation of one or more user-adapted interestingness measures. First, a description of interestingness measures, based on meaningful classical properties, is given. Second, a multicriteria decision aid process is applied to this analysis and illustrates the benefit that a user, who is not a data mining expert, can achieve with such methods.
discovery science | 2004
Benoît Vaillant; Philippe Lenca; Stéphane Lallich
It is a common issue that KDD processes may generate a large number of patterns depending on the algorithm used, and its parameters. It is hence impossible for an expert to sustain these patterns. This may be the case with the well-known APRIORI algorithm. One of the methods used to cope with such an amount of output depends on the use of interestingness measures. Stating that selecting interesting rules also means using an adapted measure, we present an experimental study of the behaviour of 20 measures on 10 datasets. This study is compared to a previous analysis of formal and meaningful properties of the measures, by means of two clusterings. One of the goals of this study is to enhance our previous approach. Both approaches seem to be complementary and could be profitable for the problem of a users choice of a measure.
Quality Measures in Data Mining | 2007
Philippe Lenca; Benoît Vaillant; Patrick Meyer; Stéphane Lallich
It is a common problem that Kdd processes may generate a large number of patterns depending on the algorithm used, and its parameters. It is hence impossible for an expert to assess these patterns. This is the case with the well-known Apriori algorithm. One of the methods used to cope with such an amount of output depends on using association rule interestingness measures. Stating that selecting interesting rules also means using an adapted measure, we present a formal and an experimental study of 20 measures. The experimental studies carried out on 10 data sets lead to an experimental classification of the measures. This study is compared to an analysis of the formal and meaningful properties of the measures. Finally, the properties are used in a multi-criteria decision analysis in order to select amongst the available measures the one or those that best take into account the user’s needs. These approaches seem to be complementary and could be useful in solving the problem of a user’s choice of measure.
Communications in Statistics-theory and Methods | 2010
Philippe Lenca; Stéphane Lallich; Benoît Vaillant
In supervised learning, especially in decision tree induction, many measures are based on the concept of entropy. A major characteristic of entropies is that they take their maximal value when the distribution of the modalities of the class variable is uniform. In particular, to deal with the case when the a priori frequencies of the class variable modalities are imbalanced, we propose an off-centered entropy which takes its maximum value for a distribution fixed by the user. This distribution can be the overall distribution of the class variable or a distribution taking into account the costs of misclassification. Other authors proposed an asymmetric entropy. In this article, we propose an adaptive strategy of induction of decision trees based on these non centered entropies. The first experiments on ten data bases using the C4.5 induction tree show that both non centered entropies are promising in comparison with the classical Shannon entropy.
modeling decisions for artificial intelligence | 2006
Jean-Pierre Barthélemy; Angélique Legrain; Philippe Lenca; Benoît Vaillant
One of the concerns of knowledge discovery in databases is the production of association rules. An association rule A
Methodology and Computing in Applied Probability | 2007
Stéphane Lallich; Benoît Vaillant; Philippe Lenca
\longrightarrow
ieee conference on cybernetics and intelligent systems | 2006
Philippe Lenca; Benoît Vaillant; Stéphane Lallich
B defines a relationship between two sets of attributes A and B, caracterising the data studied. Such a rule means that objects sharing attributes of A will “likely” have those contained in B. Yet, this notion of “likeliness” depends on the datamining context. Many interestingness measures have been introduced in order to quantify this likeliness. This panel of measures is heterogeneous and the ranking of extracted rules, according to measures, may differ largely. This contribution explores a new approach for assessing the quality of rules: aggregating valued relations. For each measure, a valued relation is built out of the numerical values it takes on the rules, and represents the preference of a rule over another. The aim in using such tools is to take into account the intensity of preference expressed by various measures, and should reduce incomparability issues related to differences in their co-domains. It also has the advantage of relating the numerical nature of measures compared to pure binary approaches. We studied several aggregation operators. In this contribution we discuss results obtained on a toy example using the simplest of them.
Lecture Notes in Computer Science | 2006
Jean-Pierre Barthélemy; Angélique Legrain; Philippe Lenca; Benoît Vaillant
DMIN | 2006
Benoît Vaillant; Stéphane Lallich; Philippe Lenca
EGC (Ateliers) | 2005
Benoît Vaillant; Patrick Meyer; Elie Prudhomme; Stéphane Lallich; Philippe Lenca; Sébastien Bigaret