Simon Marcellin
University of Lyon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Simon Marcellin.
intelligent information systems | 2010
Djamel A. Zighed; Gilbert Ritschard; Simon Marcellin
Many algorithms of machine learning use an entropy measure as optimization criterion.Among the widely used entropy measures, Shannon’s is one of the most popular. In some real world applications, the use of such entropy measures without precautions, could lead to inconsistent results. Indeed, the measures of entropy are built upon some assumptions which are not fulfilled in many real cases. For instance, in supervised learning such as decision trees, the classification cost of the classes is not explicitly taken into account in the tree growing process. Thus, the misclassification costs are assumed to be the same for all classes. In the case where those costs are not equal on all classes, the maximum of entropy must be elsewhere than on the uniform probability distribution. Also, when the classes don’t have the same a priori distribution of probability, the worst case (maximum of the entropy) must be elsewhere than on the uniform distribution. In this paper, starting from real world problems, we will show that classical entropy measures are not suitable for building a predictive model. Then, we examine the main axioms that define an entropy and discuss their inadequacy in machine learning. This we lead us to propose a new entropy measure that possesses more suitable proprieties.Afterwhat,we carry out some evaluations on data sets that illustrate the performance of the new measure of entropy.
international syposium on methodologies for intelligent systems | 2008
Simon Marcellin; Djamel A. Zighed; Gilbert Ritschard
We propose to evaluate the quality of decision trees grown on imbalanced datasets with a splitting criterion based on an asymmetric entropy measure. To deal with the class imbalance problem in machine learning, especially with decision trees, different authors proposed such asymmetric splitting criteria. After the tree is grown a decision rule has to be assigned to each leaf. The classical Bayesian rule that selects the more frequent class is irrelevant when the dataset is strongly imbalanced. A best suited assignment rule taking asymmetry into account must be adopted. But how can we then evaluate the resulting prediction model? Indeed the usual error rate is irrelevant when the classes are strongly imbalanced. Appropriate evaluation measures are required in such cases. We consider ROC curves and recall/precision graphs for evaluating the performance of decision trees grown from imbalanced datasets. These evaluation criteria are used for comparing trees obtained with an asymmetric splitting criterion with those grown with a symmetric one. In this paper we only consider the cases involving 2 classes.
Archive | 2006
Simon Marcellin; Djamel Abdelkader Zighed; Gilbert Ritschard
EGC | 2007
Djamel A. Zighed; Simon Marcellin; Gilbert Ritschard
Archive | 2006
Simon Marcellin; Djamel A. Zighed; Gilbert Ritschard
Archive | 2007
Gilbert Ritschard; Djamel A. Zighed; Simon Marcellin
Archive | 2007
Gilbert Ritschard; Djamel Abdelkader Zighed; Simon Marcellin
Archive | 2009
Gilbert Ritschard; Simon Marcellin; Djamel Abdelkader Zighed
Analyse Statistique Implicative | 2009
Gilbert Ritschard; Simon Marcellin; Djamel Abdelkader Zighed
Archive | 2008
Simon Marcellin; Djamel Abdelkader Zighed; Gilbert Ritschard