Ricco Rakotomalala
University of Lyon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ricco Rakotomalala.
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems | 1998
Djamel A. Zighed; Sabine Rabas'eda; Ricco Rakotomalala
In induction graphs methods such as C4.51 or SIPINA2, taking continuous attributes into account needs particular discretization procedures. In this paper, we propose on the one hand, an axiomatic leading to a set of criteria which can be used for continuous attributes discretization, and on the other hand, a method of discretization called FUSINTER. The results obtained by FUSINTER are compared to those obtained by techniques developed by Fayyad and Irani3 and Kerber4 and they have proved better for the majority of the examples studied.
Review of Scientific Instruments | 2005
Frederic Clerc; Mourad Lengliz; David Farrusseng; C. Mirodatos; Silvia R. M. Pereira; Ricco Rakotomalala
This study reports a detailed investigation of catalyst library design by genetic algorithm (GA). A methodology for assessing GA configurations is described. Operators, which promote the optimization speed while being robust to noise and outliers, are revealed through statistical studies. The genetic algorithms were implemented in GA platform software called OptiCat, which enables the construction of custom-made workflows using a tool box of operators. Two separate studies were carried out (i) on a virtual benchmark and (ii) on real surface response which is derived from HT screening. Additionally, we report a methodology to model a complex surface response by binning the search space in small zones that are then independently modeled by linear regression. In contrast to artificial neural networks, this approach allows one to obtain an explicit model in an analogical form that can be further used in Excel or entered in OptiCat to perform simulations. While speeding the implementation of a hybrid algorithm...
international conference on information and communication technologies | 2004
Faouzi Mhamdi; Mourad Elloumi; Ricco Rakotomalala
The present study presents the classification of proteins by basing on its primary structures. The sequence of proteins collected in a file. The application of textmining technique for extracting the features is proposed. An algorithm is also developed which extracts all the n-grams existing in the file of data and produced a learning file. Algorithm supplies three files, Boolean file, that is a relation of existence or not existence, frequencies files and occurrences files. The applied forward selection and backward elimination method is a learning file with an accepted features numbers.
european conference on principles of data mining and knowledge discovery | 2000
Stéphane Lallich; Ricco Rakotomalala
We propose a fast feature selection method in supervised learning for multi-valued attributes. The main idea is to rewrite the multi-valued problem in the space of examples into a boolean problem in the space of pairwise examples. On basis of this approach, we can use point correlation coefficient which is null in the case of conditional independence, and verifies a formula connecting partial coefficients with marginal coefficients. This property allows to reduce considerably the computing times because a single pass over the database is necessary to compute all coefficients. We test our algorithm on benchmark databases.
Information Sciences | 1996
Sabine Rabas'eda; Ricco Rakotomalala; Marc Sebban
In induction methods by graphs, as C4.5 or SIPINA, taking continuous attributes into account needs particular discretization procedures. In this paper, we propose on the one hand, an axiom leading to a set of criteria which can be used for continuous attributes discretization, and on the other hand, a method of discretization called FUSINTER. The results obtained by FUSINTER are compared to those obtained by techniques developed by other researchers, and they have proved better for the majority of the examples studied. We also discuss other approaches based on statistical concepts and appealing either to inertia criteria or to nonparametric tests, as the one of Moods runs.
Archive | 2001
Jean-Hugues Chauchat; Ricco Rakotomalala
We propose a fast and efficient sampling strategy to build decision trees from a very large database, even when there are many continuous attributes which must be discretized at each step. Successive samples are used, one on each tree node. After a brief description of two fast sequential simple random sampling methods, we apply elements of statistical theory in order to determine the sample size that is sufficient at each step to obtain a decision tree as efficient as one built on the whole database. Applying the method to a simulated database (virtually infinite size), and to five usual benchmarks, confirms that when the database is large and contains many numerical attributes, our strategy of fast sampling on each node (with sample size about n = 300 or 500) speed up the mining process while maintaining the accuracy of the classifier.
european conference on principles of data mining and knowledge discovery | 1999
Ricco Rakotomalala; Stéphane Lallich; S. Di Palma
This paper study splitting criterion in decision trees using three original points of view. First we propose a unified formalization for association measures based on entropy of type beta. This formalization includes popular measures such as Gini index or Shannon entropy. Second, we generate artificial data from M-of-N concepts whose complexity and class distribution are controlled. Third, our experiment allows us to study the behavior of measures on datasets of growing complexity. The results show that the differences of performances between measures, which are significant when there is no noise in the data, disappear when the level of noise increases.
computer recognition systems | 2005
Faouzi Mhamdi; Ricco Rakotomalala; Mourad Elloumi
In this paper, a knowledge discovery framework is used for protein classification. The processing is achieved in three steps: feature extraction, feature ranking and feature selection. Inspirited from text mining results for the first step, we use n-grams descriptors; descriptors are ranked from chi-2 statistical indices in the second step; and in the final step, the subset of descriptors is selected which will minimize the prediction error rate using a k-nearest neighbor classifier. Experiments show that this framework gives good results: the dimensionality reduction is effective and increases the classifier performances.
european conference on principles of data mining and knowledge discovery | 2000
Jean-Hugues Chauchat; Ricco Rakotomalala; Didier Robert
This paper presents various balanced sampling strategies for building decision trees in order to target rare groups. A new coefficient to compare targeting performances of various learning strategies is introduced. A real life application of targeting specific bank customer group for marketing actions is described. Results shows that local sampling on the nodes while constructing the tree requires small samples to achieve the performance of processing the complete base, with dramatically reduced computing times.
Archive | 2000
Djamel Abdelkader Zighed; Ricco Rakotomalala