M. do Carmo Nicoletti
Federal University of São Carlos
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by M. do Carmo Nicoletti.
hybrid intelligent systems | 2008
Estevam R. Hruschka; M. do Carmo Nicoletti; Vilma A. Oliveira; Gláucia M. Bressan
A Bayesian network (BN) is a formalism for representing and reasoning about uncertain domains. In BNs the knowledge is represented by a combination of a graph-based structure and probability theory. A particular type of BN known as Bayesian Classifier (BC) aims at classifying a given instance into a discrete class. BCs have been intensively used for knowledge modeling in many different applications and have been the focus of many works related to data mining. Data mining tasks are usually applied to real domains having large number of variables. In such domains, the classifiers tend to be large and complex and consequently are not so easily understood by human beings. This paper proposes an approach for promoting the understandability of the knowledge represented by a BC, by translating it into a more convenient and easily understandable form of representation, that of classification rules. The proposed method named BayesRule uses the concept of Markov-Blanket to obtain a reduced set of rules in relation to both, the number of rules and the number of conditions in the antecedent of a rule. Experiments using seven knowledge domains show that the reduced set of rules extracted from a BC can be smaller and still maintain the BC classification accuracy.
systems, man and cybernetics | 2005
D.M. Santoro; E.R. Hruschska; M. do Carmo Nicoletti
As a previous step to machine learning (ML) induced classifiers, attribute subset selection methods have become an efficient alternative for reducing the dimensionality of the search space, with obvious benefits to the learning techniques used. This paper investigates the problem of feature subset selection using a committee of filter, wrapper and embedded methods. The wrappers were implemented using two different search mechanisms, a genetic algorithm and a best-first procedure as well as three different machine learning paradigms: instance-based (nearest neighbor - NN), neural network (DistAl) and symbolic (C4.5). The two filter methods used are based on consistency and correlation measures. The goals of the experiments were to be able to identify the most suitable attribute subsets to be further used for inducing a classifier as well as investigate if the combination of different results given by the committees members can outperform any machine learning method using the original training set.
international conference hybrid intelligent systems | 2007
Estevam R. Hruschka; M. do Carmo Nicoletti; V.A. de Oliveira; Gláucia M. Bressan
Bayesian network (BN) is a formalism for representing and reasoning about uncertain domains. In BN the knowledge is represented by a combination of a graph-based structure and probability theory. A particular type of BN known as Bayesian Classifier (BC) aims at classifying a given instance into a discrete class. BCs have been extensively used for modeling knowledge in many different applications and have been the focus of many works related to data mining. Depending on the size of a BC the understandability of the knowledge it represents is not an easy task. This paper proposes an approach to help the process of understanding the knowledge represented by a BC, by translating it into a more convenient and easily understandable form of representation, that of classification rules. The proposed method named BayesRule (BR) uses the concept of Markov Blanket (MB) to obtain a reduced set of rules in respect to both, the number of rules and the number of antecedents in rules. Experiments using the ALARM network showed that the reduced set of rules extracted from the BC can be smaller than the set of rules representing a decision tree generated by C4.5, and still maintains a high accuracy rate.
congress on evolutionary computation | 2007
E.B. dos Santos; Estevam R. Hruschka; M. do Carmo Nicoletti
This work proposes a genetic strategy for learning a Bayesian classifier using an algorithm based on conditional independence and the information given by a variable ordering. The strategy has been implemented as the system VOGAC-PC. The paper presents and analyses the results of experiments in various domains using VOGAC-PC as well as a previous system, named VOGA-K2, based on algorithm K2.
intelligent systems design and applications | 2007
Marcos Evandro Cintra; H. de Arruda Camargo; Estevam R. Hruschka; M. do Carmo Nicoletti
The definition of the fuzzy rule base is one of the most important and difficult tasks when designing fuzzy systems. This paper discusses the results of two different hybrid methods investigated earlier, for the automatic generation of fuzzy rules from numerical data. One of the methods proposes the creation of fuzzy rule bases using genetic algorithms in association with a heuristic for preselecting candidate rules. The other, named Bayes fuzzy, induces a Bayes classifier using a dataset previously granulated by fuzzy partitions and then translates this classifier into a fuzzy rule base. A comparative analysis between both approaches focusing on their main characteristics, strengths/weaknesses and easiness of use is carried out. The reliability of both methods is also compared by analyzing their results in a few knowledge domains.
ieee international conference on fuzzy systems | 2007
Estevam R. Hruschka; H. de Arruda Camargo; Marcos Evandro Cintra; M. do Carmo Nicoletti
Traditional algorithms for learning Bayesian classifiers (BCs) from data are known to induce accurate classification models. However, when using these algorithms, two main concerns should be considered: i) they require qualitative data and ii) generally the induced models are not easily comprehensible by human beings. This paper deals with the two above issues by proposing a hybrid method named BayesFuzzy that learns from quantitative data and induces a fuzzy rule based model that enhances comprehensibility. BayesFuzzy has been implemented as an automatic system that combines a fuzzy strategy, for transforming numerical data into qualitative information, with a Bayes-based approach for inducing rules. Promising empirical results of the use of the BayesFuzzy system in four knowledge domains are presented and discussed.
ieee international conference on fuzzy systems | 2000
Arthur Ramer; B. Bouchon-Meunier; M. do Carmo Nicoletti; C. Marsala; M. Rifqi
Standard model of fuzzy computations is based on extension principle. It is known to work well, in practice, only for continuous fuzzy numbers, while producing unintuitive results when one or more arguments are discrete. It is also computationally cumbersome for all but linear operations. Another model was proposed for trapezoidal numbers only. Its operations amount to computing on the four vertices of the trapezoids, and then spanning a new trapezoid on the four resulting vertices. It is efficient, but produces fairly crude approximations for curvilinear fuzzy numbers; moreover, it is not applicable when discrete arguments are present. A model based on approximating fuzzy numbers, whether continuous or discrete, by multitrapezoidal curves and then performing coordinate-wise computations was proposed first by Ramer. It was applied to economical decision problems by his doctoral student James Wang. In this paper we place this computational method in context of fuzzy interpolations. We show how interpolation can bring quite disparate argument into a standardized form, thus permitting for efficient computations and avoid unintuitive results. Here we use the model of multiple trapezoids, but other classes of curves can be considered.
ieee international conference on fuzzy systems | 2003
F.O.S. Sa Lisboa; M. do Carmo Nicoletti; Arthur Ramer
The NGE model (implemented by EACH) is an incremental form of inductive learning from examples that generalizes a given training set into hypotheses represented as a set of hyperrectangles in an n-dimensional Euclidean space. The NGE algorithm can be considered a descendent of either NN or KNN algorithms. This paper focuses on a fuzzy version of the NGE algorithm, aiming at comparing its performance with a fuzzy version of the NN algorithm and, of the KNN algorithm.
hybrid intelligent systems | 2014
Edimilson Batista dos Santos; Estevam R. Hruschka; M. do Carmo Nicoletti; Nelson F. F. Ebecken
Variable Ordering (VO) plays an important role when inducing Bayesian Networks (BNs) and Bayesian Classifiers (BCs). Previous works in the literature suggest that it is worth pursuing the use of genetic/evolutionary algorithms for identifying a suitable VO, when learning a BN structure from data. This paper proposes a collaborative Evolutionary-Bayes algorithm named VOEA (Variable Ordering Evolutionary Algorithm) aimed at inducing BCs from data. The two VOEA versions presented in the paper refine a previously proposed algorithm named VOGA by employing only a single evolutionary operator (either crossover or mutation) as well as by using information about the class variable when defining the most suitable variable ordering for learning a BC. Experiments performed in a number of datasets revealed that the VOEA approach is promising and tends to generate suitable and representative BCs, particularly in its version VOEA_M, which only implements the mutation operator.
world congress on computational intelligence | 2008
Luciana Montera; M. do Carmo Nicoletti; F.H. da Silva; Pablo Moscato
The DNA shuffling process has been successfully used in many experiments of Directed Molecular Evolution. In a shuffling experiment genes are recombined by an iterative procedure of PCR cycles aiming at obtaining new genes, hopefully with some of the original functions being improved. The optimizations of the parameters involved in the process as well as the characteristics of the parental sequences are of extreme importance to guarantee the success of a shuffling experiment. This paper proposes a new measure, based on the number of bases between existing mutations in the parental sequences, suitable for evaluating the suitability of two sequences to be submitted to a DNA shuffling experiment. In order to investigate the usefulness of the proposed mutation-based measure versus two commonly used measures, a family of 37 DNA gene sequences codifying for snake venom metallopeptidases was used for evaluation purposes using the three measures. The parental sequences identified by each of the three measures were validated by simulating the DNA shuffling process using the software eShuffle. The eShuffle results illustrate on the benefits of the mutation-based measure proposed in this paper.