Márcio P. Basgalupp | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Márcio P. Basgalupp is active.

Explore More

Publication

Featured researches published by Márcio P. Basgalupp.

systems man and cybernetics | 2012

A Survey of Evolutionary Algorithms for Decision-Tree Induction

Rodrigo C. Barros; Márcio P. Basgalupp; A. de Carvalho; Alex Alves Freitas

This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The papers original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.

Information Sciences | 2011

Evolutionary model trees for handling continuous classes in machine learning

Rodrigo C. Barros; Duncan D. Ruiz; Márcio P. Basgalupp

Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output, helping the end-user to get more confidence in the prediction and providing the basis for the end-user to have new insight about the data, confirming or rejecting hypotheses previously formed. Moreover, model trees present an acceptable level of predictive performance in comparison to most techniques used for solving regression problems. Since generating the optimal model tree is an NP-Complete problem, traditional model tree induction algorithms make use of a greedy top-down divide-and-conquer strategy, which may not converge to the global optimal solution. In this paper, we propose a novel algorithm based on the use of the evolutionary algorithms paradigm as an alternate heuristic to generate model trees in order to improve the convergence to globally near-optimal solutions. We call our new approach evolutionary model tree induction (E-Motion). We test its predictive performance using public UCI data sets, and we compare the results to traditional greedy regression/model trees induction algorithms, as well as to other evolutionary approaches. Results show that our method presents a good trade-off between predictive performance and model comprehensibility, which may be crucial in many machine learning applications.

International Journal of Bio-inspired Computation | 2009

Lexicographic multi-objective evolutionary induction of decision trees

Márcio P. Basgalupp; André Carlos Ponce Leon Ferreira de Carvalho; Rodrigo C. Barros; Duncan D. Ruiz; Alex Alves Freitas

Among the several tasks that evolutionary algorithms have successfully employed, the induction of classification rules and decision trees has been shown to be a relevant approach for several application domains. Decision tree induction algorithms represent one of the most popular techniques for dealing with classification problems. However, conventionally used decision trees induction algorithms present limitations due to the strategy they usually implement: recursive top-down data partitioning through a greedy split evaluation. The main problem with this strategy is quality loss during the partitioning process, which can lead to statistically insignificant rules. In this paper, we propose a new GA-based algorithm for decision tree induction. The proposed algorithm aims to prevent the greedy strategy and to avoid converging to local optima. For such, it is based on a lexicographic multi-objective approach. In order to evaluate the proposed algorithm, it is compared with a well-known and frequently used decision tree induction algorithm using different public datasets. According to the experimental results, the proposed algorithm is able to avoid the previously described problems, reporting accuracy gains. Even more important, the proposed algorithm induced models with a significantly reduction in the complexity considering tree sizes.

acm symposium on applied computing | 2009

LEGAL-tree: a lexicographic multi-objective genetic algorithm for decision tree induction

Márcio P. Basgalupp; Rodrigo C. Barros; André Carlos Ponce Leon Ferreira de Carvalho; Alex Alves Freitas; Duncan D. Ruiz

Decision trees are widely disseminated as an effective solution for classification tasks. Decision tree induction algorithms have some limitations though, due to the typical strategy they implement: recursive top-down partitioning through a greedy split evaluation. This strategy is limiting in the sense that there is quality loss while the partitioning process occurs, creating statistically insignificant rules. In order to prevent the greedy strategy and to avoid converging to local optima, we present a novel Genetic Algorithm for decision tree induction based on a lexicographic multi-objective approach, and we compare it with the most well-known algorithm for decision tree induction, J48, over distinct public datasets. The results show the feasibility of using this technique as a means to avoid the previously described problems, reporting not only a comparable accuracy but also, importantly, a significantly simpler classification model in the employed datasets.

IEEE Transactions on Evolutionary Computation | 2014

Evolutionary Design of Decision-Tree Algorithms Tailored to Microarray Gene Expression Data Sets

Rodrigo C. Barros; Márcio P. Basgalupp; Alex Alves Freitas; André Carlos Ponce Leon Ferreira de Carvalho

Decision-tree induction algorithms are widely used in machine learning applications in which the goal is to extract knowledge from data and present it in a graphically intuitive way. The most successful strategy for inducing decision trees is the greedy top-down recursive approach, which has been continuously improved by researchers over the past 40 years. In this paper, we propose a paradigm shift in the research of decision trees: instead of proposing a new manually designed method for inducing decision trees, we propose automatically designing decision-tree induction algorithms tailored to a specific type of classification data set (or application domain). Following recent breakthroughs in the automatic design of machine learning algorithms, we propose a hyper-heuristic evolutionary algorithm called hyper-heuristic evolutionary algorithm for designing decision-tree algorithms (HEAD-DT) that evolves design components of top-down decision-tree induction algorithms. By the end of the evolution, we expect HEAD-DT to generate a new and possibly better decision-tree algorithm for a given application domain. We perform extensive experiments in 35 real-world microarray gene expression data sets to assess the performance of HEAD-DT, and compare it with very well known decision-tree algorithms such as C4.5, CART, and REPTree. Results show that HEAD-DT is capable of generating algorithms that significantly outperform the baseline manually designed decision-tree algorithms regarding predictive accuracy and F-measure.

acm symposium on applied computing | 2013

Software effort prediction: a hyper-heuristic decision-tree based approach

Márcio P. Basgalupp; Rodrigo C. Barros; Tiago Silva da Silva; André Carlos Ponce Leon Ferreira de Carvalho

Software effort prediction is an important task within software engineering. In particular, machine learning algorithms have been widely-employed to this task, bearing in mind their capability of providing accurate predictive models for the analysis of project stakeholders. Nevertheless, none of these algorithms has become the de facto standard for metrics prediction given the particularities of different software projects. Among these intelligent strategies, decision trees and evolutionary algorithms have been continuously employed for software metrics prediction, though mostly independent from each other. A recent work has proposed evolving decision trees through an evolutionary algorithm, and applying the resulting tree in the context of software maintenance effort prediction. In this paper, we raise the search-space level of an evolutionary algorithm by proposing the evolution of a decision-tree algorithm instead of the decision tree itself --- an approach known as hyper-heuristic. Our findings show that the decision-tree algorithm automatically generated by a hyper-heuristic is capable of statistically outperforming state-of-the-art top-down and evolution-based decision-tree algorithms, as well as traditional logistic regression. The ability of generating a highly-accurate comprehensible predictive model is crucial in software projects, considering that it allows the stakeholder to properly manage the teams resources with an improved confidence in the model predictions.

Evolutionary Computation | 2013

Automatic design of decision-tree algorithms with evolutionary algorithms

Rodrigo C. Barros; Márcio P. Basgalupp; André Carlos Ponce Leon Ferreira de Carvalho; Alex Alves Freitas

This study reports the empirical analysis of a hyper-heuristic evolutionary algorithm that is capable of automatically designing top-down decision-tree induction algorithms. Top-down decision-tree algorithms are of great importance, considering their ability to provide an intuitive and accurate knowledge representation for classification problems. The automatic design of these algorithms seems timely, given the large literature accumulated over more than 40 years of research in the manual design of decision-tree induction algorithms. The proposed hyper-heuristic evolutionary algorithm, HEAD-DT, is extensively tested using 20 public UCI datasets and 10 microarray gene expression datasets. The algorithms automatically designed by HEAD-DT are compared with traditional decision-tree induction algorithms, such as C4.5 and CART. Experimental results show that HEAD-DT is capable of generating algorithms which are significantly more accurate than C4.5 and CART.

genetic and evolutionary computation conference | 2012

A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms

Rodrigo C. Barros; Márcio P. Basgalupp; André Carlos Ponce Leon Ferreira de Carvalho; Alex Alves Freitas

Decision tree induction is one of the most employed methods to extract knowledge from data, since the representation of knowledge is very intuitive and easily understandable by humans. The most successful strategy for inducing decision trees, the greedy top-down approach, has been continuously improved by researchers over the years. This work, following recent breakthroughs in the automatic design of machine learning algorithms, proposes a hyper-heuristic evolutionary algorithm for automatically generating decision-tree induction algorithms, named HEAD-DT. We perform extensive experiments in 20 public data sets to assess the performance of HEAD-DT, and we compare it to traditional decision-tree algorithms such as C4.5 and CART. Results show that HEAD-DT can generate algorithms that significantly outperform C4.5 and CART regarding predictive accuracy and F-Measure.

BMC Bioinformatics | 2012

Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data

Rodrigo C. Barros; Ana T. Winck; Karina S. Machado; Márcio P. Basgalupp; André Carlos Ponce Leon Ferreira de Carvalho; Duncan D. Ruiz; Osmar Norberto de Souza

BackgroundThis paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance.ResultsThe empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application.ConclusionsWe conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.

genetic and evolutionary computation conference | 2011

Towards the automatic design of decision tree induction algorithms

Rodrigo C. Barros; Márcio P. Basgalupp; André Carlos Ponce Leon Ferreira de Carvalho; Alex Alves Freitas

Decision tree induction is one of the most employed methods to extract knowledge from data, since the representation of knowledge is very intuitive and easily understandable by humans. The most successful strategy for inducing decision trees, the greedy top-down approach, has been continuously improved by researchers over the years. This work, following recent breakthroughs in the automatic design of machine learning algorithms, proposes two different approaches for automatically generating generic decision tree induction algorithms. Both approaches are based on the evolutionary algorithms paradigm, which improves solutions based on metaphors of biological processes. We also propose guidelines to design interesting fitness functions for these evolutionary algorithms, which take into account the requirements and needs of the end-user.

Explore More