Alex Alves Freitas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alex Alves Freitas is active.

Explore More

Publication

Featured researches published by Alex Alves Freitas.

IEEE Transactions on Evolutionary Computation | 2002

Data mining with an ant colony optimization algorithm

Rafael Stubs Parpinelli; Heitor S. Lopes; Alex Alves Freitas

The paper proposes an algorithm for data mining called Ant-Miner (ant-colony-based data miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is inspired by both research on the behavior of real ant colonies and some data mining concepts as well as principles. We compare the performance of Ant-Miner with CN2, a well-known data mining algorithm for classification, in six public domain data sets. The results provide evidence that: 1) Ant-Miner is competitive with CN2 with respect to predictive accuracy, and 2) the rule lists discovered by Ant-Miner are considerably simpler (smaller) than those discovered by CN2.

systems man and cybernetics | 2009

A Survey of Evolutionary Algorithms for Clustering

Eduardo R. Hruschka; Ricardo J. G. B. Campello; Alex Alves Freitas; A. de Carvalho

This paper presents a survey of evolutionary algorithms designed for clustering tasks. It tries to reflect the profile of this area by focusing more on those subjects that have been given more importance in the literature. In this context, most of the paper is devoted to partitional algorithms that look for hard clusterings of data, though overlapping (i.e., soft and fuzzy) approaches are also covered in the paper. The paper is original in what concerns two main aspects. First, it provides an up-to-date overview that is fully devoted to evolutionary algorithms for clustering, is not limited to any particular kind of evolutionary approach, and comprises advanced topics like multiobjective and ensemble-based evolutionary clustering. Second, it provides a taxonomy that highlights some very important aspects in the context of evolutionary data clustering, namely, fixed or variable number of clusters, cluster-oriented or nonoriented operators, context-sensitive or context-insensitive operators, guided or unguided operators, binary, integer, or real encodings, centroid-based, medoid-based, label-based, tree-based, or graph-based representations, among others. A number of references are provided that describe applications of evolutionary algorithms for clustering in different domains, such as image processing, computer security, and bioinformatics. The paper ends by addressing some important issues and open questions that can be subject of future research.

Advances in evolutionary computing | 2003

A survey of evolutionary algorithms for data mining and knowledge discovery

Alex Alves Freitas

This chapter discusses the use of evolutionary algorithms, particularly genetic algorithms and genetic programming, in data mining and knowledge discovery. We focus on the data mining task of classification. In addition, we discuss some preprocessing and postprocessing steps of the knowledge discovery process, focusing on attribute selection and pruning of an ensemble of classifiers. We show how the requirements of data mining and knowledge discovery influence the design of evolutionary algorithms. In particular, we discuss how individual representation, genetic operators and fitness functions have to be adapted for extracting high-level knowledge from data.

Data Mining and Knowledge Discovery | 2011

A survey of hierarchical classification across different application domains

Carlos Nascimento Silla; Alex Alves Freitas

In this survey we discuss the task of hierarchical classification. The literature about this field is scattered across very different application domains and for that reason research in one domain is often done unaware of methods developed in other domains. We define what is the task of hierarchical classification and discuss why some related tasks should not be considered hierarchical classification. We also present a new perspective about some existing hierarchical classification approaches, and based on that perspective we propose a new unifying framework to classify the existing approaches. We also present a review of empirical comparisons of the existing methods reported in the literature as well as a conceptual comparison of those methods at a high level of abstraction, discussing their advantages and disadvantages.

Knowledge Based Systems | 1999

On rule interestingness measures

Alex Alves Freitas

This paper discusses several factors influencing the evaluation of the degree of interestingness of rules discovered by a data mining algorithm. This article aims at: (1) drawing attention to several factors related to rule interestingness that have been somewhat neglected in the literature; (2) showing some ways of modifying rule interestingness measures to take these factors into account; (3) introducing a new criterion to measure attribute surprisingness, as a factor influencing the interestingness of discovered rules.

systems man and cybernetics | 2012

A Survey of Evolutionary Algorithms for Decision-Tree Induction

Rodrigo C. Barros; Márcio P. Basgalupp; A. de Carvalho; Alex Alves Freitas

This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The papers original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.

Artificial Intelligence Review | 2001

Understanding the Crucial Role of AttributeInteraction in Data Mining

Alex Alves Freitas

This is a review paper, whose goal is tosignificantly improve our understanding of thecrucial role of attribute interaction in datamining. The main contributions of this paperare as follows. Firstly, we show that theconcept of attribute interaction has a crucialrole across different kinds of problem in datamining, such as attribute construction, copingwith small disjuncts, induction of first-orderlogic rules, detection of Simpsons paradox,and finding several types of interesting rules.Hence, a better understanding of attributeinteraction can lead to a better understandingof the relationship between these kinds ofproblems, which are usually studied separatelyfrom each other. Secondly, we draw attention tothe fact that most rule induction algorithmsare based on a greedy search which does notcope well with the problem of attributeinteraction, and point out some alternativekinds of rule discovery methods which tend tocope better with this problem. Thirdly, wediscussed several algorithms and methods fordiscovering interesting knowledge that,implicitly or explicitly, are based on theconcept of attribute interaction.

congress on evolutionary computation | 2000

Discovering comprehensible classification rules with a genetic algorithm

M.V. Fidelis; Heitor S. Lopes; Alex Alves Freitas

Presents a classification algorithm based on genetic algorithms (GAs) that discovers comprehensible IF-THEN rules, in the spirit of data mining. The proposed GA has a flexible chromosome encoding, where each chromosome corresponds to a classification rule. Although the number of genes (the genotype) is fixed, the number of rule conditions (the phenotype) is variable. The GA also has specific mutation operators for this chromosome encoding. The algorithm was evaluated on two public-domain real-world data sets (in the medical domains of dermatology and breast cancer).

Information Sciences | 2004

A hybrid decision tree/genetic algorithm method for data mining

Deborah Ribeiro Carvalho; Alex Alves Freitas

This paper addresses the well-known classification task of data mining, where the objective is to predict the class which an example belongs to. Discovered knowledge is expressed in the form of high-level, easy-to-interpret classification rules. In order to discover classification rules, we propose a hybrid decision tree/genetic algorithm method. The central idea of this hybrid method involves the concept of small disjuncts in data mining, as follows. In essence, a set of classification rules can be regarded as a logical disjunction of rules. so that each rule can be regarded as a disjunct. A small disjunct is a rule covering a small number of examples. Due to their nature, small disjuncts are error prone. However, although each small disjunct covers just a few examples, the set of all small disjuncts can cover a large number of examples, so that it is important to develop new approaches to cope with the problem of small disjuncts. In our hybrid approach, we have developed two genetic algorithms (GA) specifically designed for discovering rules covering examples belonging to small disjuncts, whereas a conventional decision tree algorithm is used to produce rules covering examples belonging to large disjuncts. We present results evaluating the performance of the hybrid method in 22 real-world data sets.

Sigkdd Explorations | 2004

A critical review of multi-objective optimization in data mining: a position paper

Alex Alves Freitas

This paper addresses the problem of how to evaluate the quality of a model built from the data in a multi-objective optimization scenario, where two or more quality criteria must be simultaneously optimized. A typical example is a scenario where one wants to maximize both the accuracy and the simplicity of a classification model or a candidate attribute subset in attribute selection. One reviews three very different approaches to cope with this problem, namely: (a) transforming the original multi-objective problem into a single-objective problem by using a weighted formula; (b) the lexicographical approach, where the objectives are ranked in order of priority; and (c) the Pareto approach, which consists of finding as many non-dominated solutions as possible and returning the set of non-dominated solutions to the user. One also presents a critical review of the case for and against each of these approaches. The general conclusions are that the weighted formula approach -- which is by far the most used in the data mining literature -- is to a large extent an ad-hoc approach for multi-objective optimization, whereas the lexicographic and the Pareto approach are more principled approaches, and therefore deserve more attention from the data mining community.

Explore More