Fernando E. B. Otero
University of Kent
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fernando E. B. Otero.
european conference on genetic programming | 2003
Fernando E. B. Otero; Monique M. S. Silva; Alex Alves Freitas; Julio Cesar Nievola
For a given data set, its set of attributes defines its data space representation. The quality of a data space representation is one of the most important factors influencing the performance of a data mining algorithm. The attributes defining the data space can be inadequate, making it difficult to discover high-quality knowledge. In order to solve this problem, this paper proposes a Genetic Programming algorithm developed for attribute construction. This algorithm constructs new attributes out of the original attributes of the data set, performing an important preprocessing step for the subsequent application of a data mining algorithm.
IEEE Transactions on Evolutionary Computation | 2013
Fernando E. B. Otero; Alex Alves Freitas; Colin G. Johnson
Ant colony optimization (ACO) algorithms have been successfully applied to discover a list of classification rules. In general, these algorithms follow a sequential covering strategy, where a single rule is discovered at each iteration of the algorithm in order to build a list of rules. The sequential covering strategy has the drawback of not coping with the problem of rule interaction, i.e., the outcome of a rule affects the rules that can be discovered subsequently since the search space is modified due to the removal of examples covered by previous rules. This paper proposes a new sequential covering strategy for ACO classification algorithms to mitigate the problem of rule interaction, where the order of the rules is implicitly encoded as pheromone values and the search is guided by the quality of a candidate list of rules. Our experiments using 18 publicly available data sets show that the predictive accuracy obtained by a new ACO classification algorithm implementing the proposed sequential covering strategy is statistically significantly higher than the predictive accuracy of state-of-the-art rule induction classification algorithms.
Applied Soft Computing | 2012
Fernando E. B. Otero; Alex Alves Freitas; Colin G. Johnson
Decision trees have been widely used in data mining and machine learning as a comprehensible knowledge representation. While ant colony optimization (ACO) algorithms have been successfully applied to extract classification rules, decision tree induction with ACO algorithms remains an almost unexplored research area. In this paper we propose a novel ACO algorithm to induce decision trees, combining commonly used strategies from both traditional decision tree induction algorithms and ACO. The proposed algorithm is compared against three decision tree induction algorithms, namely C4.5, CART and cACDT, in 22 publicly available data sets. The results show that the predictive accuracy of the proposed algorithm is statistically significantly higher than the accuracy of both C4.5 and CART, which are well-known conventional algorithms for decision tree induction, and the accuracy of the ACO-based cACDT decision tree algorithm.
Applied Soft Computing | 2013
Khalid M. Salama; Ashraf M. Abdelbar; Fernando E. B. Otero; Alex Alves Freitas
The cAnt-Miner algorithm is an Ant Colony Optimization (ACO) based technique for classification rule discovery in problem domains which include continuous attributes. In this paper, we propose several extensions to cAnt-Miner. The main extension is based on the use of multiple pheromone types, one for each class value to be predicted. In the proposed @mcAnt-Miner algorithm, an ant first selects a class value to be the consequent of a rule and the terms in the antecedent are selected based on the pheromone levels of the selected class value; pheromone update occurs on the corresponding pheromone type of the class value. The pre-selection of a class value also allows the use of more precise measures for the heuristic function and the dynamic discretization of continuous attributes, and further allows for the use of a rule quality measure that directly takes into account the confidence of the rule. Experimental results on 20 benchmark datasets show that our proposed extension improves classification accuracy to a statistically significant extent compared to cAnt-Miner, and has classification accuracy similar to the well-known Ripper and PART rule induction algorithms.
evolutionary computation, machine learning and data mining in bioinformatics | 2009
Fernando E. B. Otero; Alex Alves Freitas; Colin G. Johnson
This paper proposes a novel Ant Colony Optimisation algorithm for the hierarchical problem of predicting protein functions using the Gene Ontology (GO). The GO structure represents a challenging case of hierarchical classification, since its terms are organised in a direct acyclic graph fashion where a term can have more than one parent -- in contrast to only one parent in tree structures. The proposed method discovers an ordered list of classification rules which is able to predict all GO terms independently of their level. We have compared the proposed method against a baseline method, which consists of training classifiers for each GO terms individually, in five different ion-channel data sets and the results obtained are promising.
genetic and evolutionary computation conference | 2013
Fernando E. B. Otero; Alex Alves Freitas
The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules) - i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorithm to discover a set of rules (unordered rules). The main motivation for discovering a set of rules is to improve the interpretation of individual rules and evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly-used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms and the cAnt-MinerPB producing ordered rules are also presented.
genetic and evolutionary computation conference | 2012
Fernando E. B. Otero; Tom Castle; Colin G. Johnson
EpochX is a Genetic Programming (GP) framework written in Java. It allows the creation of tree-based and grammar-based GP systems. It has been created to reflect typical ways in which Java programmers work, for example, borrowing from the Java event model and taking inspiration from the Java collections framework. This paper presents EpochX in general, and gives particular attention to the event model and the statistics analysis framework.
congress on evolutionary computation | 2012
Alberto Moraglio; Fernando E. B. Otero; Colin G. Johnson; Simon J. Thompson; Alex Alves Freitas
Genetic programming has proven capable of evolving solutions to a wide variety of problems. However, the successes have largely been with programs without iteration or recursion; evolving recursive programs has turned out to be particularly challenging. The main obstacle to evolving recursive programs seems to be that they are particularly fragile to the application of search operators: a small change in a correct recursive program generally produces a completely wrong program. In this paper, we present a simple and general method that allows us to pass back and forth from a recursive program to an associated non-recursive program. Finding a recursive program can be reduced to evolving non-recursive programs followed by converting the optimum non-recursive program found to the associated optimum recursive program. This avoids the fragility problem above, as evolution does not search the space of recursive programs. We present promising experimental results on a test-bed of recursive problems.
international joint conference on computational intelligence | 2014
Khalid M. Salama; Fernando E. B. Otero
Ant Colony Optimization (ACO) is a meta-heuristic for solving combinatorial optimization problems, inspired by the behaviour of biological ant colonies. One of the successful applications of ACO is learning classification models (classifiers). A classifier encodes the relationships between the input attribute values and the values of a class attribute in a given set of labelled cases and it can be used to predict the class value of new unlabelled cases. Decision trees have been widely used as a type of classification model that represent comprehensible knowledge to the user. In this paper, we propose the use of ACO-based algorithms for learning an extended multi-tree classification model, which consists of multiple decision trees, one for each class value. Each class-based decision trees is responsible for discriminating between its class value and all other values available in the class domain. Our proposed algorithms are empirically evaluated against well-known decision trees induction algorithms, as well as the ACO-based Ant-Tree-Miner algorithm. The results show an overall improvement in predictive accuracy over 32 benchmark datasets. We also discuss how the new multi-tree models can provide the user with more understanding and knowledge-interpretability in a given domain.
genetic and evolutionary computation conference | 2016
Luiz Otávio Vilas Boas Oliveira; Fernando E. B. Otero; Gisele L. Pappa
Recent advances in geometric semantic genetic programming (GSGP) have shown that the results obtained by these methods can outperform those obtained by classical genetic programming algorithms, in particular in the context of symbolic regression. However, there are still many open issues on how to improve their search mechanism. One of these issues is how to get around the fact that the GSGP crossover operator cannot generate solutions that are placed outside the convex hull formed by the individuals of the current population. Although the mutation operator alleviates this problem, we cannot guarantee it will find promising regions of the search space within feasible computational time. In this direction, this paper proposes a new geometric dispersion operator that uses multiplicative factors to move individuals to less dense areas of the search space around the target solution before applying semantic genetic operators. Experiments in sixteen datasets show that the results obtained by the proposed operator are statistically significantly better than those produced by GSGP and that the operator does indeed spread the solutions around the target solution.