Alberto Suárez | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alberto Suárez is active.

Explore More

Publication

Featured researches published by Alberto Suárez.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999

Globally optimal fuzzy decision trees for classification and regression

Alberto Suárez; James F. Lutsko

A fuzzy decision tree is constructed by allowing the possibility of partial membership of a point in the nodes that make up the tree structure. This extension of its expressive capabilities transforms the decision tree into a powerful functional approximant that incorporates features of connectionist methods, while remaining easily interpretable. Fuzzification is achieved by superimposing a fuzzy structure over the skeleton of a CART decision tree. A training rule for fuzzy trees, similar to backpropagation in neural networks, is designed. This rule corresponds to a global optimization algorithm that fixes the parameters of the fuzzy splits. The method developed for the automatic generation of fuzzy decision trees is applied to both classification and regression problems. In regression problems, it is seen that the continuity constraint imposed by the function representation of the fuzzy tree leads to substantial improvements in the quality of the regression and limits the tendency to overfitting. In classification, fuzzification provides a means of uncovering the structure of the probability distribution for the classification errors in attribute space. This allows the identification of regions for which the error rate of the tree is significantly lower than the average error rate, sometimes even below the Bayes misclassification rate.

international conference on machine learning | 2006

Pruning in ordered bagging ensembles

Gonzalo Martínez-Muñoz; Alberto Suárez

We present a novel ensemble pruning method based on reordering the classifiers obtained from bagging and then selecting a subset for aggregation. Ordering the classifiers generated in bagging makes it possible to build subensembles of increasing size by including first those classifiers that are expected to perform best when aggregated. Ensemble pruning is achieved by halting the aggregation process before all the classifiers generated are included into the ensemble. Pruned subensembles containing between 15% and 30% of the initial pool of classifiers, besides being smaller, improve the generalization performance of the full bagging ensemble in the classification problems investigated.

Pattern Recognition Letters | 2007

Using boosting to prune bagging ensembles

Gonzalo Martínez-Muñoz; Alberto Suárez

Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory for storage, classify faster and can improve the generalization accuracy of the original bagging ensemble. In all the classification problems investigated pruned ensembles with 20% of the original classifiers show statistically significant improvements over bagging. In problems where boosting is superior to bagging, these improvements are not sufficient to reach the accuracy of the corresponding boosting ensembles. However, ensemble pruning preserves the performance of bagging in noisy classification tasks, where boosting often has larger generalization errors. Therefore, pruned bagging should generally be preferred to complete bagging and, if no information about the level of noise is available, it is a robust alternative to AdaBoost.

Pattern Recognition | 2005

Switching class labels to generate classification ensembles

Gonzalo Martínez-Muñoz; Alberto Suárez

Ensembles that combine the decisions of classifiers generated by using perturbed versions of the training set where the classes of the training examples are randomly switched can produce a significant error reduction, provided that large numbers of units and high class switching rates are used. The classifiers generated by this procedure have statistically uncorrelated errors in the training set. Hence, the ensembles they form exhibit a similar dependence of the training error on ensemble size, independently of the classification problem. In particular, for binary classification problems, the classification performance of the ensemble on the training data can be analysed in terms of a Bernoulli process. Experiments on several UCI datasets demonstrate the improvements in classification accuracy that can be obtained using these class-switching ensembles.

IEEE Computational Intelligence Magazine | 2010

Hybrid Approaches and Dimensionality Reduction for Portfolio Selection with Cardinality Constraints

Rubén Ruiz-Torrubiano; Alberto Suárez

A novel memetic algorithm that combines evolutionary algorithms, quadratic programming, and specially devised pruning heuristics is proposed for the selection of cardinality-constrained optimal portfolios. The framework used is the standard Markowitz mean-variance formulation for portfolio optimization with constraints of practical interest, such as minimum and maximum investments per asset and/or on groups of assets. Imposing limits on the number of different assets that can be included in the investment transforms portfolio selection into an NP-complete mixed-integer quadratic optimization problem that is difficult to solve by standard methods. An implementation of the algorithm that employs a genetic algorithm with a set representation, an appropriately defined mutation operator and Random Assortment Recombination for crossover (RAR-GA) is compared with implementations using Simulated Annealing (SA) and various Estimation of Distribution Algorithms (EDAs). An empirical investigation of the performance of the portfolios selected with these different methods using financial data shows that RAR-GA and SA are superior to the implementations with EDAs in terms of both accuracy and efficiency. The use of pruning heuristics that effectively reduce the dimensionality of the problem by identifying and eliminating from the universe of investment assets that are not expected to appear in the optimal portfolio leads to significant improvements in performance and makes EDAs competitive with RAR-GA and SA.

Annals of Operations Research | 2009

A hybrid optimization approach to index tracking

Rubén Ruiz-Torrubiano; Alberto Suárez

Index tracking consists in reproducing the performance of a stock-market index by investing in a subset of the stocks included in the index. A hybrid strategy that combines an evolutionary algorithm with quadratic programming is designed to solve this NP-hard problem: Given a subset of assets, quadratic programming yields the optimal tracking portfolio that invests only in the selected assets. The combinatorial problem of identifying the appropriate assets is solved by a genetic algorithm that uses the output of the quadratic optimization as fitness function. This hybrid approach allows the identification of quasi-optimal tracking portfolios at a reduced computational cost.

ieee international conference on evolutionary computation | 2006

Selection of Optimal Investment Portfolios with Cardinality Constraints

Rafael Moral-Escudero; Rubén Ruiz-Torrubiano; Alberto Suárez

We consider the problem of selecting an optimal portfolio within the standard mean-variance framework extended to include constraints of practical interest, such as limits on the number of assets that can be included in the portfolio and on the minimum and maximum investments per asset and/or groups of assets. The introduction of these realistic constraints transforms the selection of the optimal portfolio into a mixed integer quadratic programming problem. This optimization problem, which we prove to be NP-hard, is difficult to solve, even approximately, by standard optimization techniques. A hybrid strategy that makes use of genetic algorithms and quadratic programming is designed to provide an accurate and efficient solution to the problem.

Neurocomputing | 2011

Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles

Daniel Hernández-Lobato; Gonzalo Martínez-Muñoz; Alberto Suárez

Identifying the optimal subset of regressors in a regression bagging ensemble is a difficult task that has exponential cost in the size of the ensemble. In this article we analyze two approximate techniques especially devised to address this problem. The first strategy constructs a relaxed version of the problem that can be solved using semidefinite programming. The second one is based on modifying the order of aggregation of the regressors. Ordered aggregation is a simple forward selection algorithm that incorporates at each step the regressor that reduces the training error of the current subensemble the most. Both techniques can be used to identify subensembles that are close to the optimal ones, which can be obtained by exhaustive search at a larger computational cost. Experiments in a wide variety of synthetic and real-world regression problems show that pruned ensembles composed of only 20% of the initial regressors often have better generalization performance than the original bagging ensembles. These improvements are due to a reduction in the bias and the covariance components of the generalization error. Subensembles obtained using either SDP or ordered aggregation generally outperform subensembles obtained by other ensemble pruning methods and ensembles generated by the Adaboost.R2 algorithm, negative correlation learning or regularized linear stacked generalization. Ordered aggregation has a slightly better overall performance than SDP in the problems investigated. However, the difference is not statistically significant. Ordered aggregation has the further advantage that it produces a nested sequence of near-optimal subensembles of increasing size with no additional computational cost.

Neurocomputing | 2008

Class-switching neural network ensembles

Gonzalo Martínez-Muñoz; Aitor Sánchez-Martínez; Daniel Hernández-Lobato; Alberto Suárez

This article investigates the properties of class-switching ensembles composed of neural networks and compares them to class-switching ensembles of decision trees and to standard ensemble learning methods, such as bagging and boosting. In a class-switching ensemble, each learner is constructed using a modified version of the training data. This modification consists in switching the class labels of a fraction of training examples that are selected at random from the original training set. Experiments on 20 benchmark classification problems, including real-world and synthetic data, show that class-switching ensembles composed of neural networks can obtain significant improvements in the generalization accuracy over single neural networks and bagging and boosting ensembles. Furthermore, it is possible to build medium-sized ensembles (~200 networks) whose classification performance is comparable to larger class-switching ensembles (~1000 learners) of unpruned decision trees.

systems man and cybernetics | 2004

Using all data to generate decision tree ensembles

Gonzalo Martínez-Muñoz; Alberto Suárez

This paper develops a new method to generate ensembles of classifiers that uses all available data to construct every individual classifier. The base algorithm builds a decision tree in an iterative manner: The training data are divided into two subsets. In each iteration, one subset is used to grow the decision tree, starting from the decision tree produced by the previous iteration. This fully grown tree is then pruned by using the other subset. The roles of the data subsets are interchanged in every iteration. This process converges to a final tree that is stable with respect to the combined growing and pruning steps. To generate a variety of classifiers for the ensemble, we randomly create the subsets needed by the iterative tree construction algorithm. The method exhibits good performance in several standard datasets at low computational cost.

Explore More