Gonzalo Martínez-Muñoz
Autonomous University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gonzalo Martínez-Muñoz.
international conference on machine learning | 2006
Gonzalo Martínez-Muñoz; Alberto Suárez
We present a novel ensemble pruning method based on reordering the classifiers obtained from bagging and then selecting a subset for aggregation. Ordering the classifiers generated in bagging makes it possible to build subensembles of increasing size by including first those classifiers that are expected to perform best when aggregated. Ensemble pruning is achieved by halting the aggregation process before all the classifiers generated are included into the ensemble. Pruned subensembles containing between 15% and 30% of the initial pool of classifiers, besides being smaller, improve the generalization performance of the full bagging ensemble in the classification problems investigated.
Pattern Recognition Letters | 2007
Gonzalo Martínez-Muñoz; Alberto Suárez
Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory for storage, classify faster and can improve the generalization accuracy of the original bagging ensemble. In all the classification problems investigated pruned ensembles with 20% of the original classifiers show statistically significant improvements over bagging. In problems where boosting is superior to bagging, these improvements are not sufficient to reach the accuracy of the corresponding boosting ensembles. However, ensemble pruning preserves the performance of bagging in noisy classification tasks, where boosting often has larger generalization errors. Therefore, pruned bagging should generally be preferred to complete bagging and, if no information about the level of noise is available, it is a robust alternative to AdaBoost.
Pattern Recognition | 2005
Gonzalo Martínez-Muñoz; Alberto Suárez
Ensembles that combine the decisions of classifiers generated by using perturbed versions of the training set where the classes of the training examples are randomly switched can produce a significant error reduction, provided that large numbers of units and high class switching rates are used. The classifiers generated by this procedure have statistically uncorrelated errors in the training set. Hence, the ensembles they form exhibit a similar dependence of the training error on ensemble size, independently of the classification problem. In particular, for binary classification problems, the classification performance of the ensemble on the training data can be analysed in terms of a Bernoulli process. Experiments on several UCI datasets demonstrate the improvements in classification accuracy that can be obtained using these class-switching ensembles.
computer vision and pattern recognition | 2009
Gonzalo Martínez-Muñoz; Natalia Larios; Eric N. Mortensen; Wei Zhang; Asako Yamamuro; Robert Paasch; Nadia Payet; David A. Lytle; Linda G. Shapiro; Sinisa Todorovic; Andrew R. Moldenke; Thomas G. Dietterich
Current work in object categorization discriminates among objects that typically possess gross differences which are readily apparent. However, many applications require making much finer distinctions. We address an insect categorization problem that is so challenging that even trained human experts cannot readily categorize images of insects considered in this paper. The state of the art that uses visual dictionaries, when applied to this problem, yields mediocre results (16.1% error). Three possible explanations for this are (a) the dictionaries are unsupervised, (b) the dictionaries lose the detailed information contained in each keypoint, and (c) these methods rely on hand-engineered decisions about dictionary size. This paper presents a novel, dictionary-free methodology. A random forest of trees is first trained to predict the class of an image based on individual keypoint descriptors. A unique aspect of these trees is that they do not make decisions but instead merely record evidence-i.e., the number of descriptors from training examples of each category that reached each leaf of the tree. We provide a mathematical model showing that voting evidence is better than voting decisions. To categorize a new image, descriptors for all detected keypoints are “dropped” through the trees, and the evidence at each leaf is summed to obtain an overall evidence vector. This is then sent to a second-level classifier to make the categorization decision. We achieve excellent performance (6.4% error) on the 9-class STONEFLY9 data set. Also, our method achieves an average AUC of 0.921 on the PASCAL06 VOC, which places it fifth out of 21 methods reported in the literature and demonstrates that the method also works well for generic object categorization.
Journal of The North American Benthological Society | 2010
David A. Lytle; Gonzalo Martínez-Muñoz; Wei Zhang; Natalia Larios; Linda G. Shapiro; Robert Paasch; Andrew R. Moldenke; Eric N. Mortensen; Sinisa Todorovic; Thomas G. Dietterich
Abstract We present a visually based method for the taxonomic identification of benthic invertebrates that automates image capture, image processing, and specimen classification. The BugID system automatically positions and images specimens with minimal user input. Images are then processed with interest operators (machine-learning algorithms for locating informative visual regions) to identify informative pattern features, and this information is used to train a classifier algorithm. Naïve Bayes modeling of stacked decision trees is used to determine whether a specimen is an unknown distractor (taxon not in the training data set) or one of the species in the training set. When tested on images from 9 larval stonefly taxa, BugID correctly identified 94.5% of images, even though small or damaged specimens were included in testing. When distractor taxa (10 common invertebrates not present in the training set) were included to make classification more challenging, overall accuracy decreased but generally was close to 90%. At the equal error rate (EER), 89.5% of stonefly images were correctly classified and the accuracy of nonrejected stoneflies increased to 96.4%, a result suggesting that many difficult-to-identify or poorly imaged stonefly specimens had been rejected prior to classification. BugID is the first system of its kind that allows users to select thresholds for rejection depending on the required use. Rejected images of distractor taxa or difficult specimens can be identified later by a taxonomic expert, and new taxa ultimately can be incorporated into the training set of known taxa. BugID has several advantages over other automated insect classification systems, including automated handling of specimens, the ability to isolate nontarget and novel species, and the ability to identify specimens across different stages of larval development.
Neurocomputing | 2011
Daniel Hernández-Lobato; Gonzalo Martínez-Muñoz; Alberto Suárez
Identifying the optimal subset of regressors in a regression bagging ensemble is a difficult task that has exponential cost in the size of the ensemble. In this article we analyze two approximate techniques especially devised to address this problem. The first strategy constructs a relaxed version of the problem that can be solved using semidefinite programming. The second one is based on modifying the order of aggregation of the regressors. Ordered aggregation is a simple forward selection algorithm that incorporates at each step the regressor that reduces the training error of the current subensemble the most. Both techniques can be used to identify subensembles that are close to the optimal ones, which can be obtained by exhaustive search at a larger computational cost. Experiments in a wide variety of synthetic and real-world regression problems show that pruned ensembles composed of only 20% of the initial regressors often have better generalization performance than the original bagging ensembles. These improvements are due to a reduction in the bias and the covariance components of the generalization error. Subensembles obtained using either SDP or ordered aggregation generally outperform subensembles obtained by other ensemble pruning methods and ensembles generated by the Adaboost.R2 algorithm, negative correlation learning or regularized linear stacked generalization. Ordered aggregation has a slightly better overall performance than SDP in the problems investigated. However, the difference is not statistically significant. Ordered aggregation has the further advantage that it produces a nested sequence of near-optimal subensembles of increasing size with no additional computational cost.
Neurocomputing | 2008
Gonzalo Martínez-Muñoz; Aitor Sánchez-Martínez; Daniel Hernández-Lobato; Alberto Suárez
This article investigates the properties of class-switching ensembles composed of neural networks and compares them to class-switching ensembles of decision trees and to standard ensemble learning methods, such as bagging and boosting. In a class-switching ensemble, each learner is constructed using a modified version of the training data. This modification consists in switching the class labels of a fraction of training examples that are selected at random from the original training set. Experiments on 20 benchmark classification problems, including real-world and synthetic data, show that class-switching ensembles composed of neural networks can obtain significant improvements in the generalization accuracy over single neural networks and bagging and boosting ensembles. Furthermore, it is possible to build medium-sized ensembles (~200 networks) whose classification performance is comparable to larger class-switching ensembles (~1000 learners) of unpruned decision trees.
systems man and cybernetics | 2004
Gonzalo Martínez-Muñoz; Alberto Suárez
This paper develops a new method to generate ensembles of classifiers that uses all available data to construct every individual classifier. The base algorithm builds a decision tree in an iterative manner: The training data are divided into two subsets. In each iteration, one subset is used to grow the decision tree, starting from the decision tree produced by the previous iteration. This fully grown tree is then pruned by using the other subset. The roles of the data subsets are interchanged in every iteration. This process converges to a final tree that is stable with respect to the combined growing and pruning steps. To generate a variety of classifiers for the ensemble, we randomly create the subsets needed by the iterative tree construction algorithm. The method exhibits good performance in several standard datasets at low computational cost.
international joint conference on neural network | 2006
Daniel Hernández-Lobato; Gonzalo Martínez-Muñoz; Alberto Suárez
An efficient procedure for pruning regression ensembles is introduced. Starting from a bagging ensemble, pruning proceeds by ordering the regressors in the original ensemble and then selecting a subset for aggregation. Ensembles of increasing size are built by including first the regressors that perform best when aggregated. This strategy gives an approximate solution to the problem of extracting from the original ensemble the minimum error subensemble, which we prove to be NP-hard. Experiments show that pruned ensembles with only 20% of the initial regressors achieve better generalization accuracies than the complete bagging ensembles. The performance of pruned ensembles is analyzed by means of the bias-variance decomposition of the error.
International Journal of Nanomedicine | 2012
V. Torres-Costa; Gonzalo Martínez-Muñoz; Vanessa Sánchez-Vaquero; Álvaro Muñoz-Noval; Laura González-Méndez; E. Punzón-Quijorna; Darío Gallach-Pérez; M. Manso-Silván; A. Climent-Font; Josefa P. García-Ruiz; Raúl J. Martín-Palma
The engineering of surface patterns is a powerful tool for analyzing cellular communication factors involved in the processes of adhesion, migration, and expansion, which can have a notable impact on therapeutic applications including tissue engineering. In this regard, the main objective of this research was to fabricate patterned and textured surfaces at micron- and nanoscale levels, respectively, with very different chemical and topographic characteristics to control cell–substrate interactions. For this task, one-dimensional (1-D) and two-dimensional (2-D) patterns combining silicon and nanostructured porous silicon were engineered by ion beam irradiation and subsequent electrochemical etch. The experimental results show that under the influence of chemical and morphological stimuli, human mesenchymal stem cells polarize and move directionally toward or away from the particular stimulus. Furthermore, a computational model was developed aiming at understanding cell behavior by reproducing the surface distribution and migration of human mesenchymal stem cells observed experimentally.