Is this you? Create Your Porfile

Mikel Galar

Universidad Pública de Navarra

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mikel Galar is active.

Explore More

Publication

Featured researches published by Mikel Galar.

systems man and cybernetics | 2012

A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches

Mikel Galar; Alberto Fernández; Edurne Barrenechea; Humberto Bustince; Francisco Herrera

Classifier learning with data-sets that suffer from imbalanced class distributions is a challenging problem in data mining community. This issue occurs when the number of examples that represent one class is much lower than the ones of the other classes. Its presence in many real-world applications has brought along a growth of attention from researchers. In machine learning, the ensemble of classifiers are known to increase the accuracy of single classifiers by combining several of them, but neither of these learning techniques alone solve the class imbalance problem, to deal with this issue the ensemble learning algorithms have to be designed specifically. In this paper, our aim is to review the state of the art on ensemble techniques in the framework of imbalanced data-sets, with focus on two-class problems. We propose a taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based. In addition, we develop a thorough empirical comparison by the consideration of the most significant published approaches, within the families of the taxonomy proposed, to show whether any of them makes a difference. This comparison has shown the good behavior of the simplest approaches which combine random undersampling techniques with bagging or boosting ensembles. In addition, the positive synergy between sampling techniques and bagging has stood out. Furthermore, our results show empirically that ensemble-based algorithms are worthwhile since they outperform the mere use of preprocessing techniques before learning the classifier, therefore justifying the increase of complexity by means of a significant enhancement of the results.

Pattern Recognition | 2013

EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling

Mikel Galar; Alberto Fernández; Edurne Barrenechea; Francisco Herrera

Classification with imbalanced data-sets has become one of the most challenging problems in Data Mining. Being one class much more represented than the other produces undesirable effects in both the learning and classification processes, mainly regarding the minority class. Such a problem needs accurate tools to be undertaken; lately, ensembles of classifiers have emerged as a possible solution. Among ensemble proposals, the combination of Bagging and Boosting with preprocessing techniques has proved its ability to enhance the classification of the minority class. In this paper, we develop a new ensemble construction algorithm (EUSBoost) based on RUSBoost, one of the simplest and most accurate ensemble, which combines random undersampling with Boosting algorithm. Our methodology aims to improve the existing proposals enhancing the performance of the base classifiers by the usage of the evolutionary undersampling approach. Besides, we promote diversity favoring the usage of different subsets of majority class instances to train each base classifier. Centered on two-class highly imbalanced problems, we will prove, supported by the proper statistical analysis, that EUSBoost is able to outperform the state-of-the-art methods based on ensembles. We will also analyze its advantages using kappa-error diagrams, which we adapt to the imbalanced scenario.

Knowledge Based Systems | 2013

Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches

Alberto Fernández; Victoria López; Mikel Galar; María José del Jesús; Francisco Herrera

The imbalanced class problem is related to the real-world application of classification in engineering. It is characterised by a very different distribution of examples among the classes. The condition of multiple imbalanced classes is more restrictive when the aim of the final system is to obtain the most accurate precision for each of the concepts of the problem. The goal of this work is to provide a thorough experimental analysis that will allow us to determine the behaviour of the different approaches proposed in the specialised literature. First, we will make use of binarization schemes, i.e., one versus one and one versus all, in order to apply the standard approaches to solving binary class imbalanced problems. Second, we will apply several ad hoc procedures which have been designed for the scenario of imbalanced data-sets with multiple classes. This experimental study will include several well-known algorithms from the literature such as decision trees, support vector machines and instance-based learning, with the intention of obtaining global conclusions from different classification paradigms. The extracted findings will be supported by a statistical comparative analysis using more than 20 data-sets from the KEEL repository.

IEEE Transactions on Fuzzy Systems | 2013

A New Approach to Interval-Valued Choquet Integrals and the Problem of Ordering in Interval-Valued Fuzzy Set Applications

Humberto Bustince; Mikel Galar; Benjamín R. C. Bedregal; Anna Kolesárová; Radko Mesiar

We consider the problem of choosing a total order between intervals in multiexpert decision making problems. To do so, we first start researching the additivity of interval-valued aggregation functions. Next, we briefly treat the problem of preserving admissible orders by linear transformations. We study the construction of interval-valued ordered weighted aggregation operators by means of admissible orders and discuss their properties. In this setting, we present the definition of an interval-valued Choquet integral with respect to an admissible order based on an admissible pair of aggregation functions. The importance of the definition of the Choquet integral, which is introduced by us here, lies in the fact that if the considered data are pointwise (i.e., if they are not proper intervals), then it recovers the classical concept of this aggregation. Next, we show that if we make use of intervals in multiexpert decision making problems, then the solution at which we arrive may depend on the total order between intervals that has been chosen. For this reason, we conclude the paper by proposing two new algorithms such that the second one allows us, by means of the Shapley value, to pick up the best alternative from a set of winning alternatives provided by the first algorithm.

Applied Soft Computing | 2014

Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system

José Antonio Sanz; Mikel Galar; Aranzazu Jurio; Antonio Brugos; Miguel Pagola; Humberto Bustince

This work was partially supported by the Spanish Ministry of Science and Technology under project TIN2010-15055 and the Research Services of the Universidad Publica de Navarra.

Knowledge and Information Systems | 2014

Analyzing the presence of noise in multi-class problems: alleviating its influence with the One-vs-One decomposition

José A. Sáez; Mikel Galar; Julián Luengo; Francisco Herrera

The presence of noise in data is a common problem that produces several negative consequences in classification problems. In multi-class problems, these consequences are aggravated in terms of accuracy, building time, and complexity of the classifiers. In these cases, an interesting approach to reduce the effect of noise is to decompose the problem into several binary subproblems, reducing the complexity and, consequently, dividing the effects caused by noise into each of these subproblems. This paper analyzes the usage of decomposition strategies, and more specifically the One-vs-One scheme, to deal with noisy multi-class datasets. In order to investigate whether the decomposition is able to reduce the effect of noise or not, a large number of datasets are created introducing different levels and types of noise, as suggested in the literature. Several well-known classification algorithms, with or without decomposition, are trained on them in order to check when decomposition is advantageous. The results obtained show that methods using the One-vs-One strategy lead to better performances and more robust classifiers when dealing with noisy data, especially with the most disruptive noise schemes.

IEEE Transactions on Image Processing | 2011

Interval-Valued Fuzzy Sets Applied to Stereo Matching of Color Images

Mikel Galar; Javier Fernandez; Gleb Beliakov; Humberto Bustince

Stereo matching problem attempts to find corresponding locations between pairs of displaced images of the same scene. Correspondence estimation between pixels suffers from occlusions, noise, and bias. This paper introduces a novel approach to represent images by means of interval-valued fuzzy sets. These sets allow one to overcome the uncertainty due to the aforementioned problems. The aim is to take advantage of the new representation to develop a stereo matching algorithm. The interval-valued fuzzification process for images that is proposed here is based on image segmentation. Interval-valued fuzzy similarities are introduced to compare windows whose pixels are represented by intervals. To make use of color information, the similarities of the RGB channels were aggregated using the luminance formula. The experimental analysis makes a comparison with other methods. The new representation that is proposed together with the new similarity measure show a better overall behavior, providing more accurate correspondences, mainly near depth discontinuities and for images with a large amount of color.

Information Sciences | 2015

A survey on fingerprint minutiae-based local matching for verification and identification

Daniel Peralta; Mikel Galar; Isaac Triguero; Daniel Paternain; Salvador García; Edurne Barrenechea; José Manuel Benítez; Humberto Bustince; Francisco Herrera

A background and exhaustive survey on fingerprint matching methods in the literature is presented.A taxonomy of fingerprint minutiae-based methods is proposed.An extensive experimental study shows the performance of the state-of-the-art. Fingerprint recognition has found a reliable application for verification or identification of people in biometrics. Globally, fingerprints can be viewed as valuable traits due to several perceptions observed by the experts; such as the distinctiveness and the permanence on humans and the performance in real applications. Among the main stages of fingerprint recognition, the automated matching phase has received much attention from the early years up to nowadays. This paper is devoted to review and categorize the vast number of fingerprint matching methods proposed in the specialized literature. In particular, we focus on local minutiae-based matching algorithms, which provide good performance with an excellent trade-off between efficacy and efficiency. We identify the main properties and differences of existing methods. Then, we include an experimental evaluation involving the most representative local minutiae-based matching models in both verification and evaluation tasks. The results obtained will be discussed in detail, supporting the description of future directions.

IEEE Transactions on Fuzzy Systems | 2015

Enhancing Multiclass Classification in FARC-HD Fuzzy Classifier: On the Synergy Between

Mikel Elkano; Mikel Galar; José Antonio Sanz; Alberto Fernández; Edurne Barrenechea; Francisco Herrera; Humberto Bustince

There are many real-world classification problems involving multiple classes, e.g., in bioinformatics, computer vision, or medicine. These problems are generally more difficult than their binary counterparts. In this scenario, decomposition strategies usually improve the performance of classifiers. Hence, in this paper, we aim to improve the behavior of fuzzy association rule-based classification model for high-dimensional problems (FARC-HD) fuzzy classifier in multiclass classification problems using decomposition strategies, and more specifically One-versus-One (OVO) and One-versus-All (OVA) strategies. However, when these strategies are applied on FARC-HD, a problem emerges due to the low-confidence values provided by the fuzzy reasoning method. This undesirable condition comes from the application of the product t-norm when computing the matching and association degrees, obtaining low values, which are also dependent on the number of antecedents of the fuzzy rules. As a result, robust aggregation strategies in OVO, such as the weighted voting obtain poor results with this fuzzy classifier. In order to solve these problems, we propose to adapt the inference system of FARC-HD replacing the product t-norm with overlap functions. To do so, we define n-dimensional overlap functions. The usage of these new functions allows one to obtain more adequate outputs from the base classifiers for the subsequent aggregation in OVO and OVA schemes. Furthermore, we propose a new aggregation strategy for OVO to deal with the problem of the weighted voting derived from the inappropriate confidences provided by FARC-HD for this aggregation method. The quality of our new approach is analyzed using 20 datasets and the conclusions are supported by a proper statistical analysis. In order to check the usefulness of our proposal, we carry out a comparison against some of the state-of-the-art fuzzy classifiers. Experimental results show the competitiveness of our method.

Pattern Recognition | 2015

n

Mikel Galar; Alberto Fernández; Edurne Barrenechea; Francisco Herrera

One-vs-One strategy is a common and established technique in Machine Learning to deal with multi-class classification problems. It consists of dividing the original multi-class problem into easier-to-solve binary subproblems considering each possible pair of classes. Since several classifiers are learned, their combination becomes crucial in order to predict the class of new instances. Due to the division procedure a series of difficulties emerge at this stage, such as the non-competence problem. Each classifier is learned using only the instances of its corresponding pair of classes, and hence, it is not competent to classify instances belonging to the rest of the classes; nevertheless, at classification time all the outputs of the classifiers are taken into account because the competence cannot be known a priori (the classification problem would be solved). On this account, we develop a distance-based combination strategy, which weights the competence of the outputs of the base classifiers depending on the closeness of the query instance to each one of the classes. Our aim is to reduce the effect of the non-competent classifiers, enhancing the results obtained by the state-of-the-art combinations for One-vs-One strategy. We carry out a thorough experimental study, supported by the proper statistical analysis, showing that the results obtained by the proposed method outperform, both in terms of accuracy and kappa measures, the previous combinations for One-vs-One strategy. HighlightsThe non-competence is an important problem in One-vs-One strategy.We develop a distance-based combination strategy, based on Dynamic Classifier Weighting strategies.Weights are settled depending on the closeness of the test instance to each one of the classesThe effect of the non-competent classifiers is reduced.The new strategy enhances the results obtained w.r.t. the state-of-the-art aggregations.

Explore More