Salvador García | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Salvador García is active.

Explore More

Publication

Featured researches published by Salvador García.

Information Sciences | 2015

A survey on fingerprint minutiae-based local matching for verification and identification

Daniel Peralta; Mikel Galar; Isaac Triguero; Daniel Paternain; Salvador García; Edurne Barrenechea; José Manuel Benítez; Humberto Bustince; Francisco Herrera

A background and exhaustive survey on fingerprint matching methods in the literature is presented.A taxonomy of fingerprint minutiae-based methods is proposed.An extensive experimental study shows the performance of the state-of-the-art. Fingerprint recognition has found a reliable application for verification or identification of people in biometrics. Globally, fingerprints can be viewed as valuable traits due to several perceptions observed by the experts; such as the distinctiveness and the permanence on humans and the performance in real applications. Among the main stages of fingerprint recognition, the automated matching phase has received much attention from the early years up to nowadays. This paper is devoted to review and categorize the vast number of fingerprint matching methods proposed in the specialized literature. In particular, we focus on local minutiae-based matching algorithms, which provide good performance with an excellent trade-off between efficacy and efficiency. We identify the main properties and differences of existing methods. Then, we include an experimental evaluation involving the most representative local minutiae-based matching models in both verification and evaluation tasks. The results obtained will be discussed in detail, supporting the description of future directions.

Knowledge Based Systems | 2016

Tutorial on practical tips of the most influential data preprocessing algorithms in data mining

Salvador García; Julián Luengo; Francisco Herrera

Abstract Data preprocessing is a major and essential stage whose main goal is to obtain final data sets that can be considered correct and useful for further data mining algorithms. This paper summarizes the most influential data preprocessing algorithms according to their usage, popularity and extensions proposed in the specialized literature. For each algorithm, we provide a description, a discussion on its impact, and a review of current and further research on it. These most influential algorithms cover missing values imputation, noise filtering, dimensionality reduction (including feature selection and space transformations), instance reduction (including selection and generation), discretization and treatment of data for imbalanced preprocessing. They constitute all among the most important topics in data preprocessing research and development. This paper emphasizes on the most well-known preprocessing methods and their practical study, selected after a recent, generic book on data preprocessing that does not deepen on them. This manuscript also presents an illustrative study in two sections with different data sets that provide useful tips for the use of preprocessing algorithms. In the first place, we graphically present the effects on two benchmark data sets for the preprocessing methods. The reader may find useful insights on the different characteristics and outcomes generated by them. Secondly, we use a real world problem presented in the ECDBL’2014 Big Data competition to provide a thorough analysis on the application of some preprocessing techniques, their combination and their performance. As a result, five different cases are analyzed, providing tips that may be useful for readers.

Information Sciences | 2016

Evolutionary fuzzy k-nearest neighbors algorithm using interval-valued fuzzy sets

Joaquín Derrac; Francisco Chiclana; Salvador García; Francisco Herrera

EF-kNN-IVFS, a new fuzzy nearest neighbor classification algorithm based on interval-valued fuzzy sets and evolutionary algorithms is presented.Interval-valued fuzzy sets provide a way of representing several configurations for the parameters of fuzzyKNN.Those configurations are set up in an adaptive way: an evolutionary method (CHC) searches for the best possible configuration according to the training data available.An extensive experimental study demonstrates the good behavior of EF-kNN-IVFS, when compared with other algorithms of the state of the art. One of the most known and effective methods in supervised classification is the k-nearest neighbors classifier. Several approaches have been proposed to enhance its precision, with the fuzzy k-nearest neighbors (fuzzy-kNN) classifier being among the most successful ones. However, despite its good behavior, fuzzy-kNN lacks of a method for properly defining several mechanisms regarding the representation of the relationship between the instances and the classes of the classification problems. Such a method would be very desirable, since it would potentially lead to an improvement in the precision of the classifier.In this work we present a new approach, evolutionary fuzzy k-nearest neighbors classifier using interval-valued fuzzy sets (EF-kNN-IVFS), incorporating interval-valued fuzzy sets for computing the memberships of training instances in fuzzy-kNN. It is based on the representation of multiple choices of two key parameters of fuzzy-kNN: one is applied in the definition of the membership function, and the other is used in the computation of the voting rule. Besides, evolutionary search techniques are incorporated to the model as a self-optimization procedure for setting up these parameters. An experimental study has been carried out to assess the capabilities of our approach. The study has been validated by using nonparametric statistical tests, and remarks the strong performance of EF-kNN-IVFS compared with several state of the art techniques in fuzzy nearest neighbor classification.

Knowledge Based Systems | 2015

A survey of fingerprint classification Part II: experimental analysis and ensemble proposal

Mikel Galar; Joaquín Derrac; Daniel Peralta; Isaac Triguero; Daniel Paternain; Carlos Lopez-Molina; Salvador García; José Manuel Benítez; Miguel Pagola; Edurne Barrenechea; Humberto Bustince; Francisco Herrera

Abstract In the first part of this paper we reviewed the fingerprint classification literature from two different perspectives: the feature extraction and the classifier learning. Aiming at answering the question of which among the reviewed methods would perform better in a real implementation we ended up in a discussion which showed the difficulty in answering this question. No previous comparison exists in the literature and comparisons among papers are done with different experimental frameworks. Moreover, the difficulty in implementing published methods was stated due to the lack of details in their description, parameters and the fact that no source code is shared. For this reason, in this paper we will go through a deep experimental study following the proposed double perspective. In order to do so, we have carefully implemented some of the most relevant feature extraction methods according to the explanations found in the corresponding papers and we have tested their performance with different classifiers, including those specific proposals made by the authors. Our aim is to develop an objective experimental study in a common framework, which has not been done before and which can serve as a baseline for future works on the topic. This way, we will not only test their quality, but their reusability by other researchers and will be able to indicate which proposals could be considered for future developments. Furthermore, we will show that combining different feature extraction models in an ensemble can lead to a superior performance, significantly increasing the results obtained by individual models.

Applied Soft Computing | 2016

Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines

Nele Verbiest; Joaquín Derrac; Chris Cornelis; Salvador García; Francisco Herrera

Graphical abstractDisplay Omitted HighlightsSupport vector machines (SVMs) are popular and accurate classifiers.We study if SVMs can be further improved using training set selection (TSS).We adjust wrapper TSS techniques for SVMs.Experimental evaluation shows that filter TSS techniques cannot improve the accuracy of SVMs.Experimental evaluation shows that evolutionary based wrapper TSS techniques significantly improve SVMs. One of the most powerful, popular and accurate classification techniques is support vector machines (SVMs). In this work, we want to evaluate whether the accuracy of SVMs can be further improved using training set selection (TSS), where only a subset of training instances is used to build the SVM model. By contrast to existing approaches, we focus on wrapper TSS techniques, where candidate subsets of training instances are evaluated using the SVM training accuracy. We consider five wrapper TSS strategies and show that those based on evolutionary approaches can significantly improve the accuracy of SVMs.

Archive | 2018

Learning from Imbalanced Data Sets

Alberto Fernández; Salvador García; Mikel Galar; Ronaldo C. Prati; Bartosz Krawczyk; Francisco Herrera

Nowadays, the availability of large volumes of data and the widespread use of tools for the proper extraction of knowledge information has become very frequent, especially in large corporations. This fact has transformed the data analysis by orienting it towards certain specialized techniques included under the umbrella of Data Science. In summary, Data Science can be considered as a discipline for discovering new and significant relationships, patterns and trends in the examination of large amounts of data. Therefore, Data Science techniques pursue the automatic discovery of the knowledge contained in the information stored in large databases. These techniques aim to uncover patterns, profiles and trends through the analysis of data using reconnaissance technologies, such as clustering, classification, predictive analysis, association mining, among others. For this reason, we are witnessing the development of multiple software solutions for the treatment of data and integrating lots of Data Science algorithms. In order to better understand the nature of Data Science, this chapter is organized as follows. Sections 1.2 and 1.3 defines the Data Science terms and its workflow. Then, in Sect. 1.4 the standard problems in Data Science are introduced. Section 1.5 describes some standard data mining algorithms. Finally, in Sect. 1.6 some of the non-standard problems in Data Science are mentioned.

Information Fusion | 2016

DPD-DFF

Daniel Peralta; Isaac Triguero; Salvador García; Francisco Herrera; José Manuel Benítez

We present a double fingerprint and double matcher AFIS: DPD-DFFEight variants of DPD-DFF are definedExtensive experiments are carried out over seven databasesDPD-DFF improves both the accuracy and identification time w.r.t. reference AFIS Nowadays, many companies and institutions need fast and reliable identification systems that are able to deal with very large databases. Fingerprints are among the most used biometric traits for identification. In the current literature there are fingerprint matching algorithms that are focused on efficiency, whilst others are based on accuracy. In this paper we propose a flexible dual phase identification method, called DPD-DFF, that combines two fingers and two matchers within a hybrid fusion scheme to obtain both fast and accurate results. Different alternatives are designed to find a trade-off between runtime and accuracy that can be further tuned with a single parameter. The experiments show that DPD-DFF obtains very competitive results in comparison with the state-of-the-art score fusion techniques, especially when dealing with large databases or impostor fingerprints.

Swarm and evolutionary computation | 2018

A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark

Sergio Ramírez-Gallego; Salvador García; José Manuel Benítez; Francisco Herrera

Abstract Nowadays the phenomenon of Big Data is overwhelming our capacity to extract relevant knowledge through classical machine learning techniques. Discretization (as part of data reduction) is presented as a real solution to reduce this complexity. However, standard discretizers are not designed to perform well with such amounts of data. This paper proposes a distributed discretization algorithm for Big Data analytics based on evolutionary optimization. After comparing with a distributed discretizer based on the Minimum Description Length Principle, we have found that our solution yields more accurate and simpler solutions in reasonable time.

Archive | 2018

Dimensionality Reduction for Imbalanced Learning

Alberto Fernández; Salvador García; Mikel Galar; Ronaldo C. Prati; Bartosz Krawczyk; Francisco Herrera

One of the most successful data preprocessing techniques used is the reduction of the data dimensionality by means of feature selection and/or feature extraction. The key idea is to simplify the data by replacing the original features with new created that extract the main information or simply select a subset of original set. Although this topic has been carefully studied in the specialized literature for the classical predictive problems, there are also several approaches specifically devised to deal with imbalance learning scenarios. Again, their main purpose is to exploit the most informative features to preserve as much as possible the concept related to the minority class. This chapter will describe the most-known techniques of feature selection and feature extraction developed to tackle imbalance data sets. We will consider these two main families of techniques separately and we will also provide the recent advances in feature selection and feature extraction by non-linear methods. In addition, we will mention a recently proposed discretization approach which is able to reduce the numeric features into categories. The chapter is organized as follows. After a short introduction in Sect. 9.1, we will review in Sect. 9.2 the straightforward solutions devised in feature selection for tackling imbalanced classification. Next, we will delve deeper into describing more advanced techniques for feature selection in Sect. 9.3. Section 9.4 will be devoted to explain the redefined feature extraction techniques based on linear models. In Sects. 9.5 and 9.6, a non-linear feature extraction technique based on autoencoders and a discretization method will be outlined, respectively. Finally, Sect. 9.7 will conclude this chapter.

Archive | 2018