Iago Porto-Díaz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Iago Porto-Díaz is active.

Explore More

Publication

Featured researches published by Iago Porto-Díaz.

Pattern Recognition | 2014

A framework for cost-based feature selection

Verónica Bolón-Canedo; Iago Porto-Díaz; Noelia Sánchez-Maroño; Amparo Alonso-Betanzos

Abstract Over the last few years, the dimensionality of datasets involved in data mining applications has increased dramatically. In this situation, feature selection becomes indispensable as it allows for dimensionality reduction and relevance detection. The research proposed in this paper broadens the scope of feature selection by taking into consideration not only the relevance of the features but also their associated costs. A new general framework is proposed, which consists of adding a new term to the evaluation function of a filter feature selection method so that the cost is taken into account. Although the proposed methodology could be applied to any feature selection filter, in this paper the approach is applied to two representative filter methods: Correlation-based Feature Selection (CFS) and Minimal-Redundancy-Maximal-Relevance (mRMR), as an example of use. The behavior of the proposed framework is tested on 17 heterogeneous classification datasets, employing a Support Vector Machine (SVM) as a classifier. The results of the experimental study show that the approach is sound and that it allows the user to reduce the cost without compromising the classification error.

Knowledge Based Systems | 2017

Ensemble feature selection: Homogeneous and heterogeneous approaches

Borja Seijo-Pardo; Iago Porto-Díaz; Verónica Bolón-Canedo; Amparo Alonso-Betanzos

Abstract In the last decade, ensemble learning has become a prolific discipline in pattern recognition, based on the assumption that the combination of the output of several models obtains better results than the output of any individual model. On the basis that the same principle can be applied to feature selection, we describe two approaches: (i) homogeneous, i.e., using the same feature selection method with different training data and distributing the dataset over several nodes; and (ii) heterogeneous, i.e., using different feature selection methods with the same training data. Both approaches are based on combining rankings of features that contain all the ordered features. The results of the base selectors are combined using different combination methods, also called aggregators, and a practical subset is selected according to several different threshold values (traditional values based on fixed percentages, and more novel automatic methods based on data complexity measures). In testing using a Support Vector Machine as a classifier, ensemble results for seven datasets demonstrate performance that is at least comparable and often better than the performance of individual feature selection methods.

international conference on artificial neural networks | 2009

Combining Feature Selection and Local Modelling in the KDD Cup 99 Dataset

Iago Porto-Díaz; David Martínez-Rego; Amparo Alonso-Betanzos; Oscar Fontenla-Romero

In this work, a new approach for intrusion detection in computer networks is introduced. Using the KDD Cup 99 dataset as a benchmark, the proposed method consists of a combination between feature selection methods and a novel local classification method. This classification method ---called FVQIT (Frontier Vector Quantization using Information Theory)--- uses a modified clustering algorithm to split up the feature space into several local models, in each of which the classification task is performed independently. The method is applied over the KDD Cup 99 dataset, with the objective of improving performance achieved by previous authors. Experimental results obtained indicate the adequacy of the proposed approach.

Neural Networks | 2011

2011 Special Issue: A study of performance on microarray data sets for a classifier based on information theoretic learning

Iago Porto-Díaz; Verónica Bolón-Canedo; Amparo Alonso-Betanzos; Oscar Fontenla-Romero

Gene-expression microarray is a novel technology that allows the examination of tens of thousands of genes at a time. For this reason, manual observation is not feasible and machine learning methods are progressing to face these new data. Specifically, since the number of genes is very high, feature selection methods have proven valuable to deal with these unbalanced-high dimensionality and low cardinality-data sets. In this work, the FVQIT (Frontier Vector Quantization using Information Theory) classifier is employed to classify twelve DNA gene-expression microarray data sets of different kinds of cancer. A comparative study with other well-known classifiers is performed. The proposed approach shows competitive results outperforming all other classifiers.

international symposium on neural networks | 2009

A new supervised local modelling classifier based on information theory

David Martínez-Rego; Oscar Fontenla-Romero; Iago Porto-Díaz; Amparo Alonso-Betanzos

In this paper, a novel supervised architecture for binary classification based on local modelling and information theory is described. The architecture is composed of two steps: in the first one, a separating borderline between the two classes is piecewise constructed by a set of centroids calculated by a modified clustering algorithm, based on information theory; each of these centroids define a region where, in the second step of the proposed architecture, a hyperplane is constructed and adjusted by means of one-layer neural networks. This new method allows for binary classification while maintaining adequate use of computational resources, a common problem for machine learning methods. The proposed architecture is applied over classical benchmark classification problems and data sets, and its results are compared with those obtained by other well-known statistical and machine learning classifiers.

international work-conference on artificial and natural neural networks | 2015

Ensemble Feature Selection for Rankings of Features

Borja Seijo-Pardo; Verónica Bolón-Canedo; Iago Porto-Díaz; Amparo Alonso-Betanzos

In the last few years, ensemble learning has been the focus of much attention mainly in classification tasks, based on the assumption that combining the output of multiple experts is better than the output of any single expert. This idea of ensemble learning can be adapted for feature selection, in which different feature selection algorithms act as different experts. In this paper we propose an ensemble for feature selection based on combining rankings of features, trying to overcome the problem of selecting an appropriate ranker method for each problem at hand. The results of the individual rankings are combined with SVM Rank, and the adequacy of the ensemble was subsequently tested using SVM as classifier. Results on five UCI datasets showed that the use of the proposed ensemble gives better or comparable performance than the feature selection methods individually.

international conference on artificial neural networks | 2010

Local modeling classifier for microarray gene-expression data

Iago Porto-Díaz; Verónica Bolón-Canedo; Amparo Alonso-Betanzos; Oscar Fontenla-Romero

Gene-expression microarray is a novel technology that allows to examine tens of thousands of genes at a time. For this reason, manual observation is not feasible anymore and machine learning methods are progressing to analyze these new data. Specifically, since the number of genes is very high, feature selection methods have proven valuable to deal with this unbalanced - high dimensionality and low cardinality - datasets. Our method is composed by a discretizer, a filter and the FVQIT (Frontier Vector Quantization using Information Theory) classifier. It is employed to classify eight DNA gene-expression microarray datasets of different kinds of cancer. A comparative study with other classifiers such as Support Vector Machine (SVM), C4.5, naive Bayes and k-Nearest Neighbor is performed. Our approach shows excellent results outperforming all other classifiers.

Progress in Artificial Intelligence | 2012

Information Theoretic Learning and local modeling for binary and multiclass classification

Iago Porto-Díaz; David Martínez-Rego; Amparo Alonso-Betanzos; Oscar Fontenla-Romero

In this paper, a learning model for binary and multiclass classification based on local modeling and Information Theoretic Learning (ITL) is described. The training algorithm for the model works on two stages: first, a set of nodes are placed on the frontiers between classes using a modified clustering algorithm based on ITL. Each of these nodes defines a local model. Second, several one-layer neural networks, associated with these local models, are trained to locally classify the points in its proximity. The method is successfully applied to problems with a large amount of instances and high dimension like intrusion detection and microarray gene expression.

international conference on knowledge based and intelligent information and engineering systems | 2010

A log analyzer agent for intrusion detection in a multi-agent system

Iago Porto-Díaz; Óscar Fontenla-omero; Amparo Alonso-etanzos

In this work, the design and implementation of a log analyzer agent is described. This agent is conceived to act as a part of a multi-agent Intrusion Detection System. The agent analyzes log files of services, applications or operating systems contrasting every log line with a set of security rules defined by experts. These rules can be created using a new easy to use XML-based format founded on an object-oriented model. Whenever a security match is found, the agent sends a security report to the next level of the multi-agent system using the IDMEF (Intrusion Detection Message Exchange Format) and the IDXP (Intrusion Detection Exchange Protocol).

Knowledge Based Systems | 2017