Diego Peteiro-Barral | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Diego Peteiro-Barral is active.

Explore More

Publication

Featured researches published by Diego Peteiro-Barral.

International Journal of Neural Systems | 2015

Distributed One-Class Support Vector Machine

Enrique Castillo; Diego Peteiro-Barral; Bertha Guijarro Berdiñas; Oscar Fontenla-Romero

This paper presents a novel distributed one-class classification approach based on an extension of the ν-SVM method, thus permitting its application to Big Data data sets. In our method we will consider several one-class classifiers, each one determined using a given local data partition on a processor, and the goal is to find a global model. The cornerstone of this method is the novel mathematical formulation that makes the optimization problem separable whilst avoiding some data points considered as outliers in the final solution. This is particularly interesting and important because the decision region generated by the method will be unaffected by the position of the outliers and the form of the data will fit more precisely. Another interesting property is that, although built in parallel, the classifiers exchange data during learning in order to improve their individual specialization. Experimental results using different datasets demonstrate the good performance in accuracy of the decision regions of the proposed method in comparison with other well-known classifiers while saving training time due to its distributed nature.

Progress in Artificial Intelligence | 2013

A survey of methods for distributed machine learning

Diego Peteiro-Barral; Bertha Guijarro-Berdiñas

Traditionally, a bottleneck preventing the development of more intelligent systems was the limited amount of data available. Nowadays, the total amount of information is almost incalculable and automatic data analyzers are even more needed. However, the limiting factor is the inability of learning algorithms to use all the data to learn within a reasonable time. In order to handle this problem, a new field in machine learning has emerged: large-scale learning. In this context, distributed learning seems to be a promising line of research since allocating the learning process among several workstations is a natural way of scaling up learning algorithms. Moreover, it allows to deal with data sets that are naturally distributed, a frequent situation in many real applications. This study provides some background regarding the advantages of distributed environments as well as an overview of distributed learning for dealing with “very large” data sets.

IEEE Journal of Biomedical and Health Informatics | 2014

A Methodology for Improving Tear Film Lipid Layer Classification

Beatriz Remeseiro; Verónica Bolón-Canedo; Diego Peteiro-Barral; Amparo Alonso-Betanzos; Bertha Guijarro-Berdiñas; A. Mosquera; Manuel G. Penedo; Noelia Sánchez-Maroño

Dry eye is a symptomatic disease which affects a wide range of population and has a negative impact on their daily activities. Its diagnosis can be achieved by analyzing the interference patterns of the tear film lipid layer and by classifying them into one of the Guillon categories. The manual process done by experts is not only affected by subjective factors but is also very time consuming. In this paper we propose a general methodology to the automatic classification of tear film lipid layer, using color and texture information to characterize the image and feature selection methods to reduce the processing time. The adequacy of the proposed methodology was demonstrated since it achieves classification rates over 97% while maintaining robustness and provides unbiased results. Also, it can be applied in real time, and so allows important time savings for the experts.

Expert Systems With Applications | 2013

Toward the scalability of neural networks through feature selection

Diego Peteiro-Barral; Verónica Bolón-Canedo; Amparo Alonso-Betanzos; Bertha Guijarro-Berdiñas; Noelia Sánchez-Maroño

In the past few years, the bottleneck for machine learning developers is not longer the limited data available but the algorithms inability to use all the data in the available time. For this reason, researches are now interested not only in the accuracy but also in the scalability of the machine learning algorithms. To deal with large-scale databases, feature selection can be helpful to reduce their dimensionality, turning an impracticable algorithm into a practical one. In this research, the influence of several feature selection methods on the scalability of four of the most well-known training algorithms for feedforward artificial neural networks (ANNs) will be analyzed over both classification and regression tasks. The results demonstrate that feature selection is an effective tool to improve scalability.

Expert Systems With Applications | 2016

A unified pipeline for online feature selection and classification

Verónica Bolón-Canedo; Diego Fernández-Francos; Diego Peteiro-Barral; Amparo Alonso-Betanzos; Bertha Guijarro-Berdiñas; Noelia Sánchez-Maroño

A proposal for online feature selection is proposed.The proposed pipeline covers discretization, feature selection and classification.Classical algorithms were modified to make them work online.K-means discretizer, Chi-Square filter and Artificial Neural Networks were used.Results show that classification error is decreasing, adapting to the arrival of new data. With the advent of Big Data, data is being collected at an unprecedented fast pace, and it needs to be processed in a short time. To deal with data streams that flow continuously, classical batch learning algorithms cannot be applied and it is necessary to employ online approaches. Online learning consists of continuously revising and refining a model by incorporating new data as they arrive, and it allows important problems such as concept drift or management of extremely high-dimensional datasets to be solved. In this paper, we present a unified pipeline for online learning which covers online discretization, feature selection and classification. Three classical methods-the k-means discretizer, the ?2 filter and a one-layer artificial neural network-have been reimplemented to be able to tackle online data, showing promising results on both synthetic and real datasets.

Expert Systems With Applications | 2013

A comparative study of the scalability of a sensitivity-based learning algorithm for artificial neural networks

Diego Peteiro-Barral; Bertha Guijarro-Berdiñas; Beatriz Pérez-Sánchez; Oscar Fontenla-Romero

Until recently, the most common criterion in machine learning for evaluating the performance of algorithms was accuracy. However, the unrestrainable growth of the volume of data in recent years in fields such as bioinformatics, intrusion detection or engineering, has raised new challenges in machine learning not simply regarding accuracy but also scalability. In this research, we are concerned with the scalability of one of the most well-known paradigms in machine learning, artificial neural networks (ANNs), particularly with the training algorithm Sensitivity-Based Linear Learning Method (SBLLM). SBLLM is a learning method for two-layer feedforward ANNs based on sensitivity analysis, that calculates the weights by solving a linear system of equations. The results show that the training algorithm SBLLM performs better in terms of scalability than five of the most popular and efficient training algorithms for ANNs.

international conference on tools with artificial intelligence | 2012

Interferential Tear Film Lipid Layer Classification: An Automatic Dry Eye Test

Verónica Bolón-Canedo; Diego Peteiro-Barral; Beatriz Remeseiro; Amparo Alonso-Betanzos; Bertha Guijarro-Berdiñas; A. Mosquera; Manuel G. Penedo; Noelia Sánchez-Maroño

Dry eye is a symptomatic disease which affects a wide range of population and has a negative impact on their daily activities, such as driving or working with computers. Its diagnosis can be achieved by several clinical tests, one of which is the analysis of the interference pattern and its classification into one of the Guillons categories. The methodologies for automatic classification obtain promising results but at the expense of requiring a long processing time. In this research, feature selection techniques are used to reduce time whilst maintaining performance, paving the way for the development of a novel tool for automatic classification of tear film lipid layer. This tool produces significant classification rates over 96% compared with the annotations of the optometrists and provides unbiased results. Also, it works in real-time and so allows important time savings for the experts.

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence | 2011

Scalability analysis of ANN training algorithms with feature selection

Verónica Bolón-Canedo; Diego Peteiro-Barral; Amparo Alonso-Betanzos; Bertha Guijarro-Berdiñas; Noelia Sánchez-Maroño

The advent of high dimensionality problems has brought new challenges for machine learning researchers, who are now interested not only in the accuracy but also in the scalability of algorithms. In this context, machine learning can take advantage of feature selection methods to deal with large-scale databases. Feature selection is able to reduce the temporal and spatial complexity of learning, turning an impracticable algorithm into a practical one. In this work, the influence of feature selection on the scalability of four of the most well-known training algorithms for feedforward artificial neural networks (ANNs) is studied. Six different measures are considered to evaluate scalability, allowing to establish a final score to compare the algorithms. Results show that including a feature selection step, ANNs algorithms perform much better in terms of scalability.

Knowledge and Information Systems | 2018

On the scalability of feature selection methods on high-dimensional data

Verónica Bolón-Canedo; D. Rego-Fernández; Diego Peteiro-Barral; Amparo Alonso-Betanzos; Bertha Guijarro-Berdiñas; Noelia Sánchez-Maroño

Lately, derived from the explosion of high dimensionality, researchers in machine learning became interested not only in accuracy, but also in scalability. Although scalability of learning methods is a trending issue, scalability of feature selection methods has not received the same amount of attention. This research analyzes the scalability of state-of-the-art feature selection methods, belonging to filter, embedded and wrapper approaches. For this purpose, several new measures are presented, based not only on accuracy but also on execution time and stability. The results on seven classical artificial datasets are presented and discussed, as well as two cases study analyzing the particularities of microarray data and the effect of redundancy. Trying to check whether the results can be generalized, we included some experiments with two real datasets. As expected, filters are the most scalable feature selection approach, being INTERACT, ReliefF and mRMR the most accurate methods.

international conference on artificial intelligence and soft computing | 2013

A Study on the Scalability of Artificial Neural Networks Training Algorithms Using Multiple-Criteria Decision-Making Methods

Diego Peteiro-Barral; Bertha Guijarro-Berdiñas

In recent years, the unrestrainable growth of the volume of data has raised new challenges in machine learning regarding scalability. Scalability comprises not simply accuracy but several other measures regarding computational resources. In order to compare the scalability of algorithms it is necessary to establish a method allowing integrating all these measures into a single rank. These methods should be able to i) merge results of algorithms to be compared from different benchmark data sets, ii) quantitatively measure the difference between algorithms, and iii) weight some measures against others if necessary. In order to manage these issues, in this research we propose the use of TOPSIS as multiple-criteria decision-making method to rank algorithms. The use of this method will be illustrated to obtain a study on the scalability of five of the most well-known training algorithms for artificial neural networks (ANNs).

Explore More