David Martínez-Rego
University of A Coruña
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Martínez-Rego.
Computers & Industrial Engineering | 2013
Diego Fernández-Francos; David Martínez-Rego; Oscar Fontenla-Romero; Amparo Alonso-Betanzos
Rolling-element bearings are among the most used elements in industrial machinery, thus an early detection of a defect in these components is necessary to avoid major machine failures. Vibration analysis is a widely used condition monitoring technique for high-speed rotating machinery. Using the information contained in the vibration signals, an automatic method for bearing fault detection and diagnosis is presented in this work. Initially, a one-class @n-SVM is used to discriminate between normal and faulty conditions. In order to build a model of normal operation regime, only data extracted under normal conditions is used. Band-pass filters and Hilbert Transform are then used sequentially to obtain the envelope spectrum of the original raw signal that will finally be used to identify the location of the problem. In order to check the performance of the method, two different data sets are used: (a) real data from a laboratory test-to-failure experiment and (b) data obtained from a fault-seeded bearing test. The results showed that the method was able not only to detect the failure in an incipient stage but also to identify the location of the defect and qualitatively assess its evolution over time.
Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2016
Sergio Ramírez-Gallego; Salvador García; Héctor Mouriño-Talín; David Martínez-Rego; Verónica Bolón-Canedo; Amparo Alonso-Betanzos; José Manuel Benítez; Francisco Herrera
Discretization of numerical data is one of the most influential data preprocessing tasks in knowledge discovery and data mining. The purpose of attribute discretization is to find concise data representations as categories which are adequate for the learning task retaining as much information in the original continuous attribute as possible. In this article, we present an updated overview of discretization techniques in conjunction with a complete taxonomy of the leading discretizers. Despite the great impact of discretization as data preprocessing technique, few elementary approaches have been developed in the literature for Big Data. The purpose of this article is twofold: a comprehensive taxonomy of discretization techniques to help the practitioners in the use of the algorithms is presented; the article aims is to demonstrate that standard discretization methods can be parallelized in Big Data platforms such as Apache Spark, boosting both performance and accuracy. We thus propose a distributed implementation of one of the most well‐known discretizers based on Information Theory, obtaining better results than the one produced by: the entropy minimization discretizer proposed by Fayyad and Irani. Our scheme goes beyond a simple parallelization and it is intended to be the first to face the Big Data challenge. WIREs Data Mining Knowl Discov 2016, 6:5–21. doi: 10.1002/widm.1173
International Journal of Intelligent Systems | 2017
Sergio Ramírez-Gallego; Iago Lastra; David Martínez-Rego; Verónica Bolón-Canedo; José Manuel Benítez; Francisco Herrera; Amparo Alonso-Betanzos
With the advent of large‐scale problems, feature selection has become a fundamental preprocessing step to reduce input dimensionality. The minimum‐redundancy‐maximum‐relevance (mRMR) selector is considered one of the most relevant methods for dimensionality reduction due to its high accuracy. However, it is a computationally expensive technique, sharply affected by the number of features. This paper presents fast‐mRMR, an extension of mRMR, which tries to overcome this computational burden. Associated with fast‐mRMR, we include a package with three implementations of this algorithm in several platforms, namely, CPU for sequential execution, GPU (graphics processing units) for parallel computing, and Apache Spark for distributed computing using big data technologies.
international symposium on neural networks | 2011
David Martínez-Rego; Oscar Fontenla-Romero; Amparo Alonso-Betanzos
Vibration analysis is one of the most used techniques for predictive maintenance in high-speed rotating machinery. Using the information contained in the vibration signals, a system for alarm detection and diagnosis of failures in mechanical components of power wind mills is devised. As previous failure data collection is unfeasible in real life scenarios, the method to be employed should be capable of discerning between failure and normal data, being only trained with the latter type. Other interesting capability of such a method is the possibility of measuring the evolution of the failure. Taking into account these restrictions, a method that uses the one-class-ν-SVM paradigm is employed. In order to test its adequacy, three different scenarios are tested: (a) a simulated scenario, (b) a controlled experimental scenario with real vibrational data, and (c) a real scenario using vibrational data captured from a windmill power machine installed in a wind farm in North West Spain. The results showed not only the capabilities of the method for detecting the failure in advance to the breakpoint of the component in all three scenarios, but also its capacity to present a qualitative indication on the evolution of the defect. Finally, the results of the SVM paradigm are compared to one of the most used novelty detection methods, obtaining more accurate results under noisy circumstances.
international conference on artificial neural networks | 2009
Iago Porto-Díaz; David Martínez-Rego; Amparo Alonso-Betanzos; Oscar Fontenla-Romero
In this work, a new approach for intrusion detection in computer networks is introduced. Using the KDD Cup 99 dataset as a benchmark, the proposed method consists of a combination between feature selection methods and a novel local classification method. This classification method ---called FVQIT (Frontier Vector Quantization using Information Theory)--- uses a modified clustering algorithm to split up the feature space into several local models, in each of which the classification task is performed independently. The method is applied over the KDD Cup 99 dataset, with the objective of improving performance achieved by previous authors. Experimental results obtained indicate the adequacy of the proposed approach.
systems man and cybernetics | 2018
Sergio Ramírez-Gallego; Héctor Mouriño-Talín; David Martínez-Rego; Verónica Bolón-Canedo; José Manuel Benítez; Amparo Alonso-Betanzos; Francisco Herrera
With the advent of extremely high dimensional datasets, dimensionality reduction techniques are becoming mandatory. Of the many techniques available, feature selection (FS) is of growing interest for its ability to identify both relevant features and frequently repeated instances in huge datasets. We aim to demonstrate that standard FS methods can be parallelized in big data platforms like Apache Spark so as to boost both performance and accuracy. We propose a distributed implementation of a generic FS framework that includes a broad group of well-known information theory-based methods. Experimental results for a broad set of real-world datasets show that our distributed framework is capable of rapidly dealing with ultrahigh-dimensional datasets as well as those with a huge number of samples, outperforming the sequential version in all the cases studied.
international symposium on neural networks | 2009
David Martínez-Rego; Oscar Fontenla-Romero; Iago Porto-Díaz; Amparo Alonso-Betanzos
In this paper, a novel supervised architecture for binary classification based on local modelling and information theory is described. The architecture is composed of two steps: in the first one, a separating borderline between the two classes is piecewise constructed by a set of centroids calculated by a modified clustering algorithm, based on information theory; each of these centroids define a region where, in the second step of the proposed architecture, a hyperplane is constructed and adjusted by means of one-layer neural networks. This new method allows for binary classification while maintaining adequate use of computational resources, a common problem for machine learning methods. The proposed architecture is applied over classical benchmark classification problems and data sets, and its results are compared with those obtained by other well-known statistical and machine learning classifiers.
distributed computing and artificial intelligence | 2009
Bertha Guijarro-Berdiñas; David Martínez-Rego; Santiago Fernández-Lorenzo
In recent years, Machine Learning (ML) has witnessed a great increase of storage capacity of computer systems and an enormous growth of available information to work with thanks to the WWW. This has raised an opportunity for new real life applications of ML methods and also new cutting-edge ML challenges like: tackle with massive databases, Distributed Learning and Privacy-preserving Classification. In this paper a new method capable of dealing with this three problems is presented. The method is based on Artificial Neural Networks with incremental learning and Genetic Algorithms. As supported by the experimental results, this method is able to fastly obtain an accurate model based on the information of distributed databases without exchanging any data during the training process, without degrading its classification accuracy when compared with other non-distributed classical ML methods. This makes the proposed method very efficient and adequate for Privacy-Preserving Learning applications.
Expert Systems With Applications | 2013
Beatriz Pérez-Sánchez; Oscar Fontenla-Romero; Bertha Guijarro-Berdiñas; David Martínez-Rego
Many real scenarios in machine learning are of dynamic nature. Learning in these types of environments represents an important challenge for learning systems. In this context, the model used for learning should work in real time and have the ability to act and react by itself, adjusting its controlling parameters, even its structures, depending on the requirements of the process. In a previous work, the authors presented an online learning algorithm for two-layer feedforward neural networks that includes a factor that weights the errors committed in each of the samples. This method is effective in dynamic environments as well as in stationary contexts. As regards this method’s incremental feature, we raise the possibility that the network topology is adapted according to the learning needs. In this paper, we demonstrate and justify the suitability of the online learning algorithm to work with adaptive structures without significantly degrading its performance. The theoretical basis for the method is given and its performance is illustrated by means of its application to different system identification problems. The results confirm that the proposed method is able to incorporate units to its hidden layer, during the learning process, without high performance degradation.
international conference on artificial neural networks | 2010
David Martínez-Rego; Oscar Fontenla-Romero; Beatriz Pérez-Sánchez; Amparo Alonso-Betanzos
Predictive maintenance of industrial machinery has steadily emerge as an important topic of research. Due to an accurate automatic diagnosis and prognosis of faults, savings of the current expenses devoted to maintenance can be obtained. The aim of this work is to develop an automatic prognosis system based on vibration data. An on-line version of the Sensitivity-based Linear Learning Model algorithm for neural networks is applied over real vibrational data in order to assess its forecasting capabilities. Moreover, the behavior of the method is compared with that of an efficient and fast method, the On-line Sequential Extreme LearningMachine. The accurate predictions of the proposed method pave the way for future development of a complete prognosis system.