Amparo Alonso-Betanzos
University of A Coruña
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Amparo Alonso-Betanzos.
Knowledge and Information Systems | 2013
Verónica Bolón-Canedo; Noelia Sánchez-Maroño; Amparo Alonso-Betanzos
With the advent of high dimensionality, adequate identification of relevant features of the data has become indispensable in real-world scenarios. In this context, the importance of feature selection is beyond doubt and different methods have been developed. However, with such a vast body of algorithms available, choosing the adequate feature selection method is not an easy-to-solve question and it is necessary to check their effectiveness on different situations. Nevertheless, the assessment of relevant features is difficult in real datasets and so an interesting option is to use artificial data. In this paper, several synthetic datasets are employed for this purpose, aiming at reviewing the performance of feature selection methods in the presence of a crescent number or irrelevant features, noise in the data, redundancy and interaction between attributes, as well as a small ratio between number of samples and number of features. Seven filters, two embedded methods, and two wrappers are applied over eleven synthetic datasets, tested by four classifiers, so as to be able to choose a robust method, paving the way for its application to real datasets.
Information Sciences | 2014
Verónica Bolón-Canedo; Noelia Sánchez-Maroño; Amparo Alonso-Betanzos; José Manuel Benítez; Francisco Herrera
Microarray data classification is a difficult challenge for machine learning researchers due to its high number of features and the small sample sizes. Feature selection has been soon considered a de facto standard in this field since its introduction, and a huge number of feature selection methods were utilized trying to reduce the input dimensionality while improving the classification performance. This paper is devoted to reviewing the most up-to-date feature selection methods developed in this field and the microarray databases most frequently used in the literature. We also make the interested reader aware of the problematic of data characteristics in this domain, such as the imbalance of the data, their complexity, or the so-called dataset shift. Finally, an experimental evaluation on the most representative datasets using well-known feature selection methods is presented, bearing in mind that the aim is not to provide the best feature selection method, but to facilitate their comparative study by the research community.
Expert Systems With Applications | 2011
Verónica Bolón-Canedo; Noelia Sánchez-Maroño; Amparo Alonso-Betanzos
Research highlights? A combination of discretizers, filters and classifiers is presented. ? This combination is applied to binary and multiple class classification problems. ? Its performance is compared to KDD Cup winner and other methods results. ? It achieves better performance while significantly reduces the number of features. In this work, a new method consisting of a combination of discretizers, filters and classifiers is presented. Its aim is to improve the performance results of classifiers but using a significantly reduced set of features. The method has been applied to a binary and to a multiple class classification problem. Specifically, the KDD Cup 99 benchmark was used for testing its effectiveness. A comparative study with other methods and the KDD winner was accomplished. The results obtained showed the adequacy of the proposed method, achieving better performance in most cases while reducing the number of features in more than 80%.
Pattern Recognition | 2012
Verónica Bolón-Canedo; Noelia Sánchez-Maroño; Amparo Alonso-Betanzos
In this paper a new framework for feature selection consisting of an ensemble of filters and classifiers is described. Five filters, based on different metrics, were employed. Each filter selects a different subset of features which is used to train and to test a specific classifier. The outputs of these five classifiers are combined by simple voting. In this study three well-known classifiers were employed for the classification task: C4.5, naive-Bayes and IB1. The rationale of the ensemble is to reduce the variability of the features selected by filters in different classification domains. Its adequacy was demonstrated by employing 10 microarray data sets.
Knowledge Based Systems | 2015
Verónica Bolón-Canedo; Noelia Sánchez-Maroño; Amparo Alonso-Betanzos
The explosion of big data has posed important challenges to researchers.Feature selection is paramount when dealing with high-dimensional datasets.We review the state-of-the-art and recent contributions in feature selection.The emerging challenges in feature selection are identified and discussed. In an era of growing data complexity and volume and the advent of big data, feature selection has a key role to play in helping reduce high-dimensionality in machine learning problems. We discuss the origins and importance of feature selection and outline recent contributions in a range of applications, from DNA microarray analysis to face recognition. Recent years have witnessed the creation of vast datasets and it seems clear that these will only continue to grow in size and number. This new big data scenario offers both opportunities and challenges to feature selection researchers, as there is a growing need for scalable yet efficient feature selection methods, given that existing methods are likely to prove inadequate.
Neural Computation | 2002
Enrique Castillo; Oscar Fontenla-Romero; Bertha Guijarro-Berdiñas; Amparo Alonso-Betanzos
The article presents a method for learning the weights in one-layer feed-forward neural networks minimizing either the sum of squared errors or the maximum absolute error, measured in the input scale. This leads to the existence of a global optimum that can be easily obtained solving linear systems of equations or linear programming problems, using much less computational power than the one associated with the standard methods. Another version of the method allows computing a large set of estimates for the weights, providing robust, mean or median, estimates for them, and the associated standard errors, which give a good measure for the quality of the fit. Later, the standard one-layer neural network algorithms are improved by learning the neural functions instead of assuming them known. A set of examples of applications is used to illustrate the methods. Finally, a comparison with other high-performance learning algorithms shows that the proposed methods are at least 10 times faster than the fastest standard algorithm used in the comparison.
intelligent data engineering and automated learning | 2007
Noelia Sánchez-Maroño; Amparo Alonso-Betanzos; María Tombilla-Sanromán
Adequate selection of features may improve accuracy and efficiency of classifier methods. There are two main approaches for feature selection: wrapper methods, in which the features are selected using the classifier, and filter methods, in which the selection of features is independent of the classifier used. Although the wrapper approach may obtain better performances, it requires greater computational resources. For this reason, lately a new paradigm, hybrid approach, that combines both filter and wrapper methods has emerged. One of its problems is to select the filter method that gives the best relevance index for each case, and this is not an easy to solve question. Different approaches to relevance evaluation lead to a large number of indices for ranking and selection. In this paper, several filter methods are applied over artificial data sets with different number of relevant features, level of noise in the output, interaction between features and increasing number of samples. The results obtained for the four filters studied (ReliefF, Correlation-based Feature Selection, Fast Correlated Based Filter and INTERACT) are compared and discussed. The final aim of this study is to select a filter to construct a hybrid method for feature selection.
Expert Systems With Applications | 2003
Amparo Alonso-Betanzos; Oscar Fontenla-Romero; Bertha Guijarro-Berdiñas; Elena Hernández-Pereira; María Inmaculada Paz Andrade; E. Jiménez; José Luis Legido Soto; T. Carballas
Abstract Over the last two decades in southern Europe, more than 10 million hectares of forest have been damaged by fire. Due to the costs and complications of fire-fighting a number of technical developments in the field have been appeared in recent years. This paper describes a system developed for the region of Galicia in NW Spain, one of the regions of Europe most affected by fires. This system fulfills three main aims: it acts as a preventive tool by predicting forest fire risks, it backs up the forest fire monitoring and extinction phase, and it assists in planning the recuperation of the burned areas. The forest fire prediction model is based on a neural network whose output is classified into four symbolic risk categories, obtaining an accuracy of 0.789. The other two main tasks are carried out by a knowledge-based system developed following the CommonKADS methodology. Currently we are working on the trail of the system in a controlled real environment. This will provide results on real behaviour that can be used to fine-tune the system to the point where it is considered suitable for installation in a real application environment.
Artificial Intelligence in Medicine | 2005
Oscar Fontenla-Romero; Bertha Guijarro-Berdiñas; Amparo Alonso-Betanzos; Vicente Moret-Bonillo
OBJECTIVES This paper presents a novel approach for sleep apnea classification. The goal is to classify each apnea in one of three basic types: obstructive, central and mixed. MATERIALS AND METHODS Three different supervised learning methods using a neural network were tested. The inputs of the neural network are the first level-5-detail coefficients obtained from a discrete wavelet transformation of the samples (previously detected as apnea) in the thoracic effort signal. In order to train and test the systems, 120 events from six different patients were used. The true error rate was estimated using a 10-fold cross validation. The results presented in this work were averaged over 100 different simulations and a multiple comparison procedure was used for model selection. RESULTS The method finally selected is based on a feedforward neural network trained using the Bayesian framework and a cross-entropy error function. The mean classification accuracy, obtained over the test set was 83.78+/-1.90%. CONCLUSION The proposed classifier surpasses, up to the authors knowledge, other previous results. Finally, a scheme to maintain and improve this system during its clinical use is also proposed.
Computers & Industrial Engineering | 2013
Diego Fernández-Francos; David Martínez-Rego; Oscar Fontenla-Romero; Amparo Alonso-Betanzos
Rolling-element bearings are among the most used elements in industrial machinery, thus an early detection of a defect in these components is necessary to avoid major machine failures. Vibration analysis is a widely used condition monitoring technique for high-speed rotating machinery. Using the information contained in the vibration signals, an automatic method for bearing fault detection and diagnosis is presented in this work. Initially, a one-class @n-SVM is used to discriminate between normal and faulty conditions. In order to build a model of normal operation regime, only data extracted under normal conditions is used. Band-pass filters and Hilbert Transform are then used sequentially to obtain the envelope spectrum of the original raw signal that will finally be used to identify the location of the problem. In order to check the performance of the method, two different data sets are used: (a) real data from a laboratory test-to-failure experiment and (b) data obtained from a fault-seeded bearing test. The results showed that the method was able not only to detect the failure in an incipient stage but also to identify the location of the defect and qualitatively assess its evolution over time.