Ana Cernea
University of Oviedo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ana Cernea.
International Journal of Pattern Recognition and Artificial Intelligence | 2015
Juan Luis Fernández-Martínez; Ana Cernea
In this paper, we present a supervised ensemble learning algorithm, called SCAV1, and its application to face recognition. This algorithm exploits the uncertainty space of the ensemble classifiers. Its design includes six different nearest-neighbor (NN) classifiers that are based on different and diverse image attributes: histogram, variogram, texture analysis, edges, bidimensional discrete wavelet transform and Zernike moments. In this approach each attribute, together with its corresponding type of the analysis (local or global), and the distance criterion (p-norm) induces a different individual NN classifier. The ensemble classifier SCAV1 depends on a set of parameters: the number of candidate images used by each individual method to perform the final classification and the individual weights given to each individual classifier. SCAV1 parameters are optimized/sampled using a supervised approach via the regressive particle swarm optimization algorithm (RR-PSO). The final classifier exploits the uncertainty space of SCAV1 and uses majority voting (Borda Count) as a final decision rule. We show the application of this algorithm to the ORL and PUT image databases, obtaining very high and stable accuracies (100% median accuracy and almost null interquartile range). In conclusion, exploring the uncertainty space of ensemble classifiers provides optimum results and seems to be the appropriate strategy to adopt for face recognition and other classification problems.
International Journal of Pattern Recognition and Artificial Intelligence | 2014
Ana Cernea; Juan Luis Fernández-Martínez
In this paper, we propose different ensemble learning algorithms and their application to the face recognition problem. Three types of attributes are used for image representation: statistical, spectral, and segmentation features and regional descriptors. Classification is performed by nearest neighbor using different p-norms defined in the corresponding spaces of attributes. In this approach, each attribute together with its corresponding type of the analysis (local or global) and the distance criterion (norm or cosine), define a different classifier. The classification is unsupervised since no class information is used to improve the design of the different classifiers. Three different versions of ensemble classifiers are proposed in this paper: CAV1, CAV2, and CBAG, being the main differences among them the way the image candidates that perform the consensus are selected. The main results shown in this paper are the following: 1. The statistical attributes (local histogram and percentiles) are the individual classifiers that provided the higher accuracies, followed by the spectral methods (DWT), and the regional features (texture analysis). 2. No single attribute is able to provide systematically 100% accuracy over the ORL database. 3. The accuracy and stability of the classification is increased by consensus classification (ensemble learning techniques). 4. Optimum results are obtained by reducing the number of classifiers taking into account their diversity, and by optimizing the parameters of these classifiers using a member of the Particle Swarm Optimization (PSO) family. These results are in accord with the conclusions that are presented in the literature using ensemble learning methodologies, that is, it is possible to build strong classifiers by assembling different weak (or simple) classifiers based on different and diverse image attributes. Due to these encouraging results, future research will be devoted to the use of supervised ensemble techniques in face recognition and in other important biometric problems.
International Journal of Pattern Recognition and Artificial Intelligence | 2014
Juan Luis Fernández-Martínez; Ana Cernea
Face recognition is a challenging problem in computer vision and artificial intelligence. One of the main challenges consists in establishing a low-dimensional feature representation of the images having enough discriminatory power to perform high accuracy classification. Different methods of supervised and unsupervised classification can be found in the literature, but few numerical comparisons among them have been performed on the same computing platform. In this paper, we perform this kind of comparison, revisiting the main spectral decomposition methods for face recognition. We also introduce for the first time, the use of the noncentered PCA and the 2D discrete Chebyshev transform for biometric applications. Faces are represented by their spectral features, that is, their projections onto the different spectral basis. Classification is performed using different norms and/or the cosine defined by the Euclidean scalar product in the space of spectral attributes. Although this constitutes a simple algorithm of unsupervised classification, several important conclusions arise from this analysis: (1) All the spectral methods provide approximately the same accuracy when they are used with the same energy cutoff. This is an important conclusion since many publications try to promote one specific spectral method with respect to other methods. Nevertheless, there exist small variations on the highest median accuracy rates: PCA, 2DPCA and DWT perform better in this case. Also all the covariance-free spectral decomposition techniques based on single images (DCT, DST, DCHT, DWT, DWHT, DHT) are very interesting since they provide high accuracies and are not computationally expensive compared to covariance-based techniques. (2) The use of local spectral features generally provide higher accuracies than global features for the spectral methods which use the whole training database (PCA, NPCA, 2DPCA, Fishers LDA, ICA). For the methods based on orthogonal transformations of single images, global features calculated using the whole size of the images appear to perform better. (3) The distance criterion generally provides a higher accuracy than the cosine criterion. The use of other p-norms (p > 2) provides similar results to the Euclidean norm, nevertheless some methods perform better. (4) No spectral method can provide 100% accuracy by itself. Therefore, other kind of attributes and supervised learning algorithms are needed. These results are coherent for the ORL and FERET databases. Finally, although this comparison has been performed for the face recognition problem, it could be generalized to other biometric authentication problems.
swarm evolutionary and memetic computing | 2013
Juan Luis Fernández-Martínez; Ana Cernea; Esperanza García-Gonzalo; Julian Velasco; Bijaya Ketan Panigrahi
This paper is devoted to present the stochastic stability analysis of a novel PSO version, the aligned PSO, and its application to the face recognition problem using supervised learning techniques. Its application to the ORL database provides 100% median identification accuracy over 100 independent runs.
international conference on bioinformatics and biomedical engineering | 2018
Juan Luis Fernández-Martínez; Ana Cernea; Enrique J. deAndrés-Galiana; Francisco Javier Fernández-Ovies; Zulima Fernández-Muñiz; Oscar Alvarez-Machancoses; Leorey N. Saligan; Stephen T. Sonis
In this paper, we introduce the holdout sampler to find the defective pathways in high underdetermined phenotype prediction problems. This sampling algorithm is inspired by the bootstrapping procedure used in regression analysis to established confidence bounds. We show that working with partial information (data bags) serves to sample the linear uncertainty region in a simple regression problem, mainly along the axis of greatest uncertainty that corresponds to the smallest singular value of the system matrix. This procedure applied to a phenotype prediction problem, considered as a generalized prediction problem between the set of genetic signatures and the set of classes in which the phenotype is divided, serves to unravel the ensemble of altered pathways in the transcriptome that are involved in the disease development. The algorithm looks for the minimum-scale genetic signature in each random holdout and the likelihood (predictive accuracy) is established using the validation dataset via a nearest-neighbor classifier. The posterior analysis serves to identify the header genes that most-frequently appear in the different hold-outs and are therefore robust to a partial lack of samples. These genes are used to establish the genetic pathways and the biological processes involved in the disease progression. This algorithm is much faster, robust and simpler than Bayesian Networks. We show its application to a microarray dataset concerning a type of breast cancers with poor prognoses (TNBC).
Entropy | 2018
J. L. G. Pallero; María Fernández-Muñiz; Ana Cernea; Oscar Alvarez-Machancoses; L.M. Pedruelo-González; Sylvain Bonvalot; Juan Luis Fernández-Martínez
Most inverse problems in the industry (and particularly in geophysical exploration) are highly underdetermined because the number of model parameters too high to achieve accurate data predictions and because the sampling of the data space is scarce and incomplete; it is always affected by different kinds of noise. Additionally, the physics of the forward problem is a simplification of the reality. All these facts result in that the inverse problem solution is not unique; that is, there are different inverse solutions (called equivalent), compatible with the prior information that fits the observed data within similar error bounds. In the case of nonlinear inverse problems, these equivalent models are located in disconnected flat curvilinear valleys of the cost-function topography. The uncertainty analysis consists of obtaining a representation of this complex topography via different sampling methodologies. In this paper, we focus on the use of a particle swarm optimization (PSO) algorithm to sample the region of equivalence in nonlinear inverse problems. Although this methodology has a general purpose, we show its application for the uncertainty assessment of the solution of a geophysical problem concerning gravity inversion in sedimentary basins, showing that it is possible to efficiently perform this task in a sampling-while-optimizing mode. Particularly, we explain how to use and analyze the geophysical models sampled by exploratory PSO family members to infer different descriptors of nonlinear uncertainty.
international conference on bioinformatics and biomedical engineering | 2018
Ana Cernea; Juan Luis Fernández-Martínez; Enrique J. deAndrés-Galiana; Francisco Javier Fernández-Ovies; Zulima Fernández-Muñiz; Oscar Alvarez-Machancoses; Leorey N. Saligan; Stephen T. Sonis
In this paper, we introduce the Fisher’s ratio sampler that serves to unravel the defective pathways in highly underdetermined phenotype prediction problems. This sampling algorithm first selects the most discriminatory genes, that are at the same time differentially expressed, and samples the high discriminatory genetic networks with a prior probability that it is proportional to their individual Fisher’s ratio. The number of genes of the different networks is randomly established taking into account the length of the minimum-scale signature of the phenotype prediction problem which is the one that contains the most discriminatory genes with the maximum predictive power. The likelihood of the different networks is established via leave-one-out-cross-validation. Finally, the posterior analysis of the most frequently sampled genes serves to establish the defective biological pathways. This novel sampling algorithm is much faster and simpler than Bayesian Networks. We show its application to a microarray dataset concerning a type of breast cancers with very bad prognosis (TNBC). In these kind of cancers, the breast cancer cells have tested negative for hormone epidermal growth factor receptor 2 (HER-2), estrogen receptors (ER), and progesterone receptors (PR). This lack causes that common treatments like hormone therapy and drugs that target estrogen, progesterone, and HER-2 are ineffective. We believe that the genetic pathways that are identified via the Fisher’s ratio sampler, which are mainly related to signaling pathways, provide new insights about the molecular mechanisms that are involved in this complex disease. The Fisher’s ratio sampler can be also applied to the genetic analysis of other complex diseases.
international conference on bioinformatics and biomedical engineering | 2018
Ana Cernea; Juan Luis Fernández-Martínez; Enrique J. deAndrés-Galiana; Francisco Javier Fernández-Ovies; Zulima Fernández-Muñiz; Oscar Alvarez-Machancoses; Leorey N. Saligan; Stephen T. Sonis
In this paper, we compare different sampling algorithms used for identifying the defective pathways in highly underdetermined phenotype prediction problems. The first algorithm (Fisher’s ratio sampler) selects the most discriminatory genes and samples the high discriminatory genetic networks according to a prior probability that it is proportional to their individual Fisher’s ratio. The second one (holdout sampler) is inspired by the bootstrapping procedure used in regression analysis and uses the minimum-scale signatures found in different random hold outs to establish the most frequently sampled genes. The third one is a pure random sampler which randomly builds networks of differentially expressed genes. In all these algorithms, the likelihood of the different networks is established via leave one out cross-validation (LOOCV), and the posterior analysis of the most frequently sampled genes serves to establish the altered biological pathways. These algorithms are compared to the results obtained via Bayesian Networks (BNs). We show the application of these algorithms to a microarray dataset concerning Triple Negative Breast Cancers. This comparison shows that the Random, Fisher’s ratio and Holdout samplers are most effective than BNs, and all provide similar insights about the genetic mechanisms that are involved in this disease. Therefore, it can be concluded that all these samplers are good alternatives to Bayesian Networks which much lower computational demands. Besides this analysis confirms the insight that the altered pathways should be independent of the sampling methodology and the classifier that is used to infer them.
international conference on artificial intelligence and soft computing | 2018
Óscar Álvarez; Juan Luis Fernández-Martínez; Celia Fernández-Brillet; Ana Cernea; Zulima Fernández-Muñiz; Andrzej Kloczkowski
We discuss applicability of Principal Component Analysis and Particle Swarm Optimization in protein tertiary structure prediction. The proposed algorithm is based on establishing a low-dimensional space where the sampling (and optimization) is carried out via Particle Swarm Optimizer (PSO). The reduced space is found via Principal Component Analysis (PCA) performed for a set of previously found low-energy protein models. A high frequency term is added into this expansion by projecting the best decoy into the PCA basis set and calculating the residual model. Our results show that PSO improves the energy of the best decoy used in the PCA considering an adequate number of PCA terms.
Archive | 2017
Juan Carlos Beltrán Vargas; Enrique J. deAndrés-Galiana; Ana Cernea; Juan Luis Fernández-Martínez
Searching for new biomarkers, biological networks and pathways is crucial in the solution of neurodegenerative diseases. In this research we have compared three different algorithms and resampling techniques to find possible genetic causes in patients with Alzheimer’s and Parkinson’s diseases, providing some interesting insights about the main causes involved in these diseases.