Rosa Blanco
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rosa Blanco.
Artificial Intelligence in Medicine | 2004
Iñaki Inza; Pedro Larrañaga; Rosa Blanco; Antonio José Cerrolaza
DNA microarray experiments generating thousands of gene expression measurements, are used to collect information from tissue and cell samples regarding gene expression differences that could be useful for diagnosis disease, distinction of the specific tumor type, etc. One important application of gene expression microarray data is the classification of samples into known categories. As DNA microarray technology measures the gene expression en masse, this has resulted in data with the number of features (genes) far exceeding the number of samples. As the predictive accuracy of supervised classifiers that try to discriminate between the classes of the problem decays with the existence of irrelevant and redundant features, the necessity of a dimensionality reduction process is essential. We propose the application of a gene selection process, which also enables the biology researcher to focus on promising gene candidates that actively contribute to classification in these large scale microarrays. Two basic approaches for feature selection appear in machine learning and pattern recognition literature: the filter and wrapper techniques. Filter procedures are used in most of the works in the area of DNA microarrays. In this work, a comparison between a group of different filter metrics and a wrapper sequential search procedure is carried out. The comparison is performed in two well-known DNA microarray datasets by the use of four classic supervised classifiers. The study is carried out over the original-continuous and three-intervals discretized gene expression data. While two well-known filter metrics are proposed for continuous data, four classic filter measures are used over discretized data. The same wrapper approach is used for both continuous and discretized data. The application of filter and wrapper gene selection procedures leads to considerably better accuracy results in comparison to the non-gene selection approach, coupled with interesting and notable dimensionality reductions. Although the wrapper approach mainly shows a more accurate behavior than filter metrics, this improvement is coupled with considerable computer-load necessities. We note that most of the genes selected by proposed filter and wrapper procedures in discrete and continuous microarray data appear in the lists of relevant-informative genes detected by previous studies over these datasets. The aim of this work is to make contributions in the field of the gene selection task in DNA microarray datasets. By an extensive comparison with more popular filter techniques, we would like to make contributions in the expansion and study of the wrapper approach in this type of domains.
Biodata Mining | 2008
Rubén Armañanzas; Iñaki Inza; Roberto Santana; Yvan Saeys; Jose Luis Flores; José Antonio Lozano; Yves Van de Peer; Rosa Blanco; Víctor Robles; Concha Bielza; Pedro Larrañaga
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigms potential for further research in this domain.
International Journal of Pattern Recognition and Artificial Intelligence | 2004
Rosa Blanco; Pedro Larrañaga; Iñaki Inza; Basilio Sierra
Despite the fact that cancer classification has considerably improved, nowadays a general method that classifies known types of cancer has not yet been developed. In this work, we propose the use of supervised classification techniques, coupled with feature subset selection algorithms, to automatically perform this classification in gene expression datasets. Due to the large number of features of gene expression datasets, the search of a highly accurate combination of features is done by means of the new Estimation of Distribution Algorithms paradigm. In order to assess the accuracy level of the proposed approach, the naive-Bayes classification algorithm is employed in a wrapper form. Promising results are achieved, in addition to a considerable reduction in the number of genes. Stating the optimal selection of genes as a search task, an automatic and robust choice in the genes finally selected is performed, in contrast to previous works that research the same types of problems.
International Journal of Intelligent Systems | 2003
Rosa Blanco; Iñaki Inza; Pedro Larrañaga
The induction of the optimal Bayesian network structure is NP‐hard, justifying the use of search heuristics. Two novel population‐based stochastic search approaches, univariate marginal distribution algorithm (UMDA) and population‐based incremental learning (PBIL), are used to learn a Bayesian network structure from a database of cases in a score + search framework. A comparison with a genetic algorithm (GA) approach is performed using three different scores: penalized maximum likelihood, marginal likelihood, and information‐theory–based entropy. Experimental results show the interesting capabilities of both novel approaches with respect to the score value and the number of generations needed to converge.
Journal of Biomedical Informatics | 2005
Rosa Blanco; Iñaki Inza; Marisa Merino; Jorge Quiroga; Pedro Larrañaga
The transjugular intrahepatic portosystemic shunt (TIPS) is a treatment for cirrhotic patients with portal hypertension. A subgroup of patients dies in the first 6 months and another subgroup lives a long period of time. Nowadays, no risk factors have been identified in order to determine how long a patient will survive. An empirical study for predicting the survival rate within the first 6 months after TIPS placement is conducted using a clinical database with 107 cases and 77 variables. Applications of Bayesian classification models, based on Bayesian networks, to medical problems have become popular in the last years. Feature subset selection is useful due to the heterogeneity of the medical databases where not all the variables are required to perform the classification. In this paper, filter and wrapper approaches based on the feature subset selection are adapted to induce Bayesian classifiers (naive Bayes, selective naive Bayes, semi naive Bayes, tree augmented naive Bayes, and k-dependence Bayesian classifier) and are applied to distinguish between the two subgroups of cirrhotic patients. The estimated accuracies obtained tally with the results of previous studies. Moreover, the medical significance of the subset of variables selected by the classifiers along with the comprehensibility of Bayesian models is greatly appreciated by physicians.
Archive | 2002
Rosa Blanco; José Antonio Lozano
In this paper we present an empirical comparison between different im-plementations of Estimation of Distribution Algorithms in discrete domains. The empirical comparison is carried out in relation with three different criteria: the convergence velocity, the convergence reliability and the scalability. Different function sets are optimized depending on the aspect to evaluate.
Archive | 2004
Rosa Blanco; Iiiaki Inza; Pedro Larrañaga
In this work, two novel sequential algorithms for learning Bayesian networks are proposed. The presented sequential search methods are an adaptation of a pair of algorithms proposed to feature subset selection: Sequential Forward Floating Selection and Sequential Backward Floating Selection. As far as we know, these algorithms have never been used for learning Bayesian networks. An empirical comparison among the results of the proposed algorithms and the results of two sequential algorithm (the classical B-algorithm and its extension, the B3 algorithm) is carried out over four databases from literature. The results show promising results for the floating approach to the learning Bayesian network problem.
Journal of Intelligent and Fuzzy Systems | 2002
Iñaki Inza; Basilio Sierra; Rosa Blanco; Pedro Larrañaga
probabilistic graphical models | 2002
Rosa Blanco; Iñaki Inza; Pedro Larrañaga
Archive | 2001
Rosa Blanco; Pedro Larrañaga; Basilio Sierra