José Antonio Lozano
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by José Antonio Lozano.
Archive | 2002
Pedro Larrañaga; José Antonio Lozano
Partial abductive inference in Bayesian networks is intended as the process of generating the J( most probable configurations for a distinguished subset of the network variables (explanation set), given some observations (evidence). This problem, also known as the Maximum a Posteriori Problem, is known to be NP-hard, so exact computation is not always possible. As partial abductive inference in Bayesian networks can be viewed as a combinatorial optimization problem, Genetic Algorithms have been successfully applied to give an approximate algorithm for it (de Campos et al., 1999). In this work we approach the problem by means of Estimation of Distribution Algorithms, and an empirical comparison between the results obtained by Genetic Algorithms and Estimation of Distribution Algorithms is carried out.
Pattern Recognition Letters | 1999
Jose M. Peña; José Antonio Lozano; Pedro Larrañaga
In this paper, we aim to compare empirically four initialization methods for the K-Means algorithm: random, Forgy, MacQueen and Kaufman. Although this algorithm is known for its robustness, it is widely reported in the literature that its performance depends upon two key points: initial clustering and instance order. We conduct a series of experiments to draw up (in terms of mean, maximum, minimum and standard deviation) the probability distribution of the square-error values of the final clusters returned by the K-Means algorithm independently on any initial clustering and on any instance order when each of the four initialization methods is used. The results of our experiments illustrate that the random and the Kaufman initialization methods outperform the rest of the compared methods as they make the K-Means more effective and more independent on initial clustering and on instance order. In addition, we compare the convergence speed of the K-Means algorithm when using each of the four initialization methods. Our results suggest that the Kaufman initialization method induces to the K-Means algorithm a more desirable behaviour with respect to the convergence speed than the random initialization method.
Archive | 2004
Xin Yao; Edmund K. Burke; José Antonio Lozano; Jim Smith; Juan J. Merelo-Guervós; John A. Bullinaria; Jonathan E. Rowe; Peter Tiňo; Ata Kabán; Hans-Paul Schwefel
Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person’s assignment. Unlike our previous work of using genetic algorithms whose learning is implicit, the learning in both approaches is explicit, i.e. we are able to identify building blocks directly. To achieve this target, the Bayesian optimization algorithm builds a Bayesian network of the joint probability distribution of the rules used to construct solutions, while the adapted classifier system assigns each rule a strength value that is constantly updated according to its usefulness in the current situation. Computational results from 52 real data instances of nurse scheduling demonstrate the success of both approaches. It is also suggested that the learning mechanism in the proposed approaches might be suitable for other scheduling problems.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010
Juan Diego Rodríguez; Aritz Pérez; José Antonio Lozano
In the machine learning field, the performance of a classifier is usually measured in terms of prediction error. In most real-world problems, the error cannot be exactly calculated and it must be estimated. Therefore, it is important to choose an appropriate estimator of the error. This paper analyzes the statistical properties, bias and variance, of the k-fold cross-validation classification error estimator (k-cv). Our main contribution is a novel theoretical decomposition of the variance of the k-cv considering its sources of variance: sensitivity to changes in the training set and sensitivity to changes in the folds. The paper also compares the bias and variance of the estimator for different values of k. The experimental study has been performed in artificial domains because they allow the exact computation of the implied quantities and we can rigorously specify the conditions of experimentation. The experimentation has been performed for two classifiers (naive Bayes and nearest neighbor), different numbers of folds, sample sizes, and training sets coming from assorted probability distributions. We conclude by including some practical recommendation on the use of k-fold cross validation.
Archive | 2006
José Antonio Lozano; Pedro Larrañaga; Iñaki Inza; Endika Bengoetxea
Linking Entropy to Estimation of Distribution Algorithms.- Entropy-based Convergence Measurement in Discrete Estimation of Distribution Algorithms.- Real-coded Bayesian Optimization Algorithm.- The CMA Evolution Strategy: A Comparing Review.- Estimation of Distribution Programming: EDA-based Approach to Program Generation.- Multi-objective Optimization with the Naive ID A.- A Parallel Island Model for Estimation of Distribution Algorithms.- GA-EDA: A New Hybrid Cooperative Search Evolutionary Algorithm.- Bayesian Classifiers in Optimization: An EDA-like Approach.- Feature Ranking Using an EDA-based Wrapper Approach.- Learning Linguistic Fuzzy Rules by Using Estimation of Distribution Algorithms as the Search Engine in the COR Methodology.- Estimation of Distribution Algorithm with 2-opt Local Search for the Quadratic Assignment Problem.
grid computing | 2014
Tania Lorido-Botran; José Miguel-Alonso; José Antonio Lozano
Cloud computing environments allow customers to dynamically scale their applications. The key problem is how to lease the right amount of resources, on a pay-as-you-go basis. Application re-dimensioning can be implemented effortlessly, adapting the resources assigned to the application to the incoming user demand. However, the identification of the right amount of resources to lease in order to meet the required Service Level Agreement, while keeping the overall cost low, is not an easy task. Many techniques have been proposed for automating application scaling. We propose a classification of these techniques into five main categories: static threshold-based rules, control theory, reinforcement learning, queuing theory and time series analysis. Then we use this classification to carry out a literature review of proposals for auto-scaling in the cloud.
PLOS ONE | 2009
David Otaegui; Sergio E. Baranzini; Rubén Armañanzas; Borja Calvo; Maider Muñoz-Culla; Puya Khankhanian; Iñaki Inza; José Antonio Lozano; Tamara Castillo-Triviño; Ana Asensio; Javier Olaskoaga; Adolfo López de Munain
Differences in gene expression patterns have been documented not only in Multiple Sclerosis patients versus healthy controls but also in the relapse of the disease. Recently a new gene expression modulator has been identified: the microRNA or miRNA. The aim of this work is to analyze the possible role of miRNAs in multiple sclerosis, focusing on the relapse stage. We have analyzed the expression patterns of 364 miRNAs in PBMC obtained from multiple sclerosis patients in relapse status, in remission status and healthy controls. The expression patterns of the miRNAs with significantly different expression were validated in an independent set of samples. In order to determine the effect of the miRNAs, the expression of some predicted target genes of these were studied by qPCR. Gene interaction networks were constructed in order to obtain a co-expression and multivariate view of the experimental data. The data analysis and later validation reveal that two miRNAs (hsa-miR-18b and hsa-miR-599) may be relevant at the time of relapse and that another miRNA (hsa-miR-96) may be involved in remission. The genes targeted by hsa-miR-96 are involved in immunological pathways as Interleukin signaling and in other pathways as wnt signaling. This work highlights the importance of miRNA expression in the molecular mechanisms implicated in the disease. Moreover, the proposed involvement of these small molecules in multiple sclerosis opens up a new therapeutic approach to explore and highlight some candidate biomarker targets in MS.
IEEE Transactions on Evolutionary Computation | 2008
Roberto Santana; Pedro Larrañaga; José Antonio Lozano
Simplified lattice models have played an important role in protein structure prediction and protein folding problems. These models can be useful for an initial approximation of the protein structure, and for the investigation of the dynamics that govern the protein folding process. Estimation of distribution algorithms (EDAs) are efficient evolutionary algorithms that can learn and exploit the search space regularities in the form of probabilistic dependencies. This paper introduces the application of different variants of EDAs to the solution of the protein structure prediction problem in simplified models, and proposes their use as a simulation tool for the analysis of the protein folding process. We develop new ideas for the application of EDAs to the bidimensional and tridimensional (2-d and 3-d) simplified protein folding problems. This paper analyzes the rationale behind the application of EDAs to these problems, and elucidates the relationship between our proposal and other population-based approaches proposed for the protein folding problem. We argue that EDAs are an efficient alternative for many instances of the protein structure prediction problem and are indeed appropriate for a theoretical analysis of search procedures in lattice models. All the algorithms introduced are tested on a set of difficult 2-d and 3-d instances from lattice models. Some of the results obtained with EDAs are superior to the ones obtained with other well-known population-based optimization algorithms.
Biodata Mining | 2008
Rubén Armañanzas; Iñaki Inza; Roberto Santana; Yvan Saeys; Jose Luis Flores; José Antonio Lozano; Yves Van de Peer; Rosa Blanco; Víctor Robles; Concha Bielza; Pedro Larrañaga
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigms potential for further research in this domain.
IEEE Transactions on Evolutionary Computation | 2014
Josu Ceberio; Ekhine Irurozki; Alexander Mendiburu; José Antonio Lozano
The aim of this paper is two-fold. First, we introduce a novel general estimation of distribution algorithm to deal with permutation-based optimization problems. The algorithm is based on the use of a probabilistic model for permutations called the generalized Mallows model. In order to prove the potential of the proposed algorithm, our second aim is to solve the permutation flowshop scheduling problem. A hybrid approach consisting of the new estimation of distribution algorithm and a variable neighborhood search is proposed. Conducted experiments demonstrate that the proposed algorithm is able to outperform the state-of-the-art approaches. Moreover, from the 220 benchmark instances tested, the proposed hybrid approach obtains new best known results in 152 cases. An in-depth study of the results suggests that the successful performance of the introduced approach is due to the ability of the generalized Mallows estimation of distribution algorithm to discover promising regions in the search space.