Federico Divina | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Federico Divina is active.

Explore More

Publication

Featured researches published by Federico Divina.

IEEE Transactions on Knowledge and Data Engineering | 2006

Biclustering of expression data with evolutionary computation

Federico Divina; Jesús S. Aguilar-Ruiz

Microarray techniques are leading to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In this work, we address the biclustering of gene expression data with evolutionary computation. Our approach is based on evolutionary algorithms, which have been proven to have excellent performance on complex problems, and searches for biclusters following a sequential covering strategy. The goal is to find biclusters of maximum size with mean squared residue lower than a given /spl delta/. In addition, we pay special attention to the fact of looking for high-quality biclusters with large variation, i.e., with a relatively high row variance, and with a low level of overlapping among biclusters. The quality of biclusters found by our evolutionary approach is discussed and the results are compared to those reported by Cheng and Church, and Yang et al. In general, our approach, named SEBI, shows an excellent performance at finding patterns in gene expression data.

genetic and evolutionary computation conference | 2007

A multi-objective approach to discover biclusters in microarray data

Federico Divina; Jesús S. Aguilar-Ruiz

The main motivation for using a multi-objective evolutionary algorithm for finding biclusters in gene expression data is motivated by the fact that when looking for biclusters in gene expression matrix, several objectives have to be optimized simultaneously, and often these objectives are in conflict with each other. Moreover, the use of evolutionary computation is justified by the huge dimensionality of the search space, since it is known that evolutionary algorithms have great exploration power. We focus our attention on finding biclusters of high quality with large variation. This is because, in expression data analysis, the most important goal may not be finding biclusters containing many genes and conditions, as it might be more interesting to find a set of genes showing similar behavior under a set of conditions. Experimental results confirm the validity of the proposed technique.

Bioinformatics | 2012

Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features

Jaume Bacardit; Paweł Widera; Alfonso E. Márquez-Chamorro; Federico Divina; Jesús S. Aguilar-Ruiz; Natalio Krasnogor

MOTIVATION The prediction of a proteins contact map has become in recent years, a crucial stepping stone for the prediction of the complete 3D structure of a protein. In this article, we describe a methodology for this problem that was shown to be successful in CASP8 and CASP9. The methodology is based on (i) the fusion of the prediction of a variety of structural aspects of protein residues, (ii) an ensemble strategy used to facilitate the training process and (iii) a rule-based machine learning system from which we can extract human-readable explanations of the predictor and derive useful information about the contact map representation. RESULTS The main part of the evaluation is the comparison against the sequence-based contact prediction methods from CASP9, where our method presented the best rank in five out of the six evaluated metrics. We also assess the impact of the size of the ensemble used in our predictor to show the trade-off between performance and training time of our method. Finally, we also study the rule sets generated by our machine learning system. From this analysis, we are able to estimate the contribution of the attributes in our representation and how these interact to derive contact predictions. AVAILABILITY http://icos.cs.nott.ac.uk/servers/psp.html. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

evolutionary computation machine learning and data mining in bioinformatics | 2007

Virtual error: a new measure for evolutionary biclustering

Beatriz Pontes; Federico Divina; Raúl Giráldez; Jesús S. Aguilar-Ruiz

Many heuristics used for finding biclusters in microarray data use the mean squared residue as a way of evaluating the quality of biclusters. This has led to the discovery of interesting biclusters. Recently it has been proven that the mean squared residue may fail to identify some interesting biclusters. This motivates us to introduce a new measure, called Virtual Error, for assessing the quality of biclusters in microarray data. In order to test the validity of the proposed measure, we include it within an evolutionary algorithm. Experimental results show that the use of this novel measure is effective for finding interesting biclusters, which could not have been discovered with the use of the mean squared residue.

genetic and evolutionary computation conference | 2003

A method for handling numerical attributes in GA-based inductive concept learners

Federico Divina; Maarten Keijzer; Elena Marchiori

This paper proposes a method for dealing with numerical attributes in inductive concept learning systems based on genetic algorithms. The method uses constraints for restricting the range of values of the attributes and novel stochastic operators for modifying the constraints. These operators exploit information on the distribution of the values of an attribute. The method is embedded into a GA based system for inductive logic programming. Results of experiments on various data sets indicate that the method provides an effective local discretization tool for GA based inductive concept learners.

Computers in Biology and Medicine | 2012

An effective measure for assessing the quality of biclusters

Federico Divina; Beatriz Pontes; Raúl Giráldez; Jesús S. Aguilar-Ruiz

Biclustering is becoming a popular technique for the study of gene expression data. This is mainly due to the capability of biclustering to address the data using various dimensions simultaneously, as opposed to clustering, which can use only one dimension at the time. Different heuristics have been proposed in order to discover interesting biclusters in data. Such heuristics have one common characteristic: they are guided by a measure that determines the quality of biclusters. It follows that defining such a measure is probably the most important aspect. One of the popular quality measure is the mean squared residue (MSR). However, it has been proven that MSR fails at identifying some kind of patterns. This motivates us to introduce a novel measure, called virtual error (VE), that overcomes this limitation. Results obtained by using VE confirm that it can identify interesting patterns that could not be found by MSR.

Lecture Notes in Computer Science | 2005

Evolutionary biclustering of microarray data

Jesús S. Aguilar–Ruiz; Federico Divina

In this work, we address the biclustering of gene expression data with evolutionary computation, which has been proven to have excellent performance on complex problems. In expression data analysis, the most important goal may not be finding the maximum bicluster, as it might be more interesting to find a set of genes showing similar behavior under a set of conditions. Our approach is based on evolutionary algorithms and searches for biclusters following a sequential covering strategy. In addition, we pay special attention to the fact of looking for high quality biclusters with large variation. The quality of biclusters found by our approach is discussed by means of the analysis of yeast and colon cancer datasets.

International Journal of Intelligent Computing and Cybernetics | 2009

Improved biclustering on expression data through overlapping control

Beatriz Pontes; Federico Divina; Raúl Giráldez; Jesús S. Aguilar-Ruiz

– The purpose of this paper is to present a novel control mechanism for avoiding overlapping among biclusters in expression data., – Biclustering is a technique used in analysis of microarray data. One of the most popular biclustering algorithms is introduced by Cheng and Church (2000) (Ch&Ch). Even if this heuristic is successful at finding interesting biclusters, it presents several drawbacks. The main shortcoming is that it introduces random values in the expression matrix to control the overlapping. The overlapping control method presented in this paper is based on a matrix of weights, that is used to estimate the overlapping of a bicluster with already found ones. In this way, the algorithm is always working on real data and so the biclusters it discovers contain only original data., – The paper shows that the original algorithm wrongly estimates the quality of the biclusters after some iterations, due to random values that it introduces. The empirical results show that the proposed approach is effective in order to improve the heuristic. It is also important to highlight that many interesting biclusters found by using our approach would have not been obtained using the original algorithm., – The original algorithm proposed by Ch&Ch is one of the most successful algorithms for discovering biclusters in microarray data. However, it presents some limitations, the most relevant being the substitution phase adopted in order to avoid overlapping among biclusters. The modified version of the algorithm proposed in this paper improves the original one, as proven in the experimentation.

genetic and evolutionary computation conference | 2003

Non-universal suffrage selection operators favor population diversity in genetic algorithms

Federico Divina; Maarten Keijzer; Elena Marchiori

State-of-the-art concept learning systems based on genetic algorithms evolve a redundant population of individuals, where an individual is a partial solution that covers some instances of the learning set. In this context, it is fundamental that the population be diverse and that as many instances as possible be covered. The universal suffrage selection (US) operator is a powerful selection mechanism that addresses these two requirements. In this paper we compare experimentally the US operator with two variants, called Weighted US (WUS) and Exponentially Weighted US (EWUS), of this operator in the system ECL [1].

genetic and evolutionary computation conference | 2004

Experimental Evaluation of Discretization Schemes for Rule Induction

Jesús S. Aguilar-Ruiz; Jaume Bacardit; Federico Divina

This paper proposes an experimental evaluation of various discretization schemes in three different evolutionary systems for induc- tive concept learning. The various discretization methods are used in order to obtain a number of discretization intervals, which represent the basis for the methods adopted by the systems for dealing with numeri- cal values. Basically, for each rule and attribute, one or many intervals are evolved, by means of ad-hoc operators. These operators, depending on the system, can add/subtract intervals found by a discretization me- thod to/from the intervals described by the rule, or split/merge these intervals. In this way the discretization intervals are evolved along with the rules. The aim of this experimental evaluation is to determine for an evolutionary-based system the discretization method that allows the system to obtain the best results. Moreover we want to verify if there is a discretization scheme that can be considered as generally good for evolutionary-based systems. If such a discretization method exists, it could be adopted by all the systems for inductive concept learning using a similar strategy for dealing with numerical values. Otherwise, it would be interesting to extract relationships between the performance of a sy- stem and the discretizer used.

Explore More