Cristina Rubio-Escudero
University of Seville
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cristina Rubio-Escudero.
Expert Systems With Applications | 2010
Antonio Morales-Esteban; Francisco Martínez-Álvarez; Alicia Troncoso; J.L. Justo; Cristina Rubio-Escudero
Earthquakes arrive without previous warning and can destroy a whole city in a few seconds, causing numerous deaths and economical losses. Nowadays, a great effort is being made to develop techniques that forecast these unpredictable natural disasters in order to take precautionary measures. In this paper, clustering techniques are used to obtain patterns which model the behavior of seismic temporal data and can help to predict medium-large earthquakes. First, earthquakes are classified into different groups and the optimal number of groups, a priori unknown, is determined. Then, patterns are discovered when medium-large earthquakes happen. Results from the Spanish seismic temporal data provided by the Spanish Geographical Institute and non-parametric statistical tests are presented and discussed, showing a remarkable performance and the significance of the obtained results.
Knowledge Based Systems | 2013
Francisco Martínez-Álvarez; Jorge Reyes; Antonio Morales-Esteban; Cristina Rubio-Escudero
This work explores the use of different seismicity indicators as inputs for artificial neural networks. The combination of multiple indicators that have already been successfully used in different seismic zones by the application of feature selection techniques is proposed. These techniques evaluate every input and propose the best combination of them in terms of information gain. Once these sets have been obtained, artificial neural networks are applied to four Chilean zones (the most seismic country in the world) and to two zones of the Iberian Peninsula (a moderate seismicity area). To make the comparison to other models possible, the prediction problem has been turned into one of classification, thus allowing the application of other machine learning classifiers. Comparisons with original sets of inputs and different classifiers are reported to support the degree of success achieved. Statistical tests have also been applied to confirm that the results are significantly different than those of other classifiers. The main novelty of this work stems from the use of feature selection techniques for improving earthquake prediction methods. So, the information gain of different seismic indicators has been determined. Low ranked or null contribution seismic indicators have been removed, optimizing the method. The optimized prediction method proposed has a high performance. Finally, four Chilean zones and two zones of the Iberian Peninsula have been characterized by means of an information gain analysis obtained from different seismic indicators. The results confirm the methodology proposed as the best features in terms of information gain are the same for both regions.
IEEE Transactions on Evolutionary Computation | 2008
Rocío Romero-Zaliz; Cristina Rubio-Escudero; J. P. Cobb; Francisco Herrera; Oscar Cordón; Igor Zwir
Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical conceptual clustering approaches normally recover the most obvious relations, but fail to discover the less frequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), relying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in structural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.
Bioinformatics | 2007
Rafael Navajas-Pérez; Cristina Rubio-Escudero; José Luis Aznarte; Manuel Ruiz Rejón; Manuel A. Garrido-Ramos
UNLABELLED satDNA Analyzer is a program, implemented in C++, for the analysis of the patterns of variation at each nucleotide position considered independently amongst all units of a given satellite-DNA family when comparing it between a pair of species. The program classifies each site accordingly as monomorphic or polymorphic, discriminates shared from non-shared polymorphisms and classifies each non-shared polymorphism according to the model proposed by Strachan et al. in six different stages of transition during the spread of a variant repeat unit toward its fixation. Furthermore, this program implements several other utilities for satellite-DNA analysis evolution such as the design of the average consensus sequences, the average base pair contents, the distribution of variant sites, the transition to transversion ratio and different estimates of intra-specific variation and inter-specific variation. Aprioristic hypotheses on factors influencing the molecular drive process and the rates and biases of concerted evolution can be tested with this program. Additionally, satDNA Analyzer generates an output file containing a sequence alignment without shared polymorphisms to be used for further evolutionary analysis by using different phylogenetic softwares. AVAILABILITY satDNA Analyzer is freely available at http://satdna.sourceforge.net/. SatDNA Analyzer has been designed to operate on Windows, Linux and Mac OS X.
Neurocomputing | 2014
David Gutiérrez-Avilés; Cristina Rubio-Escudero; Francisco Martínez-Álvarez; José C. Riquelme
Analyzing microarray data represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior under the conditions tested. Biclustering emerges as an improvement of classical clustering since it relaxes the constraints for grouping genes to be evaluated only under a subset of the conditions and not under all of them. However, this technique is not appropriate for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. We present the TriGen algorithm, a genetic algorithm that finds triclusters of gene expression that take into account the experimental conditions and the time points simultaneously. We have used TriGen to mine datasets related to synthetic data, yeast (Saccharomyces cerevisiae) cell cycle and human inflammation and host response to injury experiments. TriGen has proved to be capable of extracting groups of genes with similar patterns in subsets of conditions and times, and these groups have shown to be related in terms of their functional annotations extracted from the Gene Ontology.
Entropy | 2015
Francisco Martínez-Álvarez; David Gutiérrez-Avilés; Antonio Morales-Esteban; Jorge Reyes; José L. Amaro-Mellado; Cristina Rubio-Escudero
A previous definition of seismogenic zones is required to do a probabilistic seismic hazard analysis for areas of spread and low seismic activity. Traditional zoning methods are based on the available seismic catalog and the geological structures. It is admitted that thermal and resistant parameters of the crust provide better criteria for zoning. Nonetheless, the working out of the rheological profiles causes a great uncertainty. This has generated inconsistencies, as different zones have been proposed for the same area. A new method for seismogenic zoning by means of triclustering is proposed in this research. The main advantage is that it is solely based on seismic data. Almost no human decision is made, and therefore, the method is nearly non-biased. To assess its performance, the method has been applied to the Iberian Peninsula, which is characterized by the occurrence of small to moderate magnitude earthquakes. The catalog of the National Geographic Institute of Spain has been used. The output map is checked for validity with the geology. Moreover, a geographic information system has been used for two purposes. First, the obtained zones have been depicted within it. Second, the data have been used to calculate the seismic parameters (b-value, annual rate). Finally, the results have been compared to Kohonen’s self-organizing maps.
The Scientific World Journal | 2014
David Gutiérrez-Avilés; Cristina Rubio-Escudero
Microarrays have revolutionized biotechnological research. The analysis of new data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are applied to create groups of genes that exhibit a similar behavior. Biclustering emerges as a valuable tool for microarray data analysis since it relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. However, if a third dimension appears in the data, triclustering is the appropriate tool for the analysis. This occurs in longitudinal experiments in which the genes are evaluated under conditions at several time points. All clustering, biclustering, and triclustering techniques guide their search for solutions by a measure that evaluates the quality of clusters. We present an evaluation measure for triclusters called Mean Square Residue 3D. This measure is based on the classic biclustering measure Mean Square Residue. Mean Square Residue 3D has been applied to both synthetic and real data and it has proved to be capable of extracting groups of genes with homogeneous patterns in subsets of conditions and times, and these groups have shown a high correlation level and they are also related to their functional annotations extracted from the Gene Ontology project.
international conference hybrid intelligent systems | 2008
Cristina Rubio-Escudero; Francisco Martínez-Álvarez; Rocío Romero-Zaliz; Igor Zwir
Biomedical research has been revolutionized by high-throughput techniques and the enormous amount of biological data they are able to generate. In particular micro-array technology has the capacity to monitor changes in RNA abundance for thousands of genes simultaneously. The interest shown over microarray analysis methods has rapidly raised. Clustering is widely used in the analysis of microarray data to group genes of interest targeted from microarray experiments on the basis of similarity of expression patterns. In this work we apply two clustering algorithms, K-means and expectation maximization to particular a problem and we compare the groupings obtained on the basis of the cohesiveness of the gene products associated to the genes in each cluster.
Evolutionary Bioinformatics | 2015
David Gutiérrez-Avilés; Cristina Rubio-Escudero
Microarray technology is highly used in biological research environments due to its ability to monitor the RNA concentration levels. The analysis of the data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior. Biclustering relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. Triclustering appears for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. These triclusters provide hidden information in the form of behavior patterns from temporal experiments with microarrays relating subsets of genes, experimental conditions, and time points. We present an evaluation measure for triclusters called Multi Slope Measure, based on the similarity among the angles of the slopes formed by each profile formed by the genes, conditions, and times of the tricluster.
bioinformatics and biomedicine | 2014
David Gutiérrez-Avilés; Cristina Rubio-Escudero
Microarray technology has led to a great advance in biological studies due to its ability to monitorize the RNA levels of a vast amount of genes under certain experimental conditions. The use of computational techniques to mine hidden knowledge from these data is of great interest in research fields such as Data Mining and Bioinformatics. Finding patterns of genetic behavior not only taking into account the experimental conditions but also the time condition is a very challenging task nowadays. Clustering, biclustering and novel triclustering techniques offer a very suitable framework to solve the suggested problem. In this work we present LSL, a measure to evaluate the quality of triclusters found in 3D data.