Sandra Gómez Canaval

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sandra Gómez Canaval is active.

Explore More

Publication

Featured researches published by Sandra Gómez Canaval.

language and automata theory and applications | 2014

Networks of Polarized Evolutionary Processors Are Computationally Complete

Fernando Arroyo; Sandra Gómez Canaval; Victor Mitrana; Ştefan Popescu

In this paper, we consider the computational power of a new variant of networks of evolutionary processors which seems to be more suitable for a software and hardware implementation. Each processor as well as the data navigating throughout the network are now considered to be polarized. While the polarization of every processor is predefined, the data polarization is dynamically computed by means of a valuation mapping. Consequently, the protocol of communication is naturally defined by means of this polarization. We show that tag systems can be simulated by these networks with a constant number of nodes, while Turing machines can be simulated, in a time-efficient way, by these networks with a number of nodes depending linearly on the tape alphabet of the Turing machine.

Information & Computation | 2017

On the computational power of networks of polarized evolutionary processors

Fernando Arroyo; Sandra Gómez Canaval; Victor Mitrana; Stefan Popescu

Abstract We consider a new variant of networks of evolutionary processors which seems more suitable for a software and hardware implementation. Each processor as well as the data navigating throughout the network are now considered to be polarized. While the polarization of every processor is predefined, the data polarization is dynamically computed. Consequently, the protocol of communication is naturally defined by this polarization. We show that tag systems can be simulated by these networks with a constant number of nodes, while Turing machines can be efficiently simulated by these networks with a number of nodes depending linearly on the tape alphabet of the Turing machine. We also propose a simulation of Turing machines by networks with a constant number of nodes, which is reflected in an increase of the computation time. Finally, we show that every network can be simulated by a Turing machine and discuss the time complexity of this simulation.

international conference on data mining | 2016

A Fast Iterative Algorithm for Improved Unsupervised Feature Selection

Bruno Ordozgoiti; Sandra Gómez Canaval; Alberto Mozo

Dimensionality reduction is often a crucial step for the successfulapplication of machine learning and data mining methods. One way toachieve said reduction is feature selection. Due to the impossibilityof labelling many data sets, unsupervised approaches arefrequently the only option. The column subset selection problemtranslates naturally to this purpose, and has received considerableattention over the last few years, as it provides simple linear modelsfor data reconstruction. Existing methods, however, often achieveapproximation errors that are far from the optimum. In this paper wepresent a novel algorithm for column subset selection thatconsistently outperforms state-of-the-art methods in approximationerror. We present a series of key derivations that allow anefficient implementation, making it comparable in speed and in somecases faster than other algorithms. We also prove results that make itpossible to deal with huge matrices, which has strong implications for otheralgorithms of this type in the big data field. We validate our claimsthrough experiments on a wide variety of well-known data sets.

advances in databases and information systems | 2015

NPEPE: Massive Natural Computing Engine for Optimally Solving NP-complete Problems in Big Data Scenarios

Sandra Gómez Canaval; Bruno Ordozgoiti Rubio; Alberto Mozo

Networks of Evolutionary Processors (NEP) is a bio-inspired computational model defining theoretical computing devices able to solve NP-complete problems in an efficient manner. Networks of Polarized Evolutionary Processors (NPEP) is an evolution of the NEP model that presents a simpler and more natural filtering strategy to simulate the communication between cells. Up to now, it has not been possible to have implementations neither in vivo nor in vitro of these models. Therefore, the only way to analyze and execute NPEP devices is by means of ultra-scalable simulators able to encapsulate the inherent parallelism in their computations. Nowadays, there is a lack of such simulators able to handle the size of non trivial problems in a massively distributed computing environment. We propose as novelty NPEPE, a high scalability engine that runs NPEP descriptions using Apache Giraph on top of Hadoop platforms. Giraph is the open source counterpart of Google Pregel, an iterative graph processing system built for high scalability. NPEPE takes advantage of the inherent Giraph and Hadoop parallelism and scalablity to be able to deploy and run massive networks of NPEPs. We show several experiments to demonstrate that NPEP descriptions can be easily deployed and run using a NPEPE engine on a Giraph+Hadoop platform. To this end, the well known 3-colorability NP complete problem is described as a network of NPEPs and run on a 10 nodes cluster.

advances in databases and information systems | 2015

Massively Parallel Unsupervised Feature Selection on Spark

Bruno Ordozgoiti; Sandra Gómez Canaval; Alberto Mozo

High dimensional data sets pose important challenges such as the curse of dimensionality and increased computational costs. Dimensionality reduction is therefore a crucial step for most data mining applications. Feature selection techniques allow us to achieve said reduction. However, it is nowadays common to deal with huge data sets, and most existing feature selection algorithms are designed to function in a centralized fashion, which makes them non scalable. Moreover, some of them require the selection process to be validated according to some target, which constrains their applicability to the supervised learning setting. In this paper we propose as novelty a parallel, scalable, exact implementation of an existing centralized, unsupervised feature selection algorithm on Spark, an efficient big data framework for large-scale distributed computation that outperforms MapReduce when applied to multi-pass algorithms. We validate the efficiency of the implementation using 1GB of real Internet traffic captured at a medium-sized ISP.

international work-conference on artificial and natural neural networks | 2015

Distributed Simulation of NEPs Based On-Demand Cloud Elastic Computation

Sandra Gómez Canaval; Alfonso Ortega de la Puente; Pablo Orgaz González

Networks of Evolutionary Processors (NEP) are a bio-inspired computational model able to solve NP complete problems in an efficient manner. Up to now, the only way to analyze and execute these devices is through hardware and software simulators able to encapsulate the inherent parallelism and the efficiency in their computations. Nowadays, simulators for these models only cover many software applications developed under sequential/parallel architectures over multicore desktop computers or clusters of computers. Most of them, are not able to handle the size of non trivial problems within a massively parallel environment. We consider that cloud computation offers an interesting and promising option to overcome the drawbacks of these solutions. In this paper, we propose a novel parallel distributed architecture to simulate NEPs using on-demand cloud elastic computation. A flexible and extensible simulator is developed in order to demonstrate the suitability and scalability of our architecture with several variants of NEP.

Lecture Notes in Computer Science: Unconventional Computation and Natural Computation: 12th International Conference, UCNC 2013, Milan, Italy, July 1-5, 2013. Proceedings, ISSN 978-3-642-39073-9, 2013, Vol. 7956 | 2013

Simulating Metabolic Processes Using an Architecture Based on Networks of Bio-inspired Processors

Sandra Gómez Canaval; José Ramón Sánchez; Fernando Arroyo

In this work, we propose the Networks of Evolutionary Processors (NEP) [2] as a computational model to solve problems related with biological phenomena. In our first approximation, we simulate biological processes related with cellular signaling and their implications in the metabolism, by using an architecture based on NEP (NEP architecture) and their specializations: Networks of Polarized Evolutionary Processors (NPEP) [1] and NEP Transducers (NEPT) [3]. In particular, we use this architecture to simulate the interplay between cellular processes related with the metabolism as the Krebs cycle and the malate-aspartate shuttle pathway (MAS) both being altered by signaling by calcium.

Knowledge and Information Systems | 2018

Iterative column subset selection

Bruno Ordozgoiti; Sandra Gómez Canaval; Alberto Mozo

Dimensionality reduction is often a crucial step for the successful application of machine learning and data mining methods. One way to achieve said reduction is feature selection. Due to the impossibility of labelling many data sets, unsupervised approaches are frequently the only option. The column subset selection problem translates naturally to this purpose and has received considerable attention over the last few years, as it provides simple linear models for low-rank data reconstruction. Recently, it was empirically shown that an iterative algorithm, which can be implemented efficiently, provides better subsets than other state-of-the-art methods. In this paper, we describe this algorithm and provide a more in-depth analysis. We carry out numerous experiments to gain insights on its behaviour and derive a simple bound for the norm recovered by the resulting matrix. To the best of our knowledge, this is the first theoretical result of this kind for this algorithm.

international work-conference on artificial and natural neural networks | 2015

How to Search Optimal Solutions in Big Spaces with Networks of Bio-Inspired Processors

José Ramón Sánchez Couso; Sandra Gómez Canaval; David Batard Lorenzo

Searching for new efficient and exact heuristic optimization algorithms in big search spaces currently remains as an open problem. The search space increases exponentially with the problem size, making impossible to find a solution through a mere blind search. Several heuristic approaches inspired by nature have been adopted as suitable algorithms to solve complex optimization problems in many different areas. Networks of Bio-inspired Processors (NBP) is a formal framework formed of highly parallel and distributed computing models inspired and abstracted by biological evolution. From a theoretical point of view, NBP has been proved broadly to be an efficient solving of NP complete problems. The aim of this paper is to explore the expressive power of NBP to solve hard optimization problems with a big search space, using massively parallel architectures. We use the basic concepts and principles of some metaheuristic approaches to propose an extension of the NBP model, which is able to solve actual problems in the optimization field from a practical point of view.

international work-conference on artificial and natural neural networks | 2017

Probabilistic Leverage Scores for Parallelized Unsupervised Feature Selection

Bruno Ordozgoiti; Sandra Gómez Canaval; Alberto Mozo

Dimensionality reduction is often crucial for the application of machine learning and data mining. Feature selection methods can be employed for this purpose, with the advantage of preserving interpretability. There exist unsupervised feature selection methods based on matrix factorization algorithms, which can help choose the most informative features in terms of approximation error. Randomized methods have been proposed recently to provide better theoretical guarantees and better approximation errors than their deterministic counterparts, but their computational costs can be significant when dealing with big, high dimensional data sets. Some existing randomized and deterministic approaches require the computation of the singular value decomposition in \(O(mn\min (m,n))\) time (for m samples and n features) for providing leverage scores. This compromises their applicability to domains of even moderately high dimensionality. In this paper we propose the use of Probabilistic PCA to compute the leverage scores in O(mnk) time, enabling the applicability of some of these randomized methods to large, high-dimensional data sets. We show that using this approach, we can rapidly provide an approximation of the leverage scores that is works well in this context. In addition, we offer a parallelized version over the emerging Resilient Distributed Datasets paradigm (RDD) on Apache Spark, making it horizontally scalable for enormous numbers of data instances. We validate the performance of our approach on different data sets comprised of real-world and synthetic data.

Explore More