Ginés D. Guerrero
University of Murcia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ginés D. Guerrero.
Briefings in Bioinformatics | 2010
José M. Cecilia; José M. García; Ginés D. Guerrero; Miguel A. Martínez-del-Amor; Ignacio Pérez-Hurtado; Mario J. Pérez-Jiménez
P systems or membrane systems provide a high level computational modeling framework that combines the structural and dynamic aspects of biological systems in a relevant and understandable way. P systems are massively parallel distributed, and non-deterministic systems. In this paper, we describe the implementation of a simulator for the class of recognizer P systems with active membranes by using the GPU (Graphics Processing Unit). We compare the high performance parallel simulator for the GPU to the simulator developed on a single CPU (Central Processing Unit), and we show that the GPU is better suited than the CPU to simulate P systems due to its highly parallel nature.
parallel, distributed and network-based processing | 2012
Moisés Hern´ndez; Ginés D. Guerrero; José M. Cecilia; José M. García; Alberto Inuggi; Stamatios N. Sotiropoulos
Diffusion Weighted Magnetic Resonance Imaging (DW-MRI) and tractography approaches are the only tools that can be utilized to estimate structural connections between different brain areas, non-invasively and in-vivo. A first step that is commonly utilized in these techniques includes the estimation of the underlying fibre orientations and their uncertainty in each voxel of the image. A popular method to achieve that is implemented in the FSL software, provided by the FMRIB Centre at University of Oxford, and is based on a Bayesian inference framework. Despite its popularity, the approach has high computational demands, taking normally more than 24 hours for analyzing a single subject. In this paper, we present a GPU-optimized version of the FSL tool that estimates fibre orientations. We report up to 85x of speed-up factor between the GPU and its sequential counterpart CPU-based version.
The Journal of Logic and Algebraic Programming | 2010
José M. Cecilia; José M. García; Ginés D. Guerrero; Miguel A. Martínez-del-Amor; Ignacio Pérez-Hurtado; Mario J. Pérez-Jiménez
Abstract P systems are inherently parallel and non-deterministic theoretical computing devices defined inside the field of Membrane Computing. Many P system simulators have been presented in this area, but they are inefficient since they cannot handle the parallelism of these devices. Nowadays, we are witnessing the consolidation of the GPUs as a parallel framework to compute general purpose applications. In this paper, we analyse GPUs as an alternative parallel architecture to improve the performance in the simulation of P systems, and we illustrate it by using the case study of a family of P systems that provides an efficient and uniform solution to the SAT problem. Firstly, we develop a simulator that fully simulates the computation of the P system, demonstrating that GPUs are well suited to simulate them. Then, we adapt this simulator to the GPU architecture idiosyncrasies, improving the performance of the previous simulator.
soft computing | 2012
José M. Cecilia; José M. García; Ginés D. Guerrero; Miguel A. Martínez-del-Amor; Mario J. Pérez-Jiménez; Manuel Ujaldon
Membrane Computing is a discipline aiming to abstract formal computing models, called membrane systems or P systems, from the structure and functioning of the living cells as well as from the cooperation of cells in tissues, organs, and other higher order structures. This framework provides polynomial time solutions to NP-complete problems by trading space for time, and whose efficient simulation poses challenges in three different aspects: an intrinsic massively parallelism of P systems, an exponential computational workspace, and a non-intensive floating point nature. In this paper, we analyze the simulation of a family of recognizer P systems with active membranes that solves the Satisfiability problem in linear time on different instances of Graphics Processing Units (GPUs). For an efficient handling of the exponential workspace created by the P systems computation, we enable different data policies to increase memory bandwidth and exploit data locality through tiling and dynamic queues. Parallelism inherent to the target P system is also managed to demonstrate that GPUs offer a valid alternative for high-performance computing at a considerably lower cost. Furthermore, scalability is demonstrated on the way to the largest problem size we were able to run, and considering the new hardware generation from Nvidia, Fermi, for a total speed-up exceeding four orders of magnitude when running our simulations on the Tesla S2050 server.
PACBB | 2011
Ginés D. Guerrero; Horacio Pérez-Sánchez; Wolfgang Wenzel; José M. Cecilia; José M. García
In this work we discuss the benefits of using massively parallel architectures for the optimization of Virtual Screeningmethods.We empirically demonstrate that GPUs are well suited architecture for the acceleration of non-bonded interaction kernels, obtaining up to a 260 times sustained speedup compared to its sequential counterpart version.
international parallel and distributed processing symposium | 2012
Juan M. Cebri'n; Ginés D. Guerrero; José M. García
In the last few years, Graphics Processing Units (GPUs) have become a great tool for massively parallel computing. GPUs are specifically designed for throughput and face several design challenges, specially what is known as the Power and Memory Walls. In these devices, available resources should be used to enhance performance and throughput, as the performance per watt is really high. For massively parallel applications or kernels, using the available silicon resources for power management was unproductive, as the main objective of the unit was to execute the kernel as fast as possible. However, not all the applications that are being currently ported to GPUs can make use of all the available resources, either due to data dependencies, bandwidth requirements, legacy software on new hardware, etc, reducing the performance per watt. This new scenario requires new designs and optimizations to make these GPGPUs more energy efficient. But first comes first, we should begin by analyzing the applications we are running on these processors looking for bottlenecks and opportunities to optimize for energy efficiency. In this paper we analyze some kernels taken from the CUDA SDK2 in order to discover resource underutilization. Results show that this underutilization is present, and resource optimization can increase the energy efficiency of GPU-based computation. We then discuss different strategies and proposals to increase energy efficiency in future GPU designs.
parallel, distributed and network-based processing | 2012
Ginés D. Guerrero; Horacio E. Perez-S´nchez; José M. Cecilia; José M. García
The current trend in medical research for the discovery of new drugs is the use of Virtual Screening (VS) methods. In these methods, the calculation of the non-bonded interactions, such as electrostatics or van der Waals forces, plays an important role, representing up to 80% of the total execution time. These kernels are computational intensive and massively parallel in nature, and thus they are well suited to be accelerated on parallel architectures. In this work, we discuss the effective parallelization of the non-bonded electrostatic interactions kernel for VS on three different parallel architectures: a shared memory system, a distributed memory system, and a Graphics Processing Units (GPUs). For an efficient handling of the computational intensive and massively parallelism of this kernel, we enable different data policies on those architectures to take advantage of all computational resources offered by them. Four implementations are provided based on MPI, OpenMP, Hybrid MPI Open MP and CUDA programming models. The sequential implementation is defeated by a wide margin by all parallel implementations, obtaining up to 72x speed-up factor on the shared memory system through OpenMP, up to 60x and229x speed-ups factors on the distributed memory system for the MPI implementation and the Hybrid MPI-Open MP implementation respectively, and finally, up to 213x speedup factor for the CUDA implementation on the GPU architecture to offer the best alternative in terms of performance/cost ratio.
computational methods in systems biology | 2011
Irene Sánchez-Linares; Horacio Pérez-Sánchez; Ginés D. Guerrero; José M. Cecilia; José M. García
The completion of the human genome project has brought new and still unprocessed information about potential targets for the treatment of human diseases with drugs. The efficacy of a drug can be vastly improved through the interaction with multiple targets, although undesirable side effects must also be studied. Experimental approaches for this purpose are very expensive and time consuming, while in-silico approaches can efficiently propose accurate predictions that drastically reduce testing procedures in the laboratory. Nevertheless, in-silico approaches for multiple target identification have not been yet fully explored and most of them still deal with rigid receptor models. It has been shown recently that the docking program FlexScreen includes efficiently protein flexibility. However, processing large databases of target proteins is a very time consuming process. In a new optimization approach, massively parallel architectures like GPUs can greatly overcome these limitations. In this study we report our FlexScreen parallelization efforts using CUDA.
Concurrency and Computation: Practice and Experience | 2014
Ginés D. Guerrero; Richard M. Wallace; José Luis Vázquez-Poletti; José M. Cecilia; José M. García; Daniel Mozos; Horacio Pérez-Sánchez
Virtual Screening (VS) methods can considerably aid drug discovery research, predicting how ligands interact with drug targets. BINDSURF is an efficient and fast blind VS methodology for the determination of protein binding sites, depending on the ligand, using the massively parallel architecture of graphics processing units(GPUs) for fast unbiased prescreening of large ligand databases. In this contribution, we provide a performance/cost model for the execution of this application on both local system and public cloud infrastructures. With our model, it is possible to determine which is the best infrastructure to use in terms of execution time and costs for any given problem to be solved by BINDSURF. Conclusions obtained from our study can be extrapolated to other GPU‐based VS methodologies.Copyright
Concurrency and Computation: Practice and Experience | 2014
Ginés D. Guerrero; Juan M. Cebrián; Horacio Pérez-Sánchez; José M. García; Manuel Ujaldon; José M. Cecilia
The integration of the latest breakthroughs in computational modeling and high performance computing (HPC) has leveraged advances in the fields of healthcare and drug discovery, among others. By integrating all these developments together, scientists are creating new exciting personal therapeutic strategies for living longer that were unimaginable not that long ago. However, we are witnessing the biggest revolution in HPC in the last decade. Several graphics processing unit architectures have established their niche in the HPC arena but at the expense of an excessive power and heat. A solution for this important problem is based on heterogeneity. In this paper, we analyze power consumption on heterogeneous systems, benchmarking a bioinformatics kernel within the framework of virtual screening methods. Cores and frequencies are tuned to further improve the performance or energy efficiency on those architectures. Our experimental results show that targeted low‐cost systems are the lowest power consumption platforms, although the most energy efficient platform and the best suited for performance improvement is the Kepler GK110 graphics processing unit from Nvidia by using compute unified device architecture. Finally, the open computing language version of virtual screening shows a remarkable performance penalty compared with its compute unified device architecture counterpart. Copyright