José M. Mantas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where José M. Mantas is active.

Explore More

Publication

Featured researches published by José M. Mantas.

The Journal of Supercomputing | 2011

Simulation of one-layer shallow water systems on multicore and CUDA architectures

Marc de la Asunción; José M. Mantas; Manuel J. Castro

The numerical solution of shallow water systems is useful for several applications related to geophysical flows, but the big dimensions of the domains suggests the use of powerful accelerators to obtain numerical results in reasonable times. This paper addresses how to speed up the numerical solution of a first order well-balanced finite volume scheme for 2D one-layer shallow water systems by using modern Graphics Processing Units (GPUs) supporting the NVIDIA CUDA programming model. An algorithm which exploits the potential data parallelism of this method is presented and implemented using the CUDA model in single and double floating point precision. Numerical experiments show the high efficiency of this CUDA solver in comparison with a CPU parallel implementation of the solver and with respect to a previously existing GPU solver based on a shading language.

Mathematics and Computers in Simulation | 2009

Simulation of shallow-water systems using graphics processing units

Miguel Lastra; José M. Mantas; Carlos Ureña; Manuel J. Castro; José A. García-Rodríguez

This paper addresses the speedup of the numerical solution of shallow-water systems in 2D domains by using modern graphics processing units (GPUs). A first order well-balanced finite volume numerical scheme for 2D shallow-water systems is considered. The potential data parallelism of this method is identified and the scheme is efficiently implemented on GPUs for one-layer shallow-water systems. Numerical experiments performed on several GPUs show the high efficiency of the GPU solver in comparison with a highly optimized implementation of a CPU solver.

International Journal of Approximate Reasoning | 2006

Extraction of similarity based fuzzy rules from artificial neural networks

Carlos Javier Mantas; José Manuel Puche; José M. Mantas

A method to extract a fuzzy rule based system from a trained artificial neural network for classification is presented. The fuzzy system obtained is equivalent to the corresponding neural network. In the antecedents of the fuzzy rules, it uses the similarity between the input datum and the weight vectors. This implies rules highly understandable. Thus, both the fuzzy system and a simple analysis of the weight vectors are enough to discern the hidden knowledge learnt by the neural network. Several classification problems are presented to illustrate this method of knowledge discovery by using artificial neural networks.

Journal of Parallel and Distributed Computing | 2012

An MPI-CUDA implementation of an improved Roe method for two-layer shallow water systems

Marc de la Asunción; José M. Mantas; Manuel J. Castro; Enrique D. Fernández-Nieto

The numerical solution of two-layer shallow water systems is required to simulate accurately stratified fluids, which are ubiquitous in nature: they appear in atmospheric flows, ocean currents, oil spills, etc. Moreover, the implementation of the numerical schemes to solve these models in realistic scenarios imposes huge demands of computing power. In this paper, we tackle the acceleration of these simulations in triangular meshes by exploiting the combined power of several CUDA-enabled GPUs in a GPU cluster. For that purpose, an improvement of a path conservative Roe-type finite volume scheme which is specially suitable for GPU implementation is presented, and a distributed implementation of this scheme which uses CUDA and MPI to exploit the potential of a GPU cluster is developed. This implementation overlaps MPI communication with CPU-GPU memory transfers and GPU computation to increase efficiency. Several numerical experiments, performed on a cluster of modern CUDA-enabled GPUs, show the efficiency of the distributed solver.

european conference on parallel processing | 2010

Programming CUDA-based GPUs to simulate two-layer shallow water flows

Marc de la Asunción; José M. Mantas; Manuel J. Castro

The two-layer shallow water system is used as the numerical model to simulate several phenomena related to geophysical flows such as the steady exchange of two different water flows, as occurs in the Strait of Gibraltar, or the tsunamis generated by underwater landslides. The numerical solution of this model for realistic domains imposes great demands of computing power and modern Graphics Processing Units (GPUs) have demonstrated to be a powerful accelerator for this kind of computationally intensive simulations. This work describes an accelerated implementation of a first order well-balanced finite volume scheme for 2D two-layer shallow water systems using GPUs supporting the CUDA (Compute Unified Device Architecture) programming model and double precision arithmetic. This implementation uses the CUDA framewok to exploit efficiently the potential fine-grain data parallelism of the numerical algorithm. Two versions of the GPU solver are implemented and studied: one using both single and double precision, and another using only double precision. Numerical experiments show the efficiency of this CUDA solver on several GPUs and a comparison with an efficient multicore CPU implementation of the solver is also reported.

Journal of Scientific Computing | 2011

Two-Dimensional Compact Third-Order Polynomial Reconstructions. Solving Nonconservative Hyperbolic Systems Using GPUs

José M. Gallardo; Sergio Ortega; Marc de la Asunción; José M. Mantas

We present a new kind of high-order reconstruction operator of polynomial type, which is used in combination with the scheme presented in Castro et al. (J. Sci. Comput. 39:67–114, 2009) for solving nonconservative hyperbolic systems. The implementation of the scheme is carried out on Graphics Processing Units (GPUs), thus achieving a substantial improvement of the speedup with respect to normal CPUs. As an application, the two-dimensional shallow water equations with geometrical source term due to the bottom slope is considered.

Advances in Engineering Software | 2016

Numerical simulation of tsunamis generated by landslides on multiple GPUs

M. de la Asunción; Manuel J. Castro; José M. Mantas; Sergio Ortega

We propose a 2D numerical scheme to simulate tsunamis generated by landslides.A description of an MPI-CUDA implementation which uses overlapping techniques.Validation of the numerical scheme by simulating a real tsunami.Load balancing algorithms to balance the computational load in wet and dry areas.Good weak and strong scaling using up to 24 GPUs in real and artificial problems. In this work we propose a two-layer Savage-Hutter type model that is the natural extension of the 1D system proposed by E. D. Fernandez-Nieto et al in 2008 to simulate tsunamis generated by landslides. We describe a single GPU and a multi-GPU implementation of this model using MPI and the CUDA framework over structured meshes. The distributed implementation is tested for several artificial and realistic problems using up to 24 GPUs. We also propose a static and a dynamic load balancing algorithm in order to deal with the unbalanced computational load due to different amount of wet and dry areas among the subdomains. The validity of the model is tested by simulating the tsunami occurred in Lituya Bay, Alaska, in 1958. Numerical experiments show the efficiency of the multi-GPU solver, the usefulness of the load balancing algorithms and the validity of the model to simulate real tsunamis generated by landslides.

The Journal of Supercomputing | 2015

SP-ChainMail: a GPU-based sparse parallel ChainMail algorithm for deforming medical volumes

Alejandro Rodríguez; Alejandro León; Germán Arroyo; José M. Mantas

ChainMail algorithm is a physically based deformation algorithm that has been successfully used in virtual surgery simulators, where time is a critical factor. In this paper, we present a parallel algorithm, based on ChainMail, and its efficient implementation that reduces the time required to compute deformations over large medical 3D datasets by means of modern GPU capabilities. We also present a 3D blocking scheme that reduces the amount of unnecessary processing threads. For this purpose, this paper describes a new parallel boolean reduction scheme, used to efficiently decide which blocks are computed. Finally, through an extensive analysis, we show the performance improvement achieved by our implementation of the proposed algorithm and the use of the proposed blocking scheme, due to the high spatial and temporal locality of our approach.

Archive | 2006

Parallelization of WENO-Boltzmann Schemes for Kinetic Descriptions of 2D Semiconductor Devices

José M. Mantas; José A. Carrillo; Armando Majorana

The parallelization of a direct WENO (Weighted Essentially Non-Oscillatory) solver for the 2D-spatial Boltzmann-Poisson system describing electron transport in Si-based semiconductor devices has been addressed. A non-parabolic Kane energy-band and elastic acoustic and inelastic non-polar optical phonon operators have been used [CGMS03A] in the physical description of the electron transport in the device. This choice is by no means restrictive and more complicated band structures, including several valleys, and different scattering mechanisms, both intervalley and intravalley ones, can be included in a flexible way both in the numerical method and its parallelization [CCM04]. The numerical scheme which has been parallelized [CGMS03A, CGMS03B] uses a formulation of the Boltzmann-Poisson system in spherical coordinates for the wave vector space. After adimensionalization one is reduced to simulate the evolution in time t of the distribution function Φ in the five-dimensional space (x, y, ω, μ, φ), where x and y are the spatial coordinates, ω ≥ 0 is a dimensionless energy, μ ∈ [−1, 1] is the cosine of the angle with respect to the x-axis and φ ∈ [0, π] the azimuthal angle. The resulting Boltzmann equation reads

european conference on parallel processing | 2005

Parallelization of implicit-explicit runge-kutta methods for cluster of PCs

José M. Mantas; Pedro Enrique Barrilao González; José A. Carrillo

Several physical phenomena of great importance in science and engineering are described by large partly stiff differential systems where the stiff terms can be easily separated from the remaining terms. Implicit-Explicit Runge-Kutta (IMEXRK) methods have proven to be useful solving these systems efficiently. However, the application of these methods still requires a large computational effort and their parallel implementation constitutes a suitable way to achieve acceptable response times. In this paper, a technique to parallelize and implement efficiently IMEXRK methods on PC clusters is proposed. This technique has been used to parallelize a particular IMEXRK method and an efficient parallel implementation of the resultant scheme has been derived in a structured manner by following a component-based approach. Several numerical experiments which have been performed on a cluster of dual PCs reveal the good speedup and the satisfactory scalability of the parallel solver obtained.

Explore More