Francisco-Jose Martínez-Zaldívar

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francisco-Jose Martínez-Zaldívar is active.

Explore More

Publication

Featured researches published by Francisco-Jose Martínez-Zaldívar.

The Journal of Supercomputing | 2011

Real-time massive convolution for audio applications on GPU

Jose A. Belloch; Alberto Gonzalez; Francisco-Jose Martínez-Zaldívar; Antonio M. Vidal

Massive convolution is the basic operation in multichannel acoustic signal processing. This field has experienced a major development in recent years. One reason for this has been the increase in the number of sound sources used in playback applications available to users. Another reason is the growing need to incorporate new effects and to improve the hearing experience. Massive convolution requires high computing capacity. GPUs offer the possibility of parallelizing these operations. This allows us to obtain the processing result in much shorter time and to free up CPU resources. One important aspect lies in the possibility of overlapping the transfer of data from CPU to GPU and vice versa with the computation, in order to carry out real-time applications. Thus, a synthesis of 3D sound scenes could be achieved with only a peer-to-peer music streaming environment using a simple GPU in your computer, while the CPU in the computer is being used for other tasks. Nowadays, these effects are obtained in theaters or funfairs at a very high cost, requiring a large quantity of resources. Thus, our work focuses on two mains points: to describe an efficient massive convolution implementation and to incorporate this task to real-time multichannel-sound applications.

The Journal of Supercomputing | 2011

Neville elimination on multi- and many-core systems: OpenMP, MPI and CUDA

Pedro Alonso; Raquel Cortina; Francisco-Jose Martínez-Zaldívar; José Ranilla

This paper describes several parallel algorithmic variations of the Neville elimination. This elimination solves a system of linear equations making zeros in a matrix column by adding to each row an adequate multiple of the preceding one. The parallel algorithms are run and compared on different multi- and many-core platforms using parallel programming techniques as MPI, OpenMP and CUDA.

The Journal of Supercomputing | 2011

Tridimensional block multiword LDPC decoding on GPUs

Francisco-Jose Martínez-Zaldívar; Antonio-Manuel Vidal-Maciá; Alberto Gonzalez; Vicenc Almenar

In this paper, we describe a parallel algorithm for LDPC (Low Density Parity Check codes) decoding on a GPU (Graphics Processing Unit) using CUDA (Compute Unified Device Architecture). The strategy of the kernel grid and block design is shown and the multiword decoding operation is described using tridimensional blocks. The performance (speedup) of the proposed parallel algorithm is slightly better than the performance found in the literature when this is relatively good, and shows a great improvement in those cases with previously reported moderate or bad performance.

Computers in Biology and Medicine | 2014

Adaptive step ODE algorithms for the 3D simulation of electric heart activity with graphics processing units

Víctor M. García-Molla; Alejandro Liberos; Antonio M. Vidal; Maria S. Guillem; José Millet; Alberto Gonzalez; Francisco-Jose Martínez-Zaldívar; Andreu M. Climent

In this paper we studied the implementation and performance of adaptive step methods for large systems of ordinary differential equations systems in graphics processing units, focusing on the simulation of three-dimensional electric cardiac activity. The Rush-Larsen method was applied in all the implemented solvers to improve efficiency. We compared the adaptive methods with the fixed step methods, and we found that the fixed step methods can be faster while the adaptive step methods are better in terms of accuracy and robustness.

Computer-Aided Engineering | 2013

Multichannel massive audio processing for a generalized crosstalk cancellation and equalization application using GPUs

Jose A. Belloch; Alberto Gonzalez; Francisco-Jose Martínez-Zaldívar; Antonio M. Vidal

Multichannel acoustic signal processing has undergone major development in recent years due to the increased complexity of current audio processing applications, which involves the processing of multiple sources, channels, or filters. A general scenario that appears in this context is the immersive reproduction of binaural audio without the use of headphones, which requires the use of a crosstalk canceler. However, generalized crosstalk cancellation and equalization GCCE requires high computing capacity, which is a considerable limitation for real-time applications. This paper discusses the design and implementation of all the processing blocks of a multichannel convolution on a GPU for real-time applications. To this end, a very efficient filtering method using specific data structures is proposed, which takes advantage of overlap-save filtering and filter fragmentation. It has been shown that, for a real-time application with 22 inputs and 64 outputs, the system is capable of managing 1408 filters of 2048 coefficients with a latency time less than 6 ms. The proposed GPU implementation can be easily adapted to any acoustic environment, demonstrating the validity of these co-processors for managing intensive multichannel audio applications.

international conference on conceptual structures | 2012

Headphone-based spatial sound with a GPU accelerator

Jose A. Belloch; Miguel Ferrer; Alberto Gonzalez; Francisco-Jose Martínez-Zaldívar; Antonio M. Vidal

Multichannel acoustic signal processing has undergone major development in recent years. The incorporation of spatial information into an immersive audiovisual virtual environment or into video games provides better sense of “presence” to applications. Spatial sound consists in reproducing audio signals with spatial cues (spatial information embedded in the sound) through headphones. This spatial information allows the listener to identify the virtual positions of the sources corresponding to different sounds. Headphone-based spatial sound is obtained by filtering different sound sources through special filters called Head Related Transfer Functions (HRTFs) prior to render them through headphones. Efficient computation plays an important role when the number of sources to be managed is high. This situation increases the number of filtering operations, requiring high computing capacity specially when the virtual sources are moving. Graphics Processing Units (GPUs) are high parallel programmable co-processors that provide massive computation when the needed operations are properly parallelized. This paper discusses the design, the implementation and the performance of a headphone-based spatial audio application whose processing is totally carried out on a GPU. This application is able to interact with the listener who can select and change the location of the sound sources in real-time. This work analyzes also specific computational aspects inside the CUDA environment in order to successfully exploit GPU resources. Results show that the proposed application is able to move up to 2500 sources simultaneously, while leaving free CPU resources for other tasks. This work emphasizes the importance of analyzing all CUDA aspects, since they can influence drastically the performance.

international conference on conceptual structures | 2013

New parallel sphere detector algorithm providing high-throughput for optimal MIMO detection

Csaba Mate Józsa; Géza Kolumbán; Antonio M. Vidal; Francisco-Jose Martínez-Zaldívar; Alberto Gonzalez

Abstract Multiple–input multiple-output (MIMO) detection techniques can vary significantly in complexity and detection performance. Finding the optimal Maximum Likelihood (ML) solution with high throughput was limited by the computational performance. In order to achieve high throughput non-ML algorithms were introduced, having degraded detection performance and lower complexity. In this paper we present a new parallel algorithm, inspired by the Sphere Detector (SD) algorithm, which can effi- ciently solve the ML detection of the MIMO systems with high throughput on parallel architectures. We also give an overview on how it is possible to map the Parallel Sphere Detector (PSD) onto GP-GPUs, however different parallel architectures are also suitable for adapting the presented algorithm.

The Journal of Supercomputing | 2013

Preface to high performance computing applied to computational problems in science and engineering

Raquel Cortina; Francisco-Jose Martínez-Zaldívar; Antonio M. Vidal; Jesús Vigo-Aguiar

As a part of the 11th International Conference on “Computational and Mathematical Methods in Science and Engineering” (CMMSE 2011), the Minisymposium “High Performance Computing (HPC) applied to Computational Problems in Science and Engineering” took place in Benidorm on 26 and 27 of June. It brought together over 100 researchers associated within the field of HPC, who participated in the various oral and poster sessions. In last years, the number of scientific contributions and research projects related to the HPC based on the use of parallel computers (multicore processors, Graphics Processing Units—GPU—or clusters of computers) has significantly increased. This phenomenon has occurred in almost all engineering fields that require intensive computing. Heterogeneous parallel computers based on manycore (GPU) and multicore microprocessors are fascinating tools and represent a quantitative leap in the development of high performance hardware. Probably it would be impossible to go back,

Concurrency and Computation: Practice and Experience | 2015

Parallel sphere detector algorithm providing optimal MIMO detection on massively parallel architectures

Csaba Mate Józsa; Géza Kolumbán; Antonio M. Vidal; Francisco-Jose Martínez-Zaldívar; Alberto Gonzalez

Multiple‐input multiple‐output (MIMO) systems have attracted considerable attention in wireless communications because they offer a significant increase in data throughput and link coverage without additional bandwidth requirement or increased transmit power. The price that has to be paid is the increased complexity of hardware components and algorithms. The sphere detector (SD) algorithm solves the problem of maximum likelihood (ML) detection for MIMO channels by significantly reducing the search space of possible solutions. The main drawback of the SD algorithm is in its sequential nature, consequently, running it on massively parallel architectures (MPAs) is very inefficient. In order to overcome the drawbacks of the SD algorithm, a new parallel sphere detector (PSD) algorithm is proposed. It implements a novel hybrid tree search method, where the algorithm parallelism is assured by the efficient combination of depth‐first search and breadth‐first search algorithms. A path metric‐based parallel sorting is employed at each intermediate stage. The PSD algorithm is able to adjust its memory requirements and extent of parallelism to fit a wide range of parallel architectures. Mapping details for MPAs are proposed by giving the details of thread dependent, highly parallel building blocks of the algorithm. Based on the building blocks proposed, a mapping to general‐purpose graphics processing unit is provided, and its performance is evaluated. In order to achieve high‐throughput, several levels of parallelism are introduced, and different scheduling strategies are considered. Copyright

international conference on multimedia and expo | 2011

A real-time crosstalk canceller on a notebook GPU

Jose A. Belloch; Alberto Gonzalez; Francisco-Jose Martínez-Zaldívar; Antonio M. Vidal

Crosstalk cancellation is one of the main applications in multichannel acoustic signal processing. This field has experienced a major development in recent years because of the increase in the number of sound sources used in playback applications available to users. Developing these applications requires high computing capabilities because of its high number of operations. Graphics Processor Unit (GPU), a high parallel commodity programmable co-processors, offer the possibility of parallelizing these operations. This allows to obtain the results in a much shorter time and also to free up CPU resources which can be used for other tasks. One important aspect lies in the possibility to overlap the data transfer from CPU to GPU and vice versa with the computation, in order to carry out real-time applications. Thus, this work focuses on two main points: to describe an efficient implementation of a crosstalk cancellation on GPU and to incorporate it into a real-time application.

Explore More