Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xavier Teruel is active.

Publication


Featured researches published by Xavier Teruel.


IEEE Transactions on Parallel and Distributed Systems | 2009

The Design of OpenMP Tasks

Eduard Ayguadé; Nawal Copty; Alejandro Duran; Jay Hoeflinger; Yuan Lin; Federico Massaioli; Xavier Teruel; Priya Unnikrishnan; Guansong Zhang

OpenMP has been very successful in exploiting structured parallelism in applications. With increasing application complexity, there is a growing need for addressing irregular parallelism in the presence of complicated control structures. This is evident in various efforts by the industry and research communities to provide a solution to this challenging problem. One of the primary goals of OpenMP 3.0 is to define a standard dialect to express and efficiently exploit unstructured parallelism. This paper presents the design of the OpenMP tasking model by members of the OpenMP 3.0 tasking sub-committee which was formed for this purpose. The paper summarizes the efforts of the sub-committee (spanning over two years) in designing, evaluating and seamlessly integrating the tasking model into the OpenMP specification. In this paper, we present the design goals and key features of the tasking model, including a rich set of examples and an in-depth discussion of the rationale behind various design choices. We compare a prototype implementation of the tasking model with existing models, and evaluate it on a wide range of applications. The comparison shows that the OpenMP tasking model provides expressiveness, flexibility, and huge potential for performance and scalability.


international conference on parallel processing | 2009

Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP

Alejandro Duran; Xavier Teruel; Roger Ferrer; Xavier Martorell; Eduard Ayguadé

Traditional parallel applications have exploited regular parallelism, based on parallel loops. Only a few applications exploit sections parallelism. With the release of the new OpenMP specification (3.0), this programming model supports tasking. Parallel tasks allow the exploitation of irregular parallelism, but there is a lack of benchmarks exploiting tasks in OpenMP. With the current (and projected) multicore architectures that offer many more alternatives to execute parallel applications than traditional SMP machines, this kind of parallelism is increasingly important. And so, the need to have some set of benchmarks to evaluate it. In this paper, we motivate the need of having such a benchmarks suite, for irregular and/or recursive task parallelism. We present our proposal, the Barcelona OpenMP Tasks Suite (BOTS), with a set of applications exploiting regular and irregular parallelism, based on tasks. We present an overall evaluation of the BOTS benchmarks in an Altix system and we discuss some of the different experiments that can be done with the different compilation and runtime alternatives of the benchmarks.


conference of the centre for advanced studies on collaborative research | 2007

Support for OpenMP tasks in Nanos v4

Xavier Teruel; Xavier Martorell; Alejandro Duran; Roger Ferrer; Eduard Ayguadé

In this paper we describe an implementation overview of Nanos v4: an OpenMP Run Time Library (RTL) based on the nano-threads programming model. Our main goal is to discuss different aspects of the library development focusing on the implementation of a new feature introduced in the last OpenMP release: task support. We compare the performance of our prototype implementation and the workqueuing model available on the Intel compiler with a set of kernel applications.


languages and compilers for parallel computing | 2007

An Experimental Evaluation of the New OpenMP Tasking Model

Eduard Ayguadé; Alejandro Duran; Jay Hoeflinger; Federico Massaioli; Xavier Teruel

The OpenMP standard was conceived to parallelize dense array-based applications, and it has achieved much success with that. Recently, a novel tasking proposal to handle unstructured parallelism in OpenMP has been submitted to the OpenMP 3.0 Language Committee. We tested its expressiveness and flexibility, using it to parallelize a number of examples from a variety of different application areas. Furthermore, we checked whether the model can be implemented efficiently, evaluating the performance of an experimental implementation of the tasking proposal on an SGI Altix 4700, and comparing it to the performance achieved with Intels Workqueueing model and other worksharing alternatives currently available in OpenMP 2.5. We conclude that the new OpenMP tasks allow the expression of parallelism for a broad range of applications and that they will not hamper application performance.


conference of the centre for advanced studies on collaborative research | 2008

OpenMP tasks in IBM XL compilers

Xavier Teruel; Priya Unnikrishnan; Xavier Martorell; Eduard Ayguadé; Raul Esteban Silvera; Guansong Zhang; Ettore Tiotto

Tasking is the most significant feature included in the new OpenMP 3.0 standard. It was introduced to handle unstructured parallelism and broaden the range of applications that can be parallelized by OpenMP. This paper presents the design and implementation of the task model in the IBM XL parallelizing compilers. The task construct is significantly different from other OpenMP constructs. This paper discusses some of the unique challenges in implementing the task construct and its associated synchronization constructs in the compiler. We also present a performance evaluation of our implementation on a set of benchmarks and applications. We identify limitations in the current implentation and propose solutions for further improvement.


international conference on parallel processing | 2012

On the instrumentation of OpenMP and ompss tasking constructs

Harald Servat; Xavier Teruel; Germán Llort; Alejandro Duran; Judit Gimenez; Xavier Martorell; Eduard Ayguadé; Jesús Labarta

Parallelism has become more and more commonplace with the advent of the multicore processors. Although different parallel programming models have arisen to exploit the computing capabilities of such processors, developing applications that take benefit of these processors may not be easy. And what is worse, the performance achieved by the parallel version of the application may not be what the developer expected, as a result of a dubious utilization of the resources offered by the processor. We present in this paper a fruitful synergy of a shared memory parallel compiler and runtime, and a performance extraction library. The objective of this work is not only to reduce the performance analysis life-cycle when doing the parallelization of an application, but also to extend the analysis experience of the parallel application by incorporating data that is only known in the compiler and runtime side. Additionally we present performance results obtained with the execution of instrumented application and evaluate the overhead of the instrumentation.


international workshop on openmp | 2014

Task-Parallel Reductions in OpenMP and OmpSs

Jan Ciesko; Sergi Mateo; Xavier Teruel; Vicenç Beltran; Xavier Martorell; Rosa M. Badia; Eduard Ayguadé; Jesús Labarta

The wide adoption of parallel processing hardware in mainstream computing as well as the raising interest for efficient parallel programming in the developer community increase the demand for parallel programming model support for common algorithmic patterns. In this work we present an extension to the OpenMP task construct to add support for reductions in while-loops and general-recursive algorithms. Further we discuss implications on the OpenMP standard and present a prototype implementation in OmpSs. Benchmark results confirm applicability of this approach and scalability on current SMP systems.


conference of the centre for advanced studies on collaborative research | 2009

OpenMP tasking analysis for programmers

Xavier Teruel; Christopher Barton; Alejandro Duran; Xavier Martorell; Eduard Ayguadé; Priya Unnikrishnan; Guansong Zhang; Raul Esteban Silvera

As of 2008, the OpenMP 3.0 standard includes task support allowing programmers to exploit irregular parallelism. Although several compilers are providing support for this new feature there has not been extensive investigation into the real possibilities of this extension. Several papers have discussed the programming model itself while other papers have discussed design and implementation on different platforms. There are also papers demonstrating performance results using well known kernel applications. This paper presents an analysis of the OpenMP tasking model possibilities, using the IBM XL compiler implementation. Using different parameters such as the number of tasks, task granularity and parallelism pattern, this paper explores how such parameters can affect the average performance and identifies the limits of the OpenMP tasking model.


international workshop on openmp | 2016

The secrets of the accelerators unveiled: tracing heterogeneous executions through OMPT

Germán Llort; Antonio Filgueras; Daniel Jiménez-González; Harald Servat; Xavier Teruel; Estanislao Mercadal; Carlos Álvarez; Judit Gimenez; Xavier Martorell; Eduard Ayguadé; Jesús Labarta

Heterogeneous systems are an important trend in the future of supercomputers, yet they can be hard to program and developers still lack powerful tools to gain understanding about how well their accelerated codes perform and how to improve them.


international workshop on openmp | 2016

Approaches for Task Affinity in OpenMP

Christian Terboven; Jonas Hahnfeld; Xavier Teruel; Sergi Mateo; Alejandro Duran; Michael Klemm; Stephen L. Olivier; Bronis R. de Supinski

OpenMP tasking supports parallelization of irregular algorithms. Recent OpenMP specifications extended tasking to increase functionality and to support optimizations, for instance with the taskloop construct. However, task scheduling remains opaque, which leads to inconsistent performance on NUMA architectures. We assess design issues for task affinity and explore several approaches to enable it. We evaluate these proposals with implementations in the Nanos++ and LLVM OpenMP runtimes that improve performance up to 40 % and significantly reduce execution time variation.

Collaboration


Dive into the Xavier Teruel's collaboration.

Top Co-Authors

Avatar

Eduard Ayguadé

Barcelona Supercomputing Center

View shared research outputs
Top Co-Authors

Avatar

Xavier Martorell

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Jesús Labarta

Barcelona Supercomputing Center

View shared research outputs
Top Co-Authors

Avatar

Jan Ciesko

Barcelona Supercomputing Center

View shared research outputs
Top Co-Authors

Avatar

Sergi Mateo

Barcelona Supercomputing Center

View shared research outputs
Top Co-Authors

Avatar

Alejandro Duran

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Vicenç Beltran

Barcelona Supercomputing Center

View shared research outputs
Top Co-Authors

Avatar

Germán Llort

Polytechnic University of Catalonia

View shared research outputs
Researchain Logo
Decentralizing Knowledge