Is this you? Create Your Porfile

Brice Videau

French Institute for Research in Computer Science and Automation

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brice Videau is active.

Explore More

Publication

Featured researches published by Brice Videau.

international parallel and distributed processing symposium | 2006

A tool for environment deployment in clusters and light grids

Yiannis Georgiou; Julien Leduc; Brice Videau; Johann Peyrard; Olivier Richard

Focused around the field of the exploitation and the administration of high performance large-scale parallel systems, this article describes the work carried out on the deployment of environment on high computing clusters and grids. We initially present the problems involved in the installation of an environment (OS, middleware, libraries, applications...) on a cluster or grid and how an effective deployment tool, Kadeploy2, can become a new form of exploitation of this type of infrastructures. We present the tools design choices, its architecture and we describe the various stages of the deployment method, introduced by Kadeploy2. Moreover, we propose methods on the one hand, for the improvement of the deployment time of a new environment; and in addition, for the support of various operating systems. Finally, to validate our approach we present tests and evaluations realized on various clusters of the experimental grid Grid5000

international conference on parallel processing | 2013

Optimizing 3d convolutions for wavelet transforms on CPUs with SSE units and GPUs

Brice Videau; Vania Marangozova-Martin; Luigi Genovese; Thierry Deutsch

Optimizing convolution operators is an important issue as they are used in numerous domains including electromagnetic computations, image processing and nanosimuations. In this paper we present our optimizations for 3D convolutions in the BigDFT nanosimulation software. We focus on processors with vector units and on GPU acceleration and experiment with several architectures. Exploiting the relation between algorithmic specifics and hardware architecture, we obtain performance gains of around x2 on CPU and up to x20 on GPU.

international workshop on openmp | 2012

Overlapping computations with communications and i/o explicitly using OpenMP based heterogeneous threading models

Sadaf R. Alam; Gilles Fourestey; Brice Videau; Luigi Genovese; Stefan Goedecker; Nazim Dugan

Holistic tuning and optimization of hybrid MPI and OpenMP applications is becoming focus for parallel code developers as the number of cores and hardware threads in processing nodes of high-end systems continue to increase. For example, there is support for 32 hardware threads on a Cray XE6 node with Interlagos processors while the IBM Blue Gene/Q system could support up to 64 threads per node. Note that, by default, OpenMP threads and MPI tasks are pinned to processor cores on these high-end systems and throughout the paper we assume fix bindings of threads to physical cores for the discussion. A number of OpenMP runtimes also support user specified bindings of threads to physical cores. Parallel and node efficiencies on these high-end systems for hybrid MPI and OpenMP applications largely depend on balancing and overlapping computation and communication workloads. This issue is further intensified when the nodes have a non-uniform access memory (NUMA) model and I/O accelerator devices. In these environments, where access to I/O devices such as GPU for code acceleration and network interface for MPI communication and parallel file I/O are managed and scheduled by a host CPU, application developers could introduce innovative solutions to overlap CPUs and I/O operations to improve node and parallel efficiencies. For example, in a production level application called BigDFT, the developers have introduced a master-slave model to explicitly overlap blocking, collective communication operations and local multi-threaded computation. Similarly some applications parallelized with MPI, OpenMP and GPU acceleration could assign a management thread for the GPU data and control orchestration, an MPI control thread for communication management while the CPU threads perform overlapping calculations, and potentially a background thread can be set aside for file I/O based fault-tolerance. Considering these emerging applications design needs, we would like to motivate the OpenMP standards committee, through examples and empirical results, to introduce thread and task heterogeneity in the language specification. This will allow code developers, especially those programming for large-scale distributed-memory HPC systems and accelerator devices, to design and develop portable solutions with overlapping control and data flow for their applications without resorting to custom solutions.

complex, intelligent and software intensive systems | 2009

PaSTeL: Parallel Runtime and Algorithms for Small Datasets

Brice Videau; Erik Saule; Jean-François Méhaut

In this paper, we put forward PaSTeL, an engine dedicated to parallel algorithms. PaSTeL offers both a programming model, to build parallel algorithms and an execution model based on work-stealing. Special care has been taken on using optimized thread activation and synchronization mechanisms. In order to illustrate the use of PaSTeL a subset of the STLs algorithms was implemented, which were also used on performance experiments. PaSTeLs performance is evaluated on a laptop computer using two cores, but also on a 16 cores platform. PaSTeL shows better performance than other implementations of the STL, especially on small datasets.

International Journal of High Performance Computing Applications | 2018

BOAST: A metaprogramming framework to produce portable and efficient computing kernels for HPC applications

Brice Videau; Kevin Pouget; Luigi Genovese; Thierry Deutsch; Dimitri Komatitsch; Frédéric Desprez; Jean-François Méhaut

The portability of real high-performance computing (HPC) applications on new platforms is an open and very delicate problem. Especially, the performance portability of the underlying computing kern...

Comptes Rendus Mecanique | 2011

Daubechies wavelets for high performance electronic structure calculations: The BigDFT project

Luigi Genovese; Brice Videau; Matthieu Ospici; Thierry Deutsch; Stefan Goedecker; Jean-François Méhaut

international conference on networks | 2007

Toward an experiment engine for lightweight grids

Brice Videau; Corinne Touati; Olivier Richard

international parallel and distributed processing symposium | 2017

Characterizing the Performance of Modern Architectures Through Opaque Benchmarks: Pitfalls Learned the Hard Way

Luka Stanisic; Lucas Mello Schnorr; Augustin Degomme; Franz Heinrich; Arnaud Legrand; Brice Videau

Archive | 2016

Wavelet-Based Density Functional Theory on Massively Parallel Hybrid Architectures

Luigi Genovese; Brice Videau; Damien Caliste; Jean-François Méhaut; Stefan Goedecker; Thierry Deutsch

Esaim: Proceedings | 2018

Building and Auto-Tuning Computing Kernels: Experimenting with Boast and Starpu in the Gysela Code

Julien Bigot; Virginie Grandgirard; Guillaume Latu; Jean-François Méhaut; Luís Felipe Millani; Chantal Passeron; Steven Quinito Masnada; Jérôme Richard; Brice Videau

Explore More