Marc Baboulin
University of Paris-Sud
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marc Baboulin.
parallel computing | 2010
Stanimire Tomov; Jack J. Dongarra; Marc Baboulin
If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced with accelerators! This is happening today as accelerators, in particular Graphics Processing Units (GPUs), are steadily making their way into the high performance computing (HPC) world. We highlight the trends leading to the idea of hybrid manycore/GPU systems, and we present a set of techniques that can be used to eciently program them. The presentation is in the context of Dense Linear Algebra (DLA), a major building block for many scientic computing applications.We motivate the need for new algorithms that would split the computation in a way that would fully exploit the power that each of the hybrid components oers. As the area of hybrid multicore/GPU computing is still in its infancy, we also argue for its importance in view of what future architectures may look like. We therefore envision the need for a DLA library similar to LAPACK but for hybrid manycore/GPU systems. We illustrate the main ideas with an LU-factorization algorithm where particular techniques are used to reduce the amount of pivoting, resulting in an algorithm achieving up to 388 GFlop/s for single and up to 99:4 GFlop/s for double precision factorization on a hybrid Intel Xeon (2x4 cores @ 2.33 GHz) { NVIDIA GeForce GTX 280 5 (240 cores @ 1.30 GHz) system.
Computer Physics Communications | 2009
Marc Baboulin; Alfredo Buttari; Jack J. Dongarra; Jakub Kurzak; Julie Langou; Julien Langou; Piotr Luszczek; Stanimire Tomov
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented.
ACM Transactions on Mathematical Software | 2013
Marc Baboulin; Jack J. Dongarra; Julien Herrmann; Stanimire Tomov
We illustrate how linear algebra calculations can be enhanced by statistical techniques in the case of a square linear system Ax = b. We study a random transformation of A that enables us to avoid pivoting and then to reduce the amount of communication. Numerical experiments show that this randomization can be performed at a very affordable computational price while providing us with a satisfying accuracy when compared to partial pivoting. This random transformation called Partial Random Butterfly Transformation (PRBT) is optimized in terms of data storage and flops count. We propose a solver where PRBT and the LU factorization with no pivoting take advantage of the current hybrid multicore/GPU machines and we compare its Gflop/s performance with a solver implemented in a current parallel library.
SIAM Journal on Matrix Analysis and Applications | 2007
Mario Arioli; Marc Baboulin; Serge Gratton
We consider here the linear least squares problem
Numerical Linear Algebra With Applications | 2009
Marc Baboulin; Jack J. Dongarra; Serge Gratton; Julien Langou
\min_{y \in \mathbb{R}^n}\|Ay-b\|_2
international parallel and distributed processing symposium | 2012
Marc Baboulin; Dulceneia Becker; Jack J. Dongarra
, where
international conference on conceptual structures | 2012
Marc Baboulin; Simplice Donfack; Jack J. Dongarra; Laura Grigori; Adrien Rémy; Stanimire Tomov
b \in \mathbb{R}^m
parallel processing and applied mathematics | 2011
Dulceneia Becker; Marc Baboulin; Jack J. Dongarra
and
international conference on conceptual structures | 2013
Yushan Wang; Marc Baboulin; Jack J. Dongarra; Joel Falcou; Yann Fraigneau; Olivier P. Le Maître
A \in \mathbb{R}^{m\times n}
international conference on conceptual structures | 2016
Ahmad Abdelfattah; Marc Baboulin; Veselin Dobrev; Jack J. Dongarra; Christopher Earl; Joel Falcou; Azzam Haidar; Ian Karlin; Tzanio V. Kolev; Ian Masliah; Stanimire Tomov
is a matrix of full column rank