Luigi Brochard | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Luigi Brochard is active.

Explore More

Publication

Featured researches published by Luigi Brochard.

Computer Science - Research and Development | 2010

Optimizing performance and energy of HPC applications on POWER7

Luigi Brochard; Raj Panda; Sid Vemuganti

Power consumption is a critical consideration in high performance computing systems and it is becoming the limiting factor to build and operate Petascale and Exascale systems. When studying the power consumption of existing systems running HPC workloads, we find power, energy and performance are closely related leading to the possibility to optimize energy without sacrificing (much or at all) performance.This paper presents the power features of the POWER7 and shows how innovative software can use these features to optimize the power and energy consumptions of large cluster running HPC workloads.This paper starts by presenting the new features which have been introduced in POWER7 to manage power consumption and the tools available to manage and record the power consumption. We then analyze the power consumption and performance of different HPC workloads at various levels of the POWER7 server (processor, memory, io) for different frequencies. We propose a model to predict both the power and energy consumption of real workloads based on their performance characteristics measured by hardware performance counters (HPM). We show that the power estimation model can achieve less than 5% error versus actual measurements. In conclusion, we present how an innovative scheduler can help to optimize both power and energy consumptions of large HPC clusters.

international conference on supercomputing | 1990

Designing algorithms on hierarchical memory multiprocessors

Luigi Brochard; Alex Freau

We study here the behavior of two numerical algorithms (matrix multiplications and finite difference methods) on a three-level memory hierarchy multiprocessor RP3. Using different versions of these algorithms which differ on data placement (global, local, global and cacheable, local and cacheable) and on data access (blocked on non-blocked), we study the impact of these parameters on the performance of the program. This performance analysis is done using a very accurate monitoring system (VPMC) which records instructions, memory requests, cache requests and misses. We perform also a theoretical performance analysis of these programs using a model of computation and communication. Good agreements are found between theoretical and experimental results. As a conclusion we discuss the use of local memory on such a machine and show it is not worth with the RP3 ratio of communication between local and global memories. We also discuss optimal use of cache, show the optima can only be met under some cache properties (private store-in cache with user control of write-back) and show blocked optimal algorithms are to be used to meet it.

Concurrency and Computation: Practice and Experience | 1992

Designing algorithms on RP3

Luigi Brochard; Alex Freau

We study here the behavior of two numerical algorithms (matrix multiplication and finite difference method) on a three-level memory hierarchy multi-processor RP3. Using different versions of these algorithms, which differ on data placement (global, local, global and cacheable, local and cacheable) and on data access (blocked or non-blocked), we study the impact of these parameters on the performance of the program. This performance analysis is done using a very accurate monitoring system (VPMC) which records instructions, memory requests, cache requests and misses. We perform also a theoretical performance analysis of these programs using a model of computation and communication. Good agreement is found between theoretical and experimental results. As a conclusion we discuss the use of local memory on such a machine and show that it is ineffective with RP3 cache, local and global memory communication speed ratios. We also discuss optimal use of cache and show that the optima can only be realized under some cache properties (private store-in cache with user control of write-back) and show that blocked optimal algorithms are to be used to find it. Comparing programming of shared and distributed memory multi-processors, we remark that optimized algorithms for shared memory systems utilize the same blocking techniques used for programming distributed memory systems, leading to a common programming paradigm.

Concurrency and Computation: Practice and Experience | 1992

Computation and data movement on RP3

Luigi Brochard; Alex Freau

We present in this paper a study of the computation and communication costs on RP3 and on some issues about algorithm designs on a three-level memory hierarchy multi-processor. Using very simple algorithms (vector-add, vector-sum, saxpy, … ), we compare different implementations which differ on data localization (global or local) and data cacheability (cacheable or non-cacheable). This comparison is done using a performance monitoring system (VPMC) that records instructions, data movement, cache requests and misses. The output of the VPMC was then used as input to an analytical performance model which we used to compute the elemental computation and communication times of every basic algorithm. Regarding cacheability (marking the data cacheable instead of non-cacheable), we found it worthwhile as long as data are blocked adequately. For our simple 1-D data structures, a block size equal to a multiple of the cache line size gives the best results. However, considering possible load imbalance, a block size equal to the cache line seems optimal. Regarding localization (copying data from global to local, working on local data instead of global and copying data back), we found it ineffective, at least with the RP3 local and global communication speed ratios (1:10:15).

Computers & Chemical Engineering | 1998

Hardware and software perspectives in engineering computing

Luigi Brochard

What perspectives for engineering computing? With the advances in silicon technology, Teraflops systems made of off-the-shelf microprocessors will be mainstream. The bigger challenge will arise for the management of the data those system will produce in Petabytes. New technologies to store, retrieve and analyse them are needed.

Informatik Spektrum | 2006

High performance computing technology, applications and business

Luigi Brochard

High Performance Computing (HPC) was born in the mid 70’s with the emergence of vector supercomputers. It has then evolved according to technology and business enlarging progressively its scope of application. In this paper, we describe the fundamental concepts at the core of HPC, their evolution, the way they are used today in real applications, how these applications are evolving and how application and technology are transforming business.

Archive | 2012