Bernard Goossens
University of Perpignan
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bernard Goossens.
international conference on conceptual structures | 2013
Bernard Goossens; David Parello
Abstract We analyse the capacity of different running models to benefit from the Instruction-Level Parallelism (ILP). First, we show where the locks to the capture of distant ILP reside. We show that i) fetching in parallel, ii) renaming memory references and iii) removing parasitic true dependencies on the stack management are the keys to capture distant ILP. Second, we measure the potential of a new running model, named speculative forking, in which a run is dynamically multi-threaded by forking at every function and loop entry frontier and threads communicate to link renamed consumers to their producers. We show that a run can be automatically parallelized by speculative forking and extended renaming. Most of the distant ILP, increasing with the data size, can be captured for properly compiled programs based on parallel algorithms.
parallel computing | 2010
Bernard Goossens; Philippe Langlois; David Parello; Eric Petit
We introduce and describe PerPI, a software tool analyzing the instruction level parallelism (ILP) of a program. ILP measures the best potential of a program to run in parallel on an ideal machine --- a machine with infinite resources. PerPI is a programmer-oriented tool the function of which is to improve the understanding of how the algorithm and the (micro-) architecture will interact. PerPI fills the gap between the manual analysis of an abstract algorithm and implementation-dependent profiling tools. The current version provides reproducible measures of the average number of instructions per cycle executed on an ideal machine, histograms of these instructions and associated data-flow graphs for any x86 binary file. We illustrate how these measures explain the actual performance of core numerical subroutines when measured run times cannot be correlated with the classical flop count analysis.
Technique Et Science Informatiques | 2006
Bernard Goossens; David Defour
This article presents an algorithm to perform a distributed computation of the instructions, suited to high degree superscalar microarchitectures. The method relies on a partitionning of both the register file and the reservation stations in order to decrease the number of register file access ports and the number of stations comparators. Matching the results with the depending sources is no more global but point to point thanks to an identification of the instructions and their components. The method, by limiting the access resources to each renaming register to four ports allows, despite an increase of the number of registers, to keep the access lime beyond the cycle time.
Future Generation Computer Systems | 2005
Bernard Goossens; David Defour
In this paper, we address the issue of feeding future superscalar processor cores with enough instructions. Hardware techniques targeting an increase in the instruction fetch bandwidth have been proposed such as the trace cache microarchitecture. We present a microarchitecture solution based on a register file holding basic blocks of instructions. This solution places the instruction memory hierarchy out of the cycle determining path. We call our approach, instruction register file (IRF). We estimate our approach with a SimpleScalar based simulator run on the Mediabench benchmark suite and compare to the trace cache performance on the same benchmarks. We show that on this benchmark suite, an IRF-based processor fetching up to three basic blocks per cycle outperforms a trace-cache-based processor fetching 16 instructions long traces by 25% on the average.
ComPAS: Conférence en Parallélisme, Architecture et Système | 2014
Katarzyna Porada; David Parello; Bernard Goossens
computational science and engineering | 2013
Philippe Langlois; Bernard Goossens; David Parello
Numerical Sofware: Design, Analysis and Verification | 2012
Bernard Goossens; Philippe Langlois; David Parello; Kathy Porada
Archive | 2012
Philippe Langlois; David Parello; Bernard Goossens; Kathy Porada
Archive | 2004
David Defour; Bernard Goossens
2017 First International Conference on Embedded & Distributed Systems (EDiS) | 2017
Bernard Goossens