Stéphane Mancini
Grenoble Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stéphane Mancini.
Eurasip Journal on Embedded Systems | 2008
Nicolas Gac; Stéphane Mancini; Michel Desvignes; Dominique Houzet
Back-projection (BP) is a costly computational step in tomography image reconstruction such as positron emission tomography (PET). To reduce the computation time, this paper presents a pipelined, prefetch, and parallelized architecture for PET BP (3PA-PET). The key feature of this architecture is its original memory access strategy, masking the high latency of the external memory. Indeed, the pattern of the memory references to the data acquired hinders the processing unit. The memory access bottleneck is overcome by an efficient use of the intrinsic temporal and spatial locality of the BP algorithm. A loop reordering allows an efficient use of general purpose processors caches, for software implementation, as well as the 3D predictive and adaptive cache (3D-AP cache), when considering hardware implementations. Parallel hardware pipelines are also efficient thanks to a hierarchical 3D-AP cache: each pipeline performs a memory reference in about one clock cycle to reach a computational throughput close to 100%. The 3PA-PET architecture is prototyped on a system on programmable chip (SoPC) to validate the system and to measure its expected performances. Time performances are compared with a desktop PC, a workstation, and a graphic processor unit (GPU).
acm symposium on applied computing | 2006
Nicolas Gac; Stéphane Mancini; Michel Desvignes
The reduction of image reconstruction time is needed to spread the use of PET for research and routine clinical practice. In this purpose, this article presents a hardware/software architecture for the acceleration of 3D backprojection based upon an efficient 2D backprojection. This architecture has been designed in order to provide a high level of parallelism thanks to an efficient management of the memory accesses which would have been otherwise strongly slowed by the external memory. The reconstruction system is embedded in a SoPC platform (System on Programmable Chip), the new generation of reconfigurable circuit. The originality of this architecture comes from the design of a 2D Adaptative and Predictive Cache (2D-AP Cache) which has proved to be an efficient way to overcome the memory access bottleneck. Thanks to a hierarchical use of this cache, several backprojection operators can run in parallel, accelerating in this manner noteworthy the reconstruction process. This 2D reconstruction system will next be used to speed up 3D image reconstruction.
rapid system prototyping | 2009
Zahir Larabi; Yves Mathieu; Stéphane Mancini
In this paper, we propose a low-cost n-dimensional cache (nD-Cache) architecture for FPGA-Based image and signal processing Systems On Chip (SoCs). The architecture allows efficient access to structured data such as in 2D or 3D images. We developed a theoretical model for our architecture. It gives a methodology to define the cache’s practical implementation based on the application and system parameters. Complexity and performance for selected image processing algorithms like jumping snake and 2D Back-Projection are measured and compared to classical solutions like associative caches. The architecture is shown to be efficient for tracking algorithm applications by exploiting spacial and temporal locality. Numerical results indicate that 50% improvement in run-time performance can be achieved.
federated conference on computer science and information systems | 2016
Khadija Hadj Salem; Yann Kieffer; Stéphane Mancini
The design of embedded vision systems carries a difficult challenge regarding the access times of memories holding image data for some particular cases of image treatments. This paper studies the optimization challenge reflecting the efficient operation of adhoc memory systems proposed by electronic designers to alleviate this problem. New algorithms are proposed for producing solutions to this 3-objective problem, and numerical experiments are conducted on real-world data for validating their efficiency.
international symposium on biomedical imaging | 2008
Yannick Grondin; Michel Desvignes; Laurent Desbat; Stéphane Mancini; Marie-Laure Gallin-Martel; Laurent Gallin-Martel; Olivier Rossetto
Monte Carlo simulations of a novel concept PET detector for small animal imaging are presented. The scintillation medium of the detector is liquid xenon whose characteristics in terms of detection rival with the common scintillator crystals. Moreover, the axial geometry of the detector enables depth of interaction measurement. A detector module has been built and an experimental test bench has been developed. Simulations of the test bench enabled to determine the methods to use for analysing the experimental data. Moreover, they indicate the spatial resolution in the axial direction and the energy resolution which can be expected from the detector. The results show an axial resolution of 2.87plusmn0.12 mm and an energy resolution of 7.59plusmn0.34%.
ieee nuclear science symposium | 2006
Yannick Grondin; Laurent Desbat; Michel Defrise; Thomas Rodet; Nicolas Gac; Michel Desvignes; Stéphane Mancini
The aim of this work is to use sampling theory to guide mashing schemes in multislice mode (2D) PET for multi-ring scanners. We adapt previous work on fan beam sampling to PET geometry. It results in a change of the shape of the essential support of the Fourier transform of the 3D fan beam X-ray transform. The use of square detectors in 2D multi-ring PET induces an oversampling in the transverse plane. A mashing scheme is therefore suggested to make profit of the oversampling in order to produce an efficient hexagonal sampling scheme. We add in average 24.5 LORs to form one single barycentric LOR according to a specific mashing scheme in each transverse plane. Thus, the number of LORs is significantly reduced while satisfying the hexagonal sampling conditions. Although this process causes a loss of resolution, the proposed mashing scheme optimises the trade-off between data compression and resolution.
international symposium on industrial embedded systems | 2016
Khadija Hadj Salem; Yann Kieffer; Stéphane Mancini
In the field of embedded vision systems, meeting the constraints on design criteria such as performance, area, and power consumption can be a real challenge. In fact, to alleviate the well known “Memory Mall”, it is mandatory to provide efficient memory hierarchies to reach usable performance for the system to be designed when it has to handle non-linear image treatments. To address this problematic, Mancini and Rousseau (Proc.DATE, 2012) have designed a software generator of memory hierarchies for each non-linear image operation. It allows one to improve dramatically the performance of the system, while moderately increasing its area and energy consumption. The trade-offs between these three parameters are then taken to the level of the design of the operation of this memory hierarchy, a problem that can be formalized as a 3-objective optimization problem. In this study, we formalize this problem and give new approaches both for the problem and particular sub-problems. The results on the same real-world data set as used by Mancini and Rousseau (Proc.DATE, 2012) show a very significant improvement and reduce the amount of transferred data up to 30% and a reduction of the computing time up to 15%.
conference on design and architectures for signal and image processing | 2016
Khadija Hadj Salem; Yann Kieffer; Stéphane Mancini
Embedded vision systems design faces a memory-wall kind of challenge: images are big, and therefore memories containing them have high latency; and still, high performance is desired. For the case of non-linear processings, Mancini and Rousseau (Proc. DATE 2012) have designed a software generator of adhoc memory hierarchies, called Memory Management Optimization (MMOpt). While the performance of the generated circuits is very good, design-time decisions have to be made regarding their operation in order to handle finely the compromise between the usual metrics of design area, energy consumption, and performance. This study tackles the optimization challenge set by the design of the operational behavior of the memory hierarchy generated by MMOpt. After a precise formulation as a 3-objective optimization problem is given, two algorithms are proposed, and their performance is analyzed on real-world processings against the previously proposed algorithms. The results show a reduction of the amount of transferred data by 17% on average, and of the computing times by 11.7%, for the same design area.
Proceedings of SPIE | 2008
Nicolas Gac; Stéphane Mancini; Michel Desvignes; Florian Deboissieu; Anthonin Reilhac
Forward and Backward projections are two computational costly steps in tomography image reconstruction such as Positron Emission Tomography (PET). To speed-up reconstruction time, a hardware projection/backprojection pair has been built following algorithm architecture adequacy principles. Thanks to an original memory access strategy based on an 3D adaptive and predictive memory cache, the external memory wall has been overcome. Thus, for both projector architectures several units run efficiently. Each unit reaches a computational throughput close to 1 operation per cycle. In this paper, we present how from our hardware projection/backprojection pair, an analytic (3D-RP) and an iterative (3D-EM) reconstruction algorithms can be implemented on a System on Programmable Chip (SoPC). First, an hardware/software partitioning is done based on the different steps of each algorithm. Then the reconstruction system is composed of two hardware configurations of the programmable logic resources (FPGA). Each one corresponds mainly to the projection and backprojection step. Our projector/backprojector has been validated with a software 3D-RP and 3D-EM reconstruction on simulated PET-SORTEO data. A reconstruction time evaluation of these reconstruction systems are done based on the measured performances of our projectors IPs and the estimated performances of the additional simple hardware IPs. The expected reconstruction time is compared with the software tomography distribution STIR. A speed-up of 7 can be expected for the 3D-RP algorithm and a speed-up of 3.5 for the 3D-EM algorithm. For both algorithms, the architecture cycle efficiency expected is largely greater than the software implementation: 120 times for 3D-RP and 60 times for 3D-EM.
Archive | 2007
Denis Beautemps; Laurent Girin; Noureddine Aboutabit; Gérard Bailly; Laurent Besacier; Gaspard Breton; Alice Caplier; Marie-Agnès Cathiard; Denis Chêne; Jeanne Clarke; Frédéric Elisei; Oxana Govokhina; Christian Jutten; Viet-Bac Le; Martine Marthouret; Stéphane Mancini; Yves Mathieu; Pascal Perret; Bertrand Rivet; Pablo Sacher; Christophe Savariaux; Sebastien Schmerber; Jean-François Serignat; Mélody Tribout; Sylvie Vidal