Pierre Manneback
University of Mons
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pierre Manneback.
IEEE Transactions on Computers | 2014
Paulo Ricardo Possa; Sidi Ahmed Mahmoudi; Naim Harb; Carlos Valderrama; Pierre Manneback
This work presents a new flexible parameterizable architecture for image and video processing with reduced latency and memory requirements, supporting a variable input resolution. The proposed architecture is optimized for feature detection, more specifically, the Canny edge detector and the Harris corner detector. The architecture contains neighborhood extractors and threshold operators that can be parameterized at runtime. Also, algorithm simplifications are employed to reduce mathematical complexity, memory requirements, and latency without losing reliability. Furthermore, we present the proposed architecture implementation on an FPGA-based platform and its analogous optimized implementation on a GPU-based architecture for comparison. A performance analysis of the FPGA and the GPU implementations, and an extra CPU reference implementation, shows the competitive throughput of the proposed architecture even at a much lower clock frequency than those of the GPU and the CPU. Also, the results show a clear advantage of the proposed architecture in terms of power consumption and maintain a reliable performance with noisy images, low latency and memory requirements.
international conference on cluster computing | 2010
Sidi Ahmed Mahmoudi; Fabian Lecron; Pierre Manneback; Mohammed Benjelloun; Saïd Mahmoudi
The segmentation of cervical vertebra in X-Ray radiographs can give valuable information for the study of the vertebral mobility. One particular characteristic of the X-Ray images is that they present very low grey level variation and makes the segmentation difficult to perform. In this paper, we propose a segmentation procedure based on the Active Shape Model to deal with this issue. However, this application is seriously hampered by its considerable computation time. We present how vertebra extraction can efficiently be performed in exploiting the vast processing power of the Graphics Processing Units (GPU). We propose a CUDA-based GPU implementation of the most intensive processing steps enabling to boost performance. Experimentations have been conducted using a set of high resolution X-Ray medical images, showing a global speedup ranging from 15 to 21, by comparison with the CPU implementation.
International Journal of Biomedical Imaging | 2011
Fabian Lecron; Sidi Ahmed Mahmoudi; Mohammed Benjelloun; Saïd Mahmoudi; Pierre Manneback
The context of this work is related to the vertebra segmentation. The method we propose is based on the active shape model (ASM). An original approach taking advantage of the edge polygonal approximation was developed to locate the vertebra positions in a X-ray image. Despite the fact that segmentation results show good efficiency, the time is a key variable that has always to be optimized in a medical context. Therefore, we present how vertebra extraction can efficiently be performed in exploiting the full computing power of parallel (GPU) and heterogeneous (multi-CPU/multi-GPU) architectures. We propose a parallel hybrid implementation of the most intensive steps enabling to boost performance. Experimentations have been conducted using a set of high-resolution X-ray medical images, showing a global speedup ranging from 3 to 22, by comparison with the CPU implementation. Data transfer times between CPU and GPU memories were included in the execution times of our proposed implementation.
conference on decision and control | 1992
L. Wang; Gaetan Libert; Pierre Manneback
An algorithm for the discrete time linear filtering problem is developed. The crucial component of this algorithm involves the computation of the singular value decomposition (SVD) of an unsymmetric matrix without explicitly forming its left factor, which has a high dimension. The algorithm has good numerical stability and can handle correlated measurement noise without any additional transformation. Since the algorithm is formulated in the form of vector-matrix and matrix-matrix operations, it is also useful for parallel computers. A numerical example is given.<<ETX>>
Siam Journal on Scientific and Statistical Computing | 1986
G. H. Golub; Pierre Manneback; Ph. L. Toint
The purpose of this paper is to describe and compare some numerical methods for solving large dimensional linear least squares problems that arise in geodesy and, more specially, from Doppler positioning. The methods that are considered are the direct orthogonal decomposition, and the combination of conjugate gradient type algorithms with projections as well as the exploitation of “Property A”. Numerical results are given and the respective advantage of the methods are discussed with respect to such parameters as CPU time, input/output and storage requirements. Extensions of the results to more general problemsare also discussed.
international conference on image processing | 2012
Sidi Ahmed Mahmoudi; Pierre Manneback
Image processing algorithms present a necessary tool for various domains related to computer vision, such as video surveillance, medical imaging, pattern recognition, etc. However, these algorithms are hampered by their high consumption of both computing power and memory, which increase significantly when processing large sets of images. In this work, we propose a development scheme enabling an efficient exploitation of parallel (GPU) and heterogeneous platforms (Multi-CPU/Multi-GPU), for improving performance of single and multiple image processing algorithms. The proposed scheme allows a full exploitation of hybrid platforms based on efficient scheduling strategies. It enables also overlapping data transfers by kernels executions using CUDA streaming technique within multiple GPUs. We present also parallel and heterogeneous implementations of several features extraction algorithms such as edge and corner detection. Experimentations have been conducted using a set of high resolution images, showing a global speedup ranging from 5 to 30, by comparison with CPU implementations.
Engineering Analysis With Boundary Elements | 1997
Jacques Lobry; Pierre Manneback
The multiple reciprocity method is a recent generalisation of the well-known Boundary Element Method. It allows the numerical analysis of Poissons problem in an efficient and elegant manner since it converts the classical domain integral coming from excitation to a summation of boundary ones. However, the method leads to the computation of higher order kernels so that the assembly of the linear systems is significantly increased. In this paper, it is shown that parallel computing allows a substantial reduction in CPU time. Different data distribution strategies have been implemented and compared using standard ScaLAPACK computational kernel. Tests have been successfully run on a 24-processor Intel Paragon.
international conference on algorithms and architectures for parallel processing | 1996
Junming Qin; Kai Yun Chan; Pierre Manneback
Performance analysis plays a very important part in the design and implementation of parallel algorithms. The major reason is that the highly complex parallel computer architectures and very difficult task partitioning of most applications could lead to hardly extract the maximal performance. In this paper, we focus on the parallel implementation of a sparse well structural lower triangular system and its performance analysis. Two task partitioning methods are discussed with both task assignation and task schedule. Their parallel estimated times are provided by using a performance model and a performance evaluation methodology of parallel algorithms. The optimal task granularities are theoretically deduced by performance analysis. Experiences on transputer-based multicomputer are given.
european conference on parallel processing | 2006
R. Gruber; Vincent Keller; Michela Thiémard; Oliver Wäldrich; Philipp Wieder; Wolfgang Ziegler; Pierre Manneback
The Broker with the cost function model of the ISS/VIOLA Meta-Scheduling System implementation is described in details. The Broker includes all the algorithmic steps needed to determine a well suited machine for an application component. This judicious choice is based on a deterministic cost function model including a set of parameters that can be adapted to policies set up by computing centres or application owners. All the quantities needed for the cost function can be found in the DataWarehouse, or are available through the schedulers of the different machines forming the Grid. An ISS-Simulator has been designed to simulate the real-life scheduling of existent clusters and to virtually include new parallel machines. It will be used to validate the cost model and to tune the different free parameters.
european conference on parallel processing | 2006
Sébastien Noël; Olivier Delannoy; Nahid Emad; Pierre Manneback; Serge G. Petiton
This paper presents the integration of a multi-level scheduler in the YML architecture. It demonstrates the advantages of this architecture based on a component model and why it is well suited to develop parallel applications for Grids. Then, the multi-level scheduler under development for this framework is presented.1