Marwa Chouchene | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marwa Chouchene is active.

Explore More

Publication

Featured researches published by Marwa Chouchene.

Microprocessors and Microsystems | 2015

Optimized parallel implementation of face detection based on GPU component

Marwa Chouchene; Fatma Ezahra Sayadi; Haythem Bahri; Julien Dubois; Johel Miteran; Mohamed Atri

Display Omitted An algorithm for face detection has been implemented on CPU.An acceleration of this algorithm on GPU migration.Performance of GPU implementation shows the effectiveness of this implementation.Another optimization method on GPU are operated. Face detection is an important aspect for various domains such as: biometrics, video surveillance and human computer interaction. Generally a generic face processing system includes a face detection, or recognition step, as well as tracking and rendering phase. In this paper, we develop a real-time and robust face detection implementation based on GPU component. Face detection is performed by adapting the Viola and Jones algorithm. We have developed and designed optimized several parallel implementations of these algorithms based on graphics processors GPU using CUDA (Compute Unified Device Architecture) description.First, we implemented the Viola and Jones algorithm in the basic CPU version. The basic application is widened to GPU version using CUDA technology, and freeing CPU to perform other tasks. Then, the face detection algorithm has been optimized for the GPU using a grid topology and shared memory. These programs are compared and the results are presented. Finally, to improve the quality of face detection a second proposition was performed by the implementation of WaldBoost algorithm.

International Journal of Advanced Media and Communication | 2014

Efficient implementation of Sobel edge detection algorithm on CPU, GPU and FPGA

Marwa Chouchene; Fatma Ezahra Sayadi; Yahia Said; Mohamed Atri; Rached Tourki

Many applications in image processing have high degrees of inherent parallelism and are thus good candidates for parallel implementation. In fact, programming tools for field programmable gate array FPGA, SIMD instructions on CPU and a large number of cores on graphic processor unit GPU have been developed, but it is still difficult to achieve high performance on these platforms. This paper analyses the distinct features of compute unified device architecture CUDA GPU and summarises the general program mode of CUDA. Furthermore, we present three different implementations of Sobel edge detection on CPU, FPGA and GPU. Tested image data are also used in these hardware platforms to compare computational efficiency of CPU, GPU and FPGA.

Iet Computers and Digital Techniques | 2017

Image feature extraction algorithm based on CUDA architecture: case study GFD and GCFD

Haythem Bahri; Fatma Ezahra Sayadi; Randa Khemiri; Marwa Chouchene; Mohamed Atri

Optimising computing times of applications is an increasingly important task in many different areas such as scientific and industrial applications. Graphics processing unit (GPU) is considered as one of the powerful engines for computationally demanding applications since it proposes a highly parallel architecture. In this context, the authors introduce an algorithm to optimise the computing time of feature extraction methods for the colour image. They choose generalised Fourier descriptor (GFD) and generalised colour Fourier descriptor (GCFD) models, as a method to extract the image feature for various applications such as colour object recognition in real-time or image retrieval. They compare the computing time experimental results on central processing unit and GPU. They also present a case study of these experimental results descriptors using two platforms: a NVIDIA GeForce GT525M and a NVIDIA GeForce GTX480. Their experimental results demonstrate that the execution time can considerably be reduced until 34× for GFD and 56× for GCFD.

Iet Computers and Digital Techniques | 2018

CUDA memory optimisation strategies for motion estimation

Fatma Elzahra Sayadi; Marwa Chouchene; Haithem Bahri; Randa Khemiri; Mohamed Atri

As video processing technologies continue to rise quicker than central processing unit (CPU) performance in complexity and image resolution, data-parallel computing methods will be even more important. In fact, the high-performance, data-parallel architecture of modern graphics processing unit (GPUs) can minimise execution times by orders of magnitude or more. However, creating an optimal GPU implementation not only needs converting sequential implementation of algorithms into parallel ones but, more importantly, needs cautious balancing of the GPU resources. It requires also an understanding of the bottlenecks and defect caused by memory latency and code computing. The defiance is even greater when an implementation exceeds the GPU resources. In this study, the authors discuss the parallelisation and memory optimisation strategies of a computer vision application for motion estimation using the NVIDIA compute unified device architecture (CUDA). It addresses optimisation techniques for algorithms that surpass the GPU resources in either computation or memory resources for CUDA architecture. The proposed implementation reveals a substantial improvement in both speed up (SU) and peak signal-to-noise ratio (PSNR). Indeed, the implementation is up to 50 times faster than the CPU counterpart. It also provides an increase in PSNR of the coded test sequence up to 8 dB.

Journal of Algorithms & Computational Technology | 2017

Optimization and performance evaluation of graphic processing units for voice processing

Fatma Ezahra Sayadi; Haythem Bahri; Marwa Chouchene; Mohamed Atri

With the advancement in the device technology and parallel architecture, field-programmable gate arrays (FPGAs) can well perform the speech processing operation. FPGAs have very impressive results, despite their low operating frequency, by completely extracting the parallelism. Nevertheless, recent central processing unit and graphic processing unit (GPU) have also an inherent feature for high performance. In fact, recent GPUs enable dramatic increases in computing performance by harnessing great number of cores. In this context, we seek to analyze the performance of the linear prediction coding algorithm implementation on two different platforms: one based on the GPU NVIDIA GeForce GTX 480 and another on the FPGA Spartan-6. Subsequently, we try to apply several optimization strategies on those platforms. The experimental results highlight the relative robustness or weakness of both these platforms. The tests prove that, for several samples, GPU manages speedups of up to 4× compared to the FPGA and around 48× compared to a sequential execution.

international multi-conference on systems, signals and devices | 2013

Integral image computation on GPU

Marwa Chouchene; Fatma Ezahra Sayadi; Mohamed Atri; Rached Tourki

In this paper we present an integral image algorithm that can run in real-time on a Graphics Processing Unit (GPU). Our system exploits the parallelisms in computation via the NVIDA CUDA programming model, which is a software platform for solving non-graphics problems in a massively parallel high performance fashion. We compare the performance of the parallel approach running on the GPU with the sequential CPU implementation across a range of image sizes.

international conference on communications | 2011

Software, hardware and co-simulation for detecting and tracking moving objects

Marwa Chouchene; Fatma Ezahra Sayadi; Mohamed Atri; Rached Tourki

The intelligent video opens up new possibilities in machine vision solutions. In fact it allows analyzing camera video in real-time to detect events of interest. It permits also to isolate and simultaneously track multiple objects and thus in different fields such as robotics, medicine, and in particular the domain of video surveillance and object recognition. However, for a video surveillance system, the tasks of analysis of the movement of the video consist of the detection of moving objects and their tracking. The aim of this work is a hardware / software implementation for detecting and tracking moving objects in a video sequence based on Kalman filter and background subtraction in the form of block diagram in order to minimize the execution time.

International journal of imaging and robotics | 2017