Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Randa Khemiri is active.

Publication


Featured researches published by Randa Khemiri.


2016 International Image Processing, Applications and Systems (IPAS) | 2016

Fast motion estimation for HEVC video coding

Randa Khemiri; Nejmeddine Bahri; Fatma Belghith; Fatma Ezahra Sayadi; Mohamed Atri; Nouri Masmoudi

In this paper, a fast configuration for Motion Estimation (ME) is described in order to reduce the computational time of the new High Efficient Video Coding (HEVC). This configuration uses the Coded Block Flag (CBF) Fast Method (CFM), the Early Coding Unit (CU) termination (ECU) and the Early Skip Detection (ESD) modes. The Diamond Pattern is used as a search algorithm for ME in the encoding process. Compared to the latest original reference software test model (HM) 16.2 of the HEVC, experimental results had showed that the complexity is reduced, in average, by 56.75% with a small bit-rate and PSNR degradation.


Iet Image Processing | 2018

Optimisation of HEVC motion estimation exploiting SAD and SSD GPU-based implementation

Randa Khemiri; Hassan Kibeya; Fatma Ezahra Sayadi; Nejmeddine Bahri; Mohamed Atri; Nouri Masmoudi

The new High-Efficiency Video Coding (HEVC) standard doubles the video compression ratio compared to the previous H.264/AVC at the same video quality and without any degradation. However, this important performance is achieved by increasing the encoder computational complexity. Thats why HEVC complexity is a crucial subject. The most time consuming and the most intensive computing part of HEVC is the motion estimation based principally on the sum of absolute differences (SAD) or the sum of square differences (SSD) algorithms. For these reasons, the authors proposed an implementation of these algorithms on a low cost NVIDIA GPU (graphics processing unit) using the Fermi architecture developed with Compute Unified Device Architecture language. The proposed algorithm is based on the parallel-difference and the parallel-reduction process. The investigational results show a significant speed-up in terms of execution time for most 64 × 64 pixel blocks. In fact, the proposed parallel algorithm permits a significant reduction in the execution time that reaches up to 56.17 and 30.4%, compared to the CPU, for SAD and SSD algorithms, respectively. This improvement proves that parallelising the algorithm with the new proposed reduction process for the Fermi-GPU generation leads to better results. These findings are based on a static study that determines the PU percentage utilisation for each dimension in the HEVC. This study shows that the larger PUs are the most utilised in temporal levels 3 and 4, which attain 84.56% for class E. This improvement is accompanied by an average peak signal-to-noise ratio loss of 0.095 dB and a decrease of 0.64% in terms of BitRate.


Iet Computers and Digital Techniques | 2017

Image feature extraction algorithm based on CUDA architecture: case study GFD and GCFD

Haythem Bahri; Fatma Ezahra Sayadi; Randa Khemiri; Marwa Chouchene; Mohamed Atri

Optimising computing times of applications is an increasingly important task in many different areas such as scientific and industrial applications. Graphics processing unit (GPU) is considered as one of the powerful engines for computationally demanding applications since it proposes a highly parallel architecture. In this context, the authors introduce an algorithm to optimise the computing time of feature extraction methods for the colour image. They choose generalised Fourier descriptor (GFD) and generalised colour Fourier descriptor (GCFD) models, as a method to extract the image feature for various applications such as colour object recognition in real-time or image retrieval. They compare the computing time experimental results on central processing unit and GPU. They also present a case study of these experimental results descriptors using two platforms: a NVIDIA GeForce GT525M and a NVIDIA GeForce GTX480. Their experimental results demonstrate that the execution time can considerably be reduced until 34× for GFD and 56× for GCFD.


Iet Computers and Digital Techniques | 2018

CUDA memory optimisation strategies for motion estimation

Fatma Elzahra Sayadi; Marwa Chouchene; Haithem Bahri; Randa Khemiri; Mohamed Atri

As video processing technologies continue to rise quicker than central processing unit (CPU) performance in complexity and image resolution, data-parallel computing methods will be even more important. In fact, the high-performance, data-parallel architecture of modern graphics processing unit (GPUs) can minimise execution times by orders of magnitude or more. However, creating an optimal GPU implementation not only needs converting sequential implementation of algorithms into parallel ones but, more importantly, needs cautious balancing of the GPU resources. It requires also an understanding of the bottlenecks and defect caused by memory latency and code computing. The defiance is even greater when an implementation exceeds the GPU resources. In this study, the authors discuss the parallelisation and memory optimisation strategies of a computer vision application for motion estimation using the NVIDIA compute unified device architecture (CUDA). It addresses optimisation techniques for algorithms that surpass the GPU resources in either computation or memory resources for CUDA architecture. The proposed implementation reveals a substantial improvement in both speed up (SU) and peak signal-to-noise ratio (PSNR). Indeed, the implementation is up to 50 times faster than the CPU counterpart. It also provides an increase in PSNR of the coded test sequence up to 8 dB.


computer and information technology | 2014

MatLab acceleration for DWT “Daubechies 9/7” for JPEG2000 standard on GPU

Randa Khemiri; Fatma Ezahra Sayadi; Mohamed Atri; Rached Tourki

Discrete wavelet transform (DWT) has diverse applications in signal and image processing fields. In this paper, we have implemented the lifting “Cohen-Daubechies-Feauveau 9/7” algorithm on a low cost NVIDIAs GPU (Graphics Processing Unit) with MatLab to achieve speedup in computation. The efficiency of our GPU based implementation is measured and compared with CPU based algorithms. Our investigational results with GPU show performance enhancement over a factor of 1.82 compared with CPU for an image of size 4096×4096 pixels.


Indian journal of science and technology | 2016

MatLab-GPU-based 2D-DWT Acceleration for JPEG2000 with Single and Double-Precision

Randa Khemiri; Fatma Ezahra Sayadi; Mohamed Atri


Applied Thermal Engineering | 2016

Electrothermal effect on the immunoassay in a microchannel of a biosensor with asymmetrical interdigitated electrodes

Marwa Selmi; Randa Khemiri; Fraj Echouchene; Hafedh Belmabrouk


Journal of Manufacturing Science and Engineering-transactions of The Asme | 2016

Enhancement of the Analyte Mass Transport in a Microfluidic Biosensor by Deformation of Fluid Flow and Electrothermal Force

Marwa Selmi; Randa Khemiri; Fraj Echouchene; Hafedh Belmabrouk


International journal of imaging and robotics | 2017

Fast SAD Algorithm of HEVC Video Encoder on Two Successive GPU Generations

Randa Khemiri; Marwa Chouchene; Haythem Bahri; Fatma Ezahra Sayadi; H. Kibeya; Mohamed Atri; Nouri Masmoudi


2017 International Conference on Engineering & MIS (ICEMIS) | 2017

Execution-time optimization based on thread and block repartitions on a graphic processing unit

Randa Khemiri; Fatma Ezahra Sayadi; Haythem Bahri; Marwa Chouchene; Mohamed Atri

Collaboration


Dive into the Randa Khemiri's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marwa Selmi

University of Monastir

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge