Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nejmeddine Bahri is active.

Publication


Featured researches published by Nejmeddine Bahri.


International Journal of Computer Applications | 2012

Fast Intra Mode Decision Algorithm for H264/AVC HD Baseline Profile Encoder

Imen Werda; Nejmeddine Bahri; Amine Samet; Mohamed Ali Ben Ayed; Nouri Masmoudi

The high performance of H.264/AVC video encoder is accompanied with a wide computation complexity especially for high definition (HD) video sequences. One of the major H.264/AVC features to be optimized is the mode decision for both inter and intra prediction. Thus, based on high correlation observed between selected inter prediction mode and intra mode decision, a fast intra mode decision algorithm based on the best inter prediction mode for H264 high definition (HD) baseline profile encoder is proposed. The evaluation of the proposed approach was based on the rate distortion and PSNR variation, execution time and percentage of skipping intra4x4 and intra16x16. The proposed scheme is performed on 720p (1280x720) and 1080p (1920x1088) HD video sequences. Experimental results show that the proposed algorithm can save up to 60% of intra prediction computation time, 16% of skipping intra16x16 and up to 83% for intra4x4 without inducing PSNR degradation and bit-rate increase. General Terms Video compression techniques and signal processing


2016 International Image Processing, Applications and Systems (IPAS) | 2016

Fast motion estimation for HEVC video coding

Randa Khemiri; Nejmeddine Bahri; Fatma Belghith; Fatma Ezahra Sayadi; Mohamed Atri; Nouri Masmoudi

In this paper, a fast configuration for Motion Estimation (ME) is described in order to reduce the computational time of the new High Efficient Video Coding (HEVC). This configuration uses the Coded Block Flag (CBF) Fast Method (CFM), the Early Coding Unit (CU) termination (ECU) and the Early Skip Detection (ESD) modes. The Diamond Pattern is used as a search algorithm for ME in the encoding process. Compared to the latest original reference software test model (HM) 16.2 of the HEVC, experimental results had showed that the complexity is reduced, in average, by 56.75% with a small bit-rate and PSNR degradation.


Iet Computers and Digital Techniques | 2015

H.264/AVC high definition intra coding implementation on multiprocessor system on chip technology architecture

Nidhameddine Belhadj; Nejmeddine Bahri; Zied Marrakchi; Mohamed Ali Ben Ayed; Nouri Masmoudi; Habib Mehrez

Exploiting the multiprocessor system on chip technology (MPSoC) is a promising way to improve the frame rate of latest video encoders. In this article, an MPSoC architecture for the intra prediction encoding chain of H.264/AVC high definition is proposed using SoCLib, an open platform for virtual prototyping of MPSoC architectures. Experimental results show a speedup of about 85% in processing time, compared with an execution based on a single central processing unit, with an acceptable final circuit area. The proposed parallelism does not affect the quality of the reconstructed video and bit rate. It takes into account the data loading latency constraint and the size of used memory requirement. The proposed architecture is validated on FPGA technology, using a technique that allows switching from a virtual platform to a hardware one.


mediterranean electrotechnical conference | 2012

Fast intra mode decision algorithm based on inter prediction mode for H264/AVC

Nejmeddine Bahri; Nouri Masmoudi; Imen Werda; Amine Samet; Med Ali Ben Ayed

This paper proposes a fast intra mode decision approach based on the best inter prediction mode for P frames. This investigation is motivated by the high correlation observed between selected inter prediction mode and intra mode decision. The aim of this work is reducing computational complexity of the intra prediction module for H.264/AVC baseline encoder without inducing visual quality degradation or bit-rate increase. The evaluation of the proposed scheme was based on the rate distortion criteria and the execution time. Experimental results show that the proposed algorithm can save up to 60% of intra prediction computation time and maintain similar PSNR quality without inducing a bit-rate increase.


Iet Image Processing | 2018

Optimisation of HEVC motion estimation exploiting SAD and SSD GPU-based implementation

Randa Khemiri; Hassan Kibeya; Fatma Ezahra Sayadi; Nejmeddine Bahri; Mohamed Atri; Nouri Masmoudi

The new High-Efficiency Video Coding (HEVC) standard doubles the video compression ratio compared to the previous H.264/AVC at the same video quality and without any degradation. However, this important performance is achieved by increasing the encoder computational complexity. Thats why HEVC complexity is a crucial subject. The most time consuming and the most intensive computing part of HEVC is the motion estimation based principally on the sum of absolute differences (SAD) or the sum of square differences (SSD) algorithms. For these reasons, the authors proposed an implementation of these algorithms on a low cost NVIDIA GPU (graphics processing unit) using the Fermi architecture developed with Compute Unified Device Architecture language. The proposed algorithm is based on the parallel-difference and the parallel-reduction process. The investigational results show a significant speed-up in terms of execution time for most 64 × 64 pixel blocks. In fact, the proposed parallel algorithm permits a significant reduction in the execution time that reaches up to 56.17 and 30.4%, compared to the CPU, for SAD and SSD algorithms, respectively. This improvement proves that parallelising the algorithm with the new proposed reduction process for the Fermi-GPU generation leads to better results. These findings are based on a static study that determines the PU percentage utilisation for each dimension in the HEVC. This study shows that the larger PUs are the most utilised in temporal levels 3 and 4, which attain 84.56% for class E. This improvement is accompanied by an average peak signal-to-noise ratio loss of 0.095 dB and a decrease of 0.64% in terms of BitRate.


international conference on sciences and techniques of automatic control and computer engineering | 2016

HEVC video encoder implementation on Texas Instruments platforms

Nejmeddine Bahri; Abdessamad El Ansari; Mohamed Maazouz; Ali Ahaitouf; Nouri Masmoudi

This paper presents a high efficiency video encoder implementation (HEVC) on two different Texas Instruments (Tl) platforms: the BeagleBoard-xM based on ARM processor and the TMS320C6678 DSP. The new features of these processors such as multicore architecture, high frequency processor and low power consumption motivate researchers to develop an embedded HEVC video encoder which could be exploited in several multimedia applications such as High Definition (HD)TV, smart cameras, HD video conference, Ultra HD video surveillance systems. Different operating systems (OS) with different compilers are tested in order to obtain an optimized HEVC encoder implementation. Experimental results show that HEVC encoder DSP-based solution using SYS BlOS real time OS allows saving up to 62% of encoding time compared to Linux-c6x OS with the same DSP and about 71% of encoding time compared to BeagleBoard-based solution without inducing any performance degradation in terms of video quality or bitrate.


international conference on modelling, identification and control | 2016

Parallel implementation of Kvazaar HEVC on multicore ARM processor

Mohamed Maazouz; Nejmeddine Bahri; Noureddine Batel; Abdelmoughni Toubal; Nouri Masmoudi

The emergence of the new standard HEVC (High Efficiency Video Coding) is accompanied with serious problems related to resource consumption and encoding time. The proposal of new tools and optimizations is strongly recommended to ensure the integration of this new encoder in various platforms and multimedia applications. In this context, Kvazaar HEVC encoder is introduced to overcome the problems related to HEVC test model (HM) reference software. This academic open-source is tailored to fit the programmers needs by enabling highlevel parallel processing. In this context, this paper presents different parallel implementations of the Kvazaar HEVC encoder on a powerful Octa-core CubieBoard4 platform including two quad-core ARM A7 and ARM A15 for efficient power and high performance in a single chip. A performance comparison of different parallelization strategies is performed. For single-threaded implementation, experimental results show that the high speed preset (RD1) can save up to 48% and 91% of encoding time for Random Access (RA) and All-Intra (AI) configurations respectively. When moving to multi-threaded implementation, time saving is about 65% to 75% for AI configuration. Moreover, experiments show that Wavefront Parallel Processing (WPP) outperforms tiles-level parallelization in terms of encoding speed without inducing video quality degradation or bitrate increase.


2016 International Image Processing, Applications and Systems (IPAS) | 2016

SAD and SSE implementation for HEVC encoder on DSP TMS320C6678

Hassan Kibeya; Nejmeddine Bahri; Mohamed Ali Ben Ayed; Nouri Masmoudi

High Efficiency Video Coding is the latest video standard aiming to replace H264/AVC standard by improving significantly the coding efficiency and the compression performance which allows HEVC to be mostly suitable for high-definition videos for multimedia applications. However, the encoding process requires a high computational complexity that needs to be alleviated. Hence, the paper proposes a software implementation of HEVC encoder and an optimized architecture on single core DSP TMS320C6678 to perform the rate distortion optimization (RDO) for mode decision procedure. The goal is to use single instruction multiple data (SIMD) operations and data level parallelism in order to optimize the Sum of Absolute Differences (SAD) and Sum Square Error (SSE) engines. The performance of the proposed implementation shows more than 88% improvement in terms of cycle cost for the distortion functions computation and the encoding speed of the proposed optimized HEVC encoder is accelerated by approximately 24% compared to the HEVC reference model (HM12.0) software with slight loss of coding efficiency.


international conference on computer vision | 2015

Real-time H264/AVC high definition video encoder on a multicore DSP TMS320C6678

Nejmeddine Bahri; Nidhameddine Belhadj; Med Ali Ben Ayed; Nouri Masmoudi; Thierry Grandpierre; Mohamed Akil

In this paper, the newest Texas Instruments multicore DSP TMS320C6678 is used in order to perform a real-time H264/AVC high definition (HD) embedded video encoder. We exploit the high computing performance offered by this eight-core DSP in order to meet the real-time encoding compliant. To enhance the encoding speed, Frame Level Parallelism (FLP) approach is applied. A master core is reserved to handle data transfers to/from DSP. Multithreading algorithm combined with a ping-pong buffers technique are exploited in order to optimize the standard FLP approach and hide communication overhead. Experimental results show that our enhanced FLP implementation allows achieving real-time HD (1280×720) video encoding by reaching up to 26 f/s (frame/second) as encoding speed. Experiments show also that our parallel implementation, performed on seven C6678 DSP cores running each @ 1 GHz, allows accelerating the encoding run-time by a factor of 6,38 without inducing any quality degradation or bit-rate increase.


international conference on sciences and techniques of automatic control and computer engineering | 2014

H.264/AVC intra prediction encoding chain implementation on MPSoC based on slice level parallelism

Nidhameddine Belhadj; Nejmeddine Bahri; Zied Marrakchi; M. Ben Ayed; Nouri Masmoudi; Habib Mehrez

Multiprocessor System on Chip (MPSoC) is a promising way to reduce the processing time required by digital multimedia encoders such the most complex H.264/Advanced Video Coding. MPSoC contributes in this challenge by offering a high performance computing, little system on chip (SoC) surface, and low power consumption. In order to reduce the execution time of H.264/AVC intra only encoding chain, an efficient parallel processing on MPSoC architecture is proposed in this paper. The proposed parallel processing is based on a mixed partitioning which combines slice and macro blocks line level parallelism. The proposed architecture is designed through SoCLib platform. For performances evaluation, three MIPS32 processors are used to accelerate the encoding time. Experimental results for High Definition (HD) video sequences show that the proposed implementation allows a saving of 65.7% in processing time compared to a single CPU execution. Furthermore, the proposed solution is characterized by a relatively low memory size which positively affects the final circuit surface.

Collaboration


Dive into the Nejmeddine Bahri's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mohamed Akil

École Normale Supérieure

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge