Bouraoui Ouni
University of Monastir
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bouraoui Ouni.
Microelectronics International | 2012
Bouraoui Ouni; Abdellatif Mtibaa
Purpose – The purpose of this paper is to reduce the reconfiguration time of a field‐programmable gate array (FPGA).Design/methodology/approach – The paper focuses on introducing a new temporal placement algorithm which uses a typical mathematical formalism to optimize the reconfiguration time.Findings – Results show that the algorithm decreases considerably the reconfiguration time compared with famous temporal placement algorithms.Originality/value – The paper proposes a new temporal placement algorithm which optimizes reconfiguration time of modules on the device. The studied evaluation cases show that the proposed algorithm provides very significant results in terms reconfiguration time of modules versus other well‐known algorithms used in the temporal placement field. The authors uses the eigenvalue of the Laplacian matrix.
Future Generation Computer Systems | 2018
Olfa Haggui; Claude Tadonki; Lionel Lacassagne; Fatma Ezahra Sayadi; Bouraoui Ouni
Corner detection is a key kernel for many image processing procedures including pattern recognition and motion detection. The latter, for instance, mainly relies on the corner points for which spatial analyses are performed, typically on (probably live) videos or temporal flows of images. Thus, highly efficient corner detection is essential to meet the real-time requirement of associated applications. In this paper, we consider the corner detection algorithm proposed by Harris, whose the main work-flow is a composition of basic operators represented by their approximations using 3 3 matrices. The corresponding data access patterns follow a stencil model, which is known to require careful memory organization and management. Cache misses and other additional hindering factors with NUMA architectures need to be skillfully addressed in order to reach an efficient scalable implementation. In addition, with an increasingly wide vector registers, an efficient SIMD version should be designed and explicitly implemented. In this paper, we study a direct and explicit implementation of common and novel optimization strategies, and provide a NUMA-aware parallelization. Experimental results on a dual-socket INTEL Broadwell-E/EP show a noticeably good scalability performance.
international conference on advanced technologies for signal and image processing | 2016
Ibtissem Belakhdar; Walid Kaaniche; Ridha Djmel; Bouraoui Ouni
In recent years, the detection of drowsiness based on Electroencephalogram (EEG) signal has been paid great attentions. Most of the popular algorithms used for Brain Computer Interface (BCI) applications are, the Support Vector Machine (SVM) and the Artificial Neuronal Network (ANN)). The challenge is to developed a drowsiness detection system that is at once adapt to an embedded implementation and easy to use by the driver. In this respect, we propose to evaluate the performance of thise two classifiers used for EEG classification in order to select the most appropriate one which can provide higher classification accuracy. The validation process is conducted on EEG signals of the polysomnography database where EEG signals of 10 persons have been recorded from C3-O1 region. The signal read from the dataset mentioned above is segmented into 30 second windows then features are extracted from these segments using Fast Fourier Transform (FFT). These features are fed to ANN and SVM to select the most appropriate one. To evaluate the performance of the classifier we have used two metrics: the accuracy of classifier and the Receiver Operating Characteristic (ROC) curve. Based on this study, we conclude that the ANN classifier is better than SVM for the EEG drowsiness signals when using one EEG channel.
Microprocessors and Microsystems | 2015
Mehdi Jemai; Bouraoui Ouni
A System On Programmable Chip (SOPC) is a circuit that integrates all components of an electronic system into a single chip. It may consist of memories, one or more microprocessors, interface devices, configurable logic blocks and other necessary components to achieve an intended function. In this paper, we propose a new hardware-software partitioning algorithm of control data flow graph for SOPC. The main aim of our algorithm is to find a best compromise between hardware and software implementation of operations in order to satisfy design constraints in terms of latency and hardware resources of the target application. Our algorithm has been evaluated on real hardware device. In fact, experimentations have been done using a real FPGA Virtex-5. Results have shown that our algorithm provides a better performing system with the lowest possible cost compared to existing approaches.
international multi-conference on systems, signals and devices | 2016
Ibtissem Belakhdar; Walid Kaaniche; Ridha Djmel; Bouraoui Ouni
In the recent years, driver drowsiness has been considered one of the major causes of road accidents, which can lead to severe physical injuries, deaths and important economic losses. As a consequence, a reliable driver drowsiness-detection-system is necessary to alert the driver before an accident happens. For this reason, an Electroencephalogram (EEG) has recently drawn attention in the field of brain-computer interface and cognitive neuroscience to control and predict the human drowsiness state. Our objective in this work, is to proposed an automatic approach to detect the occurrence of driver drowsiness onset based on the Artificial Neuronal Network (ANN) and using only one EEG channel. In this study, an experiment has been conducted on ten human subjects using nine features computed from one EEG channel using the Fast Fourier Transform(FFT). After introducing these features in an ANN classifier, we have obtained a classification accuracy rate of 86.1% and 84.3% of drowsiness and alertness detection. All features used in this work are easy to calculate and can be determined in real time, which makes this approach adapted for embedded implementation.
Evolving Systems | 2014
Ramzi Ayadi; Bouraoui Ouni; Abdellatif Mtibaa
In this paper, we present a novel temporal partitioning methodology that temporally partitions a data flow graph on reconfigurable system. Our approach optimizes the whole latency of the design. This aim can be reached by minimizing the latency of the graph and the reconfiguration time at the same time. Consequently, our algorithm starts by an existing temporal partitioning. The existing temporal partitioning is the result of a whole latency optimization algorithm. Next, our approach builds the best architecture, on a partially reconfigurable FPGA, that gives the lowest value of reconfiguration time. The proposed methodology was tested on several examples on the Xilinx Virtex-II pro. The results show significant reduction in the design latency compared with others famous approaches used in this field.
International Journal of Reconfigurable Computing | 2018
Lilia Kechiche; Lamjed Touil; Bouraoui Ouni
Driven by the importance of energy consumption in system-on-chip design as an evaluation factor, this paper presents a design methodology at the system level to optimize power consumption on ARM-based architecture for real-time video processing. The proposed design flow is based on the interaction between the tool and user optimizations. The tool optimizations are the options and best practices available on the integrated design environment for the Xilinx technology and the target Zynq-7000 architecture. The user methods present methods proposed by the user to optimize power consumption. We used the principles of voltage scaling and frequency scaling techniques for user methods. These two techniques allow energy to be consumed in the proportion of work to be done. The suggested flow is applied on real-time video processing system. The results show power savings for up to 60% with respect to performance and real-time constraints.
Iet Image Processing | 2018
Lamjed Touil; Ismail Gassoumi; Radhouane Laajimi; Bouraoui Ouni
Here, the authors present a hardware design of fast multiplierless forward binary discrete cosine transform (BinDCT) based on quantum-dot cellular automata (QCA) technology. This new technology offers several features such as: small size, ultralow power consumption, and can operate at 1 THz. The simulation results in QCA Designer software confirm that the proposed circuit works well and can be used as a high-performance design in QCA technology. The analysis obtained from the implementation of QCA BinDCT indicates that the proposed architecture is superior to the existing based on classic metal-oxide (complementary metal-oxide semiconductor technology) technology. Here, the authors are going to introduce highly BinDCT module scaled with ultra-low power consuming. The proposed circuit requires 50% fewer power consuming compared to previous existing designs. The proposed architecture can attain a throughput of 800 mega pixel per second (Mpps). To design and verify the proposed architecture, QCADesigner tool and QCAPro tool are, respectively, employed for synthesis and power consumption estimation. Since the works in the field of QCA logic image processing have only started to bloom, the proposed contribution will engender a new thread of research in the field of real-time image and video treatment.
Iet Image Processing | 2017
Lamjed Touil; Bouraoui Ouni
Reversible logic computation is one of the most essential promising technologies in designing low-power digital circuits, optical information processing, quantum dot cellular automata, fault tolerant system and nanotechnology. In fact, the conventional digital circuits dissipate a significant amount of energy because several bits of information are deleted during the treatments. In reversible computation, the information bits are not lost. This study presents a novel design of reversible RGB to HMMD converter circuit. The proposed hardware converter is based on several reversible sub-modules as: adder, subtractor, multiplier, register, multiplexer comparator and others. In order to demonstrate the efficiency of the proposed logic design, each sub-module is shown in terms of number of quantum cost needed, gates required delay and garbage outputs produced. Since the works in the field of reversible logic video treatment has only started to bloom, the proposed contribution will engender a new thread of research in the field of reversible real time video treatment.
international conference on advanced technologies for signal and image processing | 2016
Lilia Kechiche; Lamjed Touil; Bouraoui Ouni
Nowadays, image and video processing applications are becoming widely used in many domains including industrials, medical imaging, manufacturing, and security systems. Real time image and video processing is a very demanding task as it needs to perform high computations for a big amount of data represented by the image, and the complex operations, which may need to be performed on the image. For these reasons, there is a move toward hardware solutions, which are known to be faster for such treatments. Embedded hardware systems are knowing an explosive growth in manufacturing and usage. Designing such systems is becoming critical task, as implementation technology is oriented toward complex functionalities with an increasing time to market pressure. To resolve these issues, the combination of appropriate methodology and architecture to ensure reuse and control over time and cost is a challenging idea. In this context, we explore the principle of PBD (Platform Based Design) as a design methodology that favors system reuse and allows the exploration of the trade-offs between different design requirements. We have used the PBD to design a real time video acquisition and display module. The proposed module is implemented on Virtex-5 FPGA.