Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Khaled Benkrid is active.

Publication


Featured researches published by Khaled Benkrid.


Archive | 2013

High-Performance Computing Using FPGAs

Wim Vanderbauwhede; Khaled Benkrid

High-Performance Computing using FPGA covers the area of high performance reconfigurable computing (HPRC). This book provides an overview of architectures, tools and applications for High-Performance Reconfigurable Computing (HPRC). FPGAs offer very high I/O bandwidth and fine-grained, custom and flexible parallelism and with the ever-increasing computational needs coupled with the frequency/power wall, the increasing maturity and capabilities of FPGAs, and the advent of multicore processors which has caused the acceptance of parallel computational models. The Part on architectures will introduce different FPGA-based HPC platforms: attached co-processor HPRC architectures such as the CHRECs Novo-G and EPCCs Maxwell systems; tightly coupled HRPC architectures, e.g. the Convey hybrid-core computer; reconfigurably networked HPRC architectures, e.g. the QPACE system, and standalone HPRC architectures such as EPFLs CONFETTI system. The Part on Tools will focus on high-level programming approaches for HPRC, with chapters on C-to-Gate tools (such as Impulse-C, AutoESL, Handel-C, MORA-C++); Graphical tools (MATLAB-Simulink, NI LabVIEW); Domain-specific languages, languages for heterogeneous computing(for example OpenCL, Microsofts Kiwi and Alchemy projects). The part on Applications will present case from several application domains where HPRC has been used successfully, such as Bioinformatics and Computational Biology; Financial Computing; Stencil computations; Information retrieval; Lattice QCD; Astrophysics simulations; Weather and climate modeling.


International Journal of Reconfigurable Computing | 2012

High-Performance Reconfigurable Computing

Khaled Benkrid; Esam El-Araby; Miaoqing Huang; Kentaro Sano; Thomas Steinke

1 School of Engineering, The University of Edinburgh, Edinburgh EH9 3JL, UK 2Electrical Engineering and Computer Science, The Catholic University of America, Washington, DC 20064, USA 3Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR 72701, USA 4Graduate School of Information Sciences, Tohoku University, 6-6-01 Aramaki Aza Aoba, Sendai 980-8579, Japan 5Zuse-Institut Berlin (ZIB), Takustrase 7, 14195 Berlin-Dahlem, Germany


adaptive hardware and systems | 2012

A novel high-performance fault-tolerant ICAP controller

Ali Ebrahim; Khaled Benkrid; Xabier Iturbe; Chuan Hong

Dynamic Partial Reconfiguration is an important feature of modern FPGAs as it allows for better exploitation of FPGA resources over time and space. The Internal Configuration Access Port (ICAP) enables DPR from within an FPGA chip, leading to the possibility of fully autonomous FPGA-based systems. This paper presents a novel high performance and fault-tolerant ICAP controller which can operate at a high speed and recover from emerging faults. Test results showed that our ICAP controller is 25 times faster than the Xilinx XPS_HWICAP IP core. We demonstrate the use of Triple Modular Redundancy (TMR) in some of the ICAP controller components which have the ability to reconfigure the rest of the ICAP controller when faults are detected. This method is shown to have a 49% smaller area footprint compared to traditional full TMR.


IEEE Embedded Systems Letters | 2011

Efficient On-Chip Task Scheduler and Allocator for Reconfigurable Operating Systems

Chuan Hong; Khaled Benkrid; Xabier Iturbe; Ali Ebrahim; Tughrul Arslan

This letter presents efficient and modular task scheduler and allocator support for dynamically and partially reconfigurable electronic systems. This enables hardware tasks to be preempted and arbitrarily placed at an optimal position on the chip on-the-fly. In particular, we present a novel fault-tolerant allocating algorithm called “best-fit empty area compact (BF-EAC),” and its on-chip implementation on a Xilinx Virtex-4 field-programmable gate array (FPGA), which circumvents emerging faults while maintaining more compact empty areas for emerging tasks. We also present an implementation of the early deadline first (EDF) scheduling heuristic used to optimize the chronological order of execution of hardware tasks to meet real time constraints. Put together, the placement and scheduling architecture efficiently exploits chip resources with a μs-grade computing speed and a lightweight footprint (less than 500 slices).


field programmable logic and applications | 2012

An adaptive FPGA implementation of multi-core K-nearest neighbour ensemble classifier using dynamic partial reconfiguration

Hanaa M. Hussain; Khaled Benkrid; Chuan Hong; Huseyin Seker

Classification of highly dimensional Microarray data using K-nearest neighbour (K-NN) is a time-consuming task when implemented on general purpose processors (GPPs), and such it can benefit greatly from a parallel hardware implementation. In this work, an FPGA implementation of the K-NN classifier is presented and compared with an equivalent implementation running on GPP. Then, a novel FPGA-based multi-core implementation of the K-NN ensemble classifier, which exploits dynamic partial reconfiguration (DPR) is presented. The FPGA implementation of the single core K-NN classifier was found to be 92× faster than a GPP implementation, and the ensemble implementation was found to offer ~5× speed-up of the FPGA reconfiguration time. In addition, the paper investigates the effect of data dimensionality on classification time on both FPGAs and GPPs, showing that FPGAs scale up better than GPPs with higher data dimensionality.


reconfigurable computing and fpgas | 2011

Snake: An Efficient Strategy for the Reuse of Circuitry and Partial Computation Results in High-Performance Reconfigurable Computing

Xabier Iturbe; Khaled Benkrid; Ali Ebrahim; Chuan Hong; Tughrul Arslan; Imanol Martinez

In this paper we present Snake, a novel technique for allocating and executing hardware tasks onto partially reconfigurable Xilinx FPGAs. Snake permits to alleviate the bottleneck introduced by the Internal Configuration Access Port (ICAP) in Xilinx FPGAs, by reusing both intermediate partial results and previously allocated pieces of circuitry. Moreover, Snake considers often neglected aspects in previous approaches when making allocation decisions, such as the technological constraints introduced by reconfigurable technology and inter-task communication issues. As a result of being a realistic solution its implementation using real FPGA hardware has been successful. We have checked its ability to reduce not only the overall execution time of a wide range of synthetic reconfigurable applications, but also time overheads in making allocation decisions in the first place.


field programmable logic and applications | 2012

Design and implementation of fault-tolerant soft processors on FPGAs

Chuan Hong; Khaled Benkrid; Xabier Iturbe; Ali Ebrahim

This paper presents a novel hardware mechanism to facilitate the design and implementation of soft processors on FPGAs using the Error-correcting code (ECC)-protected memory and Triple Modular Redundancy (TMR). Such techniques highly harden the fault tolerance of soft processors, especially their memories, which are the most radiation susceptible resources on FPGAs. This is demonstrated in the implementation of a fault-tolerant PicoBlaze processor on Xilinx FPGAs, in which we used an additional LookAhead technique to synchronize the processor with ECC-protected Block RAM (ECC BRAM). The resulting fault-tolerant PicoBlaze processor has the benefit of having a self-recoverable program memory in the presence of Single Error Upsets (SEUs), without halting the processor. Our techniques can be applied to other soft processors e.g. Xilinx MicroBlaze or Altera Nios.


adaptive hardware and systems | 2011

An FPGA task allocator with preliminary First-Fit 2D packing algorithms

Chuan Hong; Khaled Benkrid; Xabier Iturbe; Ahmet T. Erdogan; Tughrul Arslan

This paper presents a novel light footprint and fast execution allocator for dynamically placing hardware tasks onto partially-damaged and resource-limited FPGA chips. The aim of the allocators placement algorithm is to maximize the overall task acceptance rate in presence of spontaneously occurring faults in chips silicon. Towards this objective, a novel placement algorithm: Empty Area Compaction (EAC) with its preliminary version: First-Fit, is proposed. Additionally, a set of observations are presented, targeting on optimizing the algorithm and accelerating its execution time, in the case of two parameters: chip granularity and algorithms pipeline structure. Based on these, a First Fit allocator has been implemented on a low cost Xilinx PicoBlaze soft processor, accelerating the placement decision to be made within 10 µs.


reconfigurable computing and fpgas | 2013

A platform for secure IP integration in Xilinx Virtex FPGAs

Ali Ebrahim; Khaled Benkrid; Jalal Khalifat; Chuan Hong

Advancements in silicon, software and IP support have made Field Programmable Gate Arrays (FPGAs) a highly flexible solution for many applications. With the growing number of companies providing IP support for FPGAs, IP license violations by over-deployment of IP into more devices than originally licensed remains a major concern for IP owners. In this paper we present a solution for secure IP exchange and configuration based on the Dynamic Partial Reconfiguration (DPR) feature in Xilinx FPGAs. Our system deploys DPR to integrate encrypted hard-macro IP cores into identifiable FPGA devices. These IP cores are configured using a proposed partial bitstream relocation technique to allow for a flexible design flow. We present a proof-of-concept implementation of a secure internal reconfiguration engine on a Xilinx Virtex-6 FPGA.


international conference on microelectronics | 2012

Virtual shared memory architecture for inter-task communication in partial reconfigurable systems

Chuan Hong; Khaled Benkrid; Ali Ebrahim; Xabier Iturbe

This paper presents a virtual shared memory architecture for inter-task communication in partial reconfigurable systems. The hardware tasks communicate with each other using the same content shared by physically separated Block RAMs (BRAMs). The coherence of the content is ensured by the Internal Configuration Access Port (ICAP), rather than conventional on-chip logic. The benefit of this approach resides in the flexibility of partial task reconfiguration that results from the ICAP-based synchronization mechanism, allowing hardware tasks to behave like software tasks, as they can be swapped in/out of the chip arbitrarily without any area boundary constraints. Moreover, a fast synchronization method which uses compressed bitstream is presented in this paper. The result shows significant improvements in synchronization speed at a low area overhead.

Collaboration


Dive into the Khaled Benkrid's collaboration.

Top Co-Authors

Avatar

Chuan Hong

University of Edinburgh

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ali Ebrahim

University of Edinburgh

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Esam El-Araby

George Washington University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge