Mohammad Gh. Alfailakawi

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammad Gh. Alfailakawi is active.

Explore More

Publication

Featured researches published by Mohammad Gh. Alfailakawi.

Expert Systems With Applications | 2016

Harmony-search algorithm for 2D nearest neighbor quantum circuits realization

Mohammad Gh. Alfailakawi; Imtiaz Ahmad; Suha A. Hamdan

The emerging field of quantum computing is addressed in this work.A Harmony Search (HS) based algorithm is proposed to efficiently realize quantum circuits on two dimension grids.The objective is to minimize the number of SWAP gates for two dimension nearest neighbor architecture.A local optimization heuristic is embedded with HS algorithm to further improve the solution quality.Experimental results on real benchmarks demonstrate the scalability and effectiveness of the proposed technique. Motivated by its promising applications, quantum computing is an emerging area of research. This paper addresses the NP-complete problem of finding Nearest Neighbor (NN) realization of quantum circuits on a 2-Dimensional grid. In certain quantum technologies, only physically adjacent qubits are allowed to interact with each other hence the need for NN requirement. Circuits with distant qubits are made NN-compliant by introducing swap gates, hence increasing cost. In this work, we present a Harmony Search (HS) based intelligent metaheuristic algorithm to efficiently realize low cost NN circuits utilizing input line reordering. The distinct feature of the proposed technique is that initial qubits placement is found using HS based metaheuristic followed by an efficient, problem-specific local heuristic to perform swap gate insertion. The effectiveness of the proposed algorithm is demonstrated by comparing its performance to a number of recent published approaches. Solutions found by the proposed technique show reduction in the number of swaps needed in the range of 4% - 36% on average when compared to state-of-the-art techniques. Compared to other approaches, the implemented algorithm is scalable and was able to find optimized circuits within 4 seconds in the worst case.

Microprocessors and Microsystems | 2016

Implementation of harmony search on embedded platform

Mohammed El-Shafei; Imtiaz Ahmad; Mohammad Gh. Alfailakawi

Harmony Search (HS) is relatively a new population-based meta-heuristic optimization algorithm that imitates the music improvisation process of musicians to search for a perfect state of harmony. HS has attracted a lot of attention by showing excellent results for a wide range of optimization problems in diverse fields. HS is typically implemented on a software platform, which restrict its applications to real-time applications. In order to accelerate the algorithm, one can proceed with the parallelization of the algorithm and/or map it directly onto hardware to achieve faster execution time. This paper presents an efficient architecture for parallel HS algorithm in FPGA platform in order to improve HS performance in terms of execution time, resource utilization and power consumption while searching several solution candidates for a problem. The implementation is tested using a suite of well-known benchmark functions. Analysis of the experimental results show that the proposed concurrent implementation has a promising performance up to 175x and no less than 16x as compared with software implementation.

International Journal of Parallel, Emergent and Distributed Systems | 2018

Hardware accelerator for solving 0–1 knapsack problems using binary harmony search

Mohammed El-Shafei; Imtiaz Ahmad; Mohammad Gh. Alfailakawi

Abstract The 0–1 knapsack problem (KP) is a well-known intractable optimization problem with wide range of applications. Harmony Search (HS) is one of the most popular metaheuristic algorithms to successfully solve 0–1 KPs. Nevertheless, metaheuristic algorithms are generally compute intensive and slow when implemented in software. In this paper, we present an FPGA-based pipelined hardware accelerator to reduce computation time for solving large dimension 0–1 KPs using Binary Harmony Search algorithm. The proposed architecture exploits the intrinsic parallelism of population based metaheuristic algorithm and the flexibility and parallel processing capabilities of FPGAs to perform the computation concurrently thus enhancing performance. To validate the efficiency of the proposed hardware accelerator, experiments were conducted using a large number of 0–1 KPs. Comparative analysis on experimental results reveals that the proposed approach offers promising speedups of 51× – 111× as compared with a software implementation and 2× – 5× as compared with a hardware implementation of Binary Particle Swarm Optimization algorithm. Hardware Accelerator Implementation using Binary Harmony Search

Iet Computers and Digital Techniques | 2018

FPGA-based implementation of cuckoo search

Mohammad Gh. Alfailakawi; Mohammed El-Shafei; Imtiaz Ahmad; Ayed A. Salman

Cuckoo search (CS) is a recent swarm intelligence-based meta-heuristic optimisation algorithm that has shown excellent results for a broad class of optimisation problems in diverse fields. However, CS is generally compute intensive and slow when implemented in software requiring large number of fitness function evaluations to obtain acceptable solutions. In this study, the authors present a problem specific parallel pipelined field programmable gate array-based accelerator to reduce execution time when solving complex optimisation problems. Experiments conducted on a large number of well-known benchmark functions revealed that the hardware approach offers a promising average speedup of 75× and 53× than software and GPU implementations, respectively.

Microprocessors and Microsystems | 2017

Odd/Even Invert coding for phase change memory with thermal crosstalk

Imtiaz Ahmad; Areej Helmi Hamouda; Mohammad Gh. Alfailakawi

Cloud based services demand a colossal amount of memory in order to satisfy their objectives. Phase-change memory (PCM) has emerged as one of the most promising memory technologies to feature in next generation memory systems. One of the key challenges of PCM is the limited number of writes that can be performed on memory cells also known as write endurance. In this paper we present a cost model which captures the asymmetry as well as disturb characteristics associated with write operations in PCMs. Moreover, we present an encoding architecture based on the proposed cost metric to allow the write operation to be performed using minimum cost. The proposed approach called Odd/Even Invert re-codes data based on selective inversion of even and/or odd bits to find minimum cost write operation that shall enhance cells lifetime. The proposed approach inquires a cost of only two extra bits regardless of the size of data word used, hence provides a cost effective approach to the problem. Experimental results and comparison with existing techniques on random data, real data, and memory traces from PERSEC benchmark suite, show the effectiveness and scalability of the proposed scheme.

Iet Computers and Digital Techniques | 2017

Extending multi-level STT-MRAM cell lifetime by minimising two-step and hard state transitions in hot bits

Imtiaz Ahmad; Mahmoud Imdoukh; Mohammad Gh. Alfailakawi

Shifting market trends towards mobile, Internet of things, and data-centric applications create opportunities for emerging low-power non-volatile memories. The attractive features of spin-torque-transfer magnetic-RAM (STT-MRAM) make it a promising candidate for future on-chip cache memory. Two-bit multiple-level cell (MLC) STT-MRAMs suffer from higher write energy, performance overhead, and lower cell endurance when compared with single-level counterpart. These unwanted effects are mainly due to write operations known as two-step (TT) and hard transitions (HT). Here, the authors offer a solution to tackle write energy problem in MLC STT-MRAM by minimising the number of TT and HT transitions. By analysing real applications, it was observed that specific locations within a cache block undergo much more TT and HT transitions resulting in hot locations when compared with other ones (cold locations). These hot locations are more detrimental to the lifetime and reliability of MRAM device. In this work, the authors propose a simple and intuitive dynamic encoding scheme that eliminates all TT and HT at hot locations, hence reducing energy consumption and improving MLC STT-MRAM lifetime. Results on PARSEC benchmarks demonstrate the effectiveness and scalability of the proposed approach to potentially prolong MLC STT-MRAM lifetime.

Intelligent Decision Technologies | 2016

Non-volatile look-up table based FPGA implementations

Lei Xie; Hoang Anh Du Nguyen; Mottaqiallah Taouil; Said Hamdioui; Koen Bertels; Mohammad Gh. Alfailakawi

Many emerging technologies are under investigation to realize alternatives for future scalable electronics. Memristor is one of the most promising candidates due to memrsitors non-volatility, high integration density, near-zero standby power consumption, etc. Memristors have been recently utilized in non-volatile memory, neuromorphic system, resistive computing architecture, and FPGA to name but a few. An FPGA typically consists of configurable logic blocks (CLBs), programmable interconnects, configuration, and block memories. Most of the recent work done was focused on using memristor to build FPGA interconnects and memories. This paper proposes two novel FPGA implementations that utilize memristor-based CLBs and their corresponding automatic design flow. To illustrate the potential of the proposed implementations, they are benchmarked using Toronto 20, and compared with the state-of-the-art in terms of area and delay. The experimental results show that both the area (up to 4.24×) and delay (up to 1.46×) of the novel FPGAs are very promising.

Quantum Information Processing | 2013