Mohamed Shalan
Ain Shams University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohamed Shalan.
compilers, architecture, and synthesis for embedded systems | 2000
Mohamed Shalan; Vincent John Mooney
with global on-chip memory allocation/de-allocation in a dynamic yet deterministic way is an important issue for upcoming billion transistor multiprocessor System-on-a-Chip (SoC) designs. To achieve this, we propose a new memory management hierarchy called Two-Level Memory Management. To implement this memory management scheme - which presents a paradigm shift in the way designers look at on-chip dynamic memory allocation - we present a System-on-a-Chip Dynamic Memory Management Unit (SoCDMMU) for allocation of the global on-chip memory, which we refer to as level two memory management (level one is the operating system management of memory allocated to a particular on-chip processor). In this way, heterogeneous processors in an SoC can request and be granted portions of the global memory in twenty clock cycles in the worst case for a four-processor SoC, which is at least an order of magnitude faster than software-based memory management. We present a sample implementation of the SoCDMMU and compare hardware and software implementations.
Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627) | 2002
Mohamed Shalan; Vincent John Mooney
The aggressive evolution of the semiconductor industry smaller process geometries, higher densities, and greater chip complexity - has provided design engineers the means to create complex, high-performance Systems-on-a-Chip (SoC) designs. Such SoC designs typically have more than one processor and huge memory, all on the same chip. Dealing with the global onchip memory allocation/de-allocation in a dynamic yet deterministic way is an important issue for the upcoming billion transistor multiprocessor SoC designs. To achieve this, we propose a memory management hierarchy we call Two-Level Memory Management. To implement this memory management scheme which presents a paradigm shift in the way designers look at on-chip dynamic memory allocation - we present a System-on-a-Chip Dynamic Memory Management Unit (SoCDMMU) for allocation of the global on-chip memory, which we refer to as Level Two memory management (Level One is the operating system management of memory allocated to a particular on-chip Processing Element). In this way, processing elements (heterogeneous or non-heterogeneous hardware or software) in an SoC can request and be granted portions of the global memory in a fast and deterministic time (for an example of a four processing element SoC, the dynamic memory allocation of the global onchip memory takes sixteen cycles per allocation/deallocation in the worst case). In this paper, we show how to modify an existing Real-Time Operating System (RTOS) to support the new proposed SoCDMMU. Our example shows a multiprocessor SoC that utilizes the SoCDMMU has 440% overall speedup of the application transition time over fully shared memory that does not utilize the SoCDMMU.
ACM Transactions in Embedded Computing Systems | 2014
Abdullah Elewi; Mohamed Shalan; Medhat H. Awadalla; E. M. Saad
Asymmetric multiprocessor systems are considered power-efficient multiprocessor architectures. Furthermore, efficient task allocation (partitioning) can achieve more energy efficiency at these asymmetric multiprocessor platforms. This article addresses the problem of energy-aware static partitioning of periodic real-time tasks on asymmetric multiprocessor (multicore) embedded systems. The article formulates the problem according to the Dynamic Voltage and Frequency Scaling (DVFS) model supported by the platform and shows that it is an NP-hard problem. Then, the article outlines optimal reference partitioning techniques for each case of DVFS model with suitable assumptions. Finally, the article proposes modifications to the traditional bin-packing techniques and designs novel techniques taking into account the DVFS model supported by the platform. All algorithms and techniques are simulated and compared. The simulation shows promising results, where the proposed techniques reduced the energy consumption by 75% compared to traditional methods when DVFS is not supported and by 50% when per-core DVFS is supported by the platform.
parallel, distributed and network-based processing | 2015
Anup Patel; Mai Daftedar; Mohamed Shalan; M. Watheq El-Kharashi
Virtualization technology has shown immense popularity within embedded systems due to its direct relationship with cost reduction, better resource utilization, and higher performance measures. Efficient hypervisors are required to achieve such high performance measures in virtualized environments, while taking into consideration the low memory footprints as well as the stringent timing constraints of embedded systems. Although there are a number of open-source hypervisors available such as Xen, Linux KVM and OKL4 Micro visor, this is the first paper to present the open-source embedded hypervisor Extensible Versatile hyper Visor (Xvisor) and compare it against two of the commonly used hypervisors KVM and Xen in-terms of comparison factors that affect the whole system performance. Experimental results on ARM architecture prove Xvisors lower CPU overhead, higher memory bandwidth, lower lock synchronization latency and lower virtual timer interrupt overhead and thus overall enhanced virtualized embedded system performance.
design, automation, and test in europe | 2011
Mona Safar; M. Watheq El-Kharashi; Mohamed Shalan; Ashraf Salem
Several approaches have been proposed to accelerate the NP-complete Boolean Satisfiability problem (SAT) using reconfigurable computing. In this paper, we present a five-stage pipelined SAT solver. SAT solving is broken into five stages: variable decision, variable effect fetch, clause evaluation, conflict detection, and conflict analysis. The solver performs a novel search algorithm combining state-of-the-art SAT solvers advanced techniques: non-chronological backjumping, dynamic backtracking and learning without explicit traversal of implication graph. SAT instance information is stored into FPGA block RAMs avoiding synthesizing overhead for each instance. The proposed solver achieves up to 70× speedup over other hardware SAT solvers with 200× less resource utilization.
design, automation, and test in europe | 2007
Mona Safar; Mohamed Shalan; M.W. El-Kharashi; Ashraf Salem
Several approaches have been proposed to accelerate the NP-complete Boolean satisfiability problem (SAT) using reconfigurable computing. We present an FPGA based clause evaluator, where each clause is modeled as a shift register that is either right shifted, left shifted, or standstill according to whether the current assigned variable value satisfy, unsatisfy, or does not effect the clause, respectively. For a given problem instance, the effect of the value of each of its variables on its SAT formula is loaded in the FPGA on-chip memory. This results in less configuration effort and fewer hardware resources than other available SAT solvers. Also, we present a new approach for implementing conflict analysis based on a conflicting variables accumulator and priority encoder to determine backtrack level. Using these two new ideas, we implement an FPGA based SAT solver performing depth-first search with non-chronological conflict directed backtracking. We compare our SAT solver with other solvers through instances from DIMACS benchmarks suite
Intelligent Decision Technologies | 2009
H. Shokry; M. Shedeed; Sherif Hammad; Mohamed Shalan; A. Wahdan
Controller Area Network (CAN) is widely used in real-time automobile control and is gaining wider acceptance as a standard for automotive networking. The applicability of Earliest Deadline (EDF) techniques to the scheduling of CAN messages has been shown in previous researches. Earliest deadline can guarantee higher network utilization than fixed-priority schemes like Deadline or Rate Monotonic (DM, RM), but the EDF technique continuous deadlines (priorities) update at each scheduling round results in high CPU overhead. The paper describes a way to decrease such CPU overhead by implementing EDF scheduler dedicated hardware and embedding it within the CAN controller open core IP. Consequently high reduction in CPU overhead is achieved. This paper also validates the design and implementation of the hardware EDF algorithm on a commonly used CAN controller connected on a SoC design. Hence, this paper can be considered as introducing a new generation of more efficient CAN controllers to be used in several industry domains.
Intelligent Decision Technologies | 2009
Mohamed Shalan; Dina El-Sissy
Power management is essential in microprocessor-based system design to avoid heat dissipation and preserve battery life time. In mobile devices running real time applications, the power consumed by the CPU can be more than the needed amount of power for best performance. At first glance, one can think of reducing the power budget of the system. Although this may seem an appealing easy solution, it can cause a huge degradation in performance if not controlled properly. Also a real time applications requirements vary with time, which implies that the power budget should also be variable. There must be a trade-off between power management and performance such that, the power consumed is always proportional to the required performance level. DVFS (Dynamic Voltage and Frequency Scaling) can be used to change the operating point (voltage-frequency pair) of the CPU according to the applications power requirement. In this paper, we present an implementation of a negative feedback control algorithm that uses DVFS for power saving in soft real-time systems that runs on Mentor Graphics® Nucleus® RTOS. A monitor periodically calculates the CPU utilization in runtime and reports it to the controller which adjusts the CPU operating point online such that the best performance is achieved with least power consumption. Our experimental results show that power savings up to 24% can be achieved just by using the proposed DVFS on a platform that supports only frequency scaling.
International Journal of Computer Applications | 2013
E. M. Saad; Abdullah Elewi; Mohamed Shalan; Medhat H. Awadalla
task mapping plays a crucial role in saving energy in asymmetric multiprocessor platforms. This paper considers the problem of energy-aware static mapping of periodic real- time dependent tasks sharing resources on asymmetric multi/many-core embedded systems. The paper extends an existing synchronization-aware bin-packing (BP) variant when the full-chip dynamic voltage and frequency scaling (DVFS) is supported by the asymmetric multicore platform. Then, the paper proposes another BP variant when DVFS is not supported. The simulation results showed that the proposed BP variant can reduce energy consumption significantly in the presence of shared resources.
international conference on pervasive and embedded computing and communication systems | 2015
Osama Mabrouk Khaled; Hoda Mohammed Hosny; Mohamed Shalan
An efficient development strategy for pervasive computing requires that the smart object manufacturers design their devices with profound facilities that can be accessible for developers. In our in-progress research, we present a high level design for smart object essential handlers. This design establishes rules and regulations for the development of pervasive computing in general and promotes for quality in pervasive systems in particular.