Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Amr Suleiman is active.

Publication


Featured researches published by Amr Suleiman.


signal processing systems | 2014

Energy-efficient HOG-based object detection at 1080HD 60 fps with multi-scale support

Amr Suleiman; Vivienne Sze

In this paper, we present a real-time and energy-efficient multi-scale object detector using Histogram of Oriented Gradient (HOG) features and Support Vector Machine (SVM) classification. Parallel detectors with balanced workload are used to enable processing of multiple scales and increase the throughput such that voltage scaling can be applied to reduce energy consumption. Image pre-processing is also introduced to further reduce power and area cost of the image scales generation. This design can operate on high definition 1080HD video at 60 fps in real-time with a clock rate of 270 MHz, and consumes 45.3 mW (0.36 nJ/pixel) based on post-layout simulations. The ASIC has an area of 490 kgates and 0.538 Mbit on-chip memory in a 45nm SOI CMOS process.


signal processing systems | 2016

An Energy-Efficient Hardware Implementation of HOG-Based Object Detection at 1080HD 60 fps with Multi-Scale Support

Amr Suleiman; Vivienne Sze

A real-time and energy-efficient multi-scale object detector hardware implementation is presented in this paper. Detection is done using Histogram of Oriented Gradients (HOG) features and Support Vector Machine (SVM) classification. Multi-scale detection is essential for robust and practical applications to detect objects of different sizes. Parallel detectors with balanced workload are used to increase the throughput, enabling voltage scaling and energy consumption reduction. Image pre-processing is also introduced to further reduce power and area costs of the image scales generation. This design can operate on high definition 1080HD video at 60 fps in real-time with a clock rate of 270 MHz, and consumes 45.3 mW (0.36 nJ/pixel) based on post-layout simulations. The ASIC has an area of 490 kgates and 0.538 Mbit on-chip memory in a 45 nm SOI CMOS process.


IEEE Transactions on Circuits and Systems | 2014

Model Predictive Control Equalization for High-Speed I/O Links

Amr Suleiman; Ranko Sredojevic; Vladimir Stojanovic

In this work, we formulate a new, nonlinear, and time-variant transmitter equalization method based on the Model Predictive Control (MPC) algorithm. MPC is a class of control algorithms in which the current control action is obtained by solving, perhaps approximately, an online open-loop optimal control problem. One important advantage of the MPC in peak-power constrained link environment is its ability to cope with hard constraints on controls and states. Knowing the state of the channel enables a very fine nonlinear equalization. We utilize this flexibility to create various MPC formulations that control the entire eye-mask, receive signal dynamic range as well as the required quantization. Our MPC equalization significantly outperforms traditional transmitter techniques such as linear feed-forward and Tomlinson-Harashima equalizers, and gets very close to the optimized decision-feedback equalization at lower transmitter resolutions. We also describe the possible complexity reduction techniques that enable efficient implementation of our MPC algorithm in hardware.


international conference on microelectronics | 2010

ASIC Implementation of Cairo University SPARC “CUSPARC” embedded processor

Amr Suleiman; Alhassan F. Khedr; S. E.-D. Habib

Cairo University SPARC “CUSPARC” processor is an IP embedded processor core conforming to SPARC V8 ISA. CUSPARC is fully developed at Cairo University and is the first Egyptian processor. In this paper, the ASIC Implementation and Verification of the CUSPARC processor is described at 130nm technology node. CUSPARC scores a typical clock frequency of 260MHz, power dissipation of 0.11 mW/MHz and power Efficiency of 8.78 DMIPS/mW, which makes it very suitable for embedded and real-time systems.


international symposium on circuits and systems | 2017

Towards closing the energy gap between HOG and CNN features for embedded vision (Invited paper)

Amr Suleiman; Yu-Hsin Chen; Joel S. Emer; Vivienne Sze

Computer vision enables a wide range of applications in robotics/drones, self-driving cars, smart Internet of Things, and portable/wearable electronics. For many of these applications, local embedded processing is preferred due to privacy and/or latency concerns. Accordingly, energy-efficient embedded vision hardware delivering real-time and robust performance is crucial. While deep learning is gaining popularity in several computer vision algorithms, a significant energy consumption difference exists compared to traditional hand-crafted approaches. In this paper, we provide an in-depth analysis of the computation, energy and accuracy trade-offs between learned features such as deep Convolutional Neural Networks (CNN) and hand-crafted features such as Histogram of Oriented Gradients (HOG). This analysis is supported by measurements from two chips that implement these algorithms. Our goal is to understand the source of the energy discrepancy between the two approaches and to provide insight about the potential areas where CNNs can be improved and eventually approach the energy-efficiency of HOG while maintaining its outstanding performance accuracy.


symposium on vlsi circuits | 2016

A 58.6mW real-time programmable object detector with multi-scale multi-object support using deformable parts model on 1920×1080 video at 30fps

Amr Suleiman; Zhengdong Zhang; Vivienne Sze

This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2× higher accuracy than traditional rigid body models. With 8 deformable parts detection, three methods are used to address the high computational complexity: classification pruning for 33× fewer parts classification, vector quantization for 15× memory size reduction, and feature basis projection for 2× reduction of the cost of each classification. The chip is implemented in 65nm CMOS technology, and can process HD (1920×1080) images at 30fps without any off-chip storage while consuming only 58.6mW (0.94nJ/pixel, 1168 GOPS/W). The chip has two classification engines to simultaneously detect two different classes of objects. With a tested high throughput of 60fps, the classification engines can be time multiplexed to detect even more than two object classes. It is energy scalable by changing the pruning factor or disabling the parts classification.


Sze | 2014

Energy-Efficient HOG-based Object Detection at 1080HD 60 fps with Multi-Scale Support

Amr Suleiman; Vivienne Sze


arXiv: Robotics | 2018

Navion: A 2mW Fully Integrated Real-Time Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones.

Amr Suleiman; Zhengdong Zhang; Luca Carlone; Sertac Karaman; Vivienne Sze


Prof. Sze | 2018

Navion: A Fully Integrated Energy-Efficient Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones

Amr Suleiman; Zhengdong Zhang; Luca Carlone; Sertac Karaman; Vivienne Sze


custom integrated circuits conference | 2017

Hardware for machine learning: Challenges and opportunities

Vivienne Sze; Yu-Hsin Chen; Joel Einer; Amr Suleiman; Zhengdong Zhang

Collaboration


Dive into the Amr Suleiman's collaboration.

Top Co-Authors

Avatar

Vivienne Sze

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Zhengdong Zhang

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yu-Hsin Chen

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Joel S. Emer

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Luca Carlone

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Sertac Karaman

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Joel Einer

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Ranko Sredojevic

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Vladimir Stojanovic

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge