Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ahmed M. Shams is active.

Publication


Featured researches published by Ahmed M. Shams.


IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 2000

A novel high-performance CMOS 1-bit full-adder cell

Ahmed M. Shams; Magdy A. Bayoumi

A novel 16-transistor CMOS 1-bit full-adder cell is proposed. It uses the low-power designs of the XOR and XNOR gates, pass transistors, and transmission gates. The cell offers higher speed and lower power consumption than standard implementations of the 1-bit full-adder cell. Eliminating an inverter from the critical path accounts for its high speed, while reducing the number and magnitude of the cell capacitances, in addition to eliminating the short circuit power component, account for its low power consumption. Simulation results comparing the proposed cell to the standard implementations show its superiority. Different circuit structures and input patterns are used for simulation. Energy savings up to 30% are achieved.


IEEE Transactions on Signal Processing | 2006

NEDA: a low-power high-performance DCT architecture

Ahmed M. Shams; Archana Chidanandan; Wendi Pan; Magdy A. Bayoumi

Conventional distributed arithmetic (DA) is popular in application-specific integrated circuit (ASIC) design, and it features on-chip ROM to achieve high speed and regularity. In this paper, a new DA architecture called NEDA is proposed, aimed at reducing the cost metrics of power and area while maintaining high speed and accuracy in digital signal processing (DSP) applications. Mathematical analysis proves that DA can implement inner product of vectors in the form of twos complement numbers using only additions, followed by a small number of shifts at the final stage. Comparative studies show that NEDA outperforms widely used approaches such as multiply/accumulate (MAC) and DA in many aspects. Being a high-speed architecture free of ROM, multiplication, and subtraction, NEDA can also expose the redundancy existing in the adder array consisting of entries of 0 and 1. A hardware compression scheme is introduced to generate a butterfly structure with minimum number of additions. NEDA-based architectures for 8 /spl times/ 8 discrete cosine transform (DCT) core are presented as an example. Savings exceeding 88% are achieved, when the compression scheme is applied along with NEDA. Finite word-length simulations demonstrate the viability and excellent performance of NEDA.


asilomar conference on signals, systems and computers | 1997

A structured approach for designing low power adders

Ahmed M. Shams; Magdy A. Bayoumi

A performance analysis of a general 1-bit full adder cell is presented. The adder cell is anatomized into smaller modules using the proposed structured approach. The modules are studied extensively and several designs of each of them are shown. Connecting combinations of designs of these modules together we construct 24 different 1-bit full adder cells (some of them are novel circuits). Each of these cells exhibits different power consumption, speed, area, and driving capability figures. Some of the new cells outperform existing standard designs of the full adder cell.


international symposium on circuits and systems | 1999

Performance evaluation of 1-bit CMOS adder cells

Ahmed M. Shams; Magdy A. Bayoumi

Evaluating the performance measures of a full adder cell, like other circuits, is input pattern dependent. The issue gets more complicated when evaluating several parameters such as time delay, area, power dissipation, and correct functionality at the same time. The proposed input test pattern is based on full coverage of all possible transitions from one input pattern to another. It is composed of two parts: the first is a 56 transitions input pattern for speed measurement, followed by 9 different input patterns concatenated together for power consumption measurement. The proposed input test pattern proves the correct functionality, and produces correct time delay and power dissipation. Using this input test pattern guarantees correct and fair comparison among different full adder cells.


ieee computer society annual symposium on vlsi | 2002

A low power high performance distributed DCT architecture

Ahmed M. Shams; Wendi Pan; Archana Chidanandan; Magdy A. Bayoumi

A new distributed arithmetic architecture, NEDA, is presented in this paper. NEDA is a low power optimized architecture based on the distributed arithmetic paradigm. In addition to low power performance, NEDA offers high speed and reduced area. In NEDA, inner product computational module has been proved, mathematically, to require only additions. Moreover, minimum number of additions is used by exploiting the redundancy in the adder array. Such properties have made a NEDA unit a basic computational module for high performance DSP architectures. A case study of 8/spl times/8 DCT NEDA-based architecture is analyzed. Savings exceeding 88% are achieved for the DCT implementation.


great lakes symposium on vlsi | 1998

A new full adder cell for low-power applications

Ahmed M. Shams; Magdy A. Bayoumi

A new low power CMOS 1-bit full adder cell is presented. It is based on recent design of XOR and XNOR gates, and pass-transistors, it has 17 transistors. This cell has been compared to two widely used efficient adder cells; the transmission function full adder cell (16 transistors) and the low power adder cell (14 transistors). The new cell has no short circuit power and lower dynamic power (than the other adder cells), because of less number and magnitude of circuit capacitances. It consumes 10% to 15% less power than the other two cells. A comparative analysis (using Magic and Hspice) for 8-bit ripple carry and carry select adders shows that the adders based on the new cell can save up to 25% of power consumption.


signal processing systems | 2000

A comparative analysis for low power motion estimation VLSI architectures

Mohamed A. Elgamel; Ahmed M. Shams; Magdy A. Bayoumi

The power consumption is very critical for portable video applications. The largest portion of power is consumed in the motion estimation module, as it requires a huge amount of computations. This paper compares different full-search motion estimation architectures targeted for low power consumption. Each of the architectures is analyzed, and then compared to the others. An architectural enhancement to further reduce the power consumption is proposed. Our approach is based on further elimination of useless computations without scarifying throughput or optimality. Different benchmarks are used to test and compare the discussed architectures. Analytical and simulation results show the effectiveness of the enhancement.


international symposium on circuits and systems | 1998

A novel low-power building block CMOS cell for adders

Ahmed M. Shams; Magdy A. Bayoumi

A new low-power, high-speed CMOS 1-bit full adder cell is presented. It is based on recent designs of XOR and XNOR gates, and pass-transistors, it has 16 transistors. This cell has been compared to two widely used efficient adder cells; the transmission function full adder cell (16 transistors), and the low power adder cell (14 transistors). The new cell has no short circuit power and lower dynamic power, because of less number and magnitude of circuit capacitances. It consumes up to 21% less power than the other two cells, while it is 12% to 20% faster. A comparative analysis (using Magic and Hspice) for 8-bit ripple-carry and carry-select adders shows that the adders based on the new cell can save up to 29% of power consumption.


international symposium on circuits and systems | 2001

Enhanced low power motion estimation VLSI architectures for video compression

Mohamed A. Elgamel; Ahmed M. Shams; Xi Xueling; Magdy A. Bayoumi

Power consumption is very critical for portable video applications. During compression, the largest portion of power is consumed in the Motion Estimation part, which requires a huge amount of computation. This paper presents an architectural enhancement to reduce the power consumption during full-search block-matching (FSBM) motion estimation without sacrificing throughput or optimality. The proposed approach achieves these power savings by disabling portions of the architecture that perform unnecessary computations. A comparison between our enhancement and others is presented based on simulation and analytical analysis. Different benchmarks are used to test and compare the discussed architectures. Analytical and simulation results show the effectiveness of the enhancements.


application-specific systems, architectures, and processors | 2000

A 108 Gbps, 1.5 GHz 1D-DCT architecture

Ahmed M. Shams; Magdy A. Bayoumi

A high-performance ID-DCT architecture is proposed. It is based on the New Distributed Arithmetic Architecture algorithm (NEDA). Enhancements to NEDA are proposed to reduce the number of computations. Only addition operations are used, with 42 additions to compute the outputs for a 8/spl times/1 DCT. No subtractions, multiplications, or ROM are needed. High-throughput is achieved by pipelining the architecture. In every clock cycle, it receives eight pixels (each is 9-bits) as inputs, and produces eight DCT coefficients (each is 14-bits). The delay of one pipeline stage is the delay of a 3-level 4:2 compressor tree. The architecture is implemented in 0.35 /spl mu/m technology; it runs at 1.5 GHz, and processes 108 Gbps of image/video sequence data.

Collaboration


Dive into the Ahmed M. Shams's collaboration.

Top Co-Authors

Avatar

Magdy A. Bayoumi

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar

Mohamed A. Elgamel

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar

Wendi Pan

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar

Archana Chidanandan

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar

Archana Chandanandan

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar

Cherif Aissi

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar

Xi Xueling

University of Louisiana at Lafayette

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge