Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maged Ghoneima is active.

Publication


Featured researches published by Maged Ghoneima.


international conference on computer aided design | 2005

Serial-link bus: a low-power on-chip bus architecture

Maged Ghoneima; Yehea I. Ismail; Muhammad M. Khellah; James W. Tschanz; Vivek De

As technology scales, the shrinking wire width increases the interconnect resistivity, while the decreasing interconnect spacing significantly increases the coupling capacitance. This paper proposes reducing the number of bus lines of the conventional parallel-line bus CB architecture by multiplexing each m-bits onto a single line. This bus architecture, the serial-link bus SLB, transforms an n-bit conventional parallel-line bus into an n/m-line (serial-link) bus. The advantage of serial-link buses is that they have fewer lines, and if the bus width is kept the same, serial- link buses will have larger line width and spacing. Increasing the line width has a twofold reduction effect on the line resistance, as the resistivity of sub-100 nm wires significantly drops as the line width increases. Also, increasing the line width and spacing reduces the coupling capacitance between adjacent lines, but increases the line-to-ground capacitance. Thus, an optimum degree of multiplexing m exists that minimizes the bus energy dissipation and maximizes the bus throughput per-unit area. The optimum degree of multiplexing for maximum throughput-per- unit-area and for minimum energy dissipation for the 25-130 nm technologies was determined in this paper. HSPICE simulations show that; for the same throughput-per-unit-area as conventional parallel-line buses, the serial-link bus architecture reduces the energy dissipation by up to 31.42% for a 64-bit bus implemented in an intermediate metal layer of a 50 nm technology and a reduction of 52.7% is projected for the 25 nm technology.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2006

Formal derivation of optimal active shielding for low-power on-chip buses

Maged Ghoneima; Yehea I. Ismail; Muhammad M. Khellah; James W. Tschanz; Vivek De

Passive shielding has been used to reduce the capacitive coupling effects of adjacent bus lines by inserting passive ground or power lines (shields) between them. Active shielding is another shielding technique in which the shield is allowed to switch depending on the switching pattern of its adjacent bus lines. This paper formally derives the optimal active shielding logic function for minimum power dissipation. It is also shown that this optimal active shielding architecture depends on the ratio of coupling to ground capacitance (/spl gamma/=C/sub c//C/sub g/). Optimal active shielding is shown to provide up to 25% reduction in bus power dissipation compared to conventional passive shielding. A suboptimal active shielding architecture with simpler hardware is also proposed. Theoretically, using the suboptimal shielding architecture leads to less than 6% bus power penalty compared to the optimal active shielding logic circuit. However, due to the simpler shield encoding circuitry, simulation results show that the suboptimal active shielding architecture leads to higher overall energy savings compared to the optimal active shielding architectures.


IEEE Transactions on Circuits and Systems | 2009

Serial-Link Bus: A Low-Power On-Chip Bus Architecture

Maged Ghoneima; Yehea I. Ismail; Muhammad M. Khellah; James W. Tschanz; Vivek De

As technology scales, the shrinking wire width increases the interconnect resistivity, while the decreasing interconnect spacing significantly increases the coupling capacitance. This paper proposes reducing the number of bus lines of the conventional parallel-line bus CB architecture by multiplexing each m-bits onto a single line. This bus architecture, the serial-link bus SLB, transforms an n-bit conventional parallel-line bus into an n/m-line (serial-link) bus. The advantage of serial-link buses is that they have fewer lines, and if the bus width is kept the same, serial- link buses will have larger line width and spacing. Increasing the line width has a twofold reduction effect on the line resistance, as the resistivity of sub-100 nm wires significantly drops as the line width increases. Also, increasing the line width and spacing reduces the coupling capacitance between adjacent lines, but increases the line-to-ground capacitance. Thus, an optimum degree of multiplexing m exists that minimizes the bus energy dissipation and maximizes the bus throughput per-unit area. The optimum degree of multiplexing for maximum throughput-per- unit-area and for minimum energy dissipation for the 25-130 nm technologies was determined in this paper. HSPICE simulations show that; for the same throughput-per-unit-area as conventional parallel-line buses, the serial-link bus architecture reduces the energy dissipation by up to 31.42% for a 64-bit bus implemented in an intermediate metal layer of a 50 nm technology and a reduction of 52.7% is projected for the 25 nm technology.


IEEE Transactions on Circuits and Systems | 2006

Reducing the Effective Coupling Capacitance in Buses Using Threshold Voltage Adjustment Techniques

Maged Ghoneima; Yehea I. Ismail; Muhammad M. Khellah; James W. Tschanz; Vivek De

This paper proposes a bus architecture which improves the performance and/or power dissipation of online buses. The proposed architecture reduces the delay on alternate lines by lowering the threshold voltage of its devices. Furthermore, the shifting of the signal switching on adjacent lines reduces the worst case coupling capacitance. Two implementations of this bus architecture are proposed, the alternate-Vt and the alternate forward body biased schemes, and are compared to a conventional bus scheme. For a flop distance of 1800 mum, the proposed schemes use the gained delay slack to reduce the total device width, and thus reducing the energy dissipation by 31.2%. For a 500-ps cycle time, the proposed bus schemes increase the maximum distance between flip-flops by 33%


international conference on computer design | 2005

A skewed repeater bus architecture for on-chip energy reduction in microprocessors

Muhammad M. Khellah; Maged Ghoneima; James W. Tschanz; Yibin Ye; Nasser A. Kurd; Javed Barkatullah; Srikanth Nimmagadda; Yehea I. Ismail; Vivek De

This paper proposes a bus architecture called skewed repeater bus (SRB) for reducing on-chip interconnect energy in microprocessors. By introducing relative delay between neighboring bus lines, SRB reduces both average and worst-case coupling capacitance between those lines. SRB is compared to previously published techniques like delayed data bus (DDB) and delayed clock bus (DCB). Simulation results in 65-nm process show that bus energy reduction of 18% is achieved when SRB is applied to a real microprocessor example, versus 11% and 7% only for DDB and DCB; respectively.


international symposium on circuits and systems | 2004

Low power coupling-based encoding for on-chip buses

Maged Ghoneima; Yehea I. Ismail

This paper proposes a low-power encoding scheme for on-chip buses. The encoding scheme is based on the relative switching between bus lines rather than the self-switching of individual lines, as the coupling capacitor dominates the total capacitance in current DSM technologies. The proposed low-power coupling-based encoding technique, CBBI, provides a reduction in power dissipation of more than 27%. A common misconception of calculating the relative switching activity coefficients is also pointed out and corrected.


international symposium on circuits and systems | 2011

A 12Gbps all digital low power SerDes transceiver for on-chip networking

Sally Safwat; Ezz El-Din O. Hussein; Maged Ghoneima; Yehea I. Ismail

In this paper, a new self-timed signaling technique for reliable low-power on-chip SerDes (Serialization and DeSerialization) links is presented. The transmitter serializes 8 parallel bits at 1.5GHz, and multiplexes the 12Gbps serial data stream with a 24GHz clock on a single line using three level signaling. This new signaling technique enables the receiver to recover the clock from the data with a simple phase detector circuitry. Moreover, this technique is insensitive to jitter accumulated during signal propagation or at the receiver input because the clock signal is extracted from the multiplexed data stream. Hence, timing errors in the received signal reflects in both the data and the extracted clock, and the data will be sampled correctly. The SerDes transceiver was implemented for a 3mm long lossy on-chip differential transmission line in 65nm TSMC CMOS technology. A primary advantage of building an all digital SerDes transceiver is the ease of scaling with technology, and the power and area reduction. The total power consumed in the Tx/Rx pair with the transmission line is 15.5mWatt, which is very small as compared to similar published signaling architectures.


international symposium on quality electronic design | 2006

Reducing the Data Switching Activity on Serial Link Buses

Maged Ghoneima; Yehea I. Ismail; Muhammad M. Khellah; Vivek De

On-chip serial link buses have been previously proposed as a strong solution to reduce the complexity and/or the energy dissipation of on-chip interconnect fabrics. However, it was noticed that serializing m-bits on a single interconnect (serial-link) increases the overall data switching activity. This paper presents a quantitative analysis of the switching activity of serial links, and provides closed form expressions for the average activity factors. Two transition encoding schemes, to reduce the activity factor of serial links, are discussed and analyzed. The impact of the encoding schemes on the MCF between neighboring interconnects is also discussed. The analysis shows that both of the schemes provide significant reduction in the average activity factor and energy dissipation reduction, but each in a different range of input activity factors. The two encoding bus schemes were modeled in a 70nm CMOS technology, and compared to an unencoded serial link bus and a parallel line bus. Simulation results show that the transition encoded bus schemes reduce the overall energy dissipation of the unencoded serial link bus by up to 96%


international symposium on circuits and systems | 2004

Effect of relative delay on the dissipated energy in coupled interconnects

Maged Ghoneima; Yehea I. Ismail

This paper presents a comprehensive qualitative and analytical analysis of the effect of relative delay on the dissipated energy due to relative switching of on-chip buses. A closed form expression modeling the effect of relative delay on the dissipated energy is also presented. Introducing a relative delay between signals on adjacent bus lines is found to only affect the switching cases in which the neighboring lines switch in the same or opposite direction. The delay is shown to provide a reduction up to 50% in the energy dissipation due to relative switching of the worst switching case. This observation can be implemented in low-power on-chip bus schemes.


international symposium on low power electronics and design | 2006

Time-borrowing multi-cycle on-chip interconnects for delay variation tolerance

Keith A. Bowman; James W. Tschanz; Muhammad M. Khellah; Maged Ghoneima; Yehea I. Ismail; Vivek De

Insertion of time-borrowing (TB) flip-flops in multi-cycle repeater-based on-chip interconnects enables significant improvements in mean performance and energy by averaging systematic and random within-die (WID) delay variations across multiple interconnect segments. A statistically-based analytical model is derived to design a TB N-cycle interconnects with optimal delay variation tolerance. The model elucidates the dependency of the transparency window required to achieve data delay averaging on the delay variation mismatch between interconnect segments. Statistical circuit simulations and analyses in a 65nm process technology demonstrate that TB multi-cycle interconnects enable a 4-6% mean maximum clock frequency (FMAX) improvement and a corresponding 10% average energy savings over optimally designed multi-cycle interconnects with conventional master-slave flip-flops. The maximum mean FMAX benefit ranges from 4.0-7.5%, corresponding to approximately a bin-split shift in the FMAX distribution. For 1.41X larger WID delay variations, the maximum mean FMAX gain rises to 5-10%

Collaboration


Dive into the Maged Ghoneima's collaboration.

Top Co-Authors

Avatar

Yehea I. Ismail

American University in Cairo

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sally Safwat

Northwestern University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge