Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Raid Ayoub is active.

Publication


Featured researches published by Raid Ayoub.


design automation conference | 2009

PDRAM: a hybrid PRAM and DRAM main memory system

Gaurav Dhiman; Raid Ayoub; Tajana Simunic Rosing

In this paper, we propose PDRAM, a novel energy efficient main memory architecture based on phase change random access memory (PRAM) and DRAM. The paper explores the challenges involved in incorporating PRAM into the main memory hierarchy of computing systems, and proposes a low overhead hybrid hardware-software solution for managing it. Our experimental results indicate that our solution is able to achieve average energy savings of 30% at negligible overhead over conventional memory architectures.


design automation conference | 2013

Dynamic voltage and frequency scaling for shared resources in multicore processor designs

Xi Chen; Zheng Xu; Hyungjun Kim; Paul V. Gratz; Jiang Hu; Michael Kishinevsky; Umit Y. Ogras; Raid Ayoub

As the core count in processor chips grows, so do the on-die, shared resources such as on-chip communication fabric and shared cache, which are of paramount importance for chip performance and power. This paper presents a method for dynamic voltage/frequency scaling of networks-on-chip and last level caches in multicore processor designs, where the shared resources form a single voltage/frequency domain. Several new techniques for monitoring and control are developed, and validated through full system simulations on the PARSEC benchmarks. These techniques reduce energy-delay product by 56% compared to a state-of-the-art prior work.


international symposium on low power electronics and design | 2009

Predict and act: dynamic thermal management for multi-core processors

Raid Ayoub; Tajana Simunic Rosing

In this paper, we propose a proactive dynamic thermal management scheme for chip multiprocessors that run multi-threaded workloads. We introduce a new predictor that utilizes the band-limited property of the temperature frequency spectrum. A big advantage of our predictor is that it does not require the costly training phase like ARMA [7]. Our thermal management scheme incorporates temperature prediction information and runtime workload characterization to perform efficient thermally aware scheduling. Our results show that applying our algorithm considerably improves the average system temperature, hottest core temperature, product MTTF and performance by 6 °C, 8 °C, 41% and 72% respectively.


international symposium on low power electronics and design | 2011

OS-level power minimization under tight performance constraints in general purpose systems

Raid Ayoub; Umit Y. Ogras; Eugene Gorbatov; Yanqin Jin; Timothy Kam; Paul S. Diefenbaugh; Tajana Simunic Rosing

We propose a new DVFS algorithm for enterprise systems that elevates performance as a first order control parameter and manages frequency and voltage as a function of performance requirements. We implement our algorithm on real Intel Westmere platform in Linux and demonstrate its ability to reduce the standard deviation from target performance by more than 90% over state of the art policies while reducing average power by 17%.


design, automation, and test in europe | 2010

GentleCool : cooling aware proactive workload scheduling in multi-machine systems

Raid Ayoub; Shervin Sharifi; Tajana Simunic Rosing

In state of the art systems, workload scheduling and server fan speed operate independently leading to cooling inefficiencies. We propose GentleCool, a proactive multi-tier approach for significantly lowering the fan cooling costs without compromising the performance. Our technique manages the fan speed through intelligently allocating the workload across different machines. The experimental results show our approach delivers average cooling energy savings of 72% and improves the mean time between failures (MTBF) of the fans by 2.3X compared to the state of the art.


asia and south pacific design automation conference | 2005

A unified transformational approach for reductions in fault vulnerability, power, and crosstalk noise and delay on processor buses

Raid Ayoub; Alex Orailoglu

In this paper we propose a coding scheme for general-purpose applications that can reduce power dissipation, crosstalk noise and crosstalk delay on the bus lines while simultaneously detecting errors at run time. The reduction in power dissipation can be achieved through reducing the bus switching activity. Not only is the switching activity in individual lines reduced but so is the coupling activity across the adjacent lines, the major contributor to the overall power dissipation in deep submicron technology. Detailed analysis of crosstalk noise and delay shows that eliminating certain patterns of transitions and reducing the infeasible ones in terms of crosstalk noise and power dissipation is a feasible strategy for alleviating these problems. We propose an encoding technique consisting of the use of predefined patterns of transitions, one for each possible combination of input data, to generate the codewords. The restriction to the predefined patterns of transitions enables fast encoding and low hardware overhead. This work presents an extensive analysis of the consequent reduction in crosstalk and power. SPICE derived experimental results show a reduction in worst case crosstalk delay and noise, ranging up to 24% and 10% respectively. Extensive experimental results for various applications show significant reduction in power dissipation ranging up to 44% for switching activity on the bus lines and up to 25% for coupling activity. The results also show a drastic reduction ranging up to 98% in the number of patterns that are most likely to produce crosstalk errors.


high performance computer architecture | 2012

JETC: Joint energy thermal and cooling management for memory and CPU subsystems in servers

Raid Ayoub; Rajib Nath; Tajana Simunic Rosing

In this work we propose a joint energy, thermal and cooling management technique (JETC) that significantly reduces per server cooling and memory energy costs. Our analysis shows that decoupling the optimization of cooling energy of CPU & memory and the optimization of memory energy leads to suboptimal solutions due to thermal dependencies between CPU and memory and non-linearity in cooling energy. This motivates us to develop a holistic solution that integrates the energy, thermal and cooling management to maximize energy savings with negligible performance hit. JETC considers thermal and power states of CPU & memory, thermal coupling between them and fan speed to arrive at energy efficient decisions. It has CPU and memory actuators to implement its decisions. The memory actuator reduces the energy of memory by performing cooling aware clustering of memory pages to a subset of memory modules. The CPU actuator saves cooling energy by reducing the hot spots between and within the CPU sockets and minimizing the effects of thermal coupling. Our experimental results show that employing JETC results in 50.7% average energy reduction in cooling and memory subsystems with less than 0.3% performance overhead.


design, automation, and test in europe | 2012

TempoMP : integrated prediction and management of temperature in heterogeneous MPSoCs

Shervin Sharifi; Raid Ayoub; Tajana Simunic Rosing

Heterogeneous Multi-Processor Systems on a Chip (MPSoCs) are more complex from a thermal perspective compared to the homogeneous MPSoCs because of their inherent imbalance in power density. In this work we develop TempoMP, a new technique for thermal management of heterogeneous MPSoCs which leverages multi-parametric optimization along with our novel thermal predictor, Tempo. TempoMP is able to deliver locally optimal dynamic thermal management decisions to meet thermal constraints while minimizing power and maximizing performance. It leverages our Tempo predictor which, unlike the previous techniques, can estimate the impact of future power state changes at negligible overhead. Our experiments show that compared to the state of the art, Tempo can reduce the maximum prediction error by up to an order of magnitude. Our experiments with heterogeneous MPSoCs also show that TempoMP meets thermal constraints while reducing the average task lateness by 2.5X and energy-lateness product by 5X compared to the state of the art techniques.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

Temperature Aware Dynamic Workload Scheduling in Multisocket CPU Servers

Raid Ayoub; Krishnam Raju Indukuri; Tajana Simunic Rosing

In this paper, we propose a multitier approach for significantly lowering the cooling costs associated with fan subsystems without compromising the system performance. Our technique manages the fan speed by intelligently allocating the workload at the core level as well as at the CPU socket level. At the core level we propose a proactive dynamic thermal management scheme. We introduce a new predictor that utilizes the band-limited property of the temperature frequency spectrum. A big advantage of our predictor is that it does not require the costly training phase and still maintains high accuracy. At the socket level, we use control theoretic approach to develop a stable scheduler that reduces the cooling costs further by providing a better thermal distribution. Our thermal management scheme incorporates runtime workload characterization to perform efficient thermally aware scheduling. The experimental results show that our approach delivers an average cooling energy savings of 80% compared to the state of the art techniques. The reported results also show that our formal technique maintains stability while heuristic solutions fail in this aspect.


international symposium on low power electronics and design | 2010

Energy efficient proactive thermal management in memory subsystem

Raid Ayoub; Krishnam Raju Indukuri; Tajana Simunic Rosing

Energy management of memory subsystem is challenging due to performance and thermal constraints. Big energy gains can be obtained by clustering memory accesses, however this also leads to a higher need for cooling due to larger temperatures in active areas of memory. Our solution to memory thermal management problem is based on proactive thermal management that intelligently allocates workload pages to few memory units and powers down rest of the memory. Our experimental results show that this approach improves energy savings by 43% and reduces performance overhead by 85% with respect to the state of the art polices.

Collaboration


Dive into the Raid Ayoub's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Umit Y. Ogras

Arizona State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alex Orailoglu

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ujjwal Gupta

Arizona State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge