Sudhanva Gurumurthi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sudhanva Gurumurthi is active.

Explore More

Publication

Featured researches published by Sudhanva Gurumurthi.

international symposium on computer architecture | 2003

DRPM: dynamic speed control for power management in server class disks

Sudhanva Gurumurthi; Anand Sivasubramaniam; Mahmut T. Kandemir; Hubertus Franke

A large portion of the power budget in server environments goes into the I/O subsystem - the disk array in particular. Traditional approaches to disk power management involve completely stopping the disk rotation, which can take a considerable amount of time, making them less useful in cases where idle times between disk requests may not be long enough to outweigh the overheads. This paper presents a new approach called DRPM to modulate disk speed (RPM) dynamically, and gives a practical implementation to exploit this mechanism. Extensive simulations with different workload and hardware parameters show that DRPM can provide significant energy savings without compromising much on performance. This paper also discusses practical issues when implementing DRPM on server disks.

high-performance computer architecture | 2011

Relaxing non-volatility for fast and energy-efficient STT-RAM caches

Clinton Wills Smullen; Vidyabhushan Mohan; Anurag Nigam; Sudhanva Gurumurthi; Mircea R. Stan

Spin-Transfer Torque RAM (STT-RAM) is an emerging non-volatile memory technology that is a potential universal memory that could replace SRAM in processor caches. This paper presents a novel approach for redesigning STT-RAM memory cells to reduce the high dynamic energy and slow write latencies. We lower the retention time by reducing the planar area of the cell, thereby reducing the write current, which we then use with CACTI to design caches and memories. We simulate quad-core processor designs using a combination of SRAM- and STT-RAM-based caches. Since ultra-low retention STT-RAM may lose data, we also provide a preliminary evaluation for a simple, DRAMstyle refresh policy. We found that a pure STT-RAM cache hierarchy provides the best energy efficiency, though a hybrid design of SRAM-based L1 caches with reduced-retention STT-RAM L2 and L3 caches eliminates performance loss while still reducing the energy-delay product by more than 70%.

high-performance computer architecture | 2002

Using complete machine simulation for software power estimation: the SoftWatt approach

Sudhanva Gurumurthi; Anand Sivasubramaniam; Mary Jane Irwin; Narayanan Vijaykrishnan; Mahmut T. Kandemir

Power dissipation has become one of the most critical factors for the continued development of both high-end and low-end computer systems. We present a complete system power simulator, called SoftWatt, that models the CPU, memory hierarchy, and a low-power disk subsystem and quantifies the power behavior of both the application and operating system. This tool, built on top of the SimOS infrastructure, uses validated analytical energy models to identify the power hotspots in the system components, capture relative contributions of the user and kernel code to the system power profile, identify the power-hungry operating system services and characterize the variance in kernel power profile with respect to workload. Our results using Spec JVM98 benchmark suite emphasize the importance of complete system simulation to understand the power impact of architecture and operating system on application execution.

international symposium on computer architecture | 2007

Dynamic prediction of architectural vulnerability from microarchitectural state

Kristen R. Walcott; Greg Humphreys; Sudhanva Gurumurthi

Transient faults due to particle strikes are a key challenge in microprocessor design. Driven by exponentially increasing transistor counts, per-chip faults are a growing burden. To protect against soft errors, redundancy techniques such as redundant multithreading (RMT) are often used. However, these techniques assume that the probability that a structural fault will result in a soft error (i.e., the Architectural Vulnerability Factor (AVF)) is 100 percent, unnecessarily draining processor resources. Due to the high cost of redundancy, there have been efforts to throttle RMT at runtime. To date, these methods have not incorporated an AVF model and therefore tend to be ad hoc. Unfortunately, computing the AVF of complex microprocessor structures (e.g., the ISQ) can be quite involved. To provide probabilistic guarantees about fault tolerance, we have created a rigorous characterization of AVF behavior that can be easily implemented in hardware. We experimentally demonstrate AVF variability within and across the SPEC2000 benchmarks and identify strong correlations between structural AVF values and a small set of processor metrics. Using these simple indicators as predictors, we create a proof-of-concept RMT implementation that demonstrates that AVF prediction can be used to maintain a low fault tolerance level without significant performance impact.

dependable systems and networks | 2003

ICR: in-cache replication for enhancing data cache reliability

Wei Zhang; Sudhanva Gurumurthi; Mahmut T. Kandemir; Anand Sivasubramaniam

Processor caches already play a critical role in the performance of today’s computer systems. At the same time, the data integrity of words coming out of the caches can have serious consequences on the ability of a program to execute correctly, or even to proceed. The integrity checks need to be performed in a time-sensitive manner to not slow down the execution when there are no errors as in the common case, and should not excessively increase the power budget of the caches which is already high. ECC and parity-based protection techniques in use today fall at either extremes in terms of compromising one criteria for another, i.e., reliability for performance or vice-versa. This paper proposes a novel solution to this problem by allowing in-cache replication, wherein reliability can be enhanced without excessively slowing down cache accesses or requiring significant area cost increases. The mechanism is fairly power efficient in comparison to other alternatives as well. In particular, the solution replicates data that is in active use within the cache itself while evicting those that may not be needed in the near future. Our experiments show that a large fraction of the data read from the cache have replicas available with this optimization.

international symposium on computer architecture | 2005

Disk Drive Roadmap from the Thermal Perspective: A Case for Dynamic Thermal Management

Sudhanva Gurumurthi; Anand Sivasubramaniam; Vivek Natarajan

The importance of pushing the performance envelope of disk drives continues to grow, not just in the server market but also in numerous consumer electronics products. One of the most fundamental factors impacting disk drive design is the heat dissipation and its effect on drive reliability, since high temperatures can cause off-track errors, or even head crashes. Until now, drive manufacturers have continued to meet the 40% annual growth target of the internal data rates (IDR) by increasing RPMs, and shrinking platter sizes, both of which have counter-acting effects on the heat dissipation within a drive. As this paper shows, we are getting to a point where it is becoming very difficult to stay on this roadmap. This paper presents an integrated disk drive model that captures the close relationships between capacity, performance and thermal characteristics over time. Using this model, we quantify the drop off in IDR growth rates over the next decade if we are to adhere to the thermal envelope of drive design. We present two mechanisms for buying back some of this IDR loss with dynamic thermal management (DTM). The first DTM technique exploits any available thermal slack, between what the drive was intended to support and the currently lower operating temperature, to ramp up the RPM. The second DTM technique assumes that the drive is only designed for average case behavior, thus allowing higher RPMs than the thermal envelope, and employs dynamic throttling of disk drive activities to remain within this envelope.

IEEE Computer | 2003

Reducing disk power consumption in servers with DRPM

Sudhanva Gurumurthi; Anand Sivasubramaniam; Mahmut T. Kandemir; Hubertus Franke

Although effective techniques exist for tackling disk power for laptops and workstations, applying them in a server environment presents a considerable challenge, especially under stringent performance requirements. Using a dynamic rotations per minute approach to speed control in server disk arrays can provide significant savings in I/O system power consumption without lessening performance.

ieee international conference on high performance computing data and analytics | 2013

Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults

Vilas Sridharan; Jon Stearley; Nathan DeBardeleben; Sean Blanchard; Sudhanva Gurumurthi

Several recent publications confirm that faults are common in high-performance computing systems. Therefore, further attention to the faults experienced by such computing systems is warranted. In this paper, we present a study of DRAM and SRAM faults in large high-performance computing systems. Our goal is to understand the factors that influence faults in production settings. We examine the impact of aging on DRAM, finding a marked shift from permanent to transient faults in the first two years of DRAM lifetime. We examine the impact of DRAM vendor, finding that fault rates vary by more than 4x among vendors. We examine the physical location of faults in a DRAM device and in a data center; contrary to prior studies, we find no correlations with either. Finally, we study the impact of altitude and rack placement on SRAM faults, finding that, as expected, altitude has a substantial impact on SRAM faults, and that top of rack placement correlates with 20% higher fault rate.

architectural support for programming languages and operating systems | 2015

Memory Errors in Modern Systems: The Good, The Bad, and The Ugly

Vilas Sridharan; Nathan DeBardeleben; Sean Blanchard; Kurt Brian Ferreira; Jon Stearley; John Shalf; Sudhanva Gurumurthi

Several recent publications have shown that hardware faults in the memory subsystem are commonplace. These faults are predicted to become more frequent in future systems that contain orders of magnitude more DRAM and SRAM than found in current memory subsystems. These memory subsystems will need to provide resilience techniques to tolerate these faults when deployed in high-performance computing systems and data centers containing tens of thousands of nodes. Therefore, it is critical to understand the efficacy of current hardware resilience techniques to determine whether they will be suitable for future systems. In this paper, we present a study of DRAM and SRAM faults and errors from the field. We use data from two leadership-class high-performance computer systems to analyze the reliability impact of hardware resilience schemes that are deployed in current systems. Our study has several key findings about the efficacy of many currently deployed reliability techniques such as DRAM ECC, DDR address/command parity, and SRAM ECC and parity. We also perform a methodological study, and find that counting errors instead of faults, a common practice among researchers and data center operators, can lead to incorrect conclusions about system reliability. Finally, we use our data to project the needs of future large-scale systems. We find that SRAM faults are unlikely to pose a significantly larger reliability threat in the future, while DRAM faults will be a major concern and stronger DRAM resilience schemes will be needed to maintain acceptable failure rates similar to those found on todays systems.

international symposium on low power electronics and design | 2011

Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM)

Anurag Nigam; Clinton Wills Smullen; Vidyabhushan Mohan; Eugene Chen; Sudhanva Gurumurthi; Mircea R. Stan

Spin-Transfer Torque RAM (STT-RAM) has emerged as a potential candidate for Universal memory. However, there are two challenges to using STT-RAM in memory system design: (1) the intrinsic variation in the storage element, the Magnetic Tunnel Junction (MTJ), and (2) the high write energy. In this paper, we present a physically based thermal noise model for simulating the statistical variations of MTJs. We have implemented it in HSPICE and validated it against analytical results. We demonstrate its use in setting the write pulse width for a given write error rate. We then propose two write-energy reduction techniques. At the device level, we propose the use of a low-MS ferromagnetic material that can reduce the write energy without sacrificing retention time. At the architecture level, we show that Invert Coding provides a 7% average reduction in the total write energy for the SPEC CPU2006 benchmark suite without any performance overhead.

Explore More