Mahmut T. Kandemir
Pennsylvania State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mahmut T. Kandemir.
IEEE Computer | 2003
Nam Sung Kim; Todd M. Austin; D. Baauw; Trevor N. Mudge; Krisztian Flautner; Jie S. Hu; Mary Jane Irwin; Mahmut T. Kandemir; Vijay Narayanan
Off-state leakage is static power, current that leaks through transistors even when they are turned off. The other source of power dissipation in todays microprocessors, dynamic power, arises from the repeated capacitance charge and discharge on the output of the hundreds of millions of gates in todays chips. Until recently, only dynamic power has been a significant source of power consumption, and Moores law helped control it. However, power consumption has now become a primary microprocessor design constraint; one that researchers in both industry and academia will struggle to overcome in the next few years. Microprocessor design has traditionally focused on dynamic power consumption as a limiting factor in system integration. As feature sizes shrink below 0.1 micron, static power is posing new low-power design challenges.
design automation conference | 2000
Wu Ye; Narayanan Vijaykrishnan; Mahmut T. Kandemir; Mary Jane Irwin
In this paper, we presen t the design and use of a comprehensiv e framework, SimplePower, for ev aluating the effect of high-level algorithmic, architectural, and compilation trade-offs on energy. An execution-driven, cycle-accurate RT lev el energy estimation tool that uses transition sensitive energy models forms the cornerstone of this framework. SimplePower also pro vides the energy consumed in the memory system and on-chip buses using analytical energy models. We presen t the use of SimplePower to evaluate the impact of a new selective gated pipeline register optimization, a high-level data transformation and a pow er-conscious post compilation optimization (register relabeling) on the datapath, memory and on-chip bus energy, respectively. We find that these three optimizations reduce the energy by 18-36% in the datapath, 62% in the memory system and 12% in the instruction cache data bus, respectively.
international symposium on computer architecture | 2003
Sudhanva Gurumurthi; Anand Sivasubramaniam; Mahmut T. Kandemir; Hubertus Franke
A large portion of the power budget in server environments goes into the I/O subsystem - the disk array in particular. Traditional approaches to disk power management involve completely stopping the disk rotation, which can take a considerable amount of time, making them less useful in cases where idle times between disk requests may not be long enough to outweigh the overheads. This paper presents a new approach called DRPM to modulate disk speed (RPM) dynamically, and gives a practical implementation to exploit this mechanism. Extensive simulations with different workload and hardware parameters show that DRPM can provide significant energy savings without compromising much on performance. This paper also discusses practical issues when implementing DRPM on server disks.
international symposium on computer architecture | 2006
Feihui Li; Chrysostomos Nicopoulos; Thomas Richardson; Yuan Xie; Vijaykrishnan Narayanan; Mahmut T. Kandemir
Long interconnects are becoming an increasingly important problem from both power and performance perspectives. This motivates designers to adopt on-chip network-based communication infrastructures and three-dimensional (3D) designs where multiple device layers are stacked together. Considering the current trends towards increasing use of chip multiprocessing, it is timely to consider 3D chip multiprocessor design and memory networking issues, especially in the context of data management in large L2 caches. The overall goal of this paper is to study the challenges for L2 design and management in 3D chip multiprocessors. Our first contribution is to propose a router architecture and a topology design that makes use of a network architecture embedded into the L2 cache memory. Our second contribution is to demonstrate, through extensive experiments, that a 3D L2 memory architecture generates much better results than the conventional two-dimensional (2D) designs under different number of layers and vertical (inter-wafer) connections. In particular, our experiments show that a 3D architecture with no dynamic data migration generates better performance than a 2D architecture that employs data migration. This also helps reduce power consumption in L2 due to a reduced number of data movements.
international symposium on computer architecture | 2000
Narayanan Vijaykrishnan; Mahmut T. Kandemir; Mary Jane Irwin; Hyun Suk Kim; Wu Ye
With the emergence of a plethora of embedded and portable applications, energy dissipation has joined throughput, area, and accuracy/precision as a major design constraint. Thus, designers must be concerned with both optimizing and estimating the energy consumption of circuits, architectures, and software. Most of the research in energy optimization and/or estimation has focused on single components of the system and has not looked across the interacting spectrum of the hardware and software. The novelty of our new energy estimation framework, SimplePower, is that it evaluates the energy considering the system as a whole rather than just as a sum of parts, and that it concurrently supports both compiler and architectural experimentation. We present the design and use of the SimplePower framework that includes a transition-sensitive, cycle-accurate datapath energy model that interfaces with analytical and transition sensitive energy models for the memory and bus subsystems, respectively. We analyzed the energy consumption of ten codes from the multidimensional array domain, a domain that is important for embedded video and signal processing systems, after applying different compiler and architectural optimizations. Our experiments demonstrate that early estimates from the SimplePower energy estimation framework can help identify the system energy hotspots and enable architects and compiler designers to focus their efforts on these areas.
design automation conference | 2001
Mahmut T. Kandemir; J. Ramanujam; J. Irwin; Narayanan Vijaykrishnan; Ismail Kadayif; A. Parikh
Optimizations aimed at improving the efficiency of on-chip memories are extremely important. We propose a compiler-controlled dynamic on-chip scratch-pad memory (SPM) management framework that uses both loop and data transformations. Experimental results obtained using a generic cost model indicate significant reductions in data transfer activity between SPM and off-chip memory.
field programmable gate arrays | 2004
Aman Gayasen; Yuh-Fang Tsai; Narayanan Vijaykrishnan; Mahmut T. Kandemir; Mary Jane Irwin; Tim Tuan
FPGAs are being increasingly used in a wide variety of applications. While power optimization has been only of secondary importance in many FPGA applications, growing importance of leakage in FPGAs designed in 90nm and below makes it imperative to treat power optimization as a first class citizen. In this paper, we propose a leakage-saving technique for FPGAs that involves dividing the FPGA fabric into small regions and switching on/off the power supply to each region using a sleep transistor in order to conserve leakage energy. Specifically, the regions not used by the placed design are supply gated. Next, we present a new placement strategy to increase the number of regions that can be supply gated. Finally, the supply gating technique is extended to exploit idleness in different parts of the same design during different time periods. Our experiments with different region sizes using various commercial and academic designs indicate that the proposed optimization outperforms conventional placement, and reduces leakage power consumption significantly.
international symposium on performance analysis of systems and software | 2013
Emre Kultursay; Mahmut T. Kandemir; Anand Sivasubramaniam; Onur Mutlu
In this paper, we explore the possibility of using STT-RAM technology to completely replace DRAM in main memory. Our goal is to make STT-RAM performance comparable to DRAM while providing substantial power savings. Towards this goal, we first analyze the performance and energy of STT-RAM, and then identify key optimizations that can be employed to improve its characteristics. Specifically, using partial write and row buffer write bypass, we show that STT-RAM main memory performance and energy can be significantly improved. Our experiments indicate that an optimized, equal capacity STT-RAM main memory can provide performance comparable to DRAM main memory, with an average 60% reduction in main memory energy.
high performance computer architecture | 2001
Victor Delaluz; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Anand Sivasubramaniam; Mary Jane Irwin
While there have been several studies and proposals for energy conservation for CPUs and peripherals, energy optimization techniques for selective operating mode control of DRAMs have not been fully explored. It has been shown that as much as 90% of overall system energy (excluding I/O) is consumed by the DRAM modules, serving as a good candidate for energy optimizations. Further; DRAM technology has also matured to provide several low energy operating modes (power modes), making it an opportunistic moment to conduct studies exploring the potential benefits of mode control techniques. This paper conducts an in-depth investigation of software and hardware techniques to avail of the DRAM mode control capabilities at a module granularity for energy savings.
high-performance computer architecture | 2002
Sudhanva Gurumurthi; Anand Sivasubramaniam; Mary Jane Irwin; Narayanan Vijaykrishnan; Mahmut T. Kandemir
Power dissipation has become one of the most critical factors for the continued development of both high-end and low-end computer systems. We present a complete system power simulator, called SoftWatt, that models the CPU, memory hierarchy, and a low-power disk subsystem and quantifies the power behavior of both the application and operating system. This tool, built on top of the SimOS infrastructure, uses validated analytical energy models to identify the power hotspots in the system components, capture relative contributions of the user and kernel code to the system power profile, identify the power-hungry operating system services and characterize the variance in kernel power profile with respect to workload. Our results using Spec JVM98 benchmark suite emphasize the importance of complete system simulation to understand the power impact of architecture and operating system on application execution.