Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vinay Hanumaiah is active.

Publication


Featured researches published by Vinay Hanumaiah.


IEEE Transactions on Computers | 2014

Energy-Efficient Operation of Multicore Processors by DVFS, Task Migration, and Active Cooling

Vinay Hanumaiah; Sarma B. K. Vrudhula

Energy efficiency has taken center stage in all aspects of computing, regardless of whether it is performed on a portable battery-powered device, a desktop PC, on servers in a data center, or on a supercomputer. It is expressed as performance-per-watt (PPW), which is equal to the number of instructions that are executed per Joule of energy. The shift to multicore processors, with tens or hundreds of cores on a single die requires that the operation of the cores be dynamically controlled to maximize the processors overall energy efficiency. This paper presents a unified formulation and an efficient solution for this problem. The solution considers dynamic frequency and voltage scaling, thread migration, and active cooling as the means to control the cores. The solution method is efficient for a real-time implementation. The formulation includes accurate power and thermal models, temperature constraints, and accounts for the dependence of leakage power and circuit delay on temperature. The PPW metric is extended to Pα PW (performanceα-per-watt), which allows examining the tradeoffs between optimizing for performance versus optimizing for energy by varying . Simulation experiments assuming a four-core processor demonstrate that the derived control strategy can achieve 3.2× greater energy efficiency (i.e., executes more than three times the number of instructions per Joule) over the performance-optimal solution. The formulation and the efficiency of the solution method also allows for fast design space exploration. Specifically, it is shown how simply increasing the number of cores in a processor can significantly diminish its energy efficiency, and that there is an optimal number of cores that maximize the PPW. This number depends on the ratio of how much the power of an individual core is reduced by scaling, i.e., as the number of cores are increased. Finally, the proposed method is implemented on a quad-core Intel Sandy Bridge processor, and verified by running benchmarks. The experiments suggest that the proposed method results in an improvement of 37 percent over the current state-of-the-art energy-efficient schemes.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

Performance Optimal Online DVFS and Task Migration Techniques for Thermally Constrained Multi-Core Processors

Vinay Hanumaiah; Sarma B. K. Vrudhula; Karam S. Chatha

Extracting high performance from multi-core processors requires increased use of thermal management techniques. In contrast to offline thermal management techniques, online techniques are capable of sensing changes in the workload distribution and setting the processor controls accordingly. Hence, online solutions are more accurate and are able to extract higher performance than the offline techniques. This paper presents performance optimal online thermal management techniques for multicore processors. The techniques include dynamic voltage and frequency scaling and task-to-core allocation or task migration. The problem formulation includes accurate power and thermal models, as well as leakage dependence on temperature. This paper provides a theoretical basis for deriving the optimal policies and computationally efficient implementations. The effectiveness of our DVFS and task-to-core allocation techniques are demonstrated by numerical simulations. The proposed task-to-core allocation method showed a 20.2% improvement in performance over a power-based thread migration approach. The techniques have been incorporated in a thermal-aware architectural-level simulator called MAGMA that allows for design space exploration, offline, and online dynamic thermal management. The simulator is capable of handling simulations of hundreds of cores within reasonable time.


international conference on computer aided design | 2009

Maximizing performance of thermally constrained multi-core processors by dynamic voltage and frequency control

Vinay Hanumaiah; Sarma B. K. Vrudhula; Karam S. Chatha

In this paper a precise formulation of the problem of minimizing the maximum completion time of tasks on a multi-core processor, subject to thermal constraints is presented. The power model used in this work, accounts for the leakage dependence on temperature, while the thermal model is based on the HotSpot model. The general problem is shown to be a non-linear optimization problem that includes cyclic constraints between temperature and power. The derived policy of dynamic frequency and voltage control results in a performance improvement of 19.6% over an optimal policy which performs speed-only control.


IEEE Transactions on Computers | 2012

Temperature-Aware DVFS for Hard Real-Time Applications on Multicore Processors

Vinay Hanumaiah; Sarma B. K. Vrudhula

This paper addresses the problem of determining the feasible speeds and voltages of multicore processors with hard real-time and temperature constraints. This is an important problem, which has applications in time-critical execution of programs like audio and video encoding on application-specific embedded processors. Two problems are solved. The first is the computation of the optimal time-varying voltages and speeds of each core in a heterogeneous multicore processor, that minimize the makespan-the latest completion time of all tasks, while satisfying timing and temperature constraints. The solution to the makespan minimization problem is then extended to the problem of determining the feasible speeds and voltages that satisfy task deadlines. The methods presented in this paper also provide a theoretical basis and analytical relations between speed, voltage, power and temperature, which provide greater insight into the early-phase design of processors and are also useful for online dynamic thermal management.


design automation conference | 2009

Throughput optimal task allocation under thermal constraints for multi-core processors

Vinay Hanumaiah; Ravishankar Rao; Sarma B. K. Vrudhula; Karam S. Chatha

It is known that temperature gradients and thermal hotspots affect the reliability of microprocessors. Temperature is also an important constraint when maximizing the performance of processors. Although DVFS and DFS can be used to extract higher performance from temperature and power constrained single core processors, the full potential of multi-core performance cannot be exploited without the use of thread migration or task-to-core allocation schemes. In this paper, we formulate the problem of throughput-optimal task allocation on thermally constrained multi-core processors, and present a novel solution that includes optimal speed throttling. We show that the algorithms are implementable in real time and can be implemented in operating systems dynamic scheduling policy. The method presented here can result in a significant improvement in throughput over existing methods (5X over a naive scheme).


international conference on computer communications | 2012

Optimal range assignment in solar powered active wireless sensor networks

Benjamin Gaudette; Vinay Hanumaiah; Sarma B. K. Vrudhula; Marwan Krunz

Energy harvesting in a sensor network is essential in situations where it is either difficult or not cost effective to access the networks nodes to replace the batteries. In this paper, we investigate the problems involved in controlling an active wireless sensor network that is powered both by rechargeable batteries and solar energy. The objective of this control is to maximize the networks quality of coverage (QoC), defined as the minimum number of targets that must be covered over a 24-hour period. Assuming a time varying solar profile, the problem is to optimally control the sensing range of each sensor so as to maximize the QoC. Implicit in the solution is the dynamic allocation of solar energy during the day to sensing tasks and to recharging the battery so that minimum coverage is guaranteed even during the night, when only the batteries can supply energy to the sensors. The problem turns out to be a nonlinear optimal control problem of high complexity. Exploiting the specific structure of the problem, we present a method to solve it as a series of quasiconvex (unimodal) optimization problems. The runtime of the proposed solution is 60X less than a naive method that is based on dynamic programming, while its worst-case error is less than 8%. Unlike the dynamic programming method, the proposed method is scalable to large networks consisting of hundreds of sensors and targets. This paper also offers several insights in the design of energy-harvesting networks, which result in minimum network setup cost through the determination of the optimal configuration of the number of sensors and the sampling time.


design, automation, and test in europe | 2011

Reliability-aware thermal management for hard real-time applications on multi-core processors

Vinay Hanumaiah; Sarma B. K. Vrudhula

Advances in chip-multiprocessor processing capabilities have led to an increased power consumption and temperature hotspots. Reducing the on-die peak temperature is important from the power reduction and reliability considerations. However, the presence of task deadlines constrain the reduction of peak temperature and thus complicates the determination of optimal speeds for minimizing the peak temperature. We formulate the determination of optimal speeds for minimizing the peak temperature of execution with task deadlines as a quasiconvex optimization problem. This formulation includes accurate power and thermal models with the leakage power dependency on temperature. Experiments demonstrate that our approach is very flexible in adapting to various scenarios of workload and deadline specifications. We obtained an 8°C reduction in peak temperature for a sample execution of benchmarks.


ACM Transactions in Embedded Computing Systems | 2014

STEAM: A Smart Temperature and Energy Aware Multicore Controller

Vinay Hanumaiah; Digant Desai; Benjamin Gaudette; Carole Jean Wu; Sarma B. K. Vrudhula

Recent empirical studies have shown that multicore scaling is fast becoming power limited, and consequently, an increasing fraction of a multicore processor has to be under clocked or powered off. Therefore, in addition to fundamental innovations in architecture, compilers and parallelization of application programs, there is a need to develop practical and effective dynamic energy management (DEM) techniques for multicore processors. Existing DEM techniques mainly target reducing processor power consumption and temperature, and only few of them have addressed improving energy efficiency for multicore systems. With energy efficiency taking a center stage in all aspects of computing, the focus of the DEM needs to be on finding practical methods to maximize processor efficiency. Towards this, this article presents STEAM -- an optimal closed-loop DEM controller designed for multicore processors. The objective is to maximize energy efficiency by dynamic voltage and frequency scaling (DVFS). Energy efficiency is defined as the ratio of performance to power consumption or performance-per-watt (PPW). This is the same as the number of instructions executed per Joule. The PPW metric is actually replaced by PαPW (performanceα-per-Watt), which allows for controlling the importance of performance versus power consumption by varying α. The proposed controller was implemented on a Linux system and tested with the Intel Sandy Bridge processor. There are three power management schemes called governors, available with Intel platforms. They are referred to as (1) Powersave (lowest power consumption), (2) Performance (achieves highest performance), and (3) Ondemand. Our simple and lightweight controller when executing SPEC CPU2006, PARSEC, and MiBench benchmarks have achieved an average of 18% improvement in energy efficiency (MIPS/Watt) over these ACPI policies. Moreover, STEAM also demonstrated an excellent prediction of core temperatures and power consumption, and the ability to control the core temperatures within 3ˆC of the specified maximum. Finally, the overhead of the STEAM implementation (in terms of CPU resources) is less than 0.25%. The entire implementation is self-contained and can be installed on any processor with very little prior knowledge of the processor.


great lakes symposium on vlsi | 2011

A new balanced 4-moduli set {2 k , 2 n - 1, 2 n + 1, 2 n+1 -1} and its reverse converter design for efficient fir filter implementation

Gayathri Chalivendra; Vinay Hanumaiah; Sarma B. K. Vrudhula

This paper presents a new four moduli residue number system of the form {2<i><sup>k</sup></i>, 2<i><sup>n</sup></i>-1, 2<i><sup>n+1</sup></i>-1}, <i>n</i> d <i>k</i> d 2<i>n</i>, which is an enhancement of the popular four-moduli set {2<i><sup>n</sup></i>,2<i><sup>n</sup></i>-1,2<i><sup>n</sup></i>,2<i><sup>n+1</sup></i>-1} (for even <i>n</i>). Our k-mod4 moduli set achieves a higher dynamic range and a better balancing of the binary channels. Using the proposed k-mod4 moduli set helps in reducing the hardware complexity of arithmetic circuits compared with other four-moduli sets for the same performance. Additionally, we provide a reverse converter design, whose hardware complexity and performance are shown to be better than the existing reverse converters for the same dynamic range. Experimental results comparing RNS multiply and accumulate units implemented using the proposed four-moduli set with the state-of-the-art balanced four-moduli sets, show large improvements in area (46%) and power (43%) reduction for various dynamic ranges. This makes our k-mod4 moduli set ideal for digital filters implementation.


design, automation, and test in europe | 2009

Performance optimal speed control of multi-core processors under thermal constraints

Vinay Hanumaiah; Sarma B. K. Vrudhula; Karam S. Chatha

Advances in chip-multiprocessor processing capabilities has led to an increased power consumption and temperature hotspots. Maintaining the on-chip temperature is important from the power reduction and reliability considerations. Achieving highest performance while maintaining the temperature constraint is a challenge. We develop analytical solutions for the optimal control of frequencies for each core in a chip-multiprocessor. The objective is to reduce the makespan or the latest task completion time of all tasks. We show that the optimal frequency policy is bang-bang when the temperature constraint is not active and is exponential when the temperature constraint is active. We show that there is a significant improvement in overall throughput with our proposed solution and yet all cores operate under the thermal maximum.

Collaboration


Dive into the Vinay Hanumaiah's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Carole Jean Wu

Arizona State University

View shared research outputs
Top Co-Authors

Avatar

Digant Desai

Arizona State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge