Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ravishankar Rao is active.

Publication


Featured researches published by Ravishankar Rao.


international symposium on computer architecture | 2004

Synchroscalar: A Multiple Clock Domain, Power-Aware, Tile-Based Embedded Processor

John Y. Oliver; Ravishankar Rao; Paul Sultana; Jedidiah R. Crandall; Erik Czernikowski; Leslie W. Jones; Diana Franklin; Venkatesh Akella; Frederic T. Chong

We present Synchroscalar, a tile-based architecture for embedded processing that is designed to provide the flexibility of DSPs while approaching the power efficiency of ASICs. We achieve this goal by providing high parallelism and voltage scaling while minimizing control and communication costs. Specifically, Synchroscalar uses columns of processor tiles organized into statically-assigned frequency-voltage domains to minimize power consumption. Furthermore, while columns use SIMD control to minimize overhead, data-dependent computations can be supported by extremely flexible statically-scheduled communication between columns. We provide a detailed evaluation of Synchroscalar including SPICE simulation, wire and device models, synthesis of key components, cycle-level simulation, and compiler- and hand-optimized signal processing applications. We find that the goal of meeting, not exceeding, performance targets with data-parallel applications leads to designs that depart significantly from our intuitions derived from general-purpose microprocessor design. In particular, synchronous design and substantial global interconnect are desirable in the low-frequency, low-power domain. This global interconnect supports parallelization and reduces processor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Overall, Synchroscalar provides programmability while achieving power efficiencies within 8-30/spl times/ of known ASIC implementations, which is 10-60/spl times/ better than conventional DSPs. In addition, frequency-voltage scaling in Synchroscalar provides between 3-32% power savings in our application suite.


ieee international conference on high performance computing data and analytics | 2002

HLSpower: Hybrid Statistical Modeling of the Superscalar Power-Performance Design Space

Ravishankar Rao; Mark Oskin; Frederic T. Chong

As power densities increase and mobile applications become pervasive, power-aware microprocessor design has become a critical issue. We present HLSpower, a unique tool for power-aware design space exploration of superscalar processors. HLSpower is based upon HLS [OCF00], a tool which used a novel blend of statistical modeling and symbolic execution to accelerate performance modeling more than 100-1000X over conventional cycle-based simulators.In this paper, we extend the HLSmetho dology to model energy efficiency of superscalars. We validate our results against the Wattch [BTM00] cycle-based power simulator. While minor second order power effects continue to require detailed cycle-by-cycle simulation, HLSpower is useful for large-scale exploration of the significant power-performance design space. For example, we can show that the instruction cache hit rate and pipeline depth interact with power efficiency in a non-trivial way as they are varied over significant ranges. In particular, we note that, while the IPC of a superscalar increases monotonically with both optimizations, the energy efficiency does not. We highlight the design capabilities by focusing on these non-monotonic contour graphs to demonstrate how HLSpower can help build intuition in power-aware design.


computing frontiers | 2006

Tile size selection for low-power tile-based architectures

John Y. Oliver; Ravishankar Rao; Michael Brown; Jennifer Mankin; Diana Franklin; Frederic T. Chong; Venkatesh Akella

In this paper, we investigate the power implications of tile size selection for tile-based processors. We refer to this investigation as a tile granularity study. This is accomplished by distilling the architectural cost of tiles with different computational widths into a system metric we call the Granularity Indicator (GI). The GI is then compared against the communications exposed when algorithms are partitioned across multiple tiles. Through this comparison, the tile granularity that best fits a given set of algorithms can be determined, reducing the system power for that set of algorithms. When the GI analysis is applied to the Synchroscalar tile architecture[1], we find that Synchroscalars already low power consumption can be further reduced by 14% when customized for execution of the 802.11a reciever. In addition, the GI can also be a used to evaluate tile size when considering multiple applications simultaneously, providing a convenient platform for hardware-software co-design.


ieee international conference on high performance computing data and analytics | 2006

Segmented bitline cache: exploiting non-uniform memory access patterns

Ravishankar Rao; Justin Wenck; Diana Franklin; Rajeevan Amirtharajah; Venkatesh Akella

On chip caches in modern processors account for a sizable fraction of the dynamic and leakage power. Much of this power is wasted, required only because the memory cells farthest from the sense amplifiers in the cache must discharge a large capacitance on the bitlines. We reduce this capacitance by segmenting the memory cells along the bitlines, and turning off the segmenters to reduce the overall bitline capacitance. The success of this cache relies on accessing segments near the sense-amps much more often than remote segments. We show that the access pattern to the first level data and instruction cache is extremely skewed. Only a small set of cache lines are accessed frequently. We exploit this non-uniform cache access pattern by mapping the frequently accessed cache lines closer to the sense amp. These lines are isolated by segmenting circuits on the bitlines and hence dissipate lesser power when accessed. Modifications to the address decoder enable a dynamic re-mapping of cache lines to segments. In this paper, we explore the design-space of segmenting the level one data and instruction caches. Instruction and data caches show potential power savings of 10% and 6% respectively on the subset of benchmarks simulated.


PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems | 2003

Synchroscalar: initial lessons in power-aware design of a tile-based embedded architecture

John Y. Oliver; Ravishankar Rao; Paul Sultana; Jedidiah R. Crandall; Erik Czernikowski; Leslie W. Jones; Dean Copsey; Diana Keen; Venkatesh Akella; Frederic T. Chong

Embedded devices have hard performance targets and severe power and area constraints that depart significantly from our design intuitions derived from general-purpose microprocessor design. This paper describes our initial experiences in designing Synchroscalar, a tile-based embedded architecture targeted for multi-rate signal processing applications. We present a preliminary design of the Synchroscalar architecture and some design space exploration in the context of important signal processing kernels. In particular, we find that synchronous design and substantial global interconnect are desirable in the low-frequency, low-power domain. This global interconnect enables parallelization and reduces processor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Furthermore, statically-scheduled communication and SIMD computation keep control overheads low and energy efficiency high.


Journal of Embedded Computing | 2006

Synchroscalar: Evaluation of an embedded, multi-core architecture for media applications

John Y. Oliver; Ravishankar Rao; Diana Franklin; Frederic T. Chong; Venkatesh Akella


Lecture Notes in Computer Science | 2006

Segmented bitline cache : Exploiting non-uniform memory access patterns

Ravishankar Rao; Justin Wenck; Diana Franklin; Rajeevan Amirtharajah; Venkatesh Akella


Archive | 2006

Modeling and microarchitecture for low power

Fred Chong; Ravishankar Rao


Lecture Notes in Computer Science | 2005

Synchroscalar: Initial Lessons in Power-Aware Design of a Tile-Based Embedded Architecture

John Y. Oliver; Ravishankar Rao; Paul Sultana; Jedidiah R. Crandall; Erik Czernikowski; Leslie W. Jones; Dean Copsey; Diana Keen; Venkatesh Akella; Frederic T. Chong


Archive | 2004

In the 2004 International Symposium on Computer Architecture, Munich, Germany Synchroscalar: A Multiple Clock Domain, Power-Aware, Tile-Based Embedded Processor

John Y. Oliver; Ravishankar Rao; Paul Sultana; Jedidiah R. Crandall; Erik Czernikowski; Leslie W. Jones; Diana Franklin; Venkatesh Akella; Frederic T. Chong

Collaboration


Dive into the Ravishankar Rao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John Y. Oliver

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Leslie W. Jones

California Polytechnic State University

View shared research outputs
Top Co-Authors

Avatar

Paul Sultana

University of California

View shared research outputs
Top Co-Authors

Avatar

Dean Copsey

University of California

View shared research outputs
Top Co-Authors

Avatar

Diana Keen

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge