Aaron Grenat
Advanced Micro Devices
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aaron Grenat.
international solid-state circuits conference | 2014
Aaron Grenat; Sanjay Pant; Ravinder Rachala; Samuel Naffziger
In high-performance microprocessor cores, the on-die supply voltage seen by the transistors is non-ideal and exhibits significant fluctuations. These supply fluctuations are caused by sudden changes in the current consumed by the microprocessor in response to variations in workloads. This non-ideal supply can cause performance degradation or functional failures. Therefore, a significant amount of margin (10-15%) needs to be added to the ideal voltage (if there were no AC voltage variations) to ensure that the processor always executes correctly at the committed voltage-frequency points. This excess voltage wastes power proportional to the square of the voltage increase.
IEEE Journal of Solid-state Circuits | 2015
Kathryn Wilcox; Robert Cole; Harry R. Fair; Kevin Gillespie; Aaron Grenat; Carson Henrion; Ravi Jotwani; Stephen Kosonocky; Benjamin Munger; Samuel Naffziger; Robert S. Orefice; Sanjay Pant; Donald A. Priore; Ravinder Rachala; Jonathan White
This work describes the physical design implementation of the AMD “Steamroller” module and adaptive clocking system that are both integral pieces of the AMD Kaveri APU SoC which was implemented using a 28 nm high-K metal gate Bulk CMOS process. The Steamroller module occupies 29.47 mm 2 and contains 236 million transistors. Various aspects of the core design are covered including the power and timing methodologies as well as design challenges moving from 32 nm SOI to 28 nm Bulk CMOS. Adaptive clocking, one of the key features used for core power efficiency, is described in detail.
international solid-state circuits conference | 2016
Aaron Grenat; Sriram Sundaram; Stephen Kosonocky; Ravinder Rachala; Sriram Sambamurthy; Steven Liepe; Miguel Rodriguez; Tom Burd; Adam Clark; Michael Austin; Samuel Naffziger
Power-management techniques can be effective at squeezing more performance and energy efficiency out of mature SoCs. Vmax reliability limits, infrastructure limits, guard-bands, aging, and thermal limits all put restrictions on performance. This paper describes five power-management techniques that provide a net performance increase of up to 15%, depending on the application and TDP of the SoC, on “Bristol Ridge”, a 28nm CMOS dual-core x86 APU.
international conference on vlsi design | 2016
Sriram Sundaram; Sriram Samabmurthy; Michael Austin; Aaron Grenat; Michael Golden; Stephen Kosonocky; Samuel Naffziger
A high bandwidth critical path accumulator (1 sample/4GHz) capable of providing accurate timing margin information is reported. We present an adaptive voltage mechanism using these critical path accumulators that improves upon existing approaches by: (1) enabling replica paths to function as a statistical sample of the full set of Fmax limiting paths resulting in improved tracking, and (2) explicit disambiguation of the voltage impact on delay from the intrinsic circuit speed by coupling path margin assessments with a voltage reading from the integrated power supply monitors. This scheme is implemented in 28nm leading generation CPU core and is shown to track Fmax accurately with a standard deviation <;2% (~1 FO2 delay) across a large range of process, voltage and temperature. Core power is reduced to the tune of 7-20% (at same performance) across various processor states.
IEEE Journal of Solid-state Circuits | 2017
Sriram Sundaram; Aaron Grenat; Samuel Naffziger; Tom Burd; Stephen Kosonocky; Steven Liepe; Ravinder Rachala; Miguel Rodriguez; Michael Austin; Sriram Sambamurthy
Power management techniques can be effective at extracting more performance and energy efficiency out of mature systems on chip (SoCs). For instance, the peak performance of microprocessors is often limited by worst case technology (Vmax), infrastructure (thermal/electrical), and microprocessor usage assumptions. Performance/watt of microprocessors also typically suffers from guard bands associated with the test and binning processes as well as worst case aging/lifetime degradation. Similarly, on multicore processors, shared voltage rails tend to limit the peak performance achievable in low thread count workloads. In this paper, we describe five power management techniques that maximize the per-part performance under the before-mentioned constraints. Using these techniques, we demonstrate a net performance increase of up to 15% depending on the application and TDP of the SoC, implemented on “Bristol Ridge,” a 28-nm CMOS, dual-core ×86 accelerated processing unit.
international symposium on low power electronics and design | 2016
Sriram Sundaram; Warren He; Sriram Sambamurthy; Aaron Grenat; Steven Liepe; Samuel Naffziger
This paper describes a unified power-frequency model (UPFM) which combines analytical and empirical approaches to ensure a high degree of modeling flexibility and accuracy to measured silicon (Si) results. On one end, System-on-a-Chip (SoC) design teams focus the bulk of their efforts on using detailed low-level models to verify power consumption. Such models are available late in the design cycle, and often limited in number of workloads that can be evaluated. On the other end, FPGA-based modeling and spreadsheet approaches that operate on higher-level abstraction have been proposed. However these are often limited by poor correlation to measured Si results. In addition, extant models typically focus on power projection or prediction of Si speed but not both. A unified approach is much needed since SOCs today have to meet stringent power and performance constraints simultaneously. The proposed UPFM model overcomes these limitations. First actual measured Si results serve as the empirical baseline foundation for projections so that simulated vs. measured differences can be calibrated. Second, each IP is analytically modeled using a large number of relevant parameters. This high level of abstraction allows for the model to be useful from early design cycle all the way to the mature phase (parameters get refined over time). Also, wide-ranging parameters have been carefully chosen (and improved over multiple product generations) so that accuracy is not sacrificed. We demonstrate UPFM as a comprehensive framework where technology, architecture and infrastructure (test/thermal) choices can be modeled with high accuracy and drive optimal perf-per-watt SoC designs.
Archive | 2010
Jeremy Schreiber; Aaron Grenat
Archive | 2013
Michael J. Osborn; Michael J. Tresidder; Aaron Grenat; Joseph Kidd; Priyank Parakh; Steven J. Kommrusch
Archive | 2010
Michael J. Osborn; Michael J. Tresidder; Aaron Grenat; Joseph Kidd; Priyank Parakh; Steven J. Kommrusch
Archive | 2015
Aaron Grenat; Robert A. Hershberger; Sriram Sambamurthy; Samuel Naffziger; Christopher Eric Tressler; Sho-Chien Kang; Joseph Shannon; Krishna Sai Bernucho; Ashwin Chincholi; Michael Austin; Steven Liepe; Umair B. Cheema