Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gregory Scott Still.
international solid-state circuits conference | 2014
Zeynep Toprak-Deniz; Michael A. Sperling; John F. Bulzacchelli; Gregory Scott Still; Ryan Kruse; Seongwon Kim; David William Boerstler; Tilman Gloekler; Raphael Robertazzi; Kevin Stawiasz; Timothy Diemoz; George English; David T. Hui; Paul Muench; Joshua Friedrich
Integrated voltage regulator modules (iVRMs) [1] provide a cost-effective path to realizing per-core dynamic voltage and frequency scaling (DVFS), which can be used to optimize the performance of a power-constrained multi-core processor. This paper presents an iVRM system developed for the POWER8™ microprocessor, which functions as a very fast, accurate low-dropout regulator (LDO), with 90.5% peak power efficiency (only 3.1% worse than an ideal LDO). At low output voltages, efficiency is reduced but still sufficient to realize beneficial energy savings with DVFS. Each iVRM features a bypass mode so that some of the cores can be operated at maximum performance with no regulator loss. With the iVRM area including the input decoupling capacitance (DCAP) (but not the output DCAP inherent to the cores), the iVRMs achieve a power density of 34.5W/mm2, which exceeds that of inductor-based or SC converters by at least 3.4× [2].
IEEE Journal of Solid-state Circuits | 2015
Eric Fluhr; Steve Baumgartner; David William Boerstler; John F. Bulzacchelli; Timothy Diemoz; Daniel M. Dreps; George English; Joshua Friedrich; Anne E. Gattiker; Tilman Gloekler; Christopher J. Gonzalez; Jason D. Hibbeler; Keith A. Jenkins; Yong Kim; Paul Muench; Ryan Nett; Jose Angel Paredes; Juergen Pille; Donald W. Plass; Phillip J. Restle; Raphael Robertazzi; David Shan; David W. Siljenberg; Michael A. Sperling; Kevin Stawiasz; Gregory Scott Still; Zeynep Toprak-Deniz; James D. Warnock; Glen A. Wiedemeier; Victor Zyuban
POWER8™ is a 12-core processor fabricated in IBMs 22 nm SOI technology with core and cache improvements driven by big data applications, providing 2.5× socket performance over POWER7+™. Core throughput is supported by 7.6 Tb/s of off-chip I/O bandwidth which is provided by three primary interfaces, including two new variants of Elastic Interface as well as embedded PCI Gen-3. Power efficiency is improved with several techniques. An on-chip controller based on an embedded PowerPC™ 405 processor applies per-core DVFS by adjusting DPLLs and fully integrated voltage regulators. Each voltage regulator is a highly distributed system of digitally controlled microregulators, which achieves a peak power efficiency of 90.5%. A wide frequency range resonant clock design is used in 13 clock meshes and demonstrates a minimum power savings of 4%. Power and delay efficiency is achieved through the use of pulsed-clock latches, which require statistical validation to ensure robust yield.
international conference on ic design and technology | 2014
Joshua Friedrich; Hung Q. Le; William J. Starke; Jeff Stuechli; Balaram Sinharoy; Eric Fluhr; Daniel M. Dreps; Victor Zyuban; Gregory Scott Still; Christopher J. Gonzalez; David Hogenmiller; Frank Malgioglio; Ryan Nett; Ruchir Puri; Phillip J. Restle; David Shan; Zeynep Toprak Deniz; Dieter Wendel; Matthew M. Ziegler; Dave Victor
POWER8™ delivers a data-optimized design suited for analytics, cognitive workloads, and todays exploding data sizes. The design point results in a 2.5x performance gain over its predecessor, POWER7+™, for many workloads. In addition, POWER8 delivers the efficiency demanded by cloud computing models and also represents a first step toward creating an open ecosystem for server innovation.
Ibm Journal of Research and Development | 2015
Victor Zyuban; Joshua Friedrich; Daniel M. Dreps; Jürgen Pille; Donald W. Plass; Phillip J. Restle; Z. T. Deniz; M. M. Ziegler; S. Chu; Saiful Islam; James D. Warnock; R. Philhower; R. M. Rao; Gregory Scott Still; D. W. Shan; Eric Fluhr; Jose Angel Paredes; Dieter Wendel; Christopher J. Gonzalez; D. Hogenmiller; Ruchir Puri; S. A. Taylor; S. D. Posluszny
The IBM POWER8™ processor is a 649-mm
international solid-state circuits conference | 2014
Eric Fluhr; Joshua Friedrich; Daniel M. Dreps; Victor Zyuban; Gregory Scott Still; Christopher J. Gonzalez; Allen Hall; David Hogenmiller; Frank Malgioglio; Ryan Nett; Jose Angel Paredes; Juergen Pille; Donald W. Plass; Ruchir Puri; Phillip J. Restle; David Shan; Kevin Stawiasz; Zeynep Toprak Deniz; Dieter Wendel; Matt Ziegler
^{2}
international solid-state circuits conference | 2014
Phillip J. Restle; David Shan; David Hogenmiller; Yong Kim; Alan J. Drake; Jason D. Hibbeler; Thomas J. Bucelot; Gregory Scott Still; Keith A. Jenkins; Joshua Friedrich
, 4.2-billion transistor, high-frequency microprocessor fabricated in the IBM 22-nm silicon on insulator (SOI) technology with embedded dynamic random access memory (eDRAM) and 15 layers of metal. With its twelve architecturally enhanced, eight-way multithreaded cores, 96-MB high-bandwidth shared third-level cache, and increased on and off-chip bandwidth, the POWER8 processor delivers industry-leading performance. This paper describes the circuit techniques and design methodologies that were employed for implementing this chip and that allowed it to maintain the power dissipation at the level of its predecessor while delivering a threefold increase in per-socket performance. Among the innovative technologies employed by the processor are resonant clocking, on-chip per-core voltage regulation, and enhanced eDRAM arrays.
Ibm Journal of Research and Development | 1996
James Wilson Bishop; Michael J. Campion; Thomas Leo Jeremiah; Stephen J. Mercier; Edmond J. Mohring; Kerry P. Pfarr; Bruce George Rudolph; Gregory Scott Still; Tennis S. White
The 12-core 649mm2 POWER8™ leverages IBMs 22nm eDRAM SOI technology [1], and microarchitectural enhancements to deliver up to 2.5× the socket performance [2] of its 32nm predecessor, POWER7+™ [3]. POWER8 contains 4.2B transistors and 31.5μF of deep-trench decoupling capacitance. Three thin-oxide transistor Vts are used for power/performance tuning, and thick-oxide transistors enable high-voltage I/O and analog designs. The 15-layer BEOL contains 5-80nm, 2-144nm, 3-288nm, and 3-640nm pitch layers for low-latency communication as well as 2-2400nm ultra-thick-metal (UTM) pitch layers for low-resistance distribution of power and clocks.
Archive | 2000
Thomas James Osten; Gregory Scott Still
A resonant-clock design for the IBM POWER8 processor core was implemented with 2 resonant modes (and a non-resonant mode), saving clock power over a wide frequency range from 2.5GHz to more than 5GHz. The POWER8 microprocessor is composed of 12 chiplets, each containing a single resonant clock grid for one core and its L2 cache, and a half-frequency, non-resonant clock grid for the L3 cache. The clock grids drive the local clock buffers (LCBs) that in turn drive the latches. The LCBs are gated off to measure the global clock power from the PLL to the LCBs. The resonant core communicates synchronously with the L3, requiring low skew between the domains. The chip was designed in a 22nm SOI process, including two ultra-thick-metal (UTM) layers (3 microns thick) for power distribution, I/O, all long global clock wires, and the resonant clock inductors. The UTM technology reduces wire resistance and simplifies inductor design, but requires accurate transmission line modeling and special routing.
Archive | 1988
Kevin Chuang-Chi Huang; Karl Heinz Kutz; Timothy John Mcnamara; Victoria Elaine Seals; Gregory Scott Still
The PowerPC AS™ A10 64-bit RISC microprocessor is a 4.7-million-transistor integrated circuit design, using IBM CMOS 5L 0.5-µm, 3-V, four-level-metal ASIC technology. Support for the PowerPC AS architecture is implemented in a 213-mm 2 die using a semicustom design methodology. Chip density and speed are enhanced through the use of custom macros and multiport arrays. An on-chip phase-locked-loop circuit is used to reduce chip-to-chip clock skew. Full utilization of the four-level-metal interconnect technology was achieved through architectural floorplanning, performance clustering, and timing-driven placement and wiring, with a total wire length of over 102 meters placed on the 14.6 × 14.6-mm die. The microprocessor is a pipelined, superscalar design with five separate functional units, a 4KB instruction cache, and an 8KB data cache. The design includes parity, error-correction, and error-logging functions, as well as self-test for logic and arrays during power-on. The design is robust and implements a wide range of performance configurations at the system level, allowing direct attachment of DRAM to the processor, or high-performance L2 cache options using high-speed SRAM. An on-chip system I/O bus and bus controller are provided for attachment of peripherals.
Archive | 1988
Kevin Chuang-Chi Huang; John Gerard Santoni; Gregory Scott Still