Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mark D. Mayo is active.

Publication


Featured researches published by Mark D. Mayo.


international solid-state circuits conference | 2011

A 5.2GHz microprocessor chip for the IBM zEnterprise™ system

James D. Warnock; Yuen Chan; William V. Huott; Sean M. Carey; Michael Fee; Huajun Wen; M. J. Saccamango; Frank Malgioglio; Patrick J. Meaney; Donald W. Plass; Yuen H. Chan; Mark D. Mayo; Guenter Mayer; Leon J. Sigal; David L. Rude; Robert M. Averill; Michael H. Wood; Thomas Strach; Howard H. Smith; Brian W. Curran; Eric M. Schwarz; Lee Evan Eisen; Doug Malone; Steve Weitzel; Pak-Kin Mak; Thomas J. McPherson; Charles F. Webb

The microprocessor chip for the IBM zEnterprise 196 (z 196) system is a high-frequency, high-performance design that adds support for out-of-order instruction execution and increases operating frequency by almost 20% compared to the previous 65nm design, while still fitting within the same power envelope. Despite the many difficult engineering hurdles to be overcome, the design team was able to achieve a product frequency of 5.2GHz, providing a significant performance boost for the new system.


Ibm Journal of Research and Development | 1997

Circuit design techniques for the high-performance CMOS IBM S/390 parallel enterprise server G4 microprocessor

Leon J. Sigal; James D. Warnock; Brian W. Curran; Yuen H. Chan; Peter J. Camporese; Mark D. Mayo; William V. Huott; Daniel R. Knebel; C.T. Chuang; James P. Eckhardt; Philip T. Wu

This paper describes the circuit design techniques used for the IBM S/390® Parallel Enterprise Server G4 microprocessor to achieve operation up to 400 MHz. A judicious choice of process technology and concurrent top-down and bottom-up design approaches reduced risk and shortened the design time. The use of timing-driven synthesis/placement methodologies improved design turnaround time and chip timing. The combined use of static, dynamic, and self-resetting CMOS (SRCMOS) circuits facilitated the balancing of design time and performance return. The use of robust PLL design, floorplanning, and clock distribution minimized clock skew. Innovative latch designs permitted performance optimization without adding risk. Microarchitecture optimization and circuit innovations improved the performance of timing-critical macros. Full custom array design with extensive use of SRCMOS circuit techniques resulted in an on-chip L1 cache having 2.0-ns cycle time.


international solid-state circuits conference | 2000

760 MHz G6 S/390 microprocessor exploiting multiple Vt and copper interconnects

Thomas J. McPherson; Robert M. Averill; D. Balazich; K. Barkley; Sean M. Carey; Yuen H. Chan; R. Crea; A. Dansky; R. Dwyer; A. Haen; D. Hoffman; A. Jatkowski; Mark D. Mayo; D. Merrill; T. McNamara; Gregory A. Northrop; J. Rawlins; Leon J. Sigal; T. Slegel; D. Webber; P. Williams; F. Yee

The G6 system is a sixth generation CMOS server for the S/390 line of products featuring a 12+2 SMP size and significant frequency improvements obtained through the use of low-Vt devices and copper interconnects. The microprocessor operates at 760 MHz at the fast end of the process distribution. The system ships at 637 MHz in a 12+2 chilled SMP configuration. Measured system performance on the 12 way is 1600 S/390 MIPs, providing over 50% more performance than the G5. This microprocessor uses CMOS7S technology, which has a 0.2 /spl mu/m process. The chip uses 6 levels of copper metal plus an additional layer of local interconnect on a 14.6/spl times/14.7 mm/sup 2/ die with 25M transistors (7M logic/18M array). The power supply is 1.9 V and the chip power is 33 W at 637 MHz.


IEEE Journal of Solid-state Circuits | 2014

Circuit and Physical Design of the zEnterprise™ EC12 Microprocessor Chips and Multi-Chip Module

James D. Warnock; Yuen H. Chan; Hubert Harrer; Sean M. Carey; Gerard M. Salem; Doug Malone; Ruchir Puri; Jeffrey A. Zitz; Adam R. Jatkowski; Gerald Strevig; Ayan Datta; Anne E. Gattiker; Aditya Bansal; Guenter Mayer; Yiu-Hing Chan; Mark D. Mayo; David L. Rude; Leon J. Sigal; Thomas Strach; Howard H. Smith; Huajun Wen; Pak-Kin Mak; Chung-Lung Kevin Shum; Donald W. Plass; Charles F. Webb

This work describes the circuit and physical design implementation of the processor chip (CP), level-4 cache chip (SC), and the multi-chip module at the heart of the EC12 system. The chips were implemented in IBMs high-performance 32nm high-k/metal-gate SOI technology. The CP chip contains 6 super-scalar, out-of-order processor cores, running at 5.5 GHz, while the SC chip contains 192 MB of eDRAM cache. Six CP chips and two SC chips are mounted on a high-performance glass-ceramic substrate, which provides high-bandwidth, low-latency interconnections. Various aspects of the design are explored in detail, with most of the focus on the CP chip, including the circuit design implementation, clocking, thermal modeling, reliability, frequency tuning, and comparison to the previous design in 45nm technology.


international solid-state circuits conference | 1999

609 MHz G5 S/399 microprocessor

Gregory A. Northrop; Robert M. Averill; K. Barkley; Sean M. Carey; Yuen H. Chan; Yuen Chan; M. Check; D. Hoffman; William V. Huott; B. Krumm; C. Krygowski; J. Liptay; Mark D. Mayo; T. McNamara; Thomas J. McPherson; Eric M. Schwarz; L.S.T. Siegel; Charles F. Webb; D. Webber; P. Williams

The IBM G5 system is a fifth-generation CMOS server for the S/390 line of products with functionality improvements such as an instruction branch target buffer (BTB) and an IEEE compliant binary floating-point. The microprocessor operates at 600 MHz at the fast end of the process distribution, although the system is shipped at 500 MHz in a 10+2 SMP configuration. Measured system performance on the 10 way is 1069 S/390 MIPs. This microprocessor uses a 0.25 mum CMOS process. The chip uses 6 levels of metal plus an additional layer of local interconnect and is 14.6times14.7 mm2 with 25 M transistors (7 M logic/18 M array). Power supply is 1.9 V. Chip power is 25 W at 500 MHz


international solid-state circuits conference | 2015

4.1 22nm Next-generation IBM System z microprocessor

James D. Warnock; Brian W. Curran; John Badar; Gregory J. Fredeman; Donald W. Plass; Yuen H. Chan; Sean M. Carey; Gerard M. Salem; Friedrich Schroeder; Frank Malgioglio; Guenter Mayer; Christopher J. Berry; Michael H. Wood; Yiu-Hing Chan; Mark D. Mayo; John Mack Isakson; Charudhattan Nagarajan; Tobias Werner; Leon J. Sigal; Ricardo H. Nigaglioni; Mark Cichanowski; Jeffrey A. Zitz; Matthew M. Ziegler; Tim Bronson; Gerald Strevig; Daniel M. Dreps; Ruchir Puri; Douglas J. Malone; Dieter Wendel; Pak-Kin Mak

The next-generation System z design introduces a new microprocessor chip (CP) and a system controller chip (SC) aimed at providing a substantial boost to maximum system capacity and performance compared to the previous zEC12 design in 32nm [1,2]. As shown in the die photo, the CP chip includes 8 high-frequency processor cores, 64MB of eDRAM L3 cache, interface IOs (“XBUS”) to connect to two other processor chips and the L4 cache chip, along with memory interfaces, 2 PCIe Gen3 interfaces, and an I/O bus controller (GX). The design is implemented on a 678 mm2 die with 4.0 billion transistors and 17 levels of metal interconnect in IBMs high-performance 22nm high-x CMOS SOI technology [3]. The SC chip is also a 678 mm2 die, with 7.1 billion transistors, running at half the clock frequency of the CP chip, in the same 22nm technology, but with 15 levels of metal. It provides 480 MB of eDRAM L4 cache, an increase of more than 2× from zEC12 [1,2], and contains an 18 MB eDRAM L4 directory, along with multi-processor cache control/coherency logic to manage inter-processor and system-level communications. Both the CP and SC chips incorporate significant logical, physical, and electrical design innovations.


Ibm Journal of Research and Development | 1992

Improved performance of IBM Enterprise System/9000 bipolar logic chips

A. E. Brown; James P. Eckhardt; Mark D. Mayo; Walter Alan Svarczkopf; Santosh P. Gaur

The performance required for logic gate arrays by the IBM Enterprise System/9000TM (ES/9000TM) family of water-cooled processors was obtained by redesigning chips that previously consisted of emitter-coupled logic (ECL) circuits. Multiple bipolar logic circuit families were implemented for the first time on a single IBM chip by using a modular cell approach. In 60% of the ECL circuits, ac coupling in ECL gates reduced the maximum operating power per ECL circuit on ES/9000 chips by S0% and decreased the signal delay per loaded gate by 30%, to 150 ps. About 10-20% of the remaining ECL circuits were replaced by differential current switches (DCS) which dissipated less power and improved the overall chip performance. Circuits to communicate between ECL and DCS circuit families and to improve DCS circuit reliability were included on the ES/9000 chips without affecting logic function density. Introduction Recent advances in bipolar logic semiconductor processing have increased circuit densities on a cliip by an order of magnitude [1, 2]. However, packaging improvements have only doubled the quantity of heat that can be removed from a chip [3]. Consequently, a 50% reduction in the average operating power per logic circuit is required. Bipolar chips in IBM 3090TM processors are composed of ECL circuits which operate at high (15 mW/single phase) power and dissipate significant quantities of heat. In ES/9000 chips, a decrease in power per circuit increases the delays due to the load associated with the capacitance of interconnecting wires. The interconnections introduce a measure of circuit loading which doubles the sensitivity of a circuits performance to fan-out and wire lengths. Vertical geometry device design improvements reduce delays with intrinsic, but not extrinsic, loads. Advanced metallurgies do not compensate for the high wiring capacitance that arises from the longer wire lengths due to a doubling of the ES/9000 chip areas. The performance of gate arrays required by the ES/9000 family of mainframe


international solid-state circuits conference | 2013

5.5GHz system z microprocessor and multi-chip module

James D. Warnock; Yuen H. Chan; Hubert Harrer; David L. Rude; Ruchir Puri; Sean M. Carey; Gerard M. Salem; Guenter Mayer; Yiu-Hing Chan; Mark D. Mayo; Adam R. Jatkowski; Gerald Strevig; Leon J. Sigal; Ayan Datta; Anne E. Gattiker; Aditya Bansal; Doug Malone; Thomas Strach; Huajun Wen; Pak-Kin Mak; Chung-Lung Kevin Shum; Donald W. Plass; Charles F. Webb

The new System z microprocessor chip (“CP chip”) features a high-frequency processor core running at 5.5GHz in a 32nm high-κ CMOS technology [1], using 15 levels of metal. This chip is a successor to the 45nm product [2], with significant improvements made to the core and nest (i.e. the logic external to the cores) in order to increase the performance and throughput of the design. Also, special considerations were necessary to ensure robust circuit operation in the high-κ technology used for implementation. As seen in the die photo, the chip contains 6 processor cores (compared to 4 cores in the 45nm version), and a large shared 48MB DRAM L3 cache. Each core includes a pair of data and instruction L2 SRAM caches of 1MB each. In addition, the chip contains a memory control unit (MCU), an I/O bus controller (GX), and two sets of interfaces to the L4 cache chips (also in 32nm technology). The CP chip occupies 598 mm2, contains about 2.75B transistors, and has 1071 signal IOs.


international solid-state circuits conference | 2001

A 1.1 GHz first 64 b generation 2900 microprocessor

Brian W. Curran; Peter J. Camporese; Sean M. Carey; Yuen Chan; Yiu-Hing Chan; R. Clemen; R. Crea; Dale E. Hoffman; T. Koprowski; Mark D. Mayo; T. McPherson; Gregory A. Northrop; Leon J. Sigal; Howard H. Smith; F. Tanzi; P. Williams

The first 64 b S/390 microprocessor implemented in a 0.18 /spl mu/m, 7-level copper interconnect bulk CMOS process, runs operating system and applications at 1.1 GHz. The frequency is achieved with interconnect width and repeater optimization, selective use of low-Vt devices, tapered library gates, and improved synthesis and circuit tuning algorithms.


Ibm Journal of Research and Development | 2015

IBM z13 circuit design and methodology

James D. Warnock; C. Berry; M. H. Wood; Leon J. Sigal; Yuen H. Chan; G. Mayer; Mark D. Mayo; Yiu-Hing Chan; F. Malgioglio; G. Strevig; C. Nagarajan; Sean M. Carey; Gerard M. Salem; F. Schroeder; Howard H. Smith; D. Phan; Ricardo H. Nigaglioni; Thomas Strach; M. M. Ziegler; N. Fricke; K. Lind; J. L. Neves; S. H. Rangarajan; J. P. Surprise; J. M. Isakson; J. Badar; D. Malone; Donald W. Plass; A. Aipperspach; Dieter Wendel

The two chips at the heart of the IBM z13™ system include a processor chip (referred to as the CP or Central Processor chip) and an L4 (Level 4) cache chip (referred to as the SC or System Controller chip), each 678 mm2 in area. The CP and SC chips were implemented with approximately 4 billion (4 × 109) and 7.1 billion transistors, respectively, in IBMs 22-nm SOI (silicon-on-insulator) technology, supporting eDRAM (embedded dynamic random access memory), and with up to 17 levels of metal available. In this paper, we discuss aspects of the circuit and physical design of these chips, including both digital logic and custom array implementation. In addition, we describe the design analysis methodology, along with some of the checks needed to ensure a robust, reliable, and high-frequency product.

Researchain Logo
Decentralizing Knowledge