Christopher A. Krygowski

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christopher A. Krygowski is active.

Explore More

Publication

Featured researches published by Christopher A. Krygowski.

international symposium on microarchitecture | 1999

IBM's S/390 G5 microprocessor design

Timothy J. Slegel; Robert M. Averill; Mark A. Check; Bruce C. Giamei; Barry Watson Krumm; Christopher A. Krygowski; Wen H. Li; John Stephen Liptay; John Macdougall; Thomas J. McPherson; Jennifer A. Navarro; Eric M. Schwarz; Kevin Shum; Charles F. Webb

The IBM S/390 G5 microprocessor in IBMs newest CMOS mainframe system provides more than twice the performance of the previous generation, the G4. The G5 system offers improved reliability and availability, along with new architectural features such as support for IEEE floating-point arithmetic and a redesigned L2 cache and processor interconnect. The G5 system implements the ESA/390 instruction-set architecture, which is based on and compatible with the original S/360 architecture. Therefore, it has no RISC (reduced-instruction-set computing) concepts and is one of the most complex of all CISC (complex-instruction-set computing) architectures. Designers had to meet a unique set of challenges to achieve the G5s level of performance-for example, achieving a very high frequency given the complexity of the architecture.

asilomar conference on signals, systems and computers | 2001

The IBM z900 decimal arithmetic unit

Fadi Y. Busaba; Christopher A. Krygowski; Wen He Li; Eric M. Schwarz; Steven R. Carlough

As the cost for adding functions to a processor continues to decline, processor designs are including many additional features. An example of this trend is the appearance of graphics engines and compression engines on midrange and even low end microprocessors. One area that has the potential to capture chip real estate is the decimal arithmetic engine because of its importance in financial and business applications. Studies show that 55% of the numeric data stored on commercial databases are in decimal format. Although decimal arithmetic is supported in many software languages it is not yet available on many microprocessors. This paper details the decimal arithmetic engine in the recently announced z900 microprocessor.

Ibm Journal of Research and Development | 2004

The IBM eServer z990 floating-point unit

Guenter Gerwig; Holger Wetter; Eric M. Schwarz; Juergen Haess; Christopher A. Krygowski; Bruce M. Fleischer; Michael Kroener

The floating-point unit (FPU) of the IBM z990 eServerTM is the first one in an IBM mainframe with a fused multiply-add dataflow. It also represents the first time that an SRT divide algorithm (named after Sweeney, Robertson, and Tocher, who independently proposed the algorithm) was used in an IBM mainframe. The FPU supports dual architectures: the zSeries® hexadecimal floating-point architecture and the IEEE 754 binary floating-point architecture. Six floating-point formats-- including short, long, and extended operands-are supported in hardware. The throughput of this FPU is one multiply-add operation per cycle. The instructions are executed in five pipeline steps, and there are multiple provisions to avoid stalls in case of data dependencies. It is able to handle denormalized input operands and denormalized results without a stall (except for architectural program exceptions). It has a new extended-precision divide and square-root dataflow. This dataflow uses a radix-4 SRT algorithm (radix-2 for square root) and is able to handle divides and square-root operations in multiple floating-point and fixed-point formats. For fixed-point divisions, a new mechanism improves the performance by using an algorithm with which the number of divide iterations depends on the effective number of quotient bits.

Ibm Journal of Research and Development | 2002

The microarchitecture of the IBM eServer z900 processor

Eric M. Schwarz; Mark A. Check; Chung-Lung Kevin Shum; Thomas Koehler; Scott Barnett Swaney; John Macdougall; Christopher A. Krygowski

The recent IBM ESA/390 CMOS line of processors, from 1997 to 1999, consisted of the G4, G5, and G6 processors. The architecture they implemented lacked 64-bit addressability and had only a limited set of 64- bit arithmetic instructions. The processors also lacked data and instruction bandwidth, since they utilized a unified cache. The branch performance was good, but there were delays due to conflicts in searching and writing the branch target buffer. Also, the hardware data compression and decimal arithmetic performance, though good, was in demand by database and COBOL programmers. Most of the performance concerns regarding prior processors were due to area constraints. Recent technology advances have increased the circuit density by 50 percent over that of the G6 processor. This has allowed the design of several performance-critical areas to be revisited. The end result of these efforts is the IBM eServer z900 processor, which is the first high-end processor based on the new 64-bit z/Architecture™.

symposium on computer arithmetic | 1999

The S/390 G5 floating point unit supporting hex and binary architectures

Eric M. Schwarz; Ronald M. Smith; Christopher A. Krygowski

The first high performance floating point unit to support both IBM 360 hexadecimal based floating point architecture and the IEEE 754 Standard binary floating point architecture is described. The S/390 G5 floating point unit supports the new S/390 architecture which includes hexadecimal based short, long, and extended precision formats and IEEE 754 standard single, double, and quad formats. This floating point unit is part of the microprocessor chip on the S/390 G5 mainframe computer introduced in 1998 and generally available at 500 MHz speeds. The S/390 G5 represents the current state of the art in CISC processor design. The paper describes the S/390 architecture enhancements, the internal format of the FPU, and the modifications to the FPU dataflow.

Ibm Journal of Research and Development | 1999

The S/390 G5 floating-point unit

Eric M. Schwarz; Christopher A. Krygowski

The floating-point unit of the IBM S/390® G5 Parallel Enterprise Server represents a milestone in S/390 floating-point computation. The S/390 G5 contains the first floating-point unit (FPU) to support both the S/390 hexadecimal floating-point architecture and IEEE Standard 754 for binary floating-point arithmetic. The S/390 G5 FPU supports the new S/390 floating-point architecture, which contains six operand formats, including the IEEE 754 standard singleword, doubleword, and quadword formats, which are all supported in hardware. An internal hexadecimal-based dataflow is implemented to support both hexadecimal- and binary-based architectures. The S/390 G5 server is generally available at 500 MHz. The microprocessor chip is fabricated in IBM CMOS 6X technology, with a device size of 0.25 µm as drawn and 0.15 µm effective length. The design of the G5 FPU is based upon that of its predecessor, the G4. All of the custom dataflow macros from the G4 hexadecimal FPU were utilized with only minor modifications, and only a few additional macros for format conversion were required. This paper discusses the changes that were required to support the new S/390 binary floating-point architecture.

Ibm Journal of Research and Development | 2009

Functional verification of the IBM system z10 processor chipset

Christopher A. Krygowski; Dean G. Bair; Rebecca M. Gott; Mark H. Decker; Akash V. Giri; Christian Habermann; Matthias D. Heizmann; Stefan Letz; William J. Lewis; Steven M. Licker; H. Mallar; Edward C. McCain; Wolfgang Roesner; Naseer S. Siddique; Adrian E. Seigler; Brian W. Thompto; Kai Weber; Ralf Winkelmann

This paper describes the comprehensive verification effort of the IBM System z10™ processor chipset, which consists of the z10™ quad-core central processor chip and the companion z10 symmetric multiprocessor (SMP) chip. The z10 processor chipset represented a significant redesign of its predecessor and thus presented a new challenge to ensure complete functional correctness of the product before the construction of actual system hardware. The z10 microprocessor pipeline was completely redesigned to support a doubling of the operating frequency. It also includes new hardware performance features, such as enhanced branch prediction, a reoptimized cache hierarchy, hardware-based prefetching, and a hardware implementation of decimal floating-point arithmetic in IEEE formats. In addition, there were significant hardware changes in the SMP storage hierarchy for optimized data latency performance. These changes include a new system topology, interprocessor book protocol, larger SMP size, and various aggressive cache ownership schemes. Key verification innovations are described, and a direct relationship to improved z10 system quality is provided for most cases.

Ibm Journal of Research and Development | 2012

Key advances in the presilicon functional verification of the IBM zEnterprise microprocessor and storage hierarchy

Christopher A. Krygowski; Eli Almog; Dean G. Bair; Raimund Breil; Gero Dittmann; Rebecca M. Gott; William J. Lewis; Alia D. Shah; Brian W. Thompto

This paper highlights key advances in the presilicon verification effort of the IBM zEnterprise® 196 (z196) microprocessor and storage hierarchy. It focuses on the unique set of verification challenges as well as the process innovations that address them. At the time of product launch, the z196 system represented the industrys fastest and most scalable enterprise system, with up to 80 customer-configurable out-of-order core processors operating at 5.2 GHz. In addition to offering industry-leading performance, the z196 system builds upon its leadership in reliability by introducing a new redundant array of independent memory (RAIM) technology into its memory subsystem. The new product features in this system drove innovations in all aspects of processor functional verification, including stimulus generation, functional checking, debugging, and coverage. A new hybrid RAIM verification methodology, which includes both formal and random methods, is described. Many process and methodology improvements were made to improve developmental collaboration across a global team. These enhancements include a simulation development environment that uses common shared components across functional partitions, as well as a shared cache loader that was used across multiple environments. We also present a self-configuring test-case generation process that focused on the coverage of functional stimulus.

design automation conference | 2011

Facing the challenge of new design features: an effective verification approach

Wisam Kadry; Ronny Morad; Alex Goryachev; Eli Almog; Christopher A. Krygowski

Verifying new hardware systems is a daunting task. To reduce the amount of effort involved, verification teams attempt to reuse as much verification IP as possible. We introduce a novel approach for test generation that enables the reuse of verification IP to verify new functionality. This method applies to a significant category of features, which are variations on the functionality of an existing design. Our method is being successfully used in the verification of high-end IBM servers: System p and System z. We compared our technique to alternative approaches and show that it achieves the best quality while reducing manual effort.

Archive | 2007