Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thomas A. Gregg is active.

Publication


Featured researches published by Thomas A. Gregg.


Ibm Journal of Research and Development | 1999

IBM S/390 parallel enterprise server G5 fault tolerance: a historical perspective

Lisa Spainhower; Thomas A. Gregg

Fault tolerance in IBM S/390® systems during the 1980s and 1990s had three distinct phases, each characterized by a different uptime improvement rate. Early TCM-technology mainframes delivered excellent data integrity, instantaneous error detection, and positive fault isolation, but had limited on-line repair. Later TCM mainframes introduced capabilities for providing a high degree of transparent recovery, failure masking, and on-line repair. New challenges accompanied the introduction of CMOS technology. A significant reduction in parts count greatly improved intrinsic failure rates, but dense packaging disallowed on-line CPU repair. In addition, characteristics of the microprocessor technology posed difficulties for traditional in-line error checking. As a result, system fault-tolerant design, particularly in CPUs and memory, underwent another evolution from G1 to G5. G5 implements an innovative design for a high-performance, fault-tolerant single-chip microprocessor. Dynamic CPU sparing delivers a transparent concurrent repair mechanism. A new internal channel provides a high-performance, highly available Parallel Sysplex® in a single mainframe. G5 is both the culmination of decades of innovation and careful implementation, and the highest achievement of S/390 fault-tolerant design.


ieee international symposium on fault tolerant computing | 1998

G4: a fault-tolerant CMOS mainframe

Lisa Spainhower; Thomas A. Gregg

G4 is IBMs fourth generation CMOS microprocessor-based S/390 mainframe but the first to achieve fault tolerant equivalence-or superiority-with its predecessor ECL mainframes. CMOS technology provides much greater density and integration, assuring superior fault avoidance characteristics. The reduced power of CMOS makes bulk power redundancy and battery backup practical. However, the high density and circuit properties of CMOS pose new challenges for detection, recovery, and online repair. G4 implements an innovative design for a high performance, fault tolerant, single-chip microprocessor. Microprocessor sparing is used as a concurrent repair mechanism. Increased memory density requires new (76,64) S4EC/DED Error Correction Codes so that all single chip failures are correctable. As many as four I/O interfaces are packaged on an individual card, requiring both configuration management and automated maintenance procedures to assure all devices maintain connectivity during online repair.


Ibm Journal of Research and Development | 1997

S/390 CMOS server I/O: the continuing evolution

Thomas A. Gregg

IBM has developed a strategy to achieve the high I/O demands of large servers. In a new environment of industry-standard peripheral component interconnect (PCI) attached adapters conforming to open I/O interfaces, S/390® has developed an efficient method of quickly integrating disk storage, communications, and future adapters. Preserving the S/390 I/O programming model and the high level of data integrity expected in S/390 products and reducing development cycle time and resources have further constrained design options. At the same time, S/390 developers have redesigned the traditional I/O components into the latest chip technologies. The developers have also designed a new internal link (STI) to meet the increased I/O bandwidth and connectivity required by the high processor performance of the third and fourth generations of S/390 CMOS servers. This paper describes this strategy and how it has led to systems that retain the differentiating features of S/390 products.


Ibm Journal of Research and Development | 2012

Overview of IBM zEnterprise 196 I/O subsystem with focus on new PCI express infrastructure

Thomas A. Gregg; David Craddock; Daniel J. Stigliani; Frank E. Bosco; Ethan E. Cruz; Michael F. Scanlon; Philip A. Sciuto; Gerd K. Bayer; Michael Jung; Christoph Raisch

IBM zEnterprise® 196 introduces a new input/output (I/O) s5ubsystem, including a new I/O drawer that is largely based on a greatly expanded exploitation of industry-standard high-volume PCI Express® (PCIe®) links and switches. The System z® qualities of reliability, availability, and serviceability (RAS) are preserved and enhanced by combining the PCIe RAS capabilities with new System z capabilities. PCIe ports connecting the processor book to the I/O drawer are provided by a new IBM-designed PCIe fan-out card. This fan-out card and its firmware (Licensed Internal Code) support both traditional System z I/O and new I/O paradigms. In the new PCIe I/O drawer, PCIe switches provide fan-out and the well-established System z I/O failover function referred to as redundant I/O interconnect. This is the third generation of the I/O drawer/cage to be used in System z platforms. The PCIe I/O drawer design is extremely compact and provides enhanced I/O port granularity and density. It has been designed to provide performance extendibility for future I/O advancements. Traditional I/O such as FICONA, Fibre Channel Protocol, and Ethernet are provided with enhanced functionality and are packaged in this new PCIe I/O drawer. The advent of this new infrastructure opens up the possibility of attaching native PCIe adapters while allowing them to be controlled by system firmware or by the operating systems directly.


Ibm Journal of Research and Development | 2009

IBM system z10 I/O subsystem

Edward W. Chencinski; Mark A. Check; Casimer M. DeCusatis; H. Deng; M. Grassi; Thomas A. Gregg; Markus M. Helms; A. D. Koenig; L. Mohr; Kulwant M. Pandey; Thomas Schlipf; Torsten Schober; H. Ulrich; Craig R. Walters

The performance, reliability, and functionality of a large server are greatly influenced by the design characteristics of its I/O subsystem. The critical components of the IBM System z10™ I/O subsystem have, therefore, been significantly improved in terms of performance, capability, and cost. The first-order network has been redesigned from the long-evolved enhanced self-timed interface (eSTI) links to utilize InfiniBand™ links. A redesign of the host logic of I/O chips and the fiberoptic interfaces within the links made it possible to introduce InfiniBand-based IBM Parallel Sysplex® links. A broad range of legacy I/O channels have been carried forward to connect through InfiniBand, and a foundation has been laid for new channel types of improved functionality and performance. The first such hardware channel to be introduced is the next generation of Ethernet-virtualization data routers. A new and methodical recovery structure has been designed to ensure consistent, extensive support of reliability, availability, and serviceability. A building-block-oriented design process has been developed to enable the innovations that made these advances possible. Finally, a new performance verification methodology has been introduced to ensure that the system and subsystem designs are balanced to make effective use of the increased capacity.


Ibm Journal of Research and Development | 1999

The integrated cluster bus for the IBM S/390 parallel Sysplex

Thomas A. Gregg; Kulwant M. Pandey; Richard K. Errickson

IBM has developed a new S/390® Parallel Sysplex® coupling interface for the G5 server called the Integrated Cluster Bus (ICB). This interface improves the coupling efficiency by greatly reducing message-passing latency. Using the transport layer of the S/390 self-timed interface (STI) introduced in the G3 server, ICB adds channel function to the hub chip to allow a more direct interconnection between S/390 servers. This new channel has the same function as the present intersystem channel (ISC), but because it is integrated into the hub chip and therefore requires no additional components, its reliability is much better than that of the ISC. Since the ISC transmits data at a peak rate of 106 MB/s over distances exceeding ten kilometers and the ICB transmits data at a peak rate of 333 MB/s at distances of ten meters, the ISC is still required for the more geographically dispersed Parallel Sysplexes, whereas the ICB is well suited to the machine room, where multiple servers can be interconnected by ten-meter cables. This paper describes the design approach for the ICB. It describes the fundamental message-passing requirements of the Parallel Sysplex and how they are implemented in very complex yet compact hardware in the servers hub chip.


Ibm Journal of Research and Development | 2002

Coupling I/O channels for the IBM eServer z900: reengineering required

Thomas A. Gregg; Richard K. Errickson

The IBM eServer z900 introduces new Parallel Sysplex® coupling channels that satisfy evolving requirements in a way that minimizes product and development costs. Their design also provides backward compatibility with earlier S/390® models, spans all three coupling channel design points, and anticipates future end-of-life technology issues. The original intersystem channel (ISC) design was improved, and new features added, but the core chips were retained. This paper describes the efforts that led to the improved design.


international symposium on microarchitecture | 1994

IBM's ES/9000 Model 982's fault-tolerant design for consolidation

Lisa Spainhower; Thomas A. Gregg; Ram Chillarege

Consolidated work loads running around the clock means that todays large, general-purpose computers must meet high availability demands. To meet these demands, it is argued that the Model 982 provides fault tolerance by combining enhanced circuit-level error detection and failure isolation techniques with system-level techniques exploiting inherent redundancy.<<ETX>>


Ibm Journal of Research and Development | 1997

IBM S/390 parallel enterprise servers G3 and G4

Gururaj S. Rao; Thomas A. Gregg; Cyril A. Price; Chitta L. Rao; Steven J. Repka

This overview paper describes the key steps taken by IBM to transform the S/390® mainframe platform and to enhance customer satisfaction with improvements in cost, scalability, and application enablement. The effectiveness of the transformation is discussed in the context of performance and reliability, and the significance of cluster architecture is defined. Finally, mainframe resurgence is discussed, and factors important to enabling the growth of servers and microprocessors are presented.


Ibm Journal of Research and Development | 1992

The IBM Enterprise Systems Connection (ESCON) channel: a versatile building block

John R. Flanagan; Thomas A. Gregg; Daniel F. Casper

The IBM Enterprise Systems Connection (ESCON™) environment required the design of a single channel that could be attached to the entire line of Enterprise System/9000™ processors and deliver the performance required by the top of that line. In addition to the channel, other functions were needed, such as the ESCON channel-to-channel adapter. All of these functions were required to be implemented using the same channel hardware. This paper describes the key elements of the IBM ESCON channel design.

Researchain Logo
Decentralizing Knowledge