Craig B. Peterson
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Craig B. Peterson.
conference on high performance computing (supercomputing) | 1988
Shekhar Borkar; Robert Cohn; George W. Cox; Sha Gleason; Thomas Gross; H. T. Kung; Monica S. Lam; Brian E. Moore; Craig B. Peterson; John Samuel Pieper; Linda J. Rankin; P. S. Tseng; Jim Sutton; John Urbanski; Jon A. Webb
A description is given of the iWarp architecture and how it supports various communication models and system configurations. The heart of an iWarp system is the iWarp component: a single-chip processor that requires only the addition of memory chips to form a complete system building block, called the iWarp cell. Each iWarp component contains both a powerful computation engine that runs at 20 MFLOPS (million floating-point operations per second) and a high-throughput (320 Mb/s), low-latency (100-150-ns) communication engine for interfacing with other iWarp cells. Because of their strong computation and communication capabilities, the iWarp components provide a versatile building block for high-performance parallel systems ranging from special-purpose systolic arrays to general-purpose distributed memory computers. They can support both fine-grain parallel and coarse-grain distributed computation models simultaneously in the same system. The initial iWarp demonstration system consists of an 8*8 torus of iWarp cells, delivering more than 1.2 GFLOP (billions of FLOPS). It can be expanded to include up to 1024 cells.<<ETX>>
international symposium on computer architecture | 1990
Shekhar Borkar; Robert Cohn; George W. Cox; Thomas R. Gross; H. T. Kung; Monica S. Lam; Margie Levine; Brian E. Moore; Wire Moore; Craig B. Peterson; Jim Susman; Jim Sutton; John Urbanski; Jon A. Webb
iWarp is a parallel architecture developed jointly by Carnegie Mellon University and Intel Corporation. The iWarp communication system supports two widely used interprocessor communication styles: memory communication and systolic communication. This paper describes the rationale, architecture, and implementation for the iWarp communication system. The sending or receiving processor of a message can perform either memory or systolic communication. In memory communication, the entire message is buffered in the local memory of the processor before it is transmitted or after it is received. Therefore communication begins or terminates at the local memory. For conventional message passing methods, both sending and receiving processors use memory communication. In systolic communication, individual data items are transferred as they are produced, or are used as they are received, by the program running at the processor. Memory communication is flexible and well suited for general computing; whereas systolic communication is efficient and well suited for speed critical applications. A major achievement of the iWarp effort is the derivation of a common design to satisfy the requirements of both systolic and memory communication styles. This is made possible by two important innovations in communication: (1) program access to communication and (2) logical channels. The former allows programs to access data as they are transmitted and to redirect portions of messages to different destinations efficiently. The latter increases the connectivity between the processors and guarantees communication bandwidth for classes of messages. These innovations have provided a focus for the iWarp architecture. The result is a communication system that provides a total bandwidth of 320 MBytes/sec and that is integrated on a single VLSI component with a 20 MFLOPS plus 20 MIPS long instruction word computation engine.
international symposium on microarchitecture | 1991
Craig B. Peterson; James A. Sutton; Paul Wiley
An architecture that efficiently supports both message-passing and systolic communications in one system is presented. This architecture incorporates a variety of innovative features unifying both computational power and communications flexibility in one VLSI component, the iWarp microprocessor. The message-based communication model is discussed, and an overview of the architecture is given. Two principle iWarp components, called the communication agent and the computation agent, and the register file they share are described. The efficiencies of word-level communication are examined. The software development environment is also described.<<ETX>>
international conference on application specific array processors | 1990
Brent S. Baxter; George W. Cox; Thomas R. Gross; H. T. Kung; David R. O'Hallaron; Craig B. Peterson; Jon A. Webb; Paul Wiley
The iWarp processor, which integrates both communication and computation functions on a single VLSI component, is described. The iWarp component and subsystems including it are powerful building blocks for constructing a new generation of application-specific computing systems. These special-purpose systems can achieve very high performance, while maintaining a high degree of flexibility to address different needs of an application. In particular, iWarp systems deliver high computation bandwidth (up to 20 GFLOPS for a 1024 cell system), as well as high communication bandwidth (320 Mbytes/s per cell). Programming these systems is assisted by modern tools such as optimizing compilers and parallel program generators.<<ETX>>
national computer conference | 1983
Dave Johnson; Dave Budde; Dave Carson; Craig B. Peterson
Early in 1983 two new VLSI components were added to the iAPX 432 family of components. The 43204 Bus Interface Unit (BIU) and the 43205 Memory Control Unit (MCU) extend the logical flexibility and robustness of the 432 processors into the physical implementation of 432 systems. The BIU and MCU provide a range of fault-tolerant system options. The components provide comprehensive detection facilities for processor operations as well as for the operation of buses and memories. Recovery is possible from permanent as well as transient errors. Detection and recovery are done totally in the VLSI components; there is no need for additional TTL logic or diagnostic software. This range of fault-tolerant capabilities is achieved by replication of VLSI components. VLSI replication provides software transparent migration over the full range of fault-tolerant options without any penalties for unused fault-tolerant facilities in low-end systems.
Archive | 1982
David L. Budde; David G. Carson; Anthony L. Cornish; David B. Johnson; Craig B. Peterson
Archive | 1981
David L. Budde; David G. Carson; Anthony L. Cornish; Brad W. Hosler; David B. Johnson; Craig B. Peterson
Archive | 1981
John A. Bayliss; Craig B. Peterson; Doran K. Wilde
Archive | 1994
Keith-Michael W. Self; Craig B. Peterson; A. Sutton Ii James; John Urbanski; George W. Cox; Linda J. Rankin; David W. Archer; Shekhar Borkar
Archive | 1982
David L. Budde; David G. Carson; Anthony L. Cornish; David B. Johnson; Craig B. Peterson