Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Paul W. Coteus is active.

Publication


Featured researches published by Paul W. Coteus.


Ibm Journal of Research and Development | 2005

Overview of the Blue Gene/L system architecture

Alan Gara; Matthias A. Blumrich; Dong Chen; George Liang-Tai Chiu; Paul W. Coteus; Mark E. Giampapa; Ruud A. Haring; Philip Heidelberger; Dirk Hoenicke; Gerard V. Kopcsay; Thomas A. Liebsch; Martin Ohmacht; Burkhard Steinmacher-Burow; Todd E. Takken; Pavlos M. Vranas

The Blue Gene®/L computer is a massively parallel supercomputer based on IBM system-on-a-chip technology. It is designed to scale to 65,536 dual-processor nodes, with a peak performance of 360 teraflops. This paper describes the project objectives and provides an overview of the system architecture that resulted. We discuss our application-based approach and rationale for a low-power, highly integrated design. The key architectural features of Blue Gene/L are introduced in this paper: the link chip component and five Blue Gene/L networks, the PowerPC® 440 core and floating-point enhancements, the on-chip and off-chip distributed memory system, the node- and system-level design for high reliability, and the comprehensive approach to fault isolation.


Ibm Journal of Research and Development | 2005

Blue Gene/L torus interconnection network

Narasimha R. Adiga; Matthias A. Blumrich; Dong Chen; Paul W. Coteus; Alan Gara; Mark E. Giampapa; Philip Heidelberger; Sarabjeet Singh; Burkhard Steinmacher-Burow; Todd E. Takken; Mickey Tsao; Pavlos M. Vranas

The main interconnect of the massively parallel Blue Gene®/L is a three-dimensional torus network with dynamic virtual cut-through routing. This paper describes both the architecture and the microarchitecture of the torus and a network performance simulator. Both simulation results and hardware measurements are presented.


international symposium on microarchitecture | 2012

The IBM Blue Gene/Q Compute Chip

Ruud A. Haring; Martin Ohmacht; Thomas W. Fox; Michael Karl Gschwind; David L. Satterfield; Krishnan Sugavanam; Paul W. Coteus; Philip Heidelberger; Matthias A. Blumrich; Robert W. Wisniewski; Alan Gara; George Liang-Tai Chiu; Peter A. Boyle; Norman H. Chist; Changhoan Kim

Blue Gene/Q aims to build a massively parallel high-performance computing system out of power-efficient processor chips, resulting in power-efficient, cost-efficient, and floor-space- efficient systems. Focusing on reliability during design helps with scaling to large systems and lowers the total cost of ownership. This article examines the architecture and design of the Compute chip, which combines processors, memory, and communication functions on a single chip.


Proceedings of the IEEE | 2001

On-chip wiring design challenges for gigahertz operation

Alina Deutsch; Paul W. Coteus; Gerard V. Kopcsay; Howard H. Smith; Byron Krauter; Daniel C. Edelstein; Phillip J. Restle

This paper reviews the status of present day on-chip wiring design methodologies and understanding. A brief explanation is given of the fundamental transmission-line properties that should be considered for accurate prediction of crosstalk, common-mode noise and clock skew. The deficiencies of RC-circuit representation are highlighted and design guidelines are given for using modeling and simulation techniques that have been previously used for package interconnections. Such techniques are believed to teach designers how to make better use of available technologies and help them architect systems that operate with many-GHz clock rates.


ieee international conference on high performance computing data and analytics | 2014

Addressing failures in exascale computing

Marc Snir; Robert W. Wisniewski; Jacob A. Abraham; Sarita V. Adve; Saurabh Bagchi; Pavan Balaji; Jim Belak; Pradip Bose; Franck Cappello; Bill Carlson; Andrew A. Chien; Paul W. Coteus; Nathan DeBardeleben; Pedro C. Diniz; Christian Engelmann; Mattan Erez; Saverio Fazzari; Al Geist; Rinku Gupta; Fred Johnson; Sriram Krishnamoorthy; Sven Leyffer; Dean A. Liberty; Subhasish Mitra; Todd S. Munson; Rob Schreiber; Jon Stearley; Eric Van Hensbergen

We present here a report produced by a workshop on ‘Addressing failures in exascale computing’ held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to establish a common taxonomy about resilience across all the levels in a computing system, discuss existing knowledge on resilience across the various hardware and software layers of an exascale system, and build on those results, examining potential solutions from both a hardware and software perspective and focusing on a combined approach. The workshop brought together participants with expertise in applications, system software, and hardware; they came from industry, government, and academia, and their interests ranged from theory to implementation. The combination allowed broad and comprehensive discussions and led to this document, which summarizes and builds on those discussions.


Proceedings of the IEEE | 2010

Practical Strategies for Power-Efficient Computing Technologies

Leland Chang; David J. Frank; Robert K. Montoye; Steven J. Koester; Brian L. Ji; Paul W. Coteus; Robert H. Dennard; Wilfried Haensch

After decades of continuous scaling, further advancement of silicon microelectronics across the entire spectrum of computing applications is today limited by power dissipation. While the trade-off between power and performance is well-recognized, most recent studies focus on the extreme ends of this balance. By concentrating instead on an intermediate range, an ~ 8× improvement in power efficiency can be attained without system performance loss in parallelizable applications-those in which such efficiency is most critical. It is argued that power-efficient hardware is fundamentally limited by voltage scaling, which can be achieved only by blurring the boundaries between devices, circuits, and systems and cannot be realized by addressing any one area alone. By simultaneously considering all three perspectives, the major issues involved in improving power efficiency in light of performance and area constraints are identified. Solutions for the critical elements of a practical computing system are discussed, including the underlying logic device, associated cache memory, off-chip interconnect, and power delivery system. The IBM Blue Gene system is then presented as a case study to exemplify several proposed directions. Going forward, further power reduction may demand radical changes in device technologies and computer architecture; hence, a few such promising methods are briefly considered.


IEEE Transactions on Electromagnetic Compatibility | 2001

Frequency-dependent losses on high-performance interconnections

Alina Deutsch; Gerard V. Kopcsay; Paul W. Coteus; Paul Eric Dahlen; David L. Heckmann; Dah-Weih Duan

This paper compares the major classes of chip-to-chip and on-chips interconnections used in high-performance computers and communication systems and reviews their electrical characteristics. Measurement results of dielectric loss are shown and the attenuation is compared for printed-circuit-board, glass-ceramic, thin-film, and on-chip wiring. Simulation results are shown with representative driver and receiver circuits, guidelines are given for when losses are significant, and predictions are made for the sustainable bandwidths on useful wiring lengths.


Ibm Journal of Research and Development | 2005

Packaging the Blue Gene/L supercomputer

Paul W. Coteus; H. R. Bickford; T. M. Cipolla; Paul G. Crumley; Alan Gara; Shawn A. Hall; Gerard V. Kopcsay; Alphonso P. Lanzetta; L. S. Mok; Rick A. Rand; R. Swetz; Todd E. Takken; P. La Rocca; C. Marroquin; P. R. Germann; M. J. Jeanson

As 1999 ended, IBM announced its intention to construct a one-petaflop supercomputer. The construction of this system was based on a cellular architecture--the use of relatively small but powerful building blocks used together in sufficient quantities to construct large systems. The first step on the road to a petaflop machine (one quadrillion floating-point operations in a second) is the Blue Gene®/L supercomputer. Blue Gene/L combines a low-power processor with a highly parallel architecture to achieve unparalleled computing performance per unit volume. Implementing the Blue Gene/L packaging involved trading off considerations of cost, power, cooling, signaling, electromagnetic radiation, mechanics, component selection, cabling, reliability, service strategy, risk, and schedule. This paper describes how 1,024 dual-processor compute application-specific integrated circuits (ASICs) are packaged in a scalable rack, and how racks are combined and augmented with host computers and remote storage. The Blue Gene/L interconnect, power, cooling, and control systems are described individually and as part of the synergistic whole.


Ibm Journal of Research and Development | 2011

Technologies for exascale systems

Paul W. Coteus; John U. Knickerbocker; Chung H. Lam; Yurii A. Vlasov

To satisfy the economic drive for ever more powerful computers to handle scientific and business applications, new technologies are needed to overcome the limitations of current approaches. New memory technologies will address the need for greater amounts of data in close proximity to the processors. Three-dimensional silicon integration will allow more cache and function to be integrated with the processor while allowing more than 1,000 times higher bandwidth communications at low power per channel using local interconnects between Si die layers and between die stacks. Integrated silicon nanophotonics will provide low-power and high-bandwidth optical interconnections between different parts of the system on a chip, board, and rack levels. Highly efficient power delivery and advanced liquid cooling will reduce the electrical demand and facility costs. A combination of these technologies will likely be required to build exascale systems that meet the combined challenges of a practical power constraint on the order of 20 MW with sufficient reliability and at a reasonable cost.


electrical performance of electronic packaging | 1997

The importance of inductance and inductive coupling for on-chip wiring

Alina Deutsch; Howard H. Smith; George A. Katopis; Wiren D. Becker; Paul W. Coteus; Gerard V. Kopcsay; Barry J. Rubin; R.P. Dunne; T. Gallo; Daniel R. Knebel; B.L. Krauter; L.M. Terman; G.A. Sai-Halasz; P.J. Reslte

The importance of inductance and inductive coupling for accurate delay and crosstalk prediction in on-chip interconnections is investigated experimentally for the top three layers in a five-layer wiring structure and guidelines are formulated. In-plane and between-plane crosstalk and delay dependence on driver and receiver circuit device sizes and line lengths and width are analyzed with representative CMOS circuits. Simplified constant-parameter, distributed coupled-line RLC-circuit representation that approximates the waveforms predicted with frequency-dependent line parameters is shown to be feasible.

Collaboration


Dive into the Paul W. Coteus's collaboration.

Researchain Logo
Decentralizing Knowledge