H. Peter Hofstee
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by H. Peter Hofstee.
vlsi test symposium | 1998
David F. Heidel; Sang Hoo Dhong; H. Peter Hofstee; Michael Immediato; Kevin J. Nowka; Joel Abraham Silberman; Kevin Stawiasz
As microprocessor speeds approach 1 GHz and beyond the difficulties of at-speed testing continue to increase. In particular, automated test equipment which operates at these frequencies is very limited. This paper discusses a design-for-test method which serializes parallel circuit inputs and de-serializes circuit outputs to achieve 1 GHz operation on test equipment operating at frequencies below 100 MHz. This method has been used to successfully characterize the operation of a 1 GHz microprocessor chip.
international conference on computer design | 1998
Stephen D. Posluszny; Nobumasa Aoki; David William Boerstler; Jeffrey L. Burns; Sang Hoo Dhong; Uttam Shyamalindu Ghoshal; H. Peter Hofstee; David P. LaPotin; Kyung Tek Lee; David Meltzer; Hung C. Ngo; Kevin J. Nowka; Joel Abraham Silberman; Osamu Takahashi; Ivan Vo
This paper describes the design methodology used to build an experimental 1.0 GigaHertz PowerPC integer microprocessor at IBMs Austin Research Laboratory. The high frequency requirements dictated the chip composition to be almost entirely custom macros using dynamic circuit techniques. The methodology presented will cover design and verification tools as well as circuit constraints and microarchitecture philosophy. The microarchitecture, circuits and tools were defined by the high frequency requirements of the processor as well as the aggressive design schedule and size of the design team.
asia and south pacific design automation conference | 2006
D. Pham; Hans-Werner Anderson; Erwin Behnen; Mark Bolliger; Sanjay Gupta; H. Peter Hofstee; Paul Harvey; Charles Ray Johns; James Allan Kahle; Atsushi Kameyama; John M. Keaty; Bob Le; Sang Lee; Tuyen V. Nguyen; John George Petrovick; Mydung Pham; Juergen Pille; Stephen D. Posluszny; Mack W. Riley; Joseph Roland Verock; James D. Warnock; Steve Weitzel; Dieter Wendel
This paper reviews the design challenges that current and future processors must face, with stringent power limits and high frequency targets, and the design methods required to overcome the above challenges and address the continuing Giga-scale system integration trend. This paper then describes the details behind the design methodology that was used to successfully implement a first-generation CELL processor - a multi-core SoC. Key features of this methodology are broad optimization with fast rule-based analysis engines using macro-level abstraction for constraints propagation up/down the design hierarchy, coupled with accurate transistor level simulation for detailed analysis. The methodology fostered the modular design concept that is inherent to the CELL architecture, enabling a high frequency design by maximizing custom circuit content through re-use, and balanced power, frequency, and die size targets through global convergence capabilities. The design has roughly 241 million transistors implemented in 90 nm SOI technology with 8 levels of copper interconnects and one local interconnect layer. The chip has been tested at various temperatures, voltages, and frequencies. Correct operation has been observed in the lab on first pass silicon at frequencies well over 4GHz.
IEEE Micro | 2014
Raphael Polig; Kubilay Atasu; Laura Chiticariu; Christoph Hagleitner; H. Peter Hofstee; Frederick R. Reiss; Huaiyu Zhu; Eva Sitaridi
The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBMs SystemT software is a powerful text-analytics system that offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing so-called big data efficiently, despite the high memory bandwidth that is available. The authors show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemTs information extraction queries can be improved by an order of magnitude. They also show how such a system can be deployed by extending SystemTs existing compilation flow and by using a multithreaded communication interface that can efficiently use the accelerators bandwidth.
IEEE Computer | 2015
Fadi H. Gebara; H. Peter Hofstee; Kevin J. Nowka
More varied data channels, increasingly diverse analytic methods, and new deployment models-along with some fundamental technology shifts-will significantly impact the next generation of big data systems.
international conference on computer design | 1998
Osamu Takahashi; Joel Abraham Silberman; Sang Hoo Dhong; H. Peter Hofstee; Nobumasa Aoki
This paper describes a 690 ps read-access latency, 32 entry by 64 bit, 3 read-port, 2 write-port, register file with internal bypass. The register file has been fabricated as a pan of 1.0 GHz single-issue 64-bit PowerPC integer processor. Fabrication technology was IBM CMOS6X: 0.25-/spl mu/m drawn channel length, six-metal-layer (Al), 1.8 V nom. V/sub DD/. Self-resetting custom dynamic circuits are used exclusively. Read operation is accomplished by sensing the differential voltage of dual rail bit-lines. Read operation is followed by write operation in the same cycle. Whenever a read address is identical to a write address, the write data is forwarded by an output multiplexer. The register file has been tested and cycle by cycle operation in the processor environment verified at frequencies up to 1.0 GHz (1.8 V, 25/spl deg/C).
Archive | 2009
H. Peter Hofstee
The Cell Broadband Engine™1 Architecture defines a heterogeneous chip multi-processor (HCMP). Heterogeneous processors can achieve higher degrees of efficiency and performance than homogeneous chip multi-processors (CMPs), but also place a larger burden on software. In this chapter, we describe the Cell Broadband Engine Architecture and implementations. We discuss how memory flow control and the synergistic processor unit architecture extend the Power Architecture™2, to allow the creation of heterogeneous implementations that attack the greatest sources of inefficiency in modern microprocessors. We discuss aspects of the micro-architecture and implementation of the Cell Broadband Engine and PowerXCell8i processors. Next we survey portable approaches to programming the Cell Broadband Engine and we discuss aspects of its performance.
asian solid state circuits conference | 2005
Osamu Takahashi; Scott R. Cottier; Sang Hoo Dhong; Brian Flachs; Koji Hirairi; H. Peter Hofstee; Brad W. Michael; Hiromi Noro; Dieter Wendel; Michael Wayne White
A 4-way SIMD streaming processor of a cell processor is developed in a 90nm SOI technology. CMOS static gates implement the majority of the logic. Dynamic circuits are used in critical areas, occupying 19% of the non-SRAM area. ISA, microarchitecture, and physical implementation are co-optimized to achieve a compact and power efficient design
Information Processing Letters | 2001
H. Peter Hofstee; Jun Sawada
Abstract We formally construct a rotator circuit with more homogeneous interconnect than the standard logarithmic rotator. The circuit has improved data-in to data-out delay and similar control to data-out delay. The circuit is easily modified to support (SIMD) subfield rotate instructions.
Archive | 2009
Adam Patrick Burns; Michael Norman Day; Brian Flachs; H. Peter Hofstee; Charles Ray Johns; John Samuel Liberty