H. Peter Hofstee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where H. Peter Hofstee is active.

Explore More

Publication

Featured researches published by H. Peter Hofstee.

vlsi test symposium | 1998

High speed serializing/de-serializing design-for-test method for evaluating a 1 GHz microprocessor

David F. Heidel; Sang Hoo Dhong; H. Peter Hofstee; Michael Immediato; Kevin J. Nowka; Joel Abraham Silberman; Kevin Stawiasz

As microprocessor speeds approach 1 GHz and beyond the difficulties of at-speed testing continue to increase. In particular, automated test equipment which operates at these frequencies is very limited. This paper discusses a design-for-test method which serializes parallel circuit inputs and de-serializes circuit outputs to achieve 1 GHz operation on test equipment operating at frequencies below 100 MHz. This method has been used to successfully characterize the operation of a 1 GHz microprocessor chip.

international conference on computer design | 1998

Design methodology for a 1.0 GHz microprocessor

Stephen D. Posluszny; Nobumasa Aoki; David William Boerstler; Jeffrey L. Burns; Sang Hoo Dhong; Uttam Shyamalindu Ghoshal; H. Peter Hofstee; David P. LaPotin; Kyung Tek Lee; David Meltzer; Hung C. Ngo; Kevin J. Nowka; Joel Abraham Silberman; Osamu Takahashi; Ivan Vo

This paper describes the design methodology used to build an experimental 1.0 GigaHertz PowerPC integer microprocessor at IBMs Austin Research Laboratory. The high frequency requirements dictated the chip composition to be almost entirely custom macros using dynamic circuit techniques. The methodology presented will cover design and verification tools as well as circuit constraints and microarchitecture philosophy. The microarchitecture, circuits and tools were defined by the high frequency requirements of the processor as well as the aggressive design schedule and size of the design team.

asia and south pacific design automation conference | 2006

Key features of the design methodology enabling a multi-core SoC implementation of a first-generation CELL processor

D. Pham; Hans-Werner Anderson; Erwin Behnen; Mark Bolliger; Sanjay Gupta; H. Peter Hofstee; Paul Harvey; Charles Ray Johns; James Allan Kahle; Atsushi Kameyama; John M. Keaty; Bob Le; Sang Lee; Tuyen V. Nguyen; John George Petrovick; Mydung Pham; Juergen Pille; Stephen D. Posluszny; Mack W. Riley; Joseph Roland Verock; James D. Warnock; Steve Weitzel; Dieter Wendel

This paper reviews the design challenges that current and future processors must face, with stringent power limits and high frequency targets, and the design methods required to overcome the above challenges and address the continuing Giga-scale system integration trend. This paper then describes the details behind the design methodology that was used to successfully implement a first-generation CELL processor - a multi-core SoC. Key features of this methodology are broad optimization with fast rule-based analysis engines using macro-level abstraction for constraints propagation up/down the design hierarchy, coupled with accurate transistor level simulation for detailed analysis. The methodology fostered the modular design concept that is inherent to the CELL architecture, enabling a high frequency design by maximizing custom circuit content through re-use, and balanced power, frequency, and die size targets through global convergence capabilities. The design has roughly 241 million transistors implemented in 90 nm SOI technology with 8 levels of copper interconnects and one local interconnect layer. The chip has been tested at various temperatures, voltages, and frequencies. Correct operation has been observed in the lab on first pass silicon at frequencies well over 4GHz.

IEEE Micro | 2014

Giving Text Analytics a Boost

Raphael Polig; Kubilay Atasu; Laura Chiticariu; Christoph Hagleitner; H. Peter Hofstee; Frederick R. Reiss; Huaiyu Zhu; Eva Sitaridi

The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBMs SystemT software is a powerful text-analytics system that offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing so-called big data efficiently, despite the high memory bandwidth that is available. The authors show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemTs information extraction queries can be improved by an order of magnitude. They also show how such a system can be deployed by extending SystemTs existing compilation flow and by using a multithreaded communication interface that can efficiently use the accelerators bandwidth.

IEEE Computer | 2015

Second-Generation Big Data Systems

Fadi H. Gebara; H. Peter Hofstee; Kevin J. Nowka

More varied data channels, increasingly diverse analytic methods, and new deployment models-along with some fundamental technology shifts-will significantly impact the next generation of big data systems.

international conference on computer design | 1998

A 690 ps read-access latency register file for a GHz integer microprocessor

Osamu Takahashi; Joel Abraham Silberman; Sang Hoo Dhong; H. Peter Hofstee; Nobumasa Aoki

This paper describes a 690 ps read-access latency, 32 entry by 64 bit, 3 read-port, 2 write-port, register file with internal bypass. The register file has been fabricated as a pan of 1.0 GHz single-issue 64-bit PowerPC integer processor. Fabrication technology was IBM CMOS6X: 0.25-/spl mu/m drawn channel length, six-metal-layer (Al), 1.8 V nom. V/sub DD/. Self-resetting custom dynamic circuits are used exclusively. Read operation is accomplished by sensing the differential voltage of dual rail bit-lines. Read operation is followed by write operation in the same cycle. Whenever a read address is identical to a write address, the write data is forwarded by an output multiplexer. The register file has been tested and cycle by cycle operation in the processor environment verified at frequencies up to 1.0 GHz (1.8 V, 25/spl deg/C).

Archive | 2009

Heterogeneous Multi-core Processors: The Cell Broadband Engine

H. Peter Hofstee

The Cell Broadband Engine™1 Architecture defines a heterogeneous chip multi-processor (HCMP). Heterogeneous processors can achieve higher degrees of efficiency and performance than homogeneous chip multi-processors (CMPs), but also place a larger burden on software. In this chapter, we describe the Cell Broadband Engine Architecture and implementations. We discuss how memory flow control and the synergistic processor unit architecture extend the Power Architecture™2, to allow the creation of heterogeneous implementations that attack the greatest sources of inefficiency in modern microprocessors. We discuss aspects of the micro-architecture and implementation of the Cell Broadband Engine and PowerXCell8i processors. Next we survey portable approaches to programming the Cell Broadband Engine and we discuss aspects of its performance.

asian solid state circuits conference | 2005

The Power Conscious Synergistic Processor Element of a Cell Processor

Osamu Takahashi; Scott R. Cottier; Sang Hoo Dhong; Brian Flachs; Koji Hirairi; H. Peter Hofstee; Brad W. Michael; Hiromi Noro; Dieter Wendel; Michael Wayne White

A 4-way SIMD streaming processor of a cell processor is developed in a 90nm SOI technology. CMOS static gates implement the majority of the logic. Dynamic circuits are used in critical areas, occupying 19% of the non-SRAM area. ISA, microarchitecture, and physical implementation are co-optimized to achieve a compact and power efficient design

Information Processing Letters | 2001

Derivation of a rotator circuit with homogeneous interconnect

H. Peter Hofstee; Jun Sawada

Abstract We formally construct a rotator circuit with more homogeneous interconnect than the standard logarithmic rotator. The circuit has improved data-in to data-out delay and similar control to data-out delay. The circuit is easily modified to support (SIMD) subfield rotate instructions.

Archive | 2009