Rolf Hilgendorf | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rolf Hilgendorf is active.

Explore More

Publication

Featured researches published by Rolf Hilgendorf.

international solid-state circuits conference | 2004

PowerTune: advanced frequency and power scaling on 64b PowerPC microprocessor

Cedric Lichtenau; Mathew I. Ringler; Thomas Pflüger; Steve Geissler; Rolf Hilgendorf; Jay G. Heaslip; Ulrich Weiss; Peter A. Sandon; Norman J. Rohrer; Erwin B. Cohen; Miles G. Canada

PowerTune is a power-management technique for a multi-gigahertz superscalar 64b PowerPC/sup /spl reg// processor in a 90nm technology. This paper discusses the challenges and implementation of a dynamically controlled clock frequency with noise suppression as well as a synchronization circuit for a multi-processor system.

Ibm Journal of Research and Development | 1999

Evaluation of branch-prediction methods on traces from commercial applications

Rolf Hilgendorf; Gerald J. Heim; Wolfgang Rosenstiel

For modern superscalar processors, branch prediction is a must, and there has been significant progress in this field during recent years. For the IBM System ESA/390™ environment, a set of traces exists which represent different kinds of commercial workloads, and they include operating-system interactions. We have used four of these traces to evaluate a large variety of branch-prediction algorithms in order to identify possible design tradeoffs. One property of ESA/390 architecture is that for most branches, target address calculation involves the use of values stored in general-purpose registers. Therefore, not only branch directions but target addresses must be predicted. When performing prefetch-time prediction, a branch target buffer (BTB) is used to provide/predict the target address. In this paper, all evaluated prediction methods are combined with such a BTB. The resulting size for the BTB is significantly larger than for designs evaluated with SPECmark™ traces. Algorithms for determining branch direction are examined and compared. These algorithms include local branch history methods as well as global history and path history procedures. Finally, combinations of some of these methods, known as hybrid predictors, are evaluated. The path history algorithm we use is an adaptation of a known algorithm, but including it in the hybrid predictor is new. For all of these methods, design parameters are varied to find the tradeoff between the hardware needed and the prediction quality achieved. Results, except for those for the path predictor, are comparable to SPECmark results, except that for most cases less history must be used. Another property of ESA/390 architecture, the absence of specific subroutine call and return instructions, led to the investigation of hardware for self-detecting call/return pairs. A new approach has been developed, and its prediction quality is demonstrated. All of the methods described above use a BTB. A BTB performs well if branches have fixed targets. However, about 5% of the branches we consider have changing target addresses. Very recently an algorithm was proposed for treating such branches using a modification to the BTB approach. We have implemented an enhancement to this method, and the prediction correctness achievable using the enhanced method is shown in the results presented in this paper. Finally, combining several of the investigated schemes increases branch-prediction correctness in commercial environments. However, it remains to be shown whether the tremendous increase in hardware required for their implementation can be justified.

international solid state circuits conference | 2005

A 64-bit microprocessor in 130-nm and 90-nm technologies with power management features

Norman J. Rohrer; Cedric Lichtenau; Peter A. Sandon; Paul David Kartschoke; Erwin B. Cohen; Miles G. Canada; Thomas Pflüger; Mathew I. Ringler; Rolf Hilgendorf; Stephen F. Geissler; Jeffrey S. Zimmerman

The first two members in a family of 64-bit superscalar microprocessors are presented. The 130-nm processor, which was introduced first, offers 5-way instruction dispatch, support for 4-way integer and floating-point single-instruction multiple-data (SIMD) operations, a 512-kB second level (L2) cache, and a high-speed external bus. The 90-nm processor is a technology remap of the 130-nm design. It retains the features of the 130-nm processor and adds others, including a new power management facility. The architecture, device characteristics, power management, and thermal details of these two processors are described. In addition, the dataflow layout, aspects of the circuit design, clocking, and timing are discussed.

international solid-state circuits conference | 2006

A 64B CPU Pair: Dual- and Single-Processor Chips

Erwin B. Cohen; Norman J. Rohrer; Peter A. Sandon; Miles G. Canada; Cedric Lichtenau; Mathew I. Ringler; Paul David Kartschoke; R. Floyd; Jay G. Heaslip; M. Ross; T. Pflueger; Rolf Hilgendorf; P. McCormick; Gerard M. Salem; J. Connor; Stephen F. Geissler; Dana J. Thygesen

Two Powertrade-architecture 64b microprocessor chips are fabricated in 90nm dual strained-silicon SOI technology. The dual-processor chip has split clock domains and power planes, 1 MB L2 cache per core and a shared processor interconnect bus. The single-processor chip shares the duals basic core and cache design

ACM Sigarch Computer Architecture News | 2001

Instruction translation for an experimental S/390 processor

Rolf Hilgendorf; Wolfram Sauer

The IBM™ S/390™ architecture is a complex architecture, which has grown over a long period of time. Typical implementations use microcode to cope with the more complex instructions and facilities of S/390. Current IBM S/390 processors even contain two levels of microcode.We report on an experimental S/390 processor based on a RISC processor kernel employing superscalar, out of order execution of instructions. S/390 instructions have to be translated into internal sequences of RISC instructions. Actually two closely coupled internal sequences - one for register based execution and one for storage based execution are generated. The translation is a straight-forward mapping in most cases with some flexibility for special instructions.The paper introduces the hardware mechanisms used for mapping S/390 instructions to internal sequences. The facilities, which provide a greater degree of flexibility are discussed. The interactions of the low-level mapping scheme with the microcode levels is examined. Finally we discuss our experiences with this type of implementation of a CISC architecture on a RISC processor kernel.

Archive | 1997