Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where John Arends is active.

Publication


Featured researches published by John Arends.


international symposium on low power electronics and design | 1999

Instruction fetch energy reduction using loop caches for embedded applications with small tight loops

Lea Hwang Lee; Bill Moyer; John Arends

A fair amount of work has been done in recent years on reducing power consumption in caches by using a small instruction buffer placed between the execution pipe and a larger main cache. These techniques, however, often degrade the overall system performance. In this paper, we propose using a small instruction buffer, also called a loop cache, to save power. A loop cache has no address tag store. It consists of a direct-mapped data array and a loop cache controller. The loop cache controller knows precisely whether the next instruction request will hit in the loop cache, well ahead of time. As a result, there is no performance degradation.


international symposium on microarchitecture | 1999

Low-cost branch folding for embedded applications with small tight loops

Lea Hwang Lee; Jeff Scott; Bill Moyer; John Arends

Many portable and embedded applications are characterized by spending a large fraction of execution time on small program loops. To improve performance many embedded systems use special instructions to handle program loop executions. These special instructions, however, consume opcode space, which is valuable in the embedded computing environments. In this paper, we propose a hardware technique for folding our branches when executing these small loops. This technique does not require any special branch instructions. It is based on the detection and utilization of certain short backward branch instructions (sbb). A sbb is any PC-relative branch instruction with a limited backward branch distance. Once an sbb is detected, its displacement field is used by the hardware to identify the actual program loop size. It does so by loading this negative displacement field into a counter and incrementing the counter for each instruction sequentially executed. As the count approaches zero, the hardware folds out the sbb by predicting that it is always taken. The hardware overhead for this technique is minimal. Using a 5-bit increment counter, the performance improvement over a set of embedded applications is about 7.5%.


Archive | 2006

Method and apparatus for interfacing a processor to a coprocessor

William C. Moyer; John Arends; Jeffrey W. Scott


Archive | 1996

Method and apparatus for selecting a register file in a data processing system

William C. Moyer; John Arends


Archive | 1992

Apparatus and method for optimizing performance of a cache memory in a data processing system

William C. Moyer; John Arends; Christopher E. White; Keith E. Diefendorff


Archive | 1996

Data processing system having a cache and method therefor

William C. Moyer; John Arends; Lea Hwang Lee


Archive | 1999

Low-Cost Embedded Program Loop Caching - Revisited

Lea Hwang Lee; Bill Moyer; John Arends


Archive | 1999

Data processor system having branch control and method thereof

Lea Hwang Lee; William C. Moyer; Jeffrey W. Scott; John Arends


Archive | 1997

Method and apparatus for interfacing a processor to a coprocessor for communicating register write information

William C. Moyer; John Arends; Jeffrey W. Scott


Archive | 2000

Debug controller in a data processor and method therefor

John Arends; Jeffrey W. Scott; William C. Moyer

Collaboration


Dive into the John Arends's collaboration.

Researchain Logo
Decentralizing Knowledge