Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Amir Roth is active.

Publication


Featured researches published by Amir Roth.


architectural support for programming languages and operating systems | 1998

Dependence based prefetching for linked data structures

Amir Roth; Andreas Moshovos; Gurindar S. Sohi

We introduce a dynamic scheme that captures the accesspat-terns of linked data structures and can be used to predict future accesses with high accuracy. Our technique exploits the dependence relationships that exist between loads that produce addresses and loads that consume these addresses. By identzj+ing producer-consumer pairs, we construct a compact internal representation for the associated structure and its traversal. To achieve a prefetching eflect, a small prefetch engine speculatively traverses this representation ahead of the executing program. Dependence-based prefetching achieves speedups of up to 25% on a suite of pointer-intensive programs.


international symposium on computer architecture | 1999

Effective jump-pointer prefetching for linked data structures

Amir Roth; Gurindar S. Sohi

Current techniques for prefetching linked data structures (LDS) exploit the work available in one loop iteration or recursive call to overlap pointer chasing latency. Jump pointers, which provide direct access to non-adjacent nodes, can be used for prefetching when loop and recursive procedure bodies are small and do not have sufficient work to overlap a long latency. This paper describes a framework for jump-pointer prefetching (JPP) that supports four prefetching idioms: queue, full, chain, and root jumping and three implementations: software-only, hardware-only, and a cooperative software/hardware technique. On a suite of pointer intensive programs, jump pointer prefetching reduces memory stall time by 72% for software, 83% for cooperative and 55% for hardware, producing speedups of 15%, 20% and 22% respectively.


international symposium on microarchitecture | 1997

Exploiting dead value information

Milo M. K. Martin; Amir Roth; Charles N. Fischer

We describe dead value information (DVI) and introduce three new optimizations which exploit it. DVI provides assertions that certain register values are dead, meaning they will not be read before being overwritten. The processor can use DVI to track dead registers and dynamically eliminate unnecessary save and restore instructions from the execution stream at procedure calls and context switches. Our results indicate that dynamic saves and restore instances can be reduced by 46% for procedure calls and by 51% for context switches. In addition, save/restore elimination for procedure calls can improve overall performance by up to 5%. DVI also allows the processor to manage physical registers efficiently, reducing the size requirements of the physical register file. When the system clock rate as proportional to the register file cycle time, this optimization can improve performance. All of these optimizations can be supported with only a few new instructions and minimal additional hardware structures.


IEEE Computer | 2001

Speculative multithreaded processors

Gurindar S. Sohi; Amir Roth

Speculation will overcome the limitations in dividing a single program into multiple threads that can execute on the multiple logical processing elements needed to enhance performance through parallelization.


international conference on supercomputing | 1999

Improving virtual function call target prediction via dependence-based pre-computation

Amir Roth; Andreas Moshovos; Gurindar S. Sohi

We introduce dependence-based pre-computation as a complement to history-based target prediction schemes. We present pre-computation in the context of virtual function calls (v-calls), a class of control transfers that is becoming increasingly important and has resisted conventional prediction. Our proposed technique dynamically identifies the sequence of operations that computes a v-call’s target. When the first instruction in such a sequence is encountered, a small execution engine speculatively and aggressively pre-executes the rest. The pre-computed target is stored and subsequently used when a prediction needs to be made. We show that a common v-call instruction sequence can be exploited to implement pre-computation using a previously proposed prefetching mechanism and minimal additional hardware. In a suite of C++ programs, dependence-based pre-computation eliminates 46% of the mispredictions incurred by a simple BTB and 24% of those associated with a path-based two-level predictor.


ieee international conference on high performance computing, data, and analytics | 2000

Speculative Multithreaded Processors

Gurindar S. Sohi; Amir Roth

Architects of future generation processors will have hundreds of millions of transistors with which to build computing chips. At the same time, it is becoming clear that naive scaling of conventional (superscalar) designs will increase complexity and cost while not meeting performance goals. Consequently, many computer architects are advocating a shift in focus from high-performance to high-throughput with a corresponding shift to multithreaded architectures. Multithreaded architectures provide new opportunities for extracting parallelism from a single program via thread level speculation. We expect to see two major forms of thread-level speculation: control-driven and data-driven. We believe that future processors will not only be multithreaded, but will also support thread-level speculation, giving them the flexibility to operate in either multiple-program/high-throughput or single-program/highperformance capacities. Deployment of such processors will require innovations in means to convey multithreading information from software to hardware, algorithms for thread selection and management, as well as hardware structures to support the simultaneous execution of collections of speculative and non-speculative threads.


Innovative Architecture for Future Generation High-Performance Processors and Systems | 1998

New methods for exploiting program structure and behavior in computer architecture

Amir Roth; Gurindar S. Sohi

Micro-architectural techniques of the next decade will have to be more efficient and scalable in order to handle growing workloads and longer communication and memory latencies. We believe that information about program structure, the data and control relationships between instructions, can be used as a powelful framework for new techniques. We argue that program structure information has several inherent advantages over frameworks that associate information either with instructions in isolation or with data. We present summaries of four novel methods that apply program structure information to memory system problems from disambiguation and data cache bandwdith to. prefetching and coherence optimization.


high performance computer architecture | 2001

Speculative data-driven multithreading

Amir Roth; Gurindar S. Sohi


Medea | 2000

Microarchitectural Miss/Execute Decoupling

Amir Roth; Craig B. Zilles; Gurindar S. Sohi


Trends in Cognitive Sciences | 2000

Register integration: a simple and efficient implementation of squash reuse

Amir Roth; Gurindar S. Sohi

Collaboration


Dive into the Amir Roth's collaboration.

Top Co-Authors

Avatar

Gurindar S. Sohi

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Charles N. Fischer

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Milo M. K. Martin

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge