Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where James S. Burns is active.

Publication


Featured researches published by James S. Burns.


IEEE Transactions on Parallel and Distributed Systems | 2002

SMT layout overhead and scalability

James S. Burns; Jean-Luc Gaudiot

Simultaneous Multi-Threading (SMT) is a hardware technique that increases processor throughput by issuing instructions simultaneously from multiple threads. However, while SMT can be added to an existing microarchitecture with relatively low overhead, this additional chip area could be used for other resources such as more functional units, larger caches, or better branch predictors. How large is the SMT overhead and at what point does SMT no longer pay off for maximum throughput compared to adding other architecture features? This paper evaluates the silicon overhead of SMT by performing a transistor/interconnect-level analysis of the layout. We discuss microarchitecture issues that impact SMT implementations and show how the Instruction Set Architecture (ISA) and microarchitecture can have a large effect on the SMT overhead and performance. Results show that SMT yields large performance gains with small to moderate area overhead.


international conference on parallel architectures and compilation techniques | 2001

Area and System Clock Effects on SMT/CMP Processors

James S. Burns; Jean-Luc Gaudiot

Two approaches to high throughput processors are chip multiprocessing (CMP) and simultaneous multi-threading (SMT). CMP increases layout efficiency, which allows more functional units and a faster clock rate. However, CMP suffers from hardware partitioning of functional resources. SMT increases functional unit utilization by issuing instructions simultaneously from multiple threads. However, a wide-issue SMT suffers from layout and technology implementation problems. We use silicon resources as our basis for comparison and find that area and system clock have a large effect on the optimal SMT/CCMP design trade. We show the area overhead of SMT on each processor and how it scales with the width of the processor pipeline and the number of SMT threads. The wide issue SMT delivers the highest single-thread performance with improved multi-thread throughput. However multiple smaller cores deliver the highest throughput.


high performance computer architecture | 2000

Quantifying the SMT layout overhead-does SMT pull its weight?

James S. Burns; Jean-Luc Gaudiot

Simultaneous Multi-Threading (SMT) is a hardware technique that increases processor throughput by issuing instructions simultaneously from multiple threads. However, while SMT can be added to an existing microarchitecture with relatively low overhead, this additional chip area could be used for other resources such as more functional units, larger caches or better branch predictors. How large is the SMT overhead, and at what point does SMT no longer pay off compared to adding other architecture features? This paper evaluates the silicon overhead of SMT by performing a transistor/interconnect level analysis of the layout. We discuss micro-architecture issues that impact SMT implementations, and show how the Instruction Set Architecture (ISA) and microarchitecture can have a large effect on the SMT overhead and performance. Results show that SMT yields large performance gains with small to moderate area overhead.


IEEE Transactions on Computers | 2005

Area and system clock effects on SMT/CMP throughput

James S. Burns; Jean-Luc Gaudiot

Two approaches to high throughput processors are chip multiprocessing (CMP) and simultaneous multithreading (SMT). CMP increases layout efficiency, which allows more functional units and a faster clock rate. However, CMP suffers from hardware partitioning of functional resources. SMT increases functional unit utilization by issuing instructions simultaneously from multiple threads. However, a wide-issue SMT suffers from layout and technology implementation problems. We use silicon resources as our basis for comparison and find that area and system clock have a large effect on the optimal SMT/CMP design trade. We show the area overhead of SMT on each processor and how it scales with the width of the processor pipeline and the number of SMT threads. The wide issue SMT delivers the highest single-thread performance with improved multithread throughput. However, multiple smaller cores deliver the highest throughput. Also, alternate processor configurations are explored that trade off SMT threads for other microarchitecture features. The result is a small increase to single-thread performance, but a fairly large reduction in throughput.


Archive | 2002

Method and apparatus for adjusting the voltage and frequency to minimize power dissipation in a multiprocessor system

Stefan Rusu; David Ayers; James S. Burns


Archive | 2001

Multiple mode power throttle mechanism

James S. Burns; Stefan Rusu; David Ayers; Edward T. Grochowski; Marsha Eng; Vivek Tiwari


Archive | 2001

Digital throttle for multiple operating points

James S. Burns; Stefan Rusu; David Ayers; Edward T. Grochowski; Marsha Eng; Vivek Tiwari


Archive | 2001

High instruction fetch bandwidth in multithread processor using temporary instruction cache to deliver portion of cache line in subsequent clock cycle

Sailesh Kottapalli; James S. Burns; Kenneth Shoemaker


Archive | 2004

Separate thermal and electrical throttling limits in processors

James S. Burns; Sailesh Kottapalli


Archive | 2006

Method and apparatus for adjusting the voltage and frequency to minimize power dissipation in a multiprocessor system in response to compute load

Stefan Rusu; David Ayers; James S. Burns

Collaboration


Dive into the James S. Burns's collaboration.

Researchain Logo
Decentralizing Knowledge