Steve Scott | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Steve Scott is active.

Explore More

Publication

Featured researches published by Steve Scott.

conference on high performance computing (supercomputing) | 1997

Performance of the CRAY T3E Multiprocessor

Ed Anderson; Jeff Brooks; Charles M. Grassl; Steve Scott

The CRAY T3E is a scalable shared-memory multiprocessor based on the DEC Alpha 21164 microprocessor. The system includes a number of architectural features designed to tolerate latency and enhance scalability. Included among these are stream buffers, which detect and prefetch down small-stride reference streams, E-registers, which allow memory reference pipelining and provide non-unit-stride access capabilities, and a scalable, high-bandwidth interconnection network. We report our experiences with T3E performance. We describe several hardware features, discuss programming implications, and provide related benchmark results. Included are NAS Parallel Benchmark results up to 1024 processors.

conference on high performance computing (supercomputing) | 2007

The Cray BlackWidow: a highly scalable vector multiprocessor

Dennis Abts; Abdulla Bataineh; Steve Scott; Greg Faanes; Jim Schwarzmeier; Eric P. Lundberg; Timothy J. Johnson; Mike Bye; Gerald A. Schwoerer

This paper describes the system architecture of the Cray BlackWidow scalable vector multiprocessor. The BlackWidow system is a distributed shared memory (DSM) architecture that is scalable to 32K processors, each with a 4-way dispatch scalar execution unit and an 8-pipe vector unit capable of 20.8 Gflops for 64-bit operations and 41.6 Gflops for 32-bit operations at the prototype operating frequency of 1.3 GHz. Global memory is directly accessible with processor loads and stores and is globally coherent. The system supports thousands of outstanding references to hide remote memory latencies, and provides a rich suite of built-in synchronization primitives. Each BlackWidow node is implemented as a 4-way SMP with up to 128 Gbytes of DDR2 main memory capacity. The system supports common programming models such as MPI and OpenMP, as well as global address space languages such as UPC and CAF. We describe the system architecture and microarchitecture of the processor, memory controller, and router chips. We give preliminary performance results and discuss design tradeoffs.

international symposium on microarchitecture | 2009

Cost-Efficient Dragonfly Topology for Large-Scale Systems

John Kim; William J. Dally; Steve Scott; Dennis Abts

It is more efficient to use increasing pin bandwidth by creating high-radix routers with a large number of narrow ports instead of low-radix routers with fewer wide ports. building networks using high-radix routers lowers cost and improves performance, but also presents many challenges. the dragonfly topology minimizes network cost by reducing the number of global channels required.

international symposium on microarchitecture | 1996

The GigaRing channel

Steve Scott

Crays GigaRing channel provides flexible intersystem and system-to-peripheral communication for distributed supercomputer environments, sustaining data payload bandwidths on the order of a Gbyte per second.

high performance interconnects | 2009

The Impact of Optics on HPC System Interconnects

Michael A. Parker; Steve Scott

Optical signaling has long been used for telecommunications, where its low-loss signaling capability is needed and the relatively high termination costs can be amortized over long distances. Until recently, Cray has not found it advantageous to use optics in its multiprocessor interconnects. With recent reductions in optical costs and increases in signaling rates, however, the situation has changed, and Cray is currently developing a hybrid electrical/optical interconnect for our “Cascade” system, which will be shipping in 2012. In this position paper, Cray was asked to answer the question “Will cost-effective optics fundamentally change the landscape of networking?” The short answer is yes. By breaking the tight relationship between cable length, cost, and signaling speed, optical signaling technology opens the door to network topologies with much longer links than are feasible with electrical signaling. Cost-effective optics will thus enable a new class of interconnects that use high-radix network topologies to significantly improve performance while reducing cost.

conference on high performance computing (supercomputing) | 2006

Multi-Core for HPC: breakthrough or breakdown?

Thomas L. Sterling; Peter M. Kogge; William J. Dally; Steve Scott; William Gropp; David E. Keyes; Peter H. Beckman

A dramatic trend in computing is the adoption of multi-core technology by the vendors from which our current and future HPC systems are being derived. Multi-core is offered as a path to continued reliance and benefits of Moores Law while reining in the previously unfettered growth of power consumption and design complexity. Are we saved? or is it but a fools mission, trapping us in a technical cul de sac with no long term direction and no way to reinvent an alternative future. The panel will consider the following questions:* Can multi-core span the next decade of Moores Law progression?* Are the pins and caches a strangle hold on the future effectiveness of multi-core?* Can innovative algorithmic techniques exploit the opportunities and address the challenges of multi-core?* How will programming models and supporting system software change to accommodate the unique properties and peculiarities of multi-core structures?

mobile adhoc and sensor systems | 1995

A supercomputer system interconnect and scalable IOS

Steve Johnson; Steve Scott

The evolution of system architectures and system configurations has created the need for a new supercomputer system interconnect. Attributes required of the new interconnect include commonality among system and subsystem types, scalability, low latency, high bandwidth, a high level of resiliency, and flexibility. Cray Research Inc. is developing a new system channel to meet these interconnect requirements in future systems. The channel has a ring-based architecture, but can also function as a point-to-point link. It integrates control and data on a single, physical path while providing low latency and variance for control messages. Extensive features for client isolation, diagnostic capabilities, and fault tolerance have been incorporated into the design. The attributes and features of this channel are discussed along with implementation and protocol specifics.

optical fiber communication conference | 2011

Optical interconnects in future HPC systems

Steve Scott

This talk will describe the primary interconnection network challenges as we attempt to build exascale computers over the coming decade. We discuss the role that optics will play in these systems, and key attributes for signaling technology from a system builders perspective.

international conference on parallel architectures and compilation techniques | 2006

Challenges and opportunities in the post single-thread-processor era

Steve Scott

The age of the single thread juggernaut has ended, due to a variety of factors. Multi-core processors are coming on strong, and scaling in being stressed more than ever. This presents a number of architectural, hardware and software challenges. This talk will reflect on these challenges from Crays perspective in the high performance computing industry.

conference on high performance computing (supercomputing) | 2006

Cray: creating a path to adaptive supercomputing

Steve Scott

This presentation will outline Crays Adaptive Supercomputing vision. Crays phased approach will take the concept of heterogeneous computing to a new level by integrating a range of processing technologies in a single platform. These Linux-based systems will combine scalar processing, vector processing, multithreading and hardware accelerators to solve scientific and engineering problems more quickly and make programmers and end users more productive. Powerful compilers and other software will automatically match an application to the processing technology that is best suited for it, allowing the supercomputer to flexibly adapt to the application rather than obligating users to adapt their applications to the system.

Explore More