Steven Huss-Lederman
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Steven Huss-Lederman.
IEEE Concurrency | 2000
Shubhendu S. Mukherjee; Steven K. Reinhardt; Babak Falsafi; Mike Litzkow; Mark D. Hill; David A. Wood; Steven Huss-Lederman; James R. Larus
To analyze new parallel computers, developers must rapidly simulate designs running realistic workloads. Historically, direct execution and a parallel host have accelerated simulations, although these techniques have typically lacked portability. Through four key operations, the Wisconsin Wind Tunnel II can easily run simulations on Sparc platforms ranging from a workstation cluster to an asymmetric multiprocessor.
european conference on parallel processing | 1996
Al Geist; William Gropp; Steven Huss-Lederman; Andrew Lumsdaine; Ewing L. Lusk; William Saphir; Tony Skjellum; Marc Snir
This paper describes current activities of the MPI-2 Forum. The MPI-2 Forum is a group of parallel computer vendors, library writers, and application specialists working together to define a set of extensions to MPI (Message Passing Interface). MPI was defined by the same process and now has many implementations, both vendor-proprietary and publicly available, for a wide variety of parallel computing environments. In this paper we present the salient aspects of the evolving MPI-2 document as it now stands. We discuss proposed extensions and enhancements to MPI in the areas of dynamic process management, one-sided operations, collective operations, new language binding, real-time computing, external interfaces, and miscellaneous topics.
conference on high performance computing (supercomputing) | 1996
Steven Huss-Lederman; Elaine M. Jacobson; Anna Tsao; Thomas Turnbull; Jeremy R. Johnson
In this paper we report on the development of an efficient and portable implementation of Strassens matrix mulitplication algorithm. Our implementation is designed to be used in place of DGEMM, the Level 3 BLAS matrix mulitplication routine. Efficient performance will be obtained for all matrix sizes and shapes and the additional memory needed fro temporary variables has been minimized. Replacing DGEMM with our routine should provide a significant performance gain for large matrices while providing the same performance for small matrices. We measure performance of our code on the IBM RS/6000, CRAY YMP C90, and CRAY T3D single processor, and offer comparisons to other codes. Our performance data reconfirms that Strassens algorithm is practical for realistic size matrices. The usefulness of our implementation is demonstrated by replacing DGEMM with our routine in a large application code.
Archive | 1998
Marc Snir; Steve W. Otto; David Walker; Jack J. Dongarra; Steven Huss-Lederman
Archive | 1998
Marc Snir; Steve W. Otto; Steven Huss-Lederman; David W. Walker; Jack Dongarra
Archive | 1998
Andrew Lumsdaine; Steven Huss-Lederman; Bill Gropp
Archive | 1998
Marc Snir; Steve W. Otto; Steven Huss-Lederman; David Walker; Jack J. Dongarra
Archive | 1998
Marc Snir; Steve W. Otto; Steven Huss-Lederman; David Walker; Jack J. Dongarra
Archive | 1996
Marc Snir; Alexander Otto; Steven Huss-Lederman; David Walker; Jack J. Dongarra
Archive | 1998
William Gropp; Steven Huss-Lederman; Andrew Lumsdaine; Ewing L. Lusk; Bill Nitzberg; William Saphir; Marc Snir