Harvey J. Wasserman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Harvey J. Wasserman is active.

Explore More

Publication

Featured researches published by Harvey J. Wasserman.

conference on high performance computing (supercomputing) | 2001

Predictive Performance and Scalability Modeling of a Large-Scale Application

Darren J. Kerbyson; Henry J. Alme; Adolfy Hoisie; Fabrizio Petrini; Harvey J. Wasserman; Michael L. Gittings

In this work we present a predictive analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC’s Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against measurements on several systems including ASCI Blue Mountain, ASCI White, and a Compaq Alphaserver ES45 system showing high accuracy. It is parametric - basic machine performance numbers (latency, MFLOPS rate, bandwidth) and application characteristics (problem size, decomposition method, etc.) serve as input. The model is applied to add insight into the performance of current systems, to reveal bottlenecks, and to illustrate where tuning efforts can be effective. We also use the model to predict performance on future systems.

ieee international conference on high performance computing data and analytics | 2000

Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications

Adolfy Hoisie; Olaf M. Lubeck; Harvey J. Wasserman

The authors develop a model for the parallel performance of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message-passing environment. The model, based on a LogGP machine parameterization, combines the separate contributions of computation and communication wavefronts. The authors validate the model on three important supercomputer systems, on up to 500 processors. They use data from a deterministic particle transport application taken from the ASCI workload, although the model is general to any wavefront algorithm implemented on a 2-D processor domain. They also use the validated model to make estimates of performance and scalability of wavefront algorithms on 100 TFLOPS computer systems expected to be in existence within the next decade as part of the ASCI program and elsewhere. In this context, the authors analyze two problem sizes. Their model shows that on the largest such problem (1 billion cells), interprocessor communication performance is not the bottleneck. Single-node efficiency is the dominant factor.

international conference on parallel processing | 2000

A general predictive performance model for wavefront algorithms on clusters of SMPs

Adolfy Hoisie; Olaf M. Lubeck; Harvey J. Wasserman; Fabrizio Petrini; Hank Alme

We propose and validate a closed-end, analytical, general, predictive performance model for applications based on wavefront algorithms on clusters of SMPs. Wavefront algorithms are ubiquitous in parallel computing, since they represent a means of enabling parallelism in computations that contain recurrences. Our particular interest in wavefront algorithms derives from their use in discrete ordinates neutral particle transport computations representative of ASCI, but other important uses are well known. The proposed model captures the tradeoff between processor utilization and communication requirements characteristics of wavefront algorithms. The general model can predict the performance of this class of applications on distributed architectures with a network of lower dimensionality compared to that of an MPP, of which clusters of SMPs are one example. We validate the model using a compact-application from the ASCI workload on a large-scale cluster of SGI Origin 2000s in existence at the Los Alamos National Laboratory. The proposed model validates well on all clusters configurations utilized.

conference on high performance computing (supercomputing) | 1991

A performance comparison of three supercomputers: Fujitsu VP-2600, NEC SX-3, and CRAY Y-MP

Margaret L. Simmons; Harvey J. Wasserman; Olaf M. Lubeck; Christopher Eoyang; Raul Mendez; Hirro Harada; Misako Ishiguro

No abstract available

international parallel and distributed processing symposium | 2003

A comparison between the Earth Simulator and AlphaServer systems using predictive application performance models

Darren J. Kerbyson; Adolfy Hoisie; Harvey J. Wasserman

This paper gives a detailed analysis of the relative performance between the Earth Simulator and systems built using Alpha processors. The achieved performance results from an interplay of system characteristics, application requirements and scalability behavior. Detailed performance models are used here to predict the performance of two codes representative of ASCI computations, namely SAGE and Sweep3D. The performance models do not require access to a full sized system but rather rely on characteristics of the system as well as knowledge of the achieved single-processor performance. One result of this analysis is in the determination of an equivalent-sized Alpha-based machine that would be required to obtain the same performance as the Earth Simulator.

symposium on frontiers of massively parallel computation | 1999

Scalability analysis of multidimensional wavefront algorithms on large-scale SMP clusters

Adolfy Hoisie; Olaf M. Lubeck; Harvey J. Wasserman

We develop a model for the parallel performance of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message passing environment. The model combines the separate contributions of computation and communication wavefronts. We validate the model on three supercomputer systems, with up to 500 processors, using data from an ASCI deterministic particle transport application, although the model is general to any wavefront algorithm implemented on a 2-D processor domain. We also use the model to make estimates of performance and scalability of wavefront algorithms on 100-TFLOPS computer systems expected to be in existence within the next decade. Our model shows that on a 1-billion-cell problem, single-node computation speed (nor inter-processor communication performance, as is widely believed) is the bottleneck. Finally, we present preliminary considerations that reveal the additional complexity associated with modeling wavefront algorithms on reduced-connectivity network topologies, such as clusters of SMPs.

workshop on software and performance | 1998

Development and validation of a hierarchical memory model incorporating CPU- and memory-operation overlap model

Yong Luo; Olaf M. Lubeck; Harvey J. Wasserman; Federico Bassetti; Kirk W. Cameron

In this paper, we characterize application performunce with a “memory-centric” view. Using a simple strategy and performance data measured on actual mclchines, we model the performance of a simple memory hierarchy and infer the contribution of each level in the memory system to an application’s overall cycles per instruction (cpi). Included are results @rming the usefulness of the memory model over several platforms, namely the SGI Origin 2000, SGI PowerChallenge, and the Intel ASCI Red TFLOPS supercomputers. We account for the overlap of processor execution with memory accesses a key parameter, which is not directly measurable on most systems. Given the system similarities between the Origin 2000 and the PowerChallenge, we infer the separate contributions of three major architecture features in the memory subsystem of the Origin 2000. cache size, outstanding loads-under-miss, and memory latency.

ieee international conference on high performance computing data and analytics | 1998

Performance Analysis of Wavefront Algorithms on Very-Large Scale Distributed Systems

Adolfy Hoisie; Olaf M. Lubeck; Harvey J. Wasserman

We present a model for the parallel performance of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message passing environment. The model combines the separate contributions of computation and communication wavefronts. We validate the model on three important supercomputer systems, on up to 500 processors. We use data from a deterministic particle transport application taken from the ASCI workload, although the model is general to any wavefront algorithm implemented on a 2-D processor domain. We also use the validated model to make estimates of performance and scalability of wavefront algorithms on 100-TFLOPS computer systems expected to be in existence within the next decade as part of the ASCI program and elsewhere. On such machines our analysis shows that, contrary to conventional wisdom, inter-processor communication performance is not the bottleneck. Single-node efficiency is the dominant factor.

IEE Proceedings - Software | 2003

Modelling the performance of large-scale systems

Darren J. Kerbyson; Adolfy Hoisie; Harvey J. Wasserman

Performance modelling can be used throughout the development, deployment and maintenance of system hardware and application software. In this work the authors illustrate three uses of performance modelling on large-scale systems: the verification of performance during system installation, the comparison of two large-scale systems, and the prediction of performance on possible future architectures. They detail how a performance model gave an expectation of the performance of ASCI Q, a 20Tflop system recently installed at Los Alamos. A comparison between ASCI Q and the Earth Simulator is also detailed, resulting in the sizing of an AlphaServer system that has the same performance as the Earth Simulator. The modelling approach is application centric. A detailed model is developed for each application of interest based on a static analysis of the code but parametrised in terms of its dynamic behaviour.

International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems | 2002

Exploring advanced architectures using performance prediction

Darren J. Kerbyson; Harvey J. Wasserman; Adolfy Hoisie

In this work we show how by the examination of the key characteristics of an application, analytical performance models can be formed. These models are parameterized in terms of computational and communication performances of an individual system and can be used to explore achievable performance of an application prior to system availability. Two applications are considered: an adaptive mesh refinement code on structured meshes, and an Sn transport code on unstructured meshes. These are representative of part of the ASCI workload. One of the models is utilized to validate the performance of a Compaq Alpha-server ES45 supercomputing system being built at Los Alamos, and expected to grow to 30 TFLOPS peak performance in the next year. In addition, the models are used to explore the achievable performance on hypothesized future systems with increased peak computation and communication performance.

Explore More