Duncan A. Grove | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Duncan A. Grove is active.

Explore More

Publication

Featured researches published by Duncan A. Grove.

The Journal of Supercomputing | 2005

Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers

Duncan A. Grove; Paul D. Coddington

This paper gives an overview of two related tools that we have developed to provide more accurate measurement and modelling of the performance of message-passing communication and application programs on distributed memory parallel computers. MPIBench uses a very precise, globally synchronised clock to measure the performance of MPI communication routines. It can generate probability distributions of communication times, not just the average values produced by other MPI benchmarks. This allows useful insights to be made into the MPI communication performance of parallel computers, and in particular how performance is affected by network contention. The Performance Evaluating Virtual Parallel Machine (PEVPM) provides a simple, fast and accurate technique for modelling and predicting the performance of message-passing parallel programs. It uses a virtual parallel machine to simulate the execution of the parallel program. The effects of network contention can be accurately modelled by sampling from the probability distributions generated by MPIBench. These tools are particularly useful on clusters with commodity Ethernet networks, where relatively high latencies, network congestion and TCP problems can significantly affect communication performance, which is difficult to model accurately using other tools. Experiments with example parallel programs demonstrate that PEVPM gives accurate performance predictions on commodity clusters. We also show that modelling communication performance using average times rather than sampling from probability distributions can give misleading results, particularly for programs running on a large number of processors.

Performance Evaluation | 2005

Modeling message-passing programs with a Performance Evaluating Virtual Parallel Machine

Duncan A. Grove; Paul D. Coddington

We present a new performance modeling system for message-passing parallel programs that is based around a Performance Evaluating Virtual Parallel Machine (PEVPM). We explain how to develop PEVPM models for message-passing programs using a performance directive language that describes a programs serial segments of computation and message-passing events. This is a novel bottom-up approach to performance modeling, which aims to accurately model when processing and message-passing occur during program execution. The times at which these events occur are dynamic, because they are affected by network contention and data dependencies, so we use a virtual machine to simulate program execution. This simulation is done by executing models of the PEVPM performance directives rather than executing the code itself, so it is very fast. The simulation is still very accurate because enough information is stored by the PEVPM to dynamically create detailed models of processing and communication events. Another novel feature of our approach is that the communication times are sampled from probability distributions that describe the performance variability exhibited by communication subject to contention. These performance distributions can be empirically measured using a highly accurate message-passing benchmark that we have developed. This approach provides a Monte Carlo analysis that can give very accurate results for the average and the variance (or even the probability distribution) of program execution time. In this paper, we introduce the ideas underpinning the PEVPM technique, describe the syntax of the performance modeling language and the virtual machine that supports it, and present some results, for example, parallel programs to show the power and accuracy of the methodology.

international parallel and distributed processing symposium | 2004

Communication benchmarking and performance modelling of MPI programs on cluster computers

Duncan A. Grove; Paul D. Coddington

Summary form only given. We give an overview of two related tools that we have developed to provide more accurate measurement and modelling of the performance of message passing programs and communications on distributed memory parallel computers. MPIBench uses a very precise, globally synchronised clock to measure the performance of MPI communication routines, and can generate probability distributions of communication times, not just the average values produced by other MPI benchmarks. This allows useful insights into MPI communications performance of parallel computers, particularly the effects of network contention. PEVPM provides a simple, fast and accurate technique for performance modelling and prediction of message-passing parallel programs. It uses a virtual parallel machine to simulate the execution of the parallel program. The effects of network contention can be accurately modelled by sampling from the probability distributions generated by MPIBench. These tools are particularly useful on Beowulf clusters with commodity Ethernet networks, where relatively high latencies, network congestion and TCP problems can significantly affect communication performance, and can be difficult to model accurately using other tools. Experiments with example parallel programs demonstrate that PEVPM gives accurate performance predictions on Beowulf clusters. We also show that modelling communication performance using average times rather than sampling from probability distributions can give misleading results, particularly for a large number of processors.

international conference on algorithms and architectures for parallel processing | 2005

Analytical models of probability distributions for MPI point-to-point communication times on distributed memory parallel computers

Duncan A. Grove; Paul D. Coddington

Measurement and modelling of distributions of data communication times is commonly done for telecommunication networks, but this has not previously been done for message passing communications on parallel computers. We have used the MPIBench program to measure distributions of point-to-point MPI communication times for two different parallel computers, with a low-end Ethernet network and a high-end Quadrics network respectively. Here we present and discuss the results of efforts to fit the measured distributions with standard probability distribution functions such as exponential, lognormal, Erlang, gamma, Pearson 5 and Weibull distributions.

ieee international conference on high performance computing data and analytics | 2000

A Beowulf Cluster for Computational Chemistry

Kenneth A. Hawick; Duncan A. Grove; Paul D. Coddington; Heath A. James; M. Buntine

We have constructed a Beowulf cluster of networked PCs that is dedicated to solving chemistry problems using standard software packages such as Gaussian and GAMESS. We describe the economic and performance trade-offs in the design of the cluster, and present some some selected benchmark results for a parallel version of GAMESS. We believe that the Beowulf we have constructed offers the best price/performance ratio for our chemistry applications, and that commodity clusters can now provide dedicated supercomputer performance within the budget of most university departments.

Archive | 2001