J. Mark Bull | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where J. Mark Bull is active.

Explore More

Publication

Featured researches published by J. Mark Bull.

ACM Sigarch Computer Architecture News | 2001

A microbenchmark suite for OpenMP 2.0

J. Mark Bull; Darragh O'Neill

In this paper we present a set of extensions to an existing microbenchmark suite for OpenMP. The new benchmarks are targeted at directives introduced in the OpenMP 2.0 standard, as well as at the handling of thread-private data structures. Results are presented for a Sun HPC 6500 system, with an early access release of an OpenMP 2.0 compliant compiler, and for an SGI Origin 3000 system.

international workshop on openmp | 2012

A microbenchmark suite for OpenMP tasks

J. Mark Bull; Fiona Reid; Nicola McDonnell

We present a set of extensions to an existing microbenchmark suite for OpenMP. The new benchmarks measure the overhead of the task construct introduced in the OpenMP 3.0 standard, and associated task synchronisation constructs. We present the results from a variety of compilers and hardware platforms, which demonstrate some significant differences in performance between different OpenMP implementations.

ieee international conference on high performance computing data and analytics | 2014

Performance analysis of asynchronous Jacobi's method implemented in MPI, SHMEM and OpenMP

Iain Bethune; J. Mark Bull; Nicholas J. Dingle; Nicholas J. Higham

Ever-increasing core counts create the need to develop parallel algorithms that avoid closely coupled execution across all cores. We present performance analysis of several parallel asynchronous implementations of Jacobi’s method for solving systems of linear equations, using MPI, SHMEM and OpenMP. In particular we have solved systems of over 4 billion unknowns using up to 32,768 processes on a Cray XE6 supercomputer. We show that the precise implementation details of asynchronous algorithms can strongly affect the resulting performance and convergence behaviour of our solvers in unexpected ways, discuss how our specific implementations could be generalised to other classes of problem, and suggest how existing parallel programming models might be extended to allow asynchronous algorithms to be expressed more easily.

International Journal of Parallel Programming | 2010

Performance Evaluation of Mixed-Mode OpenMP/MPI Implementations

J. Mark Bull; James P. Enright; Xu Guo; C.M. Maynard; Fiona Reid

With the current prevalence of multi-core processors in HPC architectures mixed-mode programming, using both MPI and OpenMP in the same application, is seen as an important technique for achieving high levels of scalability. As there are few standard benchmarks written in this paradigm, it is difficult to assess the likely performance of such programs. To help address this, we examine the performance of mixed-mode OpenMP/MPI on a number of popular HPC architectures, using a synthetic benchmark suite and two large-scale applications. We find performance characteristics which differ significantly between implementations, and which highlight possible areas for improvement, especially when multiple OpenMP threads communicate simultaneously via MPI.

international workshop on openmp | 2009

A Microbenchmark Suite for Mixed-Mode OpenMP/MPI

J. Mark Bull; James P. Enright; Nadia Ameer

With the current prevalence of multi-core processors in HPC architectures, mixed-mode programming, using both MPI and OpenMP in the same application, is becoming increasingly important. However, no low-level synthetic benchmarks exist to test the performance of this programming model. We have designed and implemented a set of microbenchmarks for mixed-mode programming, including both point-to-point and collective communication patterns. These microbenchmarks have been run on a number of current HPC architectures: the results show some interesting performance differences between the architectures and highlight some possible inefficiencies in the implementation of MPI on multi-core systems.

parallel computing | 2000

Feedback Guided Scheduling of Nested Loops

T. L. Freeman; David Hancock; J. Mark Bull; Rupert W. Ford

In earlier papers ([2], [3], [6]) feedback guided loop scheduling algorithms have been shown to be very effective for certain loop scheduling problems which involve a sequential outer loop and a parallel inner loop and for which the workload of the parallel loop changes only slowly from one execution to the next. In this paper the extension of these ideas the case of nested parallel loops is investigated. We describe four feedback guided algorithms for scheduling nested loops and evaluate the performances of the algorithms on a set of synthetic benchmarks.

ieee international conference on high performance computing data and analytics | 2012