Iain Bethune | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Iain Bethune is active.

Explore More

Publication

Featured researches published by Iain Bethune.

ieee international conference on high performance computing data and analytics | 2014

Performance analysis of asynchronous Jacobi's method implemented in MPI, SHMEM and OpenMP

Iain Bethune; J. Mark Bull; Nicholas J. Dingle; Nicholas J. Higham

Ever-increasing core counts create the need to develop parallel algorithms that avoid closely coupled execution across all cores. We present performance analysis of several parallel asynchronous implementations of Jacobi’s method for solving systems of linear equations, using MPI, SHMEM and OpenMP. In particular we have solved systems of over 4 billion unknowns using up to 32,768 processes on a Cray XE6 supercomputer. We show that the precise implementation details of asynchronous algorithms can strongly affect the resulting performance and convergence behaviour of our solvers in unexpected ways, discuss how our specific implementations could be generalised to other classes of problem, and suggest how existing parallel programming models might be extended to allow asynchronous algorithms to be expressed more easily.

international conference on e-science | 2016

ExTASY: Scalable and flexible coupling of MD simulations and advanced sampling techniques

Vivekanandan Balasubramanian; Iain Bethune; Ardita Shkurti; Elena Breitmoser; Eugen Hruska; Cecilia Clementi; Charles A. Laughton; Shantenu Jha

For many macromolecular systems the accurate sampling of the relevant regions on the potential energy surface cannot be obtained by a single, long Molecular Dynamics (MD) trajectory. New approaches are required to promote more efficient sampling. We present the design and implementation of the Extensible Toolkit for Advanced Sampling and analYsis (Ex-TASY) for building and executing advanced sampling workflows on HPC systems. ExTASY provides Python based “templated scripts” that interface to an interoperable and high-performance pilot-based run time system, which abstracts the complexity of managing multiple simulations. ExTASY supports the use of existing highly-optimised parallel MD code and their coupling to analysis tools based upon collective coordinates which do not require a priori knowledge of the system to bias. We describe two workflows which both couple large “ensembles” of relatively short MD simulations with analysis tools to automatically analyse the generated trajectories and identify molecular conformational structures that will be used on-the-fly as new starting points for further “simulation-analysis” iterations. One of the workflows leverages the Locally Scaled Diffusion Maps technique; the other makes use of Complementary Coordinates techniques to enhance sampling and generate start-points for the next generation of MD simulations. We show that the ExTASY tools have been deployed on a range of HPC systems including ARCHER (Cray CX30), Blue Waters (Cray XE6/XK7), and Stampede (Linux cluster), and that good strong scaling can be obtained up to 1000s of MD simulations, independent of the size of each simulation. We discuss how ExTASY can be easily extended or modified by end-users to build their own workflows, and ongoing work to improve the usability and robustness of ExTASY.

international conference on parallel processing | 2013

Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs

Iain Bethune; Michael Goetz

Great strides have been made in recent years in the search for ever larger prime Generalized Fermat Numbers (GFN). We briefly review the history of the GFN prime search, and describe new implementations of the ‘Genefer’ software (now available as open source) using CUDA and optimised CPU assembler which have underpinned this unprecedented progress. The results of the ongoing search are used to extend Gallot and Dubner’s published tables comparing the theoretical predictions with actual distributions of primes, and we report on recent discoveries of GFN primes with over one million digits.

Computer Physics Communications | 2012

Mapping application performance to HPC architecture

Alan Gray; Iain Bethune; R.D. Kenway; Lorna Smith; Martyn F. Guest; Christine Kitchen; P. Calleja; A. Korzynski; S. Rankin; Mike Ashworth; Andrew Porter; Ilian T. Todorov; Martin Plummer; Eugene E. Jones; L. Steenman-Clark; B. Ralston; Charles A. Laughton

Abstract A suite of application benchmarks, designed to be broadly representative of UK HPC usage, has been developed to stress a broad range of architectural features of large scale parallel HPC resources. A generic methodology to investigate application performance and scaling characteristics has been defined, resulting in a detailed understanding of the performance of these applications. This methodology is transferable to other applications and systems: it is of practical value to developers and users who are aiming for optimal utilisation of HPC resources. An understanding of the performance characteristics of a range of large-scale HPC resources has been obtained using low-level synthetic benchmarks. A relatively simple, qualitative mechanism to assess and predict application performance on current and future architectures using synthetic benchmark results together with application performance analysis results is explored.

European Journal of Physics | 2016

High-performance computational fluid dynamics: a custom-code approach

James Fannon; Jean-Christophe Loiseau; Prashant Valluri; Iain Bethune; Lennon Ó Náraigh

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier–Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.

international conference on software testing verification and validation | 2014

Automated Multi-platform Testing and Code Coverage Analysis of the CP2K Application

Marko Miic; Iain Bethune; Milo Tomaevic

CP2K is a widely used application for atomistic simulation that can execute on a range of architectures. Consisting of more than one million lines of Fortran 95 code, the application is tested for correctness with a set of about 2,500 inputs using a dedicated regression testing environment. CP2K can be built with many compilers and executed on different serial and parallel platforms, thus making comprehensive testing even more challenging. This paper presents an effort to improve the existing testing process of CP2K in order to better support its continuing development. Enhancements have been made to the regression testing environment to support multi-platform testing and a new automated multi-platform testing system has been developed to check the code on a regular basis. Also, tools have been used to gain code coverage information for different test configurations. All the information is aggregated and displayed on the dedicated web page.

parallel computing | 2016

Parallel Computing: On the Road to Exascale

Iain Bethune; Toni Collis; Lennon Ó Náraigh; David Scott; Prashant Valluri

We introduce TPLS (Two-Phase Level Set), an MPI-parallel Direct Numerical Simulation code for two-phase flows in channel geometries. Recent developments to the code are discussed which improve the performance of the solvers and I/O by using the PETSc and NetCDF libraries respectively. Usability and functionality improvements enabled by code refactoring and merging of a separate OpenMP-parallelized version are also outlined. The overall scaling behaviour of the code is measured, and good strong scaling up to 1152 cores is observed for a 5.6 million element grid. A comparison is made between the legacy serial textformatted I/O and new NetCDF implementations, showing speedups of up to 17x. Finally, we explore the effects of output file striping on the Lustre parallel file system on ARCHER, a Cray XC30 supercomputer, finding performance gains of up to 12% over the default striping settings.

international conference on parallel processing | 2014

10th International Conference, PPAM 2013, Warsaw, Poland, September 8-11, 2013, Revised Selected Papers, Part I

Iain Bethune; Michael Goetz

international conference on parallel processing | 2014

Extending the generalized Fermat prime search beyond one million digits

Iain Bethune; Michael Goetz

parallel computing | 2012

PRACE DECI (distributed european computing initiative) minisymposium

Christopher R. Johnson; Adam Carter; Iain Bethune; Kevin Statford; Mikko J. Alava; Vitor Cardoso; Muhammad Asif; Bernhard S. A. Schuberth; Tobias Weinzierl

This article gives an overview of the DECI (Distributed European Computing Initiative) Minisymposium held within the PARA 2012 conference taking the form of a short set of articles for each of the talks presented. The work presented here was carried out under either the DEISA (receiving funding through the EU FP7 project RI-22291) or PRACE-2IP (receiving funding from the EU FP7 Programme (FP7/2007-2013) under grant agreement no RI-283493) projects.

Explore More