Graham Markall | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Graham Markall is active.

Explore More

Publication

Featured researches published by Graham Markall.

ACM Transactions on Mathematical Software | 2017

Firedrake: Automating the Finite Element Method by Composing Abstractions

Florian Rathgeber; David A. Ham; Lawrence Mitchell; Fabio Luporini; Andrew T. T. McRae; Gheorghe-Teodor Bercea; Graham Markall; Paul H. J. Kelly

Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrake adopts the domain-specific language for the finite element method of the FEniCS project, but with a pure Python runtime-only implementation centred on the composition of several existing and new abstractions for particular aspects of scientific computing. The result is a more complete separation of concerns which eases the incorporation of separate contributions from computer scientists, numerical analysts and application specialists. These contributions may add functionality, or improve performance. Firedrake benefits from automatically applying new optimisations. This includes factorising mixed function spaces, transforming and vectorising inner loops, and intrinsically supporting block matrix operations. Importantly, Firedrake presents a simple public API for escaping the UFL abstraction. This allows users to implement common operations that fall outside pure variational formulations, such as flux-limiters.

The Computer Journal | 2012

Performance Analysis and Optimization of the OP2 Framework on Many-Core Architectures

Michael B. Giles; Gihan R. Mudalige; Z. Sharif; Graham Markall; Paul H. J. Kelly

This paper presents a benchmarking, performance analysis and optimization study of the OP2 ‘active’ library, which provides an abstraction framework for the parallel execution of unstructured mesh applications. OP2 aims to decouple the scientific specification of the application from its parallel implementation, and thereby achieve code longevity and near-optimal performance through re-targeting the application to execute on different multi-core/many-core hardware. Runtime performance results are presented for a representative unstructured mesh application on a variety of many-core processor systems, including traditional X86 architectures from Intel (Xeon based on the older Penryn and current Nehalem micro-architectures) and GPU offerings from NVIDIA (GTX260, Tesla C2050). Our analysis demonstrates the contrasting performance between the use of CPU (OpenMP) and GPU (CUDA) parallel implementations for the solution of an industrial-sized unstructured mesh consisting of about 1.5Â million edges. Results show the significance of choosing the correct partition and thread-block configuration, the factors limiting the GPU performance and insights into optimizations for improved performance.

ieee international conference on high performance computing data and analytics | 2012

PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes

Florian Rathgeber; Graham Markall; Lawrence Mitchell; Nicolas Loriant; David A. Ham; Carlo Bertolli; Paul H. J. Kelly

Emerging many-core platforms are very difficult to program in a performance portable manner whilst achieving high efficiency on a diverse range of architectures. We present work in progress on PyOP2, a high-level embedded domain-specific language for mesh-based simulation codes that executes numerical kernels in parallel over unstructured meshes. Just-in-time kernel compilation and parallel scheduling are delayed until runtime, when problem-specific parameters are available. Using generative metaprogramming, performance portability is achieved, while details of the parallel implementation are abstracted from the programmer. PyOP2 kernels for finite element computations can be generated automatically from equations given in the domain-specific Unified Form Language. Interfacing to the multi-phase CFD code Fluidity through a very thin layer on top of PyOP2 yields a general purpose finite element solver with an input notation very close to mathematical formulae. Preliminary performance figures show speedups of up to 3.4× compared to Fluiditys built-in solvers when running in parallel.

international conference on conceptual structures | 2010

Towards generating optimised finite element solvers for GPUs from high-level specifications

Graham Markall; David A. Ham; Paul H. J. Kelly

Abstract We argue that producing maintainable high-performance implementations of finite element methods for multiple targets requires that they are written using a high-level domain-specific language. We make the case for using one such language, the Unified Form Language (UFL), by discussing how it allows the generation of high-performance code from maintainable sources. We support this case by showing that optimal implementations of a finite element solver written for a Graphics Processing Unit and a multicore CPU require the use of different algorithms and data formats that are embodied by the UFL representation. Finally we describe a prototype compiler that generates low-level code from high-level specifications, and outline how the high-level UFL representation can be lowered to facilitate optimisation using existing techniques prior to code generation.

international supercomputing conference | 2013

Performance-Portable Finite Element Assembly Using PyOP2 and FEniCS

Graham Markall; Florian Rathgeber; Lawrence Mitchell; Nicolas Loriant; Carlo Bertolli; David A. Ham; Paul H. J. Kelly

We describe a toolchain that provides a fully automated compilation pathway from a finite element domain-specific language to low-level code for multicore and GPGPU platforms. We demonstrate that the generated code exceeds the performance of the best available alternatives, without requiring manual tuning or modification of the generated code. The toolchain can easily be integrated with existing finite element solvers, providing a means to add performance portable methods without having to rebuild an entire complex implementation from scratch.

ICNAAM 2010: International Conference of Numerical Analysis and Applied Mathematics 2010 | 2010

Generating Optimised Finite Element Solvers for GPU Architectures

Graham Markall; David A. Ham; Paul H. J. Kelly

We show that optimal implementations of a finite element solver written for a Graphics Processing Unit and a multicore CPU require the use of different algorithms and data formats. This motivates the use of code generation in order to produce efficient, maintainable implementations of the finite element method for GPU architectures.

International Journal for Numerical Methods in Fluids | 2013

Finite element assembly strategies on multi‐core and many‐core architectures

Graham Markall; A. Slemmer; David A. Ham; Paul H. J. Kelly; Chris D. Cantwell; Spencer J. Sherwin

measurement and modeling of computer systems | 2011

Performance analysis of the OP2 framework on many-core architectures

Michael B. Giles; Gihan R. Mudalige; Z. Sharif; Graham Markall; Paul H. J. Kelly

Archive | 2016

PyOP2: Framework for performance-portable parallel computations on unstructured meshes

Florian Rathgeber; Hector Dearman; gbts; Gheorghe-Teodor Bercea; Kaho Sato; Simon W. Funke; Lawrence Mitchell; Miklós Homolya; Francis Russell; Christian T. Jacobs; David A. Ham; Andrew T. T. McRae; Graham Markall; Fabio Luporini

Archive | 2016