Michael Schliephake
Royal Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Schliephake.
ieee international conference on high performance computing data and analytics | 2015
Stefano Markidis; Jing Gong; Michael Schliephake; Erwin Laure; Alistair Hart; David Henty; Katherine Heisey; Paul Fischer
We present a case study of porting NekBone, a skeleton version of the Nek5000 code, to a parallel GPU-accelerated system. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. The original NekBone Fortran source code has been used as the base and enhanced by OpenACC directives. The profiling of NekBone provided an assessment of the suitability of the code for GPU systems, and indicated possible kernel optimizations. To port NekBone to GPU systems required little effort and a small number of additional lines of code (approximately one OpenACC directive per 1000 lines of code). The naïve implementation using OpenACC leads to little performance improvement: on a single node, from 16 Gflops obtained with the version without OpenACC, we reached 20 Gflops with the naïve OpenACC implementation. An optimized NekBone version leads to a 43 Gflop performance on a single node. In addition, we ported and optimized NekBone to parallel GPU systems, reaching a parallel efficiency of 79.9% on 1024 GPUs of the Titan XK7 supercomputer at the Oak Ridge National Laboratory.
international conference on conceptual structures | 2011
Michael Schliephake; Xavier Aguilar; Erwin Laure
Abstract The execution of scientific codes will introdu ce a number of new challenges and intensify some old ones on new high-performance computing infrastructures. Petascale computers are large systems with complex designs using heterogeneous technologies that make the programming and porting of applications difficult, particularly if one wants to use the maximum peak performance of the system. In this paper we present the design and first prototype of a runtime system for parallel numerical simulations on large-scale systems. The proposed runtime system addresses the challenges of performance, scalability, and programmability of large-scale HPC systems. We also present initial results of our prototype implementation using a molecular dynamics application kernel.
Future Generation Computer Systems | 2013
Xavier Aguilar; Michael Schliephake; Olav Vahtras; Judit Gimenez; Erwin Laure
Dalton is a molecular electronic structure program featuring common methods of computational chemistry that are based on pure quantum mechanics (QM) as well as hybrid quantum mechanics/molecular mechanics (QM/MM). It is specialized and has a leading position in calculation of molecular properties with a large world-wide user community (over 2000 licenses issued). In this paper, we present a performance characterization and optimization of Dalton. We also propose a solution to avoid the master/worker design of Dalton to become a performance bottleneck for larger process numbers. With these improvements we obtain speedups of 4x, increasing the parallel efficiency of the code and being able to run in it in a much bigger number of cores.
ieee international conference on escience | 2011
Xavier Aguilar; Michael Schliephake; Olav Vahtras; Judit Gimenez; Erwin Laure
Dalton is a molecular electronic structure program featuring common methods of computational chemistry that are based on pure quantum mechanics (QM) as well as hybrid quantum mechanics/molecular mechanics (QM/MM). It is specialized and has a leading position in calculation of molecular properties with a large world-wide user community (over 2000 licenses issued). In this paper, we present a characterization and performance optimization of Dalton that increases the scalability and parallel efficiency of the application. We also propose a solution that helps to avoid the master/worker design of Dalton to become a performance bottleneck for larger process numbers and increase the parallel efficiency.
International Conference on Exascale Applications and Software, EASC 2014 | 2015
Jing Gong; Stefano Markidis; Michael Schliephake; Erwin Laure; Dan S. Henningson; Philipp Schlatter; Adam Peplinski; Alistair Hart; Jens Doleschal; David Henty; Paul Fischer
Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flows. We follow up on an earlier study which ported the simplified version of Nek5000 to a GPU-accelerated system by presenting the hybrid CPU/GPU implementation of the full Nek5000 code using OpenACC. The matrix-matrix multiplication, the Nek5000 gather-scatter operator and a preconditioned Conjugate Gradient solver have implemented using OpenACC for multi-GPU systems. We report an speed-up of 1.3 on single node of a Cray XK6 when using OpenACC directives in Nek5000. On 512 nodes of the Titan supercomputer, the speed-up can be approached to 1.4. A performance analysis of the Nek5000 code using Score-P and Vampir performance monitoring tools shows that overlapping of GPU kernels with host-accelerator memory transfers would considerably increase the performance of the OpenACC version of Nek5000 code.
arXiv: Distributed, Parallel, and Cluster Computing | 2014
Michael Schliephake; Erwin Laure
In order to achieve exascale performance it is important to detect potential bottlenecks and identify strategies to overcome them. For this, both applications and system software must be analysed and potentially improved. The EU FP7 project Collaborative Research into Exascale Systemware, Tools & Applications (CRESTA) chose the approach to co-design advanced simulation applications and system software as well as development tools. In this paper, we present the results of a co-design activity focused on the simulation code NEK5000 that aims at performance improvements of collective communication operations. We have analysed the algorithms that form the core of NEK5000’s communication module in order to assess its viability on recent computer architectures before starting to improve its performance. Our results show that the crystal router algorithm performs well in sparse, irregular collective operations for medium and large processor number but improvements for even larger system sizes of the future will be needed. We sketch the needed improvements, which will make the communication algorithms also beneficial for other applications that need to implement latency-dominated communication schemes with short messages. The latency-optimised communication operations will also become used in a runtime-system providing dynamic load balancing, under development within CRESTA.
International Conference on Exascale Applications and Software | 2014
Jing Gong; Stefano Markidis; Michael Schliephake; Erwin Laure; Dan S. Henningson; Philipp Schlatter; Adam Peplinski; Alistair Hart; Jens Doleschal; David Henty; Paul Fischer
Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flows. We follow up on an earlier study which ported the simplified version of Nek5000 to a GPU-accelerated system by presenting the hybrid CPU/GPU implementation of the full Nek5000 code using OpenACC. The matrix-matrix multiplication, the Nek5000 gather-scatter operator and a preconditioned Conjugate Gradient solver have implemented using OpenACC for multi-GPU systems. We report an speed-up of 1.3 on single node of a Cray XK6 when using OpenACC directives in Nek5000. On 512 nodes of the Titan supercomputer, the speed-up can be approached to 1.4. A performance analysis of the Nek5000 code using Score-P and Vampir performance monitoring tools shows that overlapping of GPU kernels with host-accelerator memory transfers would considerably increase the performance of the OpenACC version of Nek5000 code.
the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) | 2015
Jing Gong; Stefano Markidis; Michael Schliephake; Erwin Laure; Luis Cebamanos; Alistair Hart; Misun Min; Paul F. Fischer
Exascale Software and Applications Conference | 2013
Stefano Markidis; Michael Schliephake; Xavier Aguilar; David Henty; Harvey Richardson; Alistair Hart; Alan Gray; David Lecomber; Tobias Hilbrich; Jens Doleschal; Erwin Laure
Exascale Applications and Software Conference; Edinburgh, Scotland, UK, 9-11 April 2013 | 2013
Jing Gong; Alistair Hart; David Henty; Stefano Markidis; Michael Schliephake; Paul F. Fischer; Katherine Heisey