Dirk Ribbrock
Technical University of Dortmund
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dirk Ribbrock.
Journal of Computational Physics | 2013
Dominik Göddeke; Dimitri Komatitsch; Markus Geveler; Dirk Ribbrock; Nikola Rajovic; Nikola Puzovic; Alex Ramirez
Power consumption and energy efficiency are becoming critical aspects in the design and operation of large scale HPC facilities, and it is unanimously recognised that future exascale supercomputers will be strongly constrained by their power requirements. At current electricity costs, operating an HPC system over its lifetime can already be on par with the initial deployment cost. These power consumption constraints, and the benefits a more energy-efficient HPC platform may have on other societal areas, have motivated the HPC research community to investigate the use of energy-efficient technologies originally developed for the embedded and especially mobile markets. However, lower power does not always mean lower energy consumption, since execution time often also increases. In order to achieve competitive performance, applications then need to efficiently exploit a larger number of processors. In this article, we discuss how applications can efficiently exploit this new class of low-power architectures to achieve competitive performance. We evaluate if they can benefit from the increased energy efficiency that the architecture is supposed to achieve. The applications that we consider cover three different classes of numerical solution methods for partial differential equations, namely a low-order finite element multigrid solver for huge sparse linear systems of equations, a Lattice-Boltzmann code for fluid simulation, and a high-order spectral element method for acoustic or seismic wave propagation modelling. We evaluate weak and strong scalability on a cluster of 96 ARM Cortex-A9 dual-core processors and demonstrate that the ARM-based cluster can be more efficient in terms of energy to solution when executing the three applications compared to an x86-based reference machine.
Journal of Computational Science | 2011
Markus Geveler; Dirk Ribbrock; Sven Mallach; Dominik Göddeke
We present a software approach to hardware-oriented numerics which builds upon an augmented, previously published set of open-source libraries facilitating portable code development and optimisation on a wide range of modern computer architectures. In order to maximise e ciency, we exploit all levels of parallelism, including vectorisation within CPU cores, the Cell BE and GPUs, shared memory thread-level parallelism between cores, and parallelism between heterogeneous distributed memory resources in clusters. To evaluate and validate our approach, we implement a collection of modular building blocks for the easy and fast assembly and development of CFD applications based on the shallow water equations: We combine the Lattice-Boltzmann method with fluid-structure interaction techniques in order to achieve real-time simulations targeting interactive virtual environments. Our results demonstrate that recent multi-core CPUs outperform the Cell BE, while GPUs are significantly faster than conventional multi-threaded SSE code. In addition, we verify good scalability properties of our application on small clusters.
Computer Physics Communications | 2009
Danny van Dyk; Markus Geveler; Sven Mallach; Dirk Ribbrock; Dominik Göddeke; Carsten Gutwenger
Abstract We present HONEI, an open-source collection of libraries offering a hardware oriented approach to numerical calculations. HONEI abstracts the hardware, and applications written on top of HONEI can be executed on a wide range of computer architectures such as CPUs, GPUs and the Cell processor. We demonstrate the flexibility and performance of our approach with two test applications, a Finite Element multigrid solver for the Poisson problem and a robust and fast simulation of shallow water waves. By linking against HONEIs libraries, we achieve a two-fold speedup over straight forward C++ code using HONEIs SSE backend, and additional 3–4 and 4–16 times faster execution on the Cell and a GPU. A second important aspect of our approach is that the full performance capabilities of the hardware under consideration can be exploited by adding optimised application-specific operations to the HONEI libraries. HONEI provides all necessary infrastructure for development and evaluation of such kernels, significantly simplifying their development. Program summary Program title: HONEI Catalogue identifier: AEDW_v1_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEDW_v1_0.html Program obtainable from: CPC Program Library, Queens University, Belfast, N. Ireland Licensing provisions: GPLv2 No. of lines in distributed program, including test data, etc.: 216 180 No. of bytes in distributed program, including test data, etc.: 1 270 140 Distribution format: tar.gz Programming language: C++ Computer: x86, x86_64, NVIDIA CUDA GPUs, Cell blades and PlayStation 3 Operating system: Linux RAM: at least 500 MB free Classification: 4.8, 4.3, 6.1 External routines: SSE: none; [1] for GPU, [2] for Cell backend Nature of problem: Computational science in general and numerical simulation in particular have reached a turning point. The revolution developers are facing is not primarily driven by a change in (problem-specific) methodology, but rather by the fundamental paradigm shift of the underlying hardware towards heterogeneity and parallelism. This is particularly relevant for data-intensive problems stemming from discretisations with local support, such as finite differences, volumes and elements. Solution method: To address these issues, we present a hardware aware collection of libraries combining the advantages of modern software techniques and hardware oriented programming. Applications built on top of these libraries can be configured trivially to execute on CPUs, GPUs or the Cell processor. In order to evaluate the performance and accuracy of our approach, we provide two domain specific applications; a multigrid solver for the Poisson problem and a fully explicit solver for 2D shallow water equations. Restrictions: HONEI is actively being developed, and its feature list is continuously expanded. Not all combinations of operations and architectures might be supported in earlier versions of the code. Obtaining snapshots from http://www.honei.org is recommended. Unusual features: The considered applications as well as all library operations can be run on NVIDIA GPUs and the Cell BE. Running time: Depending on the application, and the input sizes. The Poisson solver executes in few seconds, while the SWE solver requires up to 5 minutes for large spatial discretisations or small timesteps. References: [1] http://www.nvidia.com/cuda . [2] http://www.ibm.com/developerworks/power/cell .
Facing the multicore-challenge | 2010
Markus Geveler; Dirk Ribbrock; Dominik Göddeke; Stefan Turek
We present an efficient method for the simulation of laminar fluid flows with free surfaces including their interaction with moving rigid bodies, based on the two-dimensional shallow water equations and the Lattice-Boltzmann method. Our implementation targets multiple fundamentally different architectures such as commodity multicore CPUs with SSE, GPUs, the Cell BE and clusters. We show that our code scales well on an MPI-based cluster; that an eightfold speedup can be achieved using modern GPUs in contrast to multithreaded CPU code and, finally, that it is possible to solve fluid-structure interaction scenarios with high resolution at interactive rates.
european conference on parallel processing | 2014
Peter Bastian; Christian Engwer; Dominik Göddeke; Oleg Iliev; Olaf Ippisch; Mario Ohlberger; Stefan Turek; Jorrit Fahlke; Sven Kaulmann; Steffen Müthing; Dirk Ribbrock
In the EXA-DUNE project we strive to (i) develop and implement numerical algorithms for solving PDE problems efficiently on heterogeneous architectures, (ii) provide corresponding domain-specific abstractions that allow application scientists to effectively use these methods, and (iii) demonstrate performance on porous media flow problems. In this paper, we present first results on the hybrid parallelisation of sparse linear algebra, system and RHS assembly, the implementation of multiscale finite element methods and the SIMD performance of high-order discontinuous Galerkin methods within an application scenario.
ENUMATH | 2015
Steffen Müthing; Dirk Ribbrock; Dominik Göddeke
A major challenge in PDE software is the balance between user-level flexibility and performance on heterogeneous hardware. We discuss our ideas on how this challenge can be tackled, exemplarily for the DUNE framework and in particular its linear algebra and solver components. We demonstrate how the former MPI-only implementation is modified to support MPI+[CPU/GPU] threading and vectorisation. To this end, we devise a novel block extension of the recently proposed SELL-C-σ format. The efficiency of our approach is underlined by benchmark computations that exhibit reasonable speedups over the CPU-MPI-only case.
Software for Exascale Computing | 2016
Peter Bastian; Christian Engwer; Jorrit Fahlke; Markus Geveler; Dominik Göddeke; Oleg Iliev; Olaf Ippisch; René Milk; Jan Mohring; Steffen Müthing; Mario Ohlberger; Dirk Ribbrock; Stefan Turek
We present advances concerning efficient finite element assembly and linear solvers on current and upcoming HPC architectures obtained in the frame of the Exa-Dune project, part of the DFG priority program 1648 Software for Exascale Computing (SPPEXA). In this project, we aim at the development of both flexible and efficient hardware-aware software components for the solution of PDEs based on the DUNE platform and the FEAST library. In this contribution, we focus on node-level performance and accelerator integration, which will complement the proven MPI-level scalability of the framework. The higher-level aspects of the Exa-Dune project, in particular multiscale methods and uncertainty quantification, are detailed in the companion paper (Bastian et al., Advances concerning multiscale methods and uncertainty quantification in Exa-Dune. In: Proceedings of the SPPEXA Symposium, 2016).
parallel computing | 2015
Dominik Göddeke; Mirco Altenbernd; Dirk Ribbrock
Fault-tolerant and robust multigrid methods.Hierarchical finite element compression.Asynchronous checkpointing with local restart. We analyse novel fault tolerance schemes for data loss in multigrid solvers, which essentially combine ideas of checkpoint-restart with algorithm-based fault tolerance. To improve efficiency compared to conventional global checkpointing, we exploit the inherent data compression of the multigrid hierarchy, and relax the synchronicity requirement through a local failure local recovery approach. We experimentally identify the root cause of convergence degradation in the presence of data loss using smoothness considerations. Our resulting schemes form a family of techniques that can be tailored to the expected error probability of (future) large-scale machines. A performance model gives further insight into the benefits and applicability of our techniques.
international conference on conceptual structures | 2010
Dirk Ribbrock; Markus Geveler; Dominik Göddeke; Stefan Turek
We present di erent kernels based on Lattice-Boltzmann methods for the solution of the twodimensional Shallow Water and Navier-Stokes equations on fully structured lattices. The functionality ranges from simple scenarios like open-channel flows with planar beds to simulations with complex scene geometries like solid obstacles and non-planar bed topography with drystates and even interaction of the fluid with floating objects. The kernels are integrated into a hardware-oriented collection of libraries targeting multiple fundamentally di erent parallel hardware architectures like commodity multicore CPUs, the Cell BE, NVIDIA GPUs and clusters. We provide an algorithmic study which compares the di erent solvers in terms of performance and numerical accuracy in view of their capabilities and their specific implementation and optimisation on the di erent architectures. We show that an eightfold speedup over optimised multithreaded CPU code can be obtained with the GPU using basic methods and that even very complex flow phenomena can be simulated with significant speedups without loss of accuracy.
Software for Exascale Computing | 2016
Peter Bastian; Christian Engwer; Jorrit Fahlke; Markus Geveler; Dominik Göddeke; Oleg Iliev; Olaf Ippisch; René Milk; Jan Mohring; Steffen Müthing; Mario Ohlberger; Dirk Ribbrock; Stefan Turek
In this contribution we present advances concerning efficient parallel multiscale methods and uncertainty quantification that have been obtained in the frame of the DFG priority program 1648 Software for Exascale Computing (SPPEXA) within the funded project Exa-Dune. This project aims at the development of flexible but nevertheless hardware-specific software components and scalable high-level algorithms for the solution of partial differential equations based on the DUNE platform. While the development of hardware-based concepts and software components is detailed in the companion paper (Bastian et al., Hardware-based efficiency advances in the Exa-Dune project. In: Proceedings of the SPPEXA Symposium 2016, Munich, 25–27 Jan 2016), we focus here on the development of scalable multiscale methods in the context of uncertainty quantification. Such problems add additional layers of coarse grained parallelism, as the underlying problems require the solution of many local or global partial differential equations in parallel that are only weakly coupled.