Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Robert Speck is active.

Publication


Featured researches published by Robert Speck.


Computer Physics Communications | 2012

A massively parallel, multi-disciplinary Barnes-Hut tree code for extreme-scale N-body simulations

Mathias Winkel; Robert Speck; Helge Hübner; Lukas Arnold; Rolf Krause; Paul Gibbon

The efficient parallelization of fast multipole-based algorithms for the N-body problem is one of the most challenging topics in high performance scientific computing. The emergence of non-local, irregular communication patterns generated by these algorithms can easily create an insurmountable bottleneck on supercomputers with hundreds of thousands of cores. To overcome this obstacle we have developed an innovative parallelization strategy for Barnes–Hut tree codes on present and upcoming HPC multicore architectures. This scheme, based on a combined MPI–Pthreads approach, permits an efficient overlap of computation and data exchange. We highlight the capabilities of this method on the full IBM Blue Gene/P system JUGENE at Julich Supercomputing Centre and demonstrate scaling across 294,912 cores with up to 2,048,000,000 particles. Applying our implementation pepc to laser–plasma interaction and vortex particle methods close to the continuum limit, we demonstrate its potential for ground-breaking advances in large-scale particle simulations.


ieee international conference on high performance computing data and analytics | 2012

A massively space-time parallel N-body solver

Robert Speck; Daniel Ruprecht; Rolf Krause; Matthew Emmett; Michael L. Minion; Mathias Winkel; Paul Gibbon

We present a novel space-time parallel version of the Barnes-Hut tree code PEPC using PFASST, the Parallel Full Approximation Scheme in Space and Time. The naive use of increasingly more processors for a fixed-size N-body problem is prone to saturate as soon as the number of unknowns per core becomes too small. To overcome this intrinsic strong-scaling limit, we introduce temporal parallelism on top of PEPCs existing hybrid MPI/PThreads spatial decomposition. Here, we use PFASST which is based on a combination of the iterations of the parallel-in-time algorithm parareal with the sweeps of spectral deferred correction (SDC) schemes. By combining these sweeps with multiple space-time discretization levels, PFASST relaxes the theoretical bound on parallel efficiency in parareal. We present results from runs on up to 262,144 cores on the IBM Blue Gene/P installation JUGENE, demonstrating that the spacetime parallel code provides speedup beyond the saturation of the purely space-parallel approach.


Bit Numerical Mathematics | 2015

A multi-level spectral deferred correction method

Robert Speck; Daniel Ruprecht; Matthew Emmett; Michael L. Minion; Matthias Bolten; Rolf Krause

The spectral deferred correction (SDC) method is an iterative scheme for computing a higher-order collocation solution to an ODE by performing a series of correction sweeps using a low-order timestepping method. This paper examines a variation of SDC for the temporal integration of PDEs called multi-level spectral deferred corrections (MLSDC), where sweeps are performed on a hierarchy of levels and an FAS correction term, as in nonlinear multigrid methods, couples solutions on different levels. Three different strategies to reduce the computational cost of correction sweeps on the coarser levels are examined: reducing the degrees of freedom, reducing the order of the spatial discretization, and reducing the accuracy when solving linear systems arising in implicit temporal integration. Several numerical examples demonstrate the effect of multi-level coarsening on the convergence and cost of SDC integration. In particular, MLSDC can provide significant savings in compute time compared to SDC for a three-dimensional problem.


IEEE Transactions on Plasma Science | 2010

Progress in Mesh-Free Plasma Simulation With Parallel Tree Codes

Paul Gibbon; Robert Speck; Anupam Karmakar; Lukas Arnold; Wolfgang Frings; Benjamin Berberich; Detlef Reiter; Martin Mašek Masek

The recent developments in mesh-free plasma modeling using parallel tree codes are described, covering the algorithmic and performance issues and how to apply this technique to multidimensional electrostatic plasma problems. Examples of the simulations of the ion acceleration by high-intensity lasers, heating, and the dynamics of the nanostructured targets, as well as more recent applications of this technique to the simulations of edge plasmas in tokamaks and mesh-free magnetoinductive models, are given.


SIAM Journal on Scientific Computing | 2015

Interweaving PFASST and Parallel Multigrid

Michael L. Minion; Robert Speck; Matthias Bolten; Matthew Emmett; Daniel Ruprecht

The parallel full approximation scheme in space and time (PFASST) introduced by Emmett and Minion in 2012 is an iterative strategy for the temporal parallelization of ODEs and discretized PDEs. As the name suggests, PFASST is similar in spirit to a space-time full approximation scheme multigrid method performed over multiple time steps in parallel. However, since the original focus of PFASST was on the performance of the method in terms of time parallelism, the solution of any spatial system arising from the use of implicit or semi-implicit temporal methods within PFASST have simply been assumed to be solved to some desired accuracy completely at each substep and each iteration by some unspecified procedure. It hence is natural to investigate how iterative solvers in the spatial dimensions can be interwoven with the PFASST iterations and whether this strategy leads to a more efficient overall approach. This paper presents an initial investigation on the relative performance of different strategies for cou...


arXiv: Numerical Analysis | 2016

Inexact Spectral Deferred Corrections

Robert Speck; Daniel Ruprecht; Michael L. Minion; Matthew Emmett; Rolf Krause

Implicit integration methods based on collocation are attractive for a number of reasons, e.g. their ideal (for Gauss-Legendre nodes) or near ideal (Gauss-Radau or Gauss-Lobatto nodes) order and stability properties. However, straightforward application of a collocation formula with M nodes to an initial value problem with dimension d requires the solution of one large Md × Md system of nonlinear equations.


Journal of Computational Science | 2011

Towards a petascale tree code: Scaling and efficiency of the PEPC library

Robert Speck; Lukas Arnold; Paul Gibbon

Abstract The highly scalable parallel tree code PEPC for rapid computation of long-range (1/ r ) Coulomb forces is presented. It can be used as a library for applications involving electrostatics or Newtonian gravity in 3D. The code is based on the hashed oct-tree algorithm, in which particle coordinates are projected onto a space-filling curve prior to sorting and construction of multipole moments. However, standard particle sorting techniques can ultimately limit the scalability of such algorithms for thousands of cores, a bottleneck which can be alleviated by a recursive sort scheme specially adapted to the Morton curve. More serious limitations of the original locally essential tree concept of Salmon and Warren, which ultimately lead to a failure in memory scaling, are identified and analyzed rigorously. Benchmarks for the code on the IBM Blue Gene/P Jugene are presented which demonstrate scaling for multi-million particle systems on up to 8192 cores.


arXiv: Numerical Analysis | 2015

Convergence of Parareal for the Navier-Stokes Equations Depending on the Reynolds Number

Johannes Steiner; Daniel Ruprecht; Robert Speck; Rolf Krause

The paper presents first a linear stability analysis for the time-parallel Parareal method, using an IMEX Euler as coarse and a Runge-Kutta-3 method as fine propagator, confirming that dominant imaginary eigenvalues negatively affect Parareal’s convergence. This suggests that when Parareal is applied to the nonlinear Navier-Stokes equations, problems for small viscosities could arise. Numerical results for a driven cavity benchmark are presented, confirming that Parareal’s convergence can indeed deteriorate as viscosity decreases and the flow becomes increasingly dominated by convection. The effect is found to strongly depend on the spatial resolution.


Computing and Visualization in Science | 2015

Numerical simulation of skin transport using Parareal

Andreas Kreienbuehl; Arne Naegel; Daniel Ruprecht; Robert Speck; Gabriel Wittum; Rolf Krause

In silico investigation of skin permeation is an important but also computationally demanding problem. To resolve all scales involved in full detail will not only require exascale computing capacities but also suitable parallel algorithms. This article investigates the applicability of the time-parallel Parareal algorithm to a brick and mortar setup, a precursory problem to skin permeation. The C++ library Lib4PrM implementing Parareal is combined with the UG4 simulation framework, which provides the spatial discretization and parallelization. The combination’s performance is studied with respect to convergence and speedup. It is confirmed that anisotropies in the domain and jumps in diffusion coefficients only have a minor impact on Parareal’s convergence. The influence of load imbalances in time due to differences in number of iterations required by the spatial solver as well as spatio-temporal weak scaling is discussed.


Numerical Linear Algebra With Applications | 2017

A multigrid perspective on the parallel full approximation scheme in space and time

Matthias Bolten; Dieter Moser; Robert Speck

Summary For the numerical solution of time-dependent partial differential equations, time-parallel methods have recently been shown to provide a promising way to extend prevailing strong-scaling limits of numerical codes. One of the most complex methods in this field is the “Parallel Full Approximation Scheme in Space and Time” (PFASST). PFASST already shows promising results for many use cases and benchmarks. However, a solid and reliable mathematical foundation is still missing. We show that, under certain assumptions, the PFASST algorithm can be conveniently and rigorously described as a multigrid-in-time method. Following this equivalence, first steps towards a comprehensive analysis of PFASST using blockwise local Fourier analysis are taken. The theoretical results are applied to examples of diffusive and advective type.

Collaboration


Dive into the Robert Speck's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matthew Emmett

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Paul Gibbon

Forschungszentrum Jülich

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michael L. Minion

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Dieter Moser

Forschungszentrum Jülich

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lukas Arnold

Forschungszentrum Jülich

View shared research outputs
Top Co-Authors

Avatar

Anupam Karmakar

Forschungszentrum Jülich

View shared research outputs
Researchain Logo
Decentralizing Knowledge