Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Werner Augustin is active.

Publication


Featured researches published by Werner Augustin.


european pvm mpi users group meeting on recent advances in parallel virtual machine and message passing interface | 2002

On Benchmarking Collective MPI Operations

Thomas Worsch; Ralf H. Reussner; Werner Augustin

This article concentrates on recent work on benchmarking collective operations with SKaMPI. The goal of the SKaMPI project is the creation of a database containing performance measurements of parallel computers in terms of MPI operations. These data support software developers in creating portable and fast programs. Existing algorithms for measuring the timing of collective operations are discussed and a new algorithm is presented, taking into account the differences of local clocks. Results of measurements on a Cray T3E/900 and an IBM RS 6000 SP are presented.


european conference on parallel processing | 2009

Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems

Werner Augustin; Vincent Heuveline; Jan-Philipp Weiss

Numerical algorithms on parallel systems built upon modern multicore processors are facing two challenging obstacles that keep realistic applications from reaching the theoretically available compute performance. First, the parallelization on several system levels has to be exploited to the full extent. Second, provision of data to the compute cores needs to be adapted to the constraints of a hardware-controlled nested cache hierarchy with shared resources. In this paper we analyze dedicated optimization techniques on modern multicore systems for stencil kernels on regular three-dimensional grids. We combine various methods like a compressed grid algorithm with finite shifts in each time step and loop skewing into an optimized parallel in-place stencil implementation of the three-dimensional Laplacian operator. In that context, memory requirements are reduced by a factor of approximately two while considerable performance gains are observed on modern Intel and AMD based multicore systems.


ieee international conference on high performance computing data and analytics | 2009

High Performance Computing and Discrete Dislocation Dynamics: Plasticity of Micrometer Sized Specimens

D. Weygand; J. Senger; Christian Motz; Werner Augustin; Vincent Heuveline; Peter Gumbsch

A parallel discrete dislocation dynamics tool is employed to study the size dependent plasticity of small metallic structures. The tool has been parallelised using OpenMP. An excellent overall scaling is observed for different loading scenarios.


Parallel Tools Workshop | 2012

HiFlow 3 : A Hardware-Aware Parallel Finite Element Package

Hartwig Anzt; Werner Augustin; Martin Baumann; Thomas Gengenbach; Tobias Hahn; Andreas Helfrich-Schkarbanenko; Vincent Heuveline; Eva Ketelaer; Dimitar Lukarski; Andreas Nestler; Sebastian Ritterbusch; Staffan Ronnas; Michael Schick; Mareike Schmidtobreick; Chandramowli Subramanian; Jan-Philipp Weiss; Florian Wilhelm; Martin Wlotzka

The goal of this paper is to describe the hardware-aware parallel C++ finite element package HiFlow3. HiFlow3 aims at providing a powerful platform for simulating processes modelled by partial differential equations. Our vision is to solve boundary value problems in an appropriate way by coupling numerical simulations with modern software design and state-of-the-art hardware technologies. The main functionalities for mapping the mathematical model into parallel software are implemented in the three core modules Mesh, DoF/FEM and Linear Algebra (LA). Parallelism is realized on two levels. The modules provide efficient MPI-based distributed data structures to achieve performance on large HPC systems but also on stand-alone workstations. Additionally, the hardware-aware cross-platform approach in the LA module accelerates the solution process by exploiting the computing power from emerging technologies like multi-core CPUs and GPUs. In this context performance evaluation on different hardware-architectures will be demonstrated.


Lecture Notes in Computer Science | 2005

Benchmarking one-sided communication with SKaMPI 5

Werner Augustin; Marc-Oliver Straub; Thomas Worsch

SKaMPI is now an established benchmark for MPI implementations. Two important goals of the development of version 5 of SKaMPI were the extension of the benchmark to cover more functionality of MPI, and a redesign of the benchmark allowing it to be extended more easily. In the present paper we give an overview of the extension of SKaMPI 5 for the evaluation of one-sided communication and present a few selected results of benchmark runs, giving an impression of the breadth and depth of SKaMPI 5.A look at the source code, which is available under the GPL, reveals that it was easy to extend SKaMPI 5 with benchmarks for one-sided communication.


Lecture Notes in Computer Science | 2003

Usefulness and Usage of SKaMPI-Bench

Werner Augustin; Thomas Worsch

SKaMPI is a benchmark for measuring the performance of MPI implementations. Some examples of surprising behaviour of MPI libraries are presented. These result in certain new requirements for MPI benchmarks and will lead to major extensions in the new SKaMPI-Bench.


Archive | 2003

Benchmarking Collective Operations with SKaMPI

Thomas Worsch; Ralf H. Reussner; Werner Augustin

This article concentrates on recent work on benchmarking collective operations with SKaMPI. The goal of the SKaMPI project is the creation of a database containing performance measurements of parallel computers in terms of MPI operations. Its data support software developers in creating portable and fast programs. Existing algorithms for measuring the timing of collective operations are discussed and a new algorithm is presented, taking into account the differences of local clocks. Results of measurements on the HLRS Cray T3E/900 are presented and compared with other machines.


Archive | 2005

SKaMPI — Towards Version 5

Werner Augustin; Michael Haller; Marc-Oliver Straub; Thomas Worsch

SKaMPI is now an established benchmark for MPI implementations. The development of SKaMPI-5 strives for improvements in several directions: (i) extension of the benchmark to cover more functionality of MPI, (ii) construction of a collection of collective algorithm kernels which are not supported by core MPI collective operations. (iii) a redesign of the SKaMPI benchmark allowing it to be extended more easily (thus matching requests from SKaMPI users).


Archive | 2008

OpenMP Parallelization of the METRAS Meteorology Model: Application to the America’s Cup

Werner Augustin; Vincent Heuveline; Günter Meschkat; K. Heinke Schlünzen; Guido Schroeder; Wolfgang E. Nagel; Dietmar Kröner; Michael M. Resch

We describe the parallelization of the meteorology model METRAS (MEsoscale TRAnsport and Stream) in the context of the America’s Cup 2007 for the South African sailing yacht Shosholoza. METRAS is a community model of the atmosphere whose development is coordinated at the Meteorological Institute, ZMAW, University of Hamburg. The parallelization which is based OpenMP was done at the Steinbuch Centre for Computing (SCC) of the University of Karlsruhe and took advantage of the specific features of the Itanium-2 processors available on the local parallel computer HP XC6000. In this paper, we report on the parallelization of the meteorology model METRAS as well as describe how this parallelized version is being used in the highly challenging context of the America’s Cup.


Preprint Series of the Engineering Mathematics and Computing Lab | 2013

HiFlow3 -- A Flexible and Hardware-Aware Parallel Finite Element Package

Hartwig Anzt; Werner Augustin; Martin Baumann; Hendryk Bockelmann; Thomas Gengenbach; Tobias Hahn; Vincent Heuveline; Eva Ketelaer; Dimitar Lukarski; Andrea Otzen; Sebastian Ritterbusch; Björn Rocker; Staffan Ronnas; Michael Schick; Chandramowli Subramanian; Jan-Philipp Weiss; Florian Wilhelm

Collaboration


Dive into the Werner Augustin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Worsch

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jan-Philipp Weiss

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Chandramowli Subramanian

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Eva Ketelaer

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Florian Wilhelm

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Marc-Oliver Straub

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Michael Schick

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Sebastian Ritterbusch

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Staffan Ronnas

Karlsruhe Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge