Christian Simmendinger

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christian Simmendinger is active.

Explore More

Publication

Featured researches published by Christian Simmendinger.

2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing | 2010

License Management in Grid and Cloud Computing

Yona Raekow; Christian Simmendinger; Piotr Grabowski; Domenic Jenz

The lack of license management schemes in dis- tributed environments is becoming a major obstacle for the commercial adoption of Grid or Cloud infrastructures. In this paper, we present a complete license management architecture that enables a pay-per-use license management which can be deployed together with an on-demand computing scenario. Our architecture enables authenticated access to a remote license server. The license management architecture can be deployed in any distributed environment. It supports existing client/server based software license management tools (for example FlexNet Publisher). This allows an easy transition from current software license business models which support only a local license management towards business models which support license management in distributed environments.

Facing the Multicore-Challenge | 2013

GASPI – A Partitioned Global Address Space Programming Interface

Thomas Alrutz; Jan Backhaus; Thomas Brandes; Vanessa End; Thomas Gerhold; Alfred Geiger; Daniel Grünewald; Vincent Heuveline; Jens Jägersküpper; Andreas Knüpfer; Olaf Krzikalla; Edmund Kügeler; Carsten Lojewski; Guy Lonsdale; Ralph Müller-Pfefferkorn; Wolfgang E. Nagel; Lena Oden; Franz-Josef Pfreundt; Mirko Rahn; Michael Sattler; Mareike Schmidtobreick; Annika Schiller; Christian Simmendinger; Thomas Soddemann; Godehard Sutmann; Henning Weber; Jan-Philipp Weiss

At the threshold to exascale computing, limitations of the MPI programming model become more and more pronounced. HPC programmers have to design codes that can run and scale on systems with hundreds of thousands of cores. Setting up accordingly many communication buffers, point-to-point communication links, and using bulk-synchronous communication phases is contradicting scalability in these dimensions. Moreover, the reliability of upcoming systems will worsen.

international conference on parallel processing | 2011

A novel shared-memory thread-pool implementation for hybrid parallel CFD solvers

Jens Jägersküpper; Christian Simmendinger

The Computational Fluid Dynamics (CFD) solver TAU for unstructured grids is widely used in the European aerospace industry. TAU runs on High-Performance Computing (HPC) clusters with several thousands of cores using MPI-based domain decomposition. In order to make more efficient use of current multi-core CPUs and to prepare TAU for the many-core era, a shared-memory parallelization has been added to one of TAUs solver to obtain a hybrid parallelization: MPI-based domain decomposition plus multi-threaded processing of a domain. For the edge-based solver considered, a simple loop-based approach via OpenMP FOR directives would - due to the Amdahl trap - not deliver the required speed-up. A more sophisticated, thread-pool-based sharedmemory parallelization has been developed which allows for a relaxed thread synchronization with automatic and dynamic load balancing. In this paper we describe the concept behind this shared-memory parallelization, we explain how the multi-threaded computation of a domain works. Some details of its implementation in TAU as well as some first performance results are presented. We emphasize that the concept is not TAU-specific. Actually, this design pattern appears to be very generic and may well be applied to other grid/mesh/graph-based codes.

Praxis Der Informationsverarbeitung Und Kommunikation | 2005

Integrated Performance Analysis of Computer Systems (IPACS). Benchmarks for Distributed Computer Systems

Giovanni Falcone; Heinz Kredel; Michael Krietemeyer; Dirk Merten; Matthias Merz; Franz-Josef Pfreundt; Christian Simmendinger; Daniel Versick

ABSTRACT The IPACS-Project (Integrated Performance Analysis of Computer Systems), which was founded by the German Federal Department of Education, Science, Research and Technology (BMBF), wants to define a new Basis for measuring system performance of distributed systems. Its objective is to develop methods for measuring system performance on High Performance Computers (HPC) based on low level benchmarks, compute kernels, open source- and commercial application benchmarks. Additionally, it covers the development of methods for performance modelling and prediction of commercial codes. A further significant element is the integration into a benchmark environment consisting of a web based repository and a distributed benchmark-execution framework that ensures an easy usability and enables a just-in-time analysis of benchmark results.

Archive | 2015

The GASPI API: A Failure Tolerant PGAS API for Asynchronous Dataflow on Heterogeneous Architectures

Christian Simmendinger; Mirko Rahn; Daniel Gruenewald

The Global Address Space Programming Interface (GASPI) is a Partitioned Global Address Space (PGAS) API specification. The GASPI API specification is focused on three key objectives: scalability, flexibility and fault tolerance. It offers a small, yet powerful API composed of synchronization primitives, synchronous and asynchronous collectives, fine-grained control over one-sided read and write communication primitives, global atomics, passive receives, communication groups and communication queues. GASPI has been designed for one-sided RDMA-driven communication in a PGAS environment. As such, GASPI aims to initiate a paradigm shift from bulk-synchronous two-sided communication patterns towards an asynchronous communication and execution model. In order to achieve its much improved scaling behaviour GASPI leverages request based asynchronous dataflow with remote completion. In GASPI request based remote completion indicates that the operation has completed at the target window. The target hence can (on a per request basis) establish whether a one sided operation is complete at the target. A correspondingly implemented fine-grain asynchronous dataflow model can achieve a largely improved scaling behaviour relative to MPI.

International Journal of Grid and Utility Computing | 2013

On-demand software licence provisioning in grid and cloud computing

Yona Raekow; Christian Simmendinger; Domenic Jenz; Piotr Grabowski

The lack of licence management schemes in distributed environments is becoming a major obstacle for the commercial adoption of grid or cloud infrastructures. In this paper, we present a complete licence management architecture that enables a pay-per-use licence management, which can be deployed together with an on-demand computing scenario. Our architecture enables authenticated access to a remote licence server. The licence management architecture can be deployed in any distributed environment. It supports existing client/server-based software licence management tools e.g. FlexNet Publisher. This allows an easy transition from current software licence business models, which support only a local licence management towards business models that support licence management in distributed environments.

Archive | 2011

HICFD: Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures

Achim Basermann; Hans-Peter Kersken; Andreas Schreiber; Thomas Gerhold; Jens Jägersküpper; Norbert Kroll; Jan Backhaus; Edmund Kügeler; Thomas Alrutz; Christian Simmendinger; Kim Feldhoff; Olaf Krzikalla; Ralph Müller-Pfefferkorn; Mathias Puetz; Petra Aumann; Olaf Knobloch; Jörg Hunger; Carsten Zscherp

The objective of the German BMBF research project Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures (HICFD) is to develop new methods and tools for the analysis and optimization of the performance of parallel computational fluid dynamics (CFD) codes on high performance computer systems with many-core processors. In the work packages of the project it is investigated how the performance of parallel CFD codes written in C can be increased by the optimal use of all parallelism levels. On the highest level Message Passing Interface (MPI) is utilized. Furthermore, on the level of the many-core architecture, highly scaling, hybrid OpenMP/MPI methods are implemented. On the level of the processor cores the parallel Single Instruction Multiple Data (SIMD) units provided by modern CPUs are exploited.

international conference on parallel processing | 2017

Interoperability of GASPI and MPI in Large Scale Scientific Applications

Dana Akhmetova; Luis Cebamanos; Roman Iakymchuk; Tiberiu Rotaru; Mirko Rahn; Stefano Markidis; Erwin Laure; Valeria Bartsch; Christian Simmendinger

One of the main hurdles of a broad distribution of PGAS approaches is the prevalence of MPI, which as a de-facto standard appears in the code basis of many applications. To take advantage of the PGAS APIs like GASPI without a major change in the code basis, interoperability between MPI and PGAS approaches needs to be ensured. In this article, we address this challenge by providing our study and preliminary performance results regarding interoperating GASPI and MPI on the performance crucial parts of the Ludwig and iPIC3D applications. In addition, we draw a strategy for better coupling of both APIs.

Archive | 2011