Peter Tröger
Hasso Plattner Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Peter Tröger.
cluster computing and the grid | 2007
Peter Tröger; Hrabri Rajic; Andreas Haas; Piotr Domagalski
Todays cluster and grid environments demand the usage of product-specific APIs and tools for developing distributed applications. We give an overview of the distributed resource management application API (DRMAA) specification, which defines a common interface for job submission, control, and monitoring. The DRMAA specification was developed by the authors at the open grid forum standardization body, and has meanwhile significant adoption in academic and commercial cluster systems. Within this paper, we describe the basic concepts of the finalized API, and explain issues and findings with the standardization of such an unified interface.
workshop on object-oriented real-time dependable systems | 2003
Peter Tröger; Andreas Polze
Most of todays distributed computing systems in the field do not support the migration of execution entities among computing nodes during runtime. The relatively static association between units of processing and computing nodes makes it difficult to implement fault-tolerant behavior or load-balancing schemes. The concept of code migration may provide a solution to the problems mentioned above. It can be defined as the movement of process, object or component instances from one computing node to another during system runtime in a distributed environment. Within our paper we describe the integration of a migration facility with the help of aspect-oriented programming (AOP) into the .NET framework. AOP is interesting as it addresses nonfunctional system properties on the middleware level, without the need to manipulate lower system layers like the operating system itself. We have implemented two proof-of-concept applications, namely a migrating Web server as well as a migrating file version checker application. The paper contains an experimental evaluation of the performance impact of object migration in context of those two applications.
grid computing | 2014
Peter Tröger; Andre Merzky
The submission and management of computational jobs is a traditional part of utility computing environments. End users and developers of domain-specific software abstractions often have to deal with the heterogeneity of such batch processing systems. This lead to a number of application programming interface and job description standards in the past, which are implemented and established for cluster and Grid systems. With the recent rise of cloud computing as new utility computing paradigm, the standardized access to batch processing facilities operated on cloud resources becomes an important issue. Furthermore, the design of such a standard has to consider a tradeoff between feature completeness and the achievable level of interoperability. The article discusses this general challenge, and presents some existing standards with traditional cluster and Grid computing background that may be applicable to cloud environments. We present OCCI-DRMAA as one approach for standardized access to batch processing facilities hosted in a cloud.
2010 Third International Conference on Dependability | 2010
Peter Tröger; Felix Salfner; Steffen Tschirpke
Software-implemented fault injection is an established method to emulate hardware faults in computer systems. Existing approaches typically extend the operating system by special drivers or change the application under test. We propose a novel approach where fault injection capabilities are added to the computer firmware. This approach can work without any modification to operating system and / or applications, and can support a larger variety of fault locations. We discuss four different strategies in X86/X64 and Itanium systems. Our analysis shows that such an approach can increase portability, the non-intrusiveness of the injector implementation, and the number of supported fault locations. Firmware-level fault injection paves the way for new research directions, such as virtual machine monitor fault injection or the investigation of certified operating systems.
pacific rim international symposium on dependable computing | 2013
Peter Tröger; Franz Becker; Felix Salfner
Dependability modeling is a widely established method for analyzing the reliability of complex systems. Nearly all approaches focus on the representation - in success or failure space - of one specific system configuration. This does not reflect the high configurability of systems being common today. Furthermore, in order to perform a quantitative analysis of a model, the reliability engineer also needs to add exact event probabilities that can often only be estimated. Both things together lead to a situation where the final model may persuade the reader of an exactness that does not correspond with reality. We present an extended version of fault trees called Fuzz Trees. They combine fuzzy numbers for event probabilities with the modeling of system variability by new graphical notations. The concept allows the engineer to make uncertainty an explicit part of the dependability model, and to evaluate design alternatives early in the development. We discuss an initial analysis algorithm that allows the comparison of failure probabilities and costs for all possible system configurations being modeled by such a tree.
international conference on parallel processing | 2013
Fahad Khalid; Zoran Nikoloski; Peter Tröger; Andreas Polze
Elementary Flux Modes (EFMs) can be used to characterize functional cellular networks and have gained importance in systems biology. Enumeration of EFMs is a compute-intensive problem due to the combinatorial explosion in candidate generation. While there exist parallel implementations for shared-memory SMP and distributed memory architectures, tools supporting heterogeneous platforms have not yet been developed. Here we propose and evaluate a heterogeneous implementation of combinatorial candidate generation that employs GPUs as accelerators. It uses a 3-stage pipeline based method to manage arithmetic intensity. Our implementation results in a 6x speedup over the serial implementation, and a 1.8x speedup over a multithreaded implementation for CPU-only SMP architectures.
Lecture Notes in Computer Science | 2004
Peter Tröger; Martin von Löwis; Andreas Polze
We present a new implementation of the old Occam language, using Microsoft .NET as the target platform. We show how Occam can be used to develop cluster and grid applications, and how such applications can be deployed. In particular, we discuss automatic placement of Occam processes onto processing nodes.
ieee international conference on software quality reliability and security companion | 2017
Lena Feinbube; Lukas Pirl; Peter Tröger; Andreas Polze
A justifiably trustworthy provisioning of cloud services can only be ensured if reliability, availability, and other dependability attributes are assessed accordingly.We present a structured approach for deriving fault injection campaigns from a failure space model of the system. Fault injection experiments are selected based on criteria of coverage, efficiency and maximality of the faultload. The resulting campaign is enacted automatically and shows the performance impact of the tested worst case non-failure scenarios.We demonstrate the feasibility of our approach with a fault tolerant deployment of an OpenStack cloud infrastructure.
mediterranean conference on embedded computing | 2015
Peter Tröger; Matthias Werner; Jan Richling
While the future importance of cyber-physical systems is widely acknowledged, there is surprisingly rare discussion about the design of operating systems for these kinds of systems. We present an extended view on the low-level abstractions to be offered by the operating system to applications. The central idea is the treatment of all relevant cyber-physical entities as task execution resource, which directly impacts the representation of tasks, communication and memory in the application programming interface. The resulting concept framework can serve as starting point for future research in this field.
parallel and distributed computing: applications and technologies | 2014
Max Plauth; Frank Feinbube; Peter Tröger; Andreas Polze
Blind Signal Separation is an algorithmic problem class that deals with the restoration of original signal data from a signal mixture. Implementations, such as Fast ICA, are optimized for parallelization on CPU or first-generation GPU hardware. With the advent of modern, compute centered GPU hardware with powerful features such as dynamic parallelism support, these solutions no longer leverage the available hardware performance in the best-possible way. We present an optimized implementation of the FastICA algorithm, which is specifically tailored for next-generation GPU architectures such as Nvidia Kepler. Our proposal achieves a two digit factor of speedup in the prototype implementation, compared to a multithreaded CPU implementation. Our custom matrix multiplication kernels, tailored specifically for the use case, contribute to the speedup by delivering better performance than the state-of-the-art CUBLAS library.