Hans-Ulrich Heiss
Technical University of Berlin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hans-Ulrich Heiss.
symposium on computer architecture and high performance computing | 2003
Lars-Olof Burchard; Hans-Ulrich Heiss; C.A.F. De Rose
In general, two types of resource reservations in computer networks can be distinguished: immediate reservations which are made in a just-in-time manner and advance reservations which allow to reserve resources a long time before they are actually used. Advance reservations are especially useful for grid computing but also for a variety of other applications that require network quality-of-service, such as content distribution networks or even mobile clients, which need advance reservation to support handovers for streaming video. With the emerged MPLS standard, explicit routing can be implemented also in IP networks, thus overcoming the unpredictable routing behavior which so far prevented the implementation of advance reservation services. The impact of such advance reservation mechanisms on the performance of the network with respect to the amount of admitted requests and the allocated bandwidth has so far not been examined in detail. We show that advance reservations can lead to a reduced performance of the network with respect to both metrics. The analysis of the reasons shows a fragmentation of the network resources. In advance reservation environments, additional new services can be defined such as malleable reservations and can lead to an increased performance of the network. Four strategies for scheduling malleable reservations are presented and compared. The results of the comparisons show that some strategies increase the resource fragmentation and are therefore unsuitable in the considered environment while others lead to a significantly better performance of the network. Besides discussing the performance issue, the software architecture of a management system for advance reservations is presented.
parallel computing | 2007
César A. F. De Rose; Hans-Ulrich Heiss; Barry Linnert
Current processor allocation techniques for highly parallel systems use centralized front-end based algorithms which restrict applied strategies to static allocation, low parallelism, and weak fault tolerance. To lift these restrictions, we are investigating a distributed approach to processor allocation in multicomputers where currently no centralized data structure with information about the state of all processors exists. This approach will allow the implementation of more complex allocation schemes and possibly the consideration of dynamic allocation, where parallel applications would be able to adapt the allocated processor partition to its actual demand at running time, resulting in a more efficient utilization of system resources. Noncontiguous versions of a distributed dynamic processor allocation scheme are proposed and studied in this paper as an alternative for parallel programming models to allow dynamic creation and task deletion. Simulations compare the performance of the proposed dynamic strategies with static counterparts and also with well-known centralized algorithms in an environment with growing and shrinking processor demands. To demonstrate dynamic allocation is feasible with current technologies, results of the experiments are presented for a 96 nodes SCI hpcLine Primergy Server cluster.
international conference on parallel processing | 2011
Tiago C. Ferreto; César A. F. De Rose; Hans-Ulrich Heiss
Server consolidation is a vital mechanism in modern data centers in order to minimize expenses with infrastructure. In most cases, server consolidation may require migrating virtual machines between different physical servers. Although the downtime of live-migration is negligible, the amount of time to migrate all virtual machines can be substantial, delaying the completion of the consolidation process. This paper proposes a new server consolidation algorithm, which guarantees that migrations are completed in a given maximum time. The migration time is estimated using the max-min fairness model, in order to consider the competition of migration flows for the network infrastructure. The algorithm was simulated using a real workload and shows a good consolidation ratio in comparison to other algorithms, while also guaranteeing a maximum migration time.
ieee international conference on high performance computing data and analytics | 2008
Lars Olof Burchard; Hans-Ulrich Heiss; Barry Linnert; Joerg Schneider; César A. F. De Rose
For resource management in grid environments, advance reservations turned out to be very useful and hence are supported by a variety of grid toolkits. However, failure recovery for such systems has not yet received the attention it deserves. In this paper, we address the problem of remapping reservations to other resources, when the originally selected resource fails. Instead of dealing with jobs already running, which usually means checkpointing and migration, our focus is on jobs that are scheduled on the failed resource for a specific future period of time but not started yet. The most critical factor when solving this problem is the estimation of the downtime. We avoid the drawbacks of under- or overestimating the downtime by a dynamic load-based approach that is evaluated by extensive simulations in a grid environment and shows superior performance compared to estimation-based approaches.
high performance computing and communications | 2009
Rodrigo da Rosa Righi; Laércio Lima Pilla; Alexandre Carissimi; Philippe Olivier Alexandre Navaux; Hans-Ulrich Heiss
We have developed a model called MigBSP that controls processes rescheduling in BSP (Bulk Synchronous Parallel)applications. A BSP application is composed by one or more supersteps, each one containing both computation and communication phases followed by a synchronization barrier. Since the barrier waits for the slowest process, MigBSP’s final idea is to adjust the processes location in order to reduce the supersteps’ times. Considering the scope of the BSP model, the novel ideas of MigBSPare: (i) combination of three metrics - Memory, Computation and Communication - to measure the potential of migration of each BSP process; (ii) use of both Computation and Communication Patterns to control processes’ regularity;(iii) adaptation regarding the periodicity to launch the processes rescheduling. This paper describes MigBSP and presents some experimental results and related work.
symposium on computer architecture and high performance computing | 2005
Lars-Olof Burchard; C.A.F. De Rose; Hans-Ulrich Heiss; Barry Linnert; Joerg Schneider
For resource management in grid environments, advance reservations turned out to be very useful and hence are supported by a variety of grid toolkits. However, failure recovery for such systems has not yet received the attention it deserves. In this paper, we address the problem of remapping reservations to other resources, when the originally selected resource fails. Instead of dealing with jobs already running, which usually means checkpointing and migration, our focus is on jobs that are scheduled on the failed resource for a specific future period of time but not started yet. The most critical factor when solving this problem is the estimation of the downtime. We avoid the drawbacks of under- or overestimating the downtime by a dynamic load-based approach that is evaluated by extensive simulations in a grid environment and shows superior performance compared to estimation-based approaches.
symposium on computer architecture and high performance computing | 2010
Jan Hendrik Schönherr; Jan Richling; Hans-Ulrich Heiss
While OpenMP conceptually allows to vary the degree of parallelism from one parallel region to the next in order to adapt to the system load, this might still be too coarse-grained in certain scenarios. Especially applications designed for parallelism may stay within one parallel region for a long time. This may lead either to an oversubscribed system where individual applications are not restricted in their degree of parallelism, or to an underutilized system, because individual applications are restricted to a too small degree of parallelism. In this paper, we tackle both problems by dynamically restricting the number of active threads within a parallel region without violating the OpenMP specification.
Future Generation Grids | 2006
Lars-Olof Burchard; Hans-Ulrich Heiss; Barry Linnert; Jörg Schneider; Felix Heine; Matthias Hovestadt; Odej Kao; Axel Keller
In this paper, we describe the architecture of the virtual resource manager VRM, a management system designed to reside on top of local resource management systems for cluster computers and other kinds of resources. The most important feature of the VRM is its capability to handle quality-of-service (QoS) guarantees and service-level agreements (SLAs). The particular emphasis of the paper is on the various opportunities to deal with local autonomy for resource management systems not supporting SLAs. As local administrators may not want to hand over complete control to the Grid management, it is necessary to define strategies that deal with this issue. Local autonomy should be retained as much as possible while providing reliability and QoS guarantees for Grid applications, e.g., specified as SLAs.
european dependable computing conference | 2015
Peter Munk; Mohammad Shadi Alhakeem; Raphael Lisicki; Helge Parzyjegla; Jan Richling; Hans-Ulrich Heiss
Commercial-off-the-shelf (COTS) many-core processors offer the performance needed for computational-intensive safety-critical real-time applications such as autonomous driving. However, these consumer-grade many-core processors are increasingly susceptible to faults because of their highly integrated design. In this paper, we present a fault-tolerance framework that eases the usage of COTS many-core processors for safety-critical applications. Our framework employs an adaptable software-based fault-tolerance mechanism that combines N Modular Redundancy (NMR) with a repair process and a rejuvenating round robin voting scheme. A Stochastic Activity Network (SAN) model of the fault-tolerance mechanism allows the framework to adapt the parameters of the mechanism such that a specified target availability is achieved with minimum overhead. Experiments on a cycle-accurate simulator empirically prove the correctness of the SAN model and evaluate the overhead of the framework.
International Journal of Parallel Programming | 2013
Claudio Schepke; Nicolas Maillard; Joerg Schneider; Hans-Ulrich Heiss
Forecast precisions of climatological models are limited by computing power and time available for the executions. As more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earth’s atmosphere can be increased, and consequently the numerical forecast is more accurate. However, a finer mesh resolution, able to include local phenomena in a global atmosphere integration, is still not possible due to the large number of data elements to compute in this case. To overcome this situation, different mesh refinement levels can be used at the same time for different areas of the domain. Thus, our paper evaluates how mesh refinement at run time (online) can improve performance for climatological models.The online mesh refinement (OMR) increases dynamically mesh resolution in parts of a domain,when special atmosphere conditions are registered during the execution. Experimental results show that the execution of a model improved by OMR provides better resolution for the meshes, without any significant increase of execution time. The parallel performance of the simulations is also increased through the creation of threads in order to explore different levels of parallelism.
Collaboration
Dive into the Hans-Ulrich Heiss's collaboration.
Philippe Olivier Alexandre Navaux
Universidade Federal do Rio Grande do Sul
View shared research outputs