Is this you? Create Your Porfile

Achim Streit

Karlsruhe Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Achim Streit is active.

Explore More

Publication

Featured researches published by Achim Streit.

grid computing | 2000

Evaluation of Job-Scheduling Strategies for Grid Computing

Volker Hamscher; Uwe Schwiegelshohn; Achim Streit; Ramin Yahyapour

In this paper, we discuss typical scheduling structures that occur in computational grids. Scheduling algorithms and selection strategies applicable to these structures are introduced and classified. Simulations were used to evaluate these aspects considering combinations of different Job and Machine Models. Some of the results are presented in this paper and are discussed in qualitative and quantitative way. For hierarchical scheduling, a common scheduling structure, the simulation results confirmed the benefit of Backfill. Unexpected results were achieved as FCFS proves to perform better than Backfill when using a central job-pool.

Future Generation Computer Systems | 2013

G-Hadoop: MapReduce across distributed data centers for data-intensive computing

Lizhe Wang; Jie Tao; Rajiv Ranjan; Holger Marten; Achim Streit; Jingying Chen; Dan Chen

Recently, the computational requirements for large-scale data-intensive analysis of scientific data have grown significantly. In High Energy Physics (HEP) for example, the Large Hadron Collider (LHC) produced 13 petabytes of data in 2010. This huge amount of data is processed on more than 140 computing centers distributed across 34 countries. The MapReduce paradigm has emerged as a highly successful programming model for large-scale data-intensive computing applications. However, current MapReduce implementations are developed to operate on single cluster environments and cannot be leveraged for large-scale distributed data processing across multiple clusters. On the other hand, workflow systems are used for distributed data processing across data centers. It has been reported that the workflow paradigm has some limitations for distributed data processing, such as reliability and efficiency. In this paper, we present the design and implementation of G-Hadoop, a MapReduce framework that aims to enable large-scale distributed computing across multiple clusters.

cluster computing and the grid | 2002

On Advantages of Grid Computing for Parallel Job Scheduling

Carsten Ernemann; Volker Hamscher; Uwe Schwiegelshohn; Ramin Yahyapour; Achim Streit

This paper addresses the potential benefit of sharing jobs between independent sites in a grid computing environment. Also the aspect of parallel multi-site job execution on different sites is discussed. To this end, various scheduling algorithms have been simulated for several machine configurations with different workloads which have been derived from real traces. The results showed that a significant improvement in terms of a smaller average response time is achievable. The usage of multi-site applications can additionally improve the results as long as the increase of the execution time due to communication overhead is limited to about 25%.

Archive | 2006

Euro-Par 2006 Parallel Processing

Wolfgang Lehner; Norbert Meyer; Achim Streit; Craig A. Stewart

A Network Monitoring system is a vital component of a Grid; however, its scalability is a challenge. We propose a network monitoring approach that combines passive monitoring, a domain oriented overlay network, and an attitude for demand driven monitoring sessions. In order to keep into account the demand for extreme scalability, we introduce a solution for two problems that are inherent to the proposed approach: security and group membership maintenance.

job scheduling strategies for parallel processing | 2003

Scheduling in HPC resource management systems: Queuing vs. planning

Matthias Hovestadt; Odej Kao; Axel Keller; Achim Streit

Nearly all existing HPC systems are operated by resource management systems based on the queuing approach. With the increasing acceptance of grid middleware like Globus, new requirements for the underlying local resource management systems arise. Features like advanced reservation or quality of service are needed to implement high level functions like co-allocation. However it is difficult to realize these features with a resource management system based on the queuing concept since it considers only the present resource usage.

parallel computing | 2005

Unicore — From project results to production grids

Achim Streit; Dietmar W. Erwin; Thomas Lippert; Daniel Mallmann; Roger Menday; Michael Rambadt; Morris Riedel; Mathilde Romberg; Bernd Schuller; Philipp Wieder

The UNICORE Grid-technology provides a seamless, secure and intuitive access to distributed Grid resources. In this paper we present the recent evolution from project results to production Grids. At the beginning UNICORE was developed as a prototype software in two projects funded by the German research ministry (BMBF). Over the following years, in various European-funded projects, UNICORE evolved to a full-grown and well-tested Grid middleware system, which today is used in daily production at many supercomputing centers worldwide. Beyond this production usage, the UNICORE technology serves as a solid basis in many European and International research projects, which use existing UNICORE components to implement advanced features, high level services, and support for applications from a growing range of domains. In order to foster these ongoing developments, UNICORE is available as open source under BSD licence at Source Forge, where new releases are published on a regular basis. This paper is a review of the UNICORE achievements so far and gives a glimpse on the UNICORE roadmap.

Journal of Computer and System Sciences | 2014

A security framework in G-Hadoop for big data computing across distributed Cloud data centres

Jiaqi Zhao; Lizhe Wang; Jie Tao; Jinjun Chen; Weiye Sun; Rajiv Ranjan; Joanna Kolodziej; Achim Streit; Dimitrios Georgakopoulos

Abstract MapReduce is regarded as an adequate programming model for large-scale data-intensive applications. The Hadoop framework is a well-known MapReduce implementation that runs the MapReduce tasks on a cluster system. G-Hadoop is an extension of the Hadoop MapReduce framework with the functionality of allowing the MapReduce tasks to run on multiple clusters. However, G-Hadoop simply reuses the user authentication and job submission mechanism of Hadoop, which is designed for a single cluster. This work proposes a new security model for G-Hadoop. The security model is based on several security solutions such as public key cryptography and the SSL protocol, and is dedicatedly designed for distributed environments. This security framework simplifies the users authentication and job submission process of the current G-Hadoop implementation with a single-sign-on approach. In addition, the designed security framework provides a number of different security mechanisms to protect the G-Hadoop system from traditional attacks.

grid computing | 2002

Enhanced Algorithms for Multi-site Scheduling

Carsten Ernemann; Volker Hamscher; Achim Streit; Ramin Yahyapour

This paper discusses two approaches to enhance multisite scheduling for grid environments. First the potential improvements of multi-site scheduling by applying constraints for the job fragmentation are presented. Subsequently, an adaptive multi-site scheduling algorithm is pointed out and evaluated. The adaptive multi-site scheduling uses a simple decision rule whether to use or not to use multi-site scheduling. To this end, several machine configurations have been simulated with different parallel job workloads which were extracted from real traces. The adaptive system improves the scheduling results in terms of a short average response time significantly.

international parallel and distributed processing symposium | 2012

MapReduce across Distributed Clusters for Data-intensive Applications

Lizhe Wang; Jie Tao; Holger Marten; Achim Streit; Samee Ullah Khan; Joanna Kolodziej; Dan Chen

Recently, the computational requirements for large scale data-intensive analysis of scientific data have grown significantly. In High Energy Physics (HEP) for example, the Large Hadron Collider (LHC) produced 13 petabytes of data in 2010. This huge amount of data are processed on more than 140 computing centers distributed across 34 countries. The MapReduce paradigm has emerged as a highly successful programming model for large-scale data-intensive computing applications. However, current MapReduce implementations are developed to operate on single cluster environments and cannot be leveraged for large-scale distributed data processing across multiple clusters. On the other hand, workflow systems are used for distributed data processing across data centers. It has been reported that the workflow paradigm has some limitations for distributed data processing, such as reliability and efficiency. In this paper, we present the design and implementation of GHadoop, a MapReduce framework that aims to enable large-scale distributed computing across multiple clusters. G-Hadoop uses the Gfarm file system as an underlying file system and executes MapReduce tasks across distributed clusters. Experiments of the G-Hadoop framework on distributed clusters show encouraging results.

high performance distributed computing | 2000

Robust resource management for metacomputers

Jörn Gehring; Achim Streit

Presents a robust software infrastructure for metacomputing. The system is intended to be used by others as a building block for large and powerful computational grids. Much effort has been taken to develop a fault-tolerant architecture that does not exhibit a single point of failure. Furthermore, we have designed the system to be modular, lean and portable. It is available as open source code and has been successfully compiled on POSIX- and Microsoft Windows-compliant platforms. The system does not originate from a laboratory environment but has proven its robustness within two large metacomputing installations. It embodies a modular concept which allows easy integration of new or modified components. Hence, it is not necessary to buy into the system as whole. We rather encourage others to use only those components that fit into their specific environments.

Explore More