Simon Ostermann
University of Innsbruck
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Simon Ostermann.
IEEE Transactions on Parallel and Distributed Systems | 2011
Alexandru Iosup; Simon Ostermann; Mn Yigitbasi; Radu Prodan; Thomas Fahringer; Dhj Dick Epema
Cloud computing is an emerging commercial infrastructure paradigm that promises to eliminate the need for maintaining expensive computing facilities by companies and institutes alike. Through the use of virtualization and resource time sharing, clouds serve with a single set of physical resources a large user base with different needs. Thus, clouds have the potential to provide to their owners the benefits of an economy of scale and, at the same time, become an alternative for scientists to clusters, grids, and parallel production environments. However, the current commercial clouds have been built to support web and small database workloads, which are very different from typical scientific computing workloads. Moreover, the use of virtualization and resource time sharing may introduce significant performance penalties for the demanding scientific computing workloads. In this work, we analyze the performance of cloud computing services for scientific computing workloads. We quantify the presence in real scientific computing workloads of Many-Task Computing (MTC) users, that is, of users who employ loosely coupled applications comprising many tasks to achieve their scientific goals. Then, we perform an empirical evaluation of the performance of four commercial cloud computing services including Amazon EC2, which is currently the largest commercial cloud. Last, we compare through trace-based simulation the performance characteristics and cost models of clouds and other scientific computing platforms, for general and MTC-based scientific computing workloads. Our results indicate that the current clouds need an order of magnitude in performance improvement to be useful to the scientific community, and show which improvements should be considered first to address this discrepancy between offer and demand.
grid computing | 2009
Radu Prodan; Simon Ostermann
With an increasing number of providers claiming to offer Cloud infrastructures, there is a lack in the community for a common terminology, accompanied by a clear definition and classification of Cloud features. We conduct in this paper a survey on a selection of Cloud providers, and propose a taxonomy of eight important Cloud computing elements covering service type, resource deployment, hardware, runtime tuning, business model, middleware, and performance. We conclude that the provisioning of Service Level Agreements as utilities, of open and interoperable middleware solutions, as well as of sustained performance metrics for high-performance computing applications are three elements with the highest need of further community research.
cluster computing and the grid | 2009
Nezih Yigitbasi; Alexandru Iosup; Dick H. J. Epema; Simon Ostermann
Cloud computing has emerged as a new technology that provides large amounts of computing and data storage capacity to its users with a promise of increased scalability, high availability, and reduced administration and maintenance costs. As the use of cloud computing environments increases, it becomes crucial to understand the performance of these environments. So, it is of great importance to assess the performance of computing clouds in terms of various metrics, such as the overhead of acquiring and releasing the virtual computing resources, and other virtualization and network communications overheads. To address these issues, we have designed and implemented C-Meter, which is a portable, extensible, and easy-to-use framework for generating and submitting test workloads to computing clouds. In this paper, first we state the requirements for frameworks to assess the performance of computing clouds. Then, we present the architecture of the C-Meter framework and discuss several cloud resource management alternatives. Finally, we present ourearly experiences with C-Meter in Amazon EC2. We show how C-Meter can be used for assessing the overhead of acquiring and releasing the virtual computing resources, for comparing different configurations, and for evaluating different scheduling algorithms.
european conference on parallel processing | 2010
Simon Ostermann; Kassian Plankensteiner; Radu Prodan; Thomas Fahringer
We present GroudSim, a Grid and Cloud simulation toolkit for scientific applications based on a scalable simulation-independent discrete-event core. GroudSim provides a comprehensive set of features for complex simulation scenarios from simple job executions on leased computing resources to calculation of costs, and background load on resources. Simulations can be parameterised and are easily extendable by probability distribution packages for failures which normally occur in complex environments. Experimental results demonstrate the improved scalability of GroudSim compared to a related process-based approach.
grid computing | 2009
Simon Ostermann; Radu Prodan; Thomas Fahringer
From its start using supercomputers, scientific computing constantly evolved to the next levels such as cluster computing, meta-computing, or computational Grids. Today, Cloud Computing is emerging as the paradigm for the next generation of large-scale scientific computing, eliminating the need of hosting expensive computing hardware. Scientists still have their Grid environments in place and can benefit from extending them by leased Cloud resources whenever needed. This paradigm shift opens new problems that need to be analyzed, such as integration of this new resource class into existing environments, applications on the resources and security. The virtualization overheads for deployment and starting of a virtual machine image are new factors which will need to be considered when choosing scheduling mechanisms. In this paper we investigate the usability of compute Clouds to extend a Grid workflow middleware and show on a real implementation that this can speed up executions of scientific workflows.
international conference on parallel processing | 2012
Simon Ostermann; Radu Prodan
We analyze the problem of provisioning Cloud instances to large scientific workflows that do not benefit from sufficient Grid resources as required by their computational requirements. We propose an extension to the dynamic critical path scheduling algorithm to deal with the general resource leasing model encountered in todays commercial Clouds. We analyze the availability of the cheaper and unreliable Spot instances and study their potential to complement the unavailability of Grid resources for large workflow executions. Experimental results demonstrate that Spot instances represent a 60% cheaper but equally reliable alternative to Standard instances provided that a correct user bet is made.
IEEE Software | 2012
Radu Prodan; Michael Sperk; Simon Ostermann
An experimental approach employs the Google App Engine (GAE) for high-performance parallel computing. A generic master-slave framework enables fast prototyping and integration of parallel algorithms that are transparently scheduled and executed on the Google cloud infrastructure. Compared to Amazon Elastic Compute Cloud (EC2), GAE offers lower resource-provisioning overhead and is cheaper for jobs shorter than one hour. Experiments demonstrated good scalability of a Monte Carlo simulation algorithm. Although this approach produced important speedup, two main obstacles limited its performance: middleware overhead and resource quotas.
ServiceWave'11 Proceedings of the 4th European conference on Towards a service-based internet | 2011
Simon Ostermann; Kassian Plankensteiner; Daniel Bodner; Georg Kraler; Radu Prodan
The utilisation of Grid and Cloud-based computing environments for solving scientific problems has become an increasingly used practice in the last decade. To ease the use of these global distributed resources, sophisticated middleware systems have been developed, enabling the transparent execution of applications by hiding low-level technology details from the user. The ASKALON environment is such a system, which supports the development and execution of distributed applications such as scientific workflows or parameter studies in Grid and Cloud computing environments. On the other hand, simulation is a widely accepted approach to analyse and further optimise the behaviour of software systems. Beside the advantage of enabling repeatable deterministic evaluations, simulations are able to circumvent the difficulties in setting up and operating multi-institutional Grid systems, thus providing a lightweight simulated distributed environment on a single machine. In this paper, we present the integration of the GroudSim Grid and Cloud event-based simulator into the ASKALON environment. This enables system, application developers, and users to perform simulations using their accustomed environment, thereby benefiting from the combination of an established real-world platform and the advantages of a simulation.
Archive | 2010
Simon Ostermann; Radu Prodan; Thomas Fahringer
From its start of using supercomputers, scientific computing constantly evolved to the next levels such as cluster computing, meta-computing, or computational Grids. Today, Cloud Computing is emerging as the paradigm for the next generation of large-scale scientific computing, eliminating the need for hosting expensive computing hardware. Scientists still have their Grid environments in place and can benefit from extending them using leased Cloud resources whenever needed. This paradigm shift opens new problems that need to be analyzed, such as integration of this new resource class into existing environments, applications on the resources, and security. The virtualization overheads for deployment and starting of a virtual machine image are new factors, which will need to be considered when choosing scheduling mechanisms. In this chapter, we investigate the usability of compute Clouds to extend a Grid workflow middleware and show on a real implementation that this can speed up executions of scientific workflows.
Archive | 2008
Simon Ostermann; Radu Prodan; Thomas Fahringer; Alexandru Iosup; Dick H. J. Epema
Grid computing promises to enable a reliable and easy-to-use computational infrastructure for e-Science. To materialize this promise, grids need to provide full automation from the experiment design to the final result. Often, this automation relies on the execution of workflows, that is, of jobs comprising many inter-related computing and data transfer tasks. While several grid workflow execution tools already exist, not much is known about their workload. This lack of knowledge hampers the development of new workflow scheduling algorithms, and slows the tuning of existing ones. To address this situation, in this work we present an analysis of two workflow-based workload traces from the Austrian Grid. We introduce a method for analyzing such traces, focused on the intrinsic and on the environment-related characteristics of the workflows. Then, we analyze the workflows executed in the Austrian Grid over the last two years. Finally, we identify six categories of workflows based on their intrinsic workflow characteristics. We show that the six categories exhibit distinctive environmentrelated characteristics, and identify the categories that are difficult to execute for common workflow schedulers.