Per-Olov Östberg
Umeå University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Per-Olov Östberg.
Future Generation Computer Systems | 2013
Per-Olov Östberg; Daniel Espling; Erik Elmroth
This work addresses Grid fairshare allocation policy enforcement and presents Aequus, a decentralized system for Grid-wide fairshare job prioritization. The main idea of fairshare scheduling is to prioritize users with regard to predefined resource allocation quotas. The presented system builds on three contributions: a flexible tree-based policy model that allows delegation of policy definition, a job prioritization algorithm based on local enforcement of distributed fairshare policies, and a decentralized architecture for non-intrusive integration with existing scheduling systems. The system supports organization of users in virtual organizations and divides usage policies into local and global policy components that are defined by resource owners and virtual organizations. The architecture realization is presented in detail along with an evaluation of the system behavior in an emulated environment. In the evaluation, convergence noise types (mechanisms counteracting policy allocation convergence) are characterized and quantified, and the system is demonstrated to meet scheduling objectives and perform scalably under realistic operating conditions. Highlights? We propose Aequus-a system for enforcement of Grid resource capacity allocation. ? Aequus is scalable, flexible, and applicable to most Grid environments. ? Aequus is based on fairshare enactment of job prioritization.
high performance distributed computing | 2015
Gonzalo Pedro Rodrigo Álvarez; Per-Olov Östberg; Erik Elmroth; Katie Antypas; Richard A. Gerber; Lavanya Ramakrishnan
High performance computing centers have traditionally served monolithic MPI applications. However, in recent years, many of the large scientific computations have included high throughput and data-intensive jobs. HPC systems have mostly used batch queue schedulers to schedule these workloads on appropriate resources. There is a need to understand future scheduling scenarios that can support the diverse scientific workloads in HPC centers. In this paper, we analyze the workloads on two systems (Hopper, Carver) at the National Energy Research Scientific Computing (NERSC) Center. Specifically, we present a trend analysis towards understanding the evolution of the workload over the lifetime of the two systems.
ieee international conference on cloud computing technology and science | 2014
Per-Olov Östberg; Henning Groenda; Stefan Wesner; James Byrne; Dimitrios S. Nikolopoulos; Craig Sheridan; Jakub Krzywda; Ahmed Ali-Eldin; Johan Tordsson; Erik Elmroth; Christian Stier; Klaus Krogmann; Jörg Domaschka; Christopher B. Hauser; Peter J. Byrne; Sergej Svorobej; Barry McCollum; Zafeirios Papazachos; Darren Whigham; Stephan Ruth; Dragana Paurevic
Recent advances in hardware development coupled with the rapid adoption and broad applicability of cloud computing have introduced widespread heterogeneity in data centers, significantly complicating the management of cloud applications and data center resources. This paper presents the CACTOS approach to cloud infrastructure automation and optimization, which addresses heterogeneity through a combination of in-depth analysis of application behavior with insights from commercial cloud providers. The aim of the approach is threefold: to model applications and data center resources, to simulate applications and resources for planning and operation, and to optimize application deployment and resource use in an autonomic manner. The approach is based on case studies from the areas of business analytics, enterprise applications, and scientific computing.
Future Generation Computer Systems | 2013
Per-Olov Östberg; Erik Elmroth
There exists a number of grid infrastructures in production use for a wide range of scientific applications. However, due to the complexities inherent to construction of distributed computing environments, many grid tools and applications remain tied to specific grids and grid middlewares. In this work we investigate best practices for grid software design and development, and propose a composable, loosely coupled Service-Oriented Architecture for grid job management. The architecture is designed for use in federated grid environments and defines a model for transparent grid access that aims to decouple grid applications from grid middlewares and facilitate concurrent use of multiple grid middlewares. The architecture model is discussed from the point of view of an ecosystem of grid infrastructure components, and is presented along with a proof-of-concept implementation of the architecture. Highlights? We propose a layered service-oriented architecture for grid job management. ? The architecture is designed as a loosely coupled framework of web services. ? The framework provides concurrent transparent access to multiple grid middlewares. ? The architecture decouples grid applications from middlewares and infrastructures.
cluster computing and the grid | 2012
Per-Olov Östberg; Andreas Hellander; Brian Drawert; Erik Elmroth; Sverker Holmgren; Linda R. Petzold
In this paper we address reduction of complexity in management of scientific computations in distributed computing environments. We explore an approach based on separation of computation design (application development) and distributed execution of computations, and investigate best practices for construction of virtual infrastructures for computational science - software systems that abstract and virtualize the processes of managing scientific computations on heterogeneous distributed resource systems. As a result we present StratUm, a toolkit for management of eScience computations. To illustrate use of the toolkit, we present it in the context of a case study where we extend the capabilities of an existing kinetic Monte Carlo software framework to utilize distributed computational resources. The case study illustrates a viable design pattern for construction of virtual infrastructures for distributed scientific computing. The resulting infrastructure is evaluated using a computational experiment from molecular systems biology.
Journal of Parallel and Distributed Computing | 2018
Gonzalo P. Rodrigo; Per-Olov Östberg; Erik Elmroth; Katie Antypas; Richard A. Gerber; Lavanya Ramakrishnan
High performance computing (HPC) scheduling landscape currently faces new challenges due to the changes in the workload. Previously, HPC centers were dominated by tightly coupled MPI jobs. HPC work ...
international conference on cloud computing and services science | 2017
James Byrne; Sergej Svorobej; Konstantinos M. Giannoutakis; Dimitrios Tzovaras; Peter J. Byrne; Per-Olov Östberg; Anna Gourinovitch; Theo Lynn
Recent years have seen an increasing trend towards the development of Discrete Event Simulation (DES) platforms to support cloud computing related decision making and research. The complexity of cl ...
grid computing | 2011
Luis Tom´s; Per-Olov Östberg; Blanca Caminero; Carmen Carrión; Erik Elmroth
Grids are highly variable heterogeneous systems where resources may span multiple administrative domains and utilize heterogeneous schedulers, which complicates enforcement of end-user resource utilization quotas. This work focuses on enhancement of resource utilization quality of service through combination of two systems. A predictive meta-scheduling framework and a distributed fairs hare job prioritization system. The first, SA-Layer, is a system designed to provide scheduling of jobs in advance by ensuring resource availability for future job executions. The second, FS Grid, provides an efficient mechanism for fairs hare-based job prioritization. The integrated architecture presented in this work combines the strengths of both systems and improves perceived end-user quality of service by providing reliable resource allocations adhering to usage allocation policies.
grid computing | 2011
Per-Olov Östberg; Erik Elmroth
Grid computing applications and infrastructures build heavily on Service-Oriented Computing development methodology and are often realized as Service-Oriented Architectures. The Grid Job Management Framework (GJMF) is a flexible Grid infrastructure and application support tool that offers a range of abstractive and platform independent interfaces for middleware-agnostic Grid job submission, monitoring, and control. In this paper we use the GJMF as a test bed for characterization of Grid Service-Oriented Architecture overhead, and evaluate the efficiency of a set of design patterns for overhead mediation mechanisms featured in the framework.
job scheduling strategies for parallel processing | 2017
Gonzalo P. Rodrigo; Erik Elmroth; Per-Olov Östberg; Lavanya Ramakrishnan
High-throughput and data-intensive applications are increasingly present, often composed as workflows, in the workloads of current HPC systems. At the same time, trends for future HPC systems point towards more heterogeneous systems with deeper I/O and memory hierarchies. However, current HPC schedulers are designed to support classical large tightly coupled parallel jobs over homogeneous systems. Therefore, there is an urgent need to investigate new scheduling algorithms that can manage the future workloads on HPC systems. However, there is a lack of appropriate models and frameworks to enable development, testing, and validation of new scheduling ideas.