Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Per-Olov Östberg is active.

Publication


Featured researches published by Per-Olov Östberg.


Future Generation Computer Systems | 2013

Decentralized scalable fairshare scheduling

Per-Olov Östberg; Daniel Espling; Erik Elmroth

This work addresses Grid fairshare allocation policy enforcement and presents Aequus, a decentralized system for Grid-wide fairshare job prioritization. The main idea of fairshare scheduling is to prioritize users with regard to predefined resource allocation quotas. The presented system builds on three contributions: a flexible tree-based policy model that allows delegation of policy definition, a job prioritization algorithm based on local enforcement of distributed fairshare policies, and a decentralized architecture for non-intrusive integration with existing scheduling systems. The system supports organization of users in virtual organizations and divides usage policies into local and global policy components that are defined by resource owners and virtual organizations. The architecture realization is presented in detail along with an evaluation of the system behavior in an emulated environment. In the evaluation, convergence noise types (mechanisms counteracting policy allocation convergence) are characterized and quantified, and the system is demonstrated to meet scheduling objectives and perform scalably under realistic operating conditions. Highlights? We propose Aequus-a system for enforcement of Grid resource capacity allocation. ? Aequus is scalable, flexible, and applicable to most Grid environments. ? Aequus is based on fairshare enactment of job prioritization.


high performance distributed computing | 2015

HPC System Lifetime Story: Workload Characterization and Evolutionary Analyses on NERSC Systems

Gonzalo Pedro Rodrigo Álvarez; Per-Olov Östberg; Erik Elmroth; Katie Antypas; Richard A. Gerber; Lavanya Ramakrishnan

High performance computing centers have traditionally served monolithic MPI applications. However, in recent years, many of the large scientific computations have included high throughput and data-intensive jobs. HPC systems have mostly used batch queue schedulers to schedule these workloads on appropriate resources. There is a need to understand future scheduling scenarios that can support the diverse scientific workloads in HPC centers. In this paper, we analyze the workloads on two systems (Hopper, Carver) at the National Energy Research Scientific Computing (NERSC) Center. Specifically, we present a trend analysis towards understanding the evolution of the workload over the lifetime of the two systems.


ieee international conference on cloud computing technology and science | 2014

The CACTOS Vision of Context-Aware Cloud Topology Optimization and Simulation

Per-Olov Östberg; Henning Groenda; Stefan Wesner; James Byrne; Dimitrios S. Nikolopoulos; Craig Sheridan; Jakub Krzywda; Ahmed Ali-Eldin; Johan Tordsson; Erik Elmroth; Christian Stier; Klaus Krogmann; Jörg Domaschka; Christopher B. Hauser; Peter J. Byrne; Sergej Svorobej; Barry McCollum; Zafeirios Papazachos; Darren Whigham; Stephan Ruth; Dragana Paurevic

Recent advances in hardware development coupled with the rapid adoption and broad applicability of cloud computing have introduced widespread heterogeneity in data centers, significantly complicating the management of cloud applications and data center resources. This paper presents the CACTOS approach to cloud infrastructure automation and optimization, which addresses heterogeneity through a combination of in-depth analysis of application behavior with insights from commercial cloud providers. The aim of the approach is threefold: to model applications and data center resources, to simulate applications and resources for planning and operation, and to optimize application deployment and resource use in an autonomic manner. The approach is based on case studies from the areas of business analytics, enterprise applications, and scientific computing.


Future Generation Computer Systems | 2013

GJMF - a composable service-oriented grid job management framework

Per-Olov Östberg; Erik Elmroth

There exists a number of grid infrastructures in production use for a wide range of scientific applications. However, due to the complexities inherent to construction of distributed computing environments, many grid tools and applications remain tied to specific grids and grid middlewares. In this work we investigate best practices for grid software design and development, and propose a composable, loosely coupled Service-Oriented Architecture for grid job management. The architecture is designed for use in federated grid environments and defines a model for transparent grid access that aims to decouple grid applications from grid middlewares and facilitate concurrent use of multiple grid middlewares. The architecture model is discussed from the point of view of an ecosystem of grid infrastructure components, and is presented along with a proof-of-concept implementation of the architecture. Highlights? We propose a layered service-oriented architecture for grid job management. ? The architecture is designed as a loosely coupled framework of web services. ? The framework provides concurrent transparent access to multiple grid middlewares. ? The architecture decouples grid applications from middlewares and infrastructures.


cluster computing and the grid | 2012

Reducing Complexity in Management of eScience Computations

Per-Olov Östberg; Andreas Hellander; Brian Drawert; Erik Elmroth; Sverker Holmgren; Linda R. Petzold

In this paper we address reduction of complexity in management of scientific computations in distributed computing environments. We explore an approach based on separation of computation design (application development) and distributed execution of computations, and investigate best practices for construction of virtual infrastructures for computational science - software systems that abstract and virtualize the processes of managing scientific computations on heterogeneous distributed resource systems. As a result we present StratUm, a toolkit for management of eScience computations. To illustrate use of the toolkit, we present it in the context of a case study where we extend the capabilities of an existing kinetic Monte Carlo software framework to utilize distributed computational resources. The case study illustrates a viable design pattern for construction of virtual infrastructures for distributed scientific computing. The resulting infrastructure is evaluated using a computational experiment from molecular systems biology.


Journal of Parallel and Distributed Computing | 2018

Towards understanding HPC users and systems: A NERSC case study

Gonzalo P. Rodrigo; Per-Olov Östberg; Erik Elmroth; Katie Antypas; Richard A. Gerber; Lavanya Ramakrishnan

High performance computing (HPC) scheduling landscape currently faces new challenges due to the changes in the workload. Previously, HPC centers were dominated by tightly coupled MPI jobs. HPC work ...


international conference on cloud computing and services science | 2017

A review of cloud computing simulation platforms and related environments

James Byrne; Sergej Svorobej; Konstantinos M. Giannoutakis; Dimitrios Tzovaras; Peter J. Byrne; Per-Olov Östberg; Anna Gourinovitch; Theo Lynn

Recent years have seen an increasing trend towards the development of Discrete Event Simulation (DES) platforms to support cloud computing related decision making and research. The complexity of cl ...


grid computing | 2011

An Adaptable In-advance and Fairshare Meta-scheduling Architecture to Improve Grid QoS

Luis Tom´s; Per-Olov Östberg; Blanca Caminero; Carmen Carrión; Erik Elmroth

Grids are highly variable heterogeneous systems where resources may span multiple administrative domains and utilize heterogeneous schedulers, which complicates enforcement of end-user resource utilization quotas. This work focuses on enhancement of resource utilization quality of service through combination of two systems. A predictive meta-scheduling framework and a distributed fairs hare job prioritization system. The first, SA-Layer, is a system designed to provide scheduling of jobs in advance by ensuring resource availability for future job executions. The second, FS Grid, provides an efficient mechanism for fairs hare-based job prioritization. The integrated architecture presented in this work combines the strengths of both systems and improves perceived end-user quality of service by providing reliable resource allocations adhering to usage allocation policies.


grid computing | 2011

Mediation of Service Overhead in Service-Oriented Grid Architectures

Per-Olov Östberg; Erik Elmroth

Grid computing applications and infrastructures build heavily on Service-Oriented Computing development methodology and are often realized as Service-Oriented Architectures. The Grid Job Management Framework (GJMF) is a flexible Grid infrastructure and application support tool that offers a range of abstractive and platform independent interfaces for middleware-agnostic Grid job submission, monitoring, and control. In this paper we use the GJMF as a test bed for characterization of Grid Service-Oriented Architecture overhead, and evaluate the efficiency of a set of design patterns for overhead mediation mechanisms featured in the framework.


job scheduling strategies for parallel processing | 2017

ScSF : a scheduling simulation framework

Gonzalo P. Rodrigo; Erik Elmroth; Per-Olov Östberg; Lavanya Ramakrishnan

High-throughput and data-intensive applications are increasingly present, often composed as workflows, in the workloads of current HPC systems. At the same time, trends for future HPC systems point towards more heterogeneous systems with deeper I/O and memory hierarchies. However, current HPC schedulers are designed to support classical large tightly coupled parallel jobs over homogeneous systems. Therefore, there is an urgent need to investigate new scheduling algorithms that can manage the future workloads on HPC systems. However, there is a lack of appropriate models and frameworks to enable development, testing, and validation of new scheduling ideas.

Collaboration


Dive into the Per-Olov Östberg's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Henning Groenda

Forschungszentrum Informatik

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Byrne

Dublin City University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christian Stier

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lavanya Ramakrishnan

Lawrence Berkeley National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge