Dariusz Król
AGH University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dariusz Król.
computer-based medical systems | 2008
Marian Bubak; Tomasz Gubała; Maciej Malawski; Bartosz Balis; Wlodzimierz Funika; Tomasz Bartyński; Eryk Ciepiela; Daniel Harezlak; Marek Kasztelnik; Joanna Kocot; Dariusz Król; Piotr Nowakowski; Michal Pelczar; Jakub Wach; Matthias Assel; Alfredo Tirado-Ramos
The ViroLab Virtual Laboratory is a collaborative platform for scientists representing multiple fields of expertise while working together on common scientific goals. This environment makes it possible to combine efforts of computer scientists, virology and epidemiology experts and experienced physicians to support future advances in HIV-related research and treatment. The paper explains the challenges involved in building a modern, inter-organizational platform to support science and gives an overview of solutions to these challenges. Examples of real-world problems applied in the presented environment are also described to prove the feasibility of the solution.
Future Generation Computer Systems | 2016
Dariusz Król; Jacek Kitowski
Abstract Software maintenance is one of the major concerns in service oriented ecosystem with an ever-increasing importance. In many cases, the cost of software maintenance is higher than the cost of software development. In particular, long-lasting services, which operate in a dynamically changing environment, require continuous management and administration. One of the important administration actions is scaling management. The problem lies in responding to workload changes of the hosted services as fast as possible. This is especially important in regard to (but not limited to) cloud environments where unnecessary resource usage leads to unnecessary costs. In this paper, we are introducing the self-scalable services and scaling rules, which are intended to support development of self-scalable systems based on Service Oriented Architecture. We propose a design of a self-scalable service based on some of the well-known software development practices along with a definition of scaling rules, which express scaling policy for the service. Both concepts were evaluated in the context of a massively scalable platform for data farming. The evaluation demonstrates advantages of utilizing the proposed concepts to manage the platform in comparison with traditional platform management strategies based on fulfilling peak load.
international conference on cluster computing | 2015
Gideon Juve; Benjamín Tovar; Rafael Ferreira da Silva; Dariusz Król; Douglas Thain; Ewa Deelman; William E. Allcock; Miron Livny
Robust high throughput computing requires effective monitoring and enforcement of a variety of resources including CPU cores, memory, disk, and network traffic. Without effective monitoring and enforcement, it is easy to overload machines, causing failures and slowdowns, or underutilize machines, which results in wasted opportunities. This paper explores how to describe, measure, and enforce resources used by computational tasks. We focus on tasks running in distributed execution systems, in which a task requests the resources it needs, and the execution system ensures the availability of such resources. This presents two non-trivial problems: how to measure the resources consumed by a task, and how to monitor and report resource exhaustion in a robust and timely manner. For both of these tasks, operating systems have a variety of mechanisms with different degrees of availability, accuracy, overhead, and intrusiveness. We describe various forms of monitoring and the available mechanisms in contemporary operating systems. We then present two specific monitoring tools that choose different tradeoffs in overhead and accuracy, and evaluate them on a selection of benchmarks.
International Journal of High Performance Computing Applications | 2017
Ewa Deelman; Christopher D. Carothers; Anirban Mandal; Brian Tierney; Jeffrey S. Vetter; Ilya Baldin; Claris Castillo; Gideon Juve; Dariusz Król; V. E. Lynch; Benjamin Mayer; Jeremy S. Meredith; Thomas Proffen; Paul Ruth; Rafael Ferreira da Silva
Computational science is well established as the third pillar of scientific discovery and is on par with experimentation and theory. However, as we move closer toward the ability to execute exascale calculations and process the ensuing extreme-scale amounts of data produced by both experiments and computations alike, the complexity of managing the compute and data analysis tasks has grown beyond the capabilities of domain scientists. Thus, workflow management systems are absolutely necessary to ensure current and future scientific discoveries. A key research question for these workflow management systems concerns the performance optimization of complex calculation and data analysis tasks. The central contribution of this article is a description of the PANORAMA approach for modeling and diagnosing the run-time performance of complex scientific workflows. This approach integrates extreme-scale systems testbed experimentation, structured analytical modeling, and parallel systems simulation into a comprehensive workflow framework called Pegasus for understanding and improving the overall performance of complex scientific workflows.
international conference on cloud computing | 2014
Dariusz Król; Renata Slota; Jacek Kitowski; Lukasz Dutka; Jakub Liput
Using multiple Clouds as a single environment to conduct simulation-based virtual experiments at a large-scale is a challenging problem. This paper describes how this can be achieved with the Scalarm platform in the context of data farming. In particular, a use case with a private Cloud combined with public, commercial Clouds is studied. We discuss the current architecture and implementation of Scalarm in terms of supporting different infrastructures, and propose how it can be extended in order to attain a unification of different Clouds usage. We discuss different aspects of the Cloud usage unification including: scheduling virtual machines, authentication, and virtual machine state monitoring. An experimental evaluation of the presented solution is conducted with a genetic algorithm solving the well-known Travel Salesman Problem. The evaluation uses three different resource configurations: using only public Cloud, using only private Cloud, and using both public and private Clouds.
Future Generation Computer Systems | 2013
Kornel Skałkowski; Renata Slota; Dariusz Król; Jacek Kitowski
In recent years, distributed environments such as grids and clouds have evolved quickly and become widely used for both business and scientific purposes. Grid environments are used for solving increasingly complex problems in order to provide more accurate and up-to-date results. However, evolution of modern grid middlewares does not follow current trends in their utilization, which often leads to problems concerning provisioning of resources in grid environments. Many users of grid systems stumble on performance issues during execution of their applications. A special kind of grid applications which are dependent on effective provisioning of storage resources constitute data-intensive grid applications, i.e. applications which operate on large datasets. This paper addresses the issue of effective provisioning of storage resources for data-intensive grid applications based on the best-effort strategy. In order to cater applications demand on storage resources in heterogeneous and dynamic grid environments, we propose an approach relying on a combination of a cluster file system technology, a dedicated storage resources monitoring service and a management layer. The paper describes the way that this combination of technologies solves the issue of effective storage resources provisioning in grid environments by presentation of the FiVO/QStorMan toolkit, which constitutes an implementation of the proposed approach. In order to prove that the proposed approach actually reduces data-intensive applications execution time, extensive evaluation of the framework is presented for motivating scenarios which overlap most kinds of data-intensive applications.
Computer Science | 2012
Renata Slota; Dariusz Król; Kornel Skałkowski; Michal Orzechowski; Darin Nikolow; Bartosz Kryza; Michał Wrzeszcz; Jacek Kitowski
This paper describes a programming toolkit developed in the PL-Grid project, named QStorMan, which supports storage QoS provisioning for data-intensive applications in distributed environments. QStorMan exploits knowledge-oriented methods for matching storage resources to non-functional requirements, which are defined for a data-intensive application. In order to support various usage scenarios, QStorMan provides two interfaces, such as programming libraries or a web portal. The interfaces allow to define the requirements either directly in an application source code or by using an intuitive graphical interface. The first way provides finer granularity, e.g., each portion of data processed by an application can define a different set of requirements. The second method is aimed at legacy applications support, which source code can not be modified. The toolkit has been evaluated using synthetic benchmarks and the production infrastructure of PL-Grid, in particular its storage infrastructure, which utilizes the Lustre file system.
international parallel and distributed processing symposium | 2016
Anirban Mandal; Paul Ruth; Ilya Baldin; Dariusz Król; Gideon Juve; Rajiv Mayani; Rafael Ferreira da Silva; Ewa Deelman; Jeremy S. Meredith; Jeffrey S. Vetter; V. E. Lynch; Benjamin Mayer; James Wynne; Mark P. Blanco; Christopher D. Carothers; Justin M. LaPre; Brian Tierney
Modern science is often conducted on large scale, distributed, heterogeneous and high-performance computing infrastructures. Increasingly, the scale and complexity of both the applications and the underlying execution platforms have been growing. Scientific workflows have emerged as a flexible representation to declaratively express complex applications with data andcontrol dependences. However, it is extremely challengingfor scientists to execute their science workflows in a reliable and scalable way due to a lack of understanding of expected and realistic behavior of complex scientific workflows on large scale and distributed HPC systems. This is exacerbated by failures and anomalies in largescale systems and applications, which makes detecting, analyzing and acting on anomaly events challenging. In this work, we present a prototype of an end-to-end system for modeling and diagnosing the runtime performance of complex scientific workflows. We interfaced the Pegasus workflow management system, Aspen performance modeling, monitoring and anomaly detection into an integrated framework that not only improves the understanding of complex scientific applications on large scale complex infrastructure, but also detects anomalies and supports adaptivity. We present a black box modeling tool, a comprehensive online monitoring system, and anomaly detection algorithms that employ the models and monitoring data to detect anomaly events. We present an evaluation of the system with a Spallation Neutron Source workflow as a driving use case.
ieee acm international conference utility and cloud computing | 2014
Dariusz Król; Michal Orzechowski; Jacek Kitowski; Christoph Niethammer; Anthony Sulisto; Amer Wafai
With the cloud paradigm and the concept of everything as a service (XasS), our ability to leverage the potential of distributed computing resources seems greater than ever. On the other hand, data farming is a methodology based on the idea that by repeatedly running a simulation model on a vast parameter space, enough output data can be gathered to provide an meaningful insight into relations between the models properties and its behaviours, with respect to the simulations input parameters. In this paper, we present an extension of a data farming computing platform, named Scalarm, and its evaluation in the context of molecular dynamics (MD) simulations on heterogeneous resources, such as clusters and cloud systems. As a case study, this paper demonstrates how MD simulations can be run with Scalarm on different infrastructures easily without requiring any modifications to the source code of the original MD simulation program. Finally, results from nano droplet simulation runs are presented, that show the advantages of the Scalarm platform for running MD simulations on a heterogeneous infrastructure -- not only for collecting pure numeric data, but also for automated post processing and visualization of the results.
international conference on high performance computing and simulation | 2016
Prathamesh Gaikwad; Anirban Mandal; Paul Ruth; Gideon Juve; Dariusz Król; Ewa Deelman
Recent advances in cloud technologies and on-demand network circuits have created an unprecedented opportunity to enable complex scientific workflow applications to run on dynamic, networked cloud infrastructure. However, it is extremely challenging to reliably execute these workflows on distributed clouds because performance anomalies and faults are frequent in these systems. Hence, accurate, automatic, proactive, online detection of anomalies is extremely important to pinpoint the time and source of the observed anomaly and to guide the adaptation of application and infrastructure. In this work, we present an anomaly detection algorithm that uses auto-regression (AR) based statistical methods on online monitoring time-series data to detect performance anomalies when scientific workflows and applications execute on networked cloud systems. We present a thorough evaluation of our auto-regression based anomaly detection approach by injecting artificial, competing loads into the system. Results show that our AR based detection algorithm can accurately detect performance anomalies for a variety of exemplar scientific workflows and applications.