Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Šimon Tóth is active.

Publication


Featured researches published by Šimon Tóth.


Computer Science | 2012

PRACTICAL EXPERIENCES WITH TORQUE META-SCHEDULING IN THE CZECH NATIONAL GRID

Šimon Tóth; Miroslav Ruda

The Czech National Grid Infrastructure went through a complex transition in the last year. The production environment has been switched from a commercial batch system PBSPro, which was replaced by an open source alternative Torque batch system. This paper concentrates on two aspects of this transition. First, we will present our practical experience with Torque being used as a production ready batch system. Our modified version of Torque, with all the necessary PBSPro ex- clusive features re-implemented and further extended with new features like cloud-like behaviour, was deployed across the entire production environment, covering the entire Czech Republic for almost a full year. In the second part, we will present our work on meta-scheduling. This in- volves our work on distributed architecture and cloud-grid convergence. The distributed architecture was designed to overcome the limitations of a central server setup, which was originally used and presented stability and performance issues. While this paper does not discuss the inclusion of cloud interfaces into grids, it does present the dynamic infrastructure, which is a requirement for sharing the grid infrastructure between a batch system and a cloud gateway. We are also inviting everyone to try out our fork of the Torque batch system, which is now publicly available.


Archive | 2011

Peer-to-peer Cooperative Scheduling Architecture for National Grid Infrastructure

Ludek Matyska; Miroslav Ruda; Šimon Tóth

For some ten years, the Czech National Grid Infrastructure MetaCentrum uses a single central PBSPro installation to schedule jobs across the country. This centralized approach keeps a full track about all the clusters, providing support for jobs spanning several sites, implementation for the fair-share policy and better overall control of the grid environment. Despite a steady progress in the increased stability and resilience to intermittent very short network failures, growing number of sites and processors makes this architecture, with a single point of failure and scalability limits, obsolete. As a result, a new scheduling architecture is proposed, which relies on higher autonomy of clusters. It is based on a peer to peer network of semi-independent schedulers for each site or even cluster. Each scheduler accepts jobs for the whole infrastructure, cooperating with other schedulers on implementation of global policies like central job accounting, fair-share, or submission of jobs across several sites. The scheduling system is integrated with the Magrathea system to support scheduling of virtual clusters, including the setup of their internal network, again eventually spanning several sites. On the other hand, each scheduler is local to one of several clusters and is able to directly control and submit jobs to them even if the connection of other scheduling peers is lost. In parallel to the change of the overall architecture, the scheduling system itself is being replaced. Instead of PBSPro, chosen originally for its declared support of large scale distributed environment, the new scheduling architecture is based on the open-source Torque system. The implementation and support for the most desired properties in PBSPro and Torque are discussed and the necessary modifications to Torque to support the MetaCentrum scheduling architecture are presented, too.


Journal of Physics: Conference Series | 2015

Distributed job scheduling in MetaCentrum

Šimon Tóth; Miroslav Ruda

MetaCentrum - The Czech National Grid provides access to various resources across the Czech Republic. The utilized resource management and scheduling system is based on a heavily modified version of the Torque Batch System. This open source resource manager is maintained in a local fork and was extended to facilitate the requirements of such a large installation. This paper provides an overview of unique features deployed in MetaCentrum. Notably, we describe our distributed setup that encompasses several standalone independent servers while still maintaining full cooperative scheduling across the grid. We also present the benefits of our virtualized infrastructure that enables our schedulers to dynamically request ondemand virtual machines, that are then used to facilitate the varied requirements of users in our system, as well as enabling support for user requested virtual clusters that can be further interconnected using a private VLAN.


Archive | 2011

Towards Peer-to-Peer Scheduling Architecture for the CzechNational Grid

Šimon Tóth; Miroslav Ruda; Luděk Matyska


Archive | 2013

Tools and Methods for Detailed Analysis of Complex JobSchedules in the Czech National Grid

Šimon Tóth; Dalibor Klusáček


simulation tools and techniques for communications, networks and system | 2016

Complex Job Scheduling Simulations with Alea 4

Dalibor Klusáček; Šimon Tóth; Gabriela Podolníková


Archive | 2015

A New Complex Workload Trace from MetaCentrum

Šimon Tóth; Dalibor Klusáček


Archive | 2015

On the Challenges in the Design of Efficient Job Scheduling Policies for Production HPC and Grid Environments

Dalibor Klusáček; Šimon Tóth


Archive | 2015

Agent-based User Modeling in Job Scheduling Simulations

Šimon Tóth; Dalibor Klusáček


Archive | 2015

Optimizing Job Scheduling in National Grid Computing System:Theory and Practice

Dalibor Klusáček; Šimon Tóth; Gabriela Podolníková

Collaboration


Dive into the Šimon Tóth's collaboration.

Researchain Logo
Decentralizing Knowledge