Šimon Tóth
CESNET
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Šimon Tóth.
Computer Science | 2012
Šimon Tóth; Miroslav Ruda
The Czech National Grid Infrastructure went through a complex transition in the last year. The production environment has been switched from a commercial batch system PBSPro, which was replaced by an open source alternative Torque batch system. This paper concentrates on two aspects of this transition. First, we will present our practical experience with Torque being used as a production ready batch system. Our modified version of Torque, with all the necessary PBSPro ex- clusive features re-implemented and further extended with new features like cloud-like behaviour, was deployed across the entire production environment, covering the entire Czech Republic for almost a full year. In the second part, we will present our work on meta-scheduling. This in- volves our work on distributed architecture and cloud-grid convergence. The distributed architecture was designed to overcome the limitations of a central server setup, which was originally used and presented stability and performance issues. While this paper does not discuss the inclusion of cloud interfaces into grids, it does present the dynamic infrastructure, which is a requirement for sharing the grid infrastructure between a batch system and a cloud gateway. We are also inviting everyone to try out our fork of the Torque batch system, which is now publicly available.
Archive | 2011
Ludek Matyska; Miroslav Ruda; Šimon Tóth
For some ten years, the Czech National Grid Infrastructure MetaCentrum uses a single central PBSPro installation to schedule jobs across the country. This centralized approach keeps a full track about all the clusters, providing support for jobs spanning several sites, implementation for the fair-share policy and better overall control of the grid environment. Despite a steady progress in the increased stability and resilience to intermittent very short network failures, growing number of sites and processors makes this architecture, with a single point of failure and scalability limits, obsolete. As a result, a new scheduling architecture is proposed, which relies on higher autonomy of clusters. It is based on a peer to peer network of semi-independent schedulers for each site or even cluster. Each scheduler accepts jobs for the whole infrastructure, cooperating with other schedulers on implementation of global policies like central job accounting, fair-share, or submission of jobs across several sites. The scheduling system is integrated with the Magrathea system to support scheduling of virtual clusters, including the setup of their internal network, again eventually spanning several sites. On the other hand, each scheduler is local to one of several clusters and is able to directly control and submit jobs to them even if the connection of other scheduling peers is lost. In parallel to the change of the overall architecture, the scheduling system itself is being replaced. Instead of PBSPro, chosen originally for its declared support of large scale distributed environment, the new scheduling architecture is based on the open-source Torque system. The implementation and support for the most desired properties in PBSPro and Torque are discussed and the necessary modifications to Torque to support the MetaCentrum scheduling architecture are presented, too.
Journal of Physics: Conference Series | 2015
Šimon Tóth; Miroslav Ruda
MetaCentrum - The Czech National Grid provides access to various resources across the Czech Republic. The utilized resource management and scheduling system is based on a heavily modified version of the Torque Batch System. This open source resource manager is maintained in a local fork and was extended to facilitate the requirements of such a large installation. This paper provides an overview of unique features deployed in MetaCentrum. Notably, we describe our distributed setup that encompasses several standalone independent servers while still maintaining full cooperative scheduling across the grid. We also present the benefits of our virtualized infrastructure that enables our schedulers to dynamically request ondemand virtual machines, that are then used to facilitate the varied requirements of users in our system, as well as enabling support for user requested virtual clusters that can be further interconnected using a private VLAN.
Archive | 2011
Šimon Tóth; Miroslav Ruda; Luděk Matyska
Archive | 2013
Šimon Tóth; Dalibor Klusáček
simulation tools and techniques for communications, networks and system | 2016
Dalibor Klusáček; Šimon Tóth; Gabriela Podolníková
Archive | 2015
Šimon Tóth; Dalibor Klusáček
Archive | 2015
Dalibor Klusáček; Šimon Tóth
Archive | 2015
Šimon Tóth; Dalibor Klusáček
Archive | 2015
Dalibor Klusáček; Šimon Tóth; Gabriela Podolníková