Siarhei Padolski
Brookhaven National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Siarhei Padolski.
Journal of Physics: Conference Series | 2018
A A Alekseev; F G Barreiro Megino; Alexei Klimentov; T A Korchuganova; T Maendo; Siarhei Padolski
The paper describes the implementation of a high-performance system for the processing and analysis of log files for the PanDA infrastructure of the ATLAS experiment at the Large Hadron Collider (LHC), responsible for the workload management of order of 2M daily jobs across the Worldwide LHC Computing Grid. The solution is based on the ELK technology stack, which includes several components: Filebeat, Logstash, ElasticSearch (ES), and Kibana. Filebeat is used to collect data from logs. Logstash processes data and export to Elasticsearch. ES are responsible for сentralized data storage. Accumulated data in ES can be viewed using a special software Kibana. These components were integrated with the PanDA infrastructure and replaced previous log processing systems for increased scalability and usability. The authors will describe all the components and their configuration tuning for the current tasks, the scale of the actual system and give several real-life examples of how this centralized log processing and storage service is used to showcase the advantages for daily operations.
Journal of Physics: Conference Series | 2018
Fernando Harald Barreiro Megino; Mikhail Titov; Tatiana Korchuganova; Mikhail Borodin; Maksim Gubin; T. Maeno; Siarhei Padolski; Dmitry Golubkov; Maria Grigoryeva; Alexei Klimentov
Having information such as an estimation of the processing time or possibility of system outage (abnormal behaviour) helps to assist to monitor system performance and to predict its next state. The current cyber-infrastructure of the ATLAS Production System presents computing conditions in which contention for resources among high-priority data analyses happens routinely, that might lead to significant workload and data handling interruptions. The lack of the possibility to monitor and to predict the behaviour of the analysis process (its duration) and system’s state itself provides motivation for a focus on design of the built-in situational awareness analytic tools.
Journal of Physics: Conference Series | 2018
A Alekseev; Alexei Klimentov; T. Korchuganova; Siarhei Padolski; Torre Wenaus
BigPanDA monitoring is a web application that provides various processing and representation of the Production and Distributed Analysis (PanDA) system objects states. Analysing hundreds of millions of computation entities, such as an event or a job, BigPanDA monitoring builds different scales and levels of abstraction reports in real time mode. Provided information allows users to drill down into the reason of a concrete event failure or observe the broad picture such as tracking the computation nucleus and satellites performance or the progress of a whole production campaign. PanDA system was originally developed for the ATLAS experiment. Currently, it manages execution of more than 2 million jobs distributed over 170 computing centers worldwide on daily basis. BigPanDA is its core component commissioned in the middle of 2014 and now is the primary source of information for ATLAS users about the state of their computations and the source of decision support information for shifters, operators and managers. In this work, we describe the evolution of the architecture, current status and plans for the development of the BigPanDA monitoring.
Journal of Physics: Conference Series | 2017
Fernando Harald Barreiro Megino; Siarhei Padolski; Danila Oleynik; S. Panitkin; K. De; Torre Wenaus; Alexei Klimentov; P. Nilsson
The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS experiment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this paper we will describe a project aimed at integration of ATLAS Production System with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA Pilot framework for job submission to Titan’s batch queues and local data management, with lightweight MPI wrappers to run single node workloads in parallel on Titan’s multi-core worker nodes. It provides for running of standard ATLAS production jobs on unused resources (backfill) on Titan. The system already allowed ATLAS to collect on Titan millions of core-hours per month, execute hundreds of thousands jobs, while simultaneously improving Titans utilization efficiency. We will discuss the details of the implementation, current experience with running the system, as well as future plans aimed at improvements in scalability and efficiency. Notice: This manuscript has been authored by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government
Journal of Physics: Conference Series | 2017
Fernando Harald Barreiro Megino; Siarhei Padolski; Danila Oleynik; T. Maeno; S. Panitkin; K. De; Torre Wenaus; Alexei Klimentov; P. Nilsson
EPJ Web of Conferences | 2016
Fernando Harald Barreiro Megino; Jose Caballero Bejar; K. De; John Hover; Alexei Klimentov; T. Maeno; P. Nilsson; Danila Oleynik; Siarhei Padolski; S. Panitkin; Artem Petrosyan; Torre Wenaus
Scientific Visualization | 2018
Siarhei Padolski; T. Korchuganova; Torre Wenaus; M. Grigoryeva; A. Alexeev; M. Titov; Alexei Klimentov
Scientific Visualization | 2018
T. Galkin; M. Grigoryeva; Alexei Klimentov; T. Korchuganova; I. Milman; Siarhei Padolski; V. Pilyugin; D. Popov; M. Titov
Archive | 2018
Aleksandr Alekseev; Tatiana Korchuganova; Siarhei Padolski
Archive | 2018
Maria Grigoryeva; Mikhail Titov; Tatiana Korchuganova; Alexei Klimentov; Siarhei Padolski