Vladimir Sapunenko
CERN
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vladimir Sapunenko.
international conference on e science | 2007
A. Carbone; Luca dell'Agnello; Alberto Forti; Antonia Ghiselli; E. Lanciotti; Luca Magnoni; Mirco Mazzucato; R. Santinelli; Vladimir Sapunenko; Vincenzo Vagnoni; Riccardo Zappi
High performance disk-storage solutions based on parallel file systems are becoming increasingly important to fulfill the large I/O throughput required by high-energy physics applications. Storage area networks (SAN) are commonly employed at the Large Hadron Collider data centres, and SAN-oriented parallel file systems such as GPFS and Lustre provide high scalability and availability by aggregating many data volumes served by multiple disk-servers into a single POSIX file system hierarchy. Since these file systems do not come with a storage resource manager (SRM) interface, necessary to access and manage the data volumes in a grid environment, a specific project called StoRM has been developed for providing them with the necessary SRM capabilities. In this paper we describe the deployment of a StoRM instance, configured to manage a GPFS file system. A software suite has been realized in order to perform stress tests of functionality and throughput on StoRM. We present the results of these tests.
IEEE Transactions on Nuclear Science | 2010
Marco Bencivenni; Daniela Bortolotti; A. Carbone; Alessandro Cavalli; Andrea Chierici; Stefano Dal Pra; Donato De Girolamo; Luca dell'Agnello; Massimo Donatelli; Armando Fella; Domenico Galli; Antonia Ghiselli; Daniele Gregori; Alessandro Italiano; Rajeev Kumar; U. Marconi; B. Martelli; Mirco Mazzucato; Michele Onofri; Gianluca Peco; S. Perazzini; Andrea Prosperini; Pier Paolo Ricci; Elisabetta Ronchieri; F Rosso; Davide Salomoni; Vladimir Sapunenko; Vincenzo Vagnoni; Riccardo Veraldi; Maria Cristina Vistoli
In the prospect of employing 10 Gigabit Ethernet as networking technology for online systems and offline data analysis centers of High Energy Physics experiments, we performed a series of measurements on the performance of 10 Gigabit Ethernet, using the network interface cards mounted on the PCI-Express bus of commodity PCs both as transmitters and receivers. In real operating conditions, the achievable maximum transfer rate through a network link is not only limited by the capacity of the link itself, but also by that of the memory and peripheral buses and by the ability of the CPUs and of the Operating System to handle packet processing and interrupts raised by the network interface cards in due time. Besides the TCP and UDP maximum data transfer throughputs, we also measured the CPU loads of the sender/receiver processes and of the interrupt and soft-interrupt handlers as a function of the packet size, either using standard or ¿jumbo¿ Ethernet frames. In addition, we also performed the same measurements by simultaneously reading data from Fibre Channel links and forwarding them through a 10 Gigabit Ethernet link, hence emulating the behavior of a disk server in a Storage Area Network exporting data to client machines via 10 Gigabit Ethernet.
IEEE Transactions on Nuclear Science | 2008
Marco Bencivenni; F. Bonifazi; A. Carbone; Andrea Chierici; A. D'Apice; D. De Girolamo; Luca dell'Agnello; Massimo Donatelli; G. Donvito; Armando Fella; F. Furano; Domenico Galli; Antonia Ghiselli; Alessandro Italiano; G. Lo Re; U. Marconi; B. Martelli; Mirco Mazzucato; Michele Onofri; Pier Paolo Ricci; F Rosso; Davide Salomoni; Vladimir Sapunenko; V. Vagnoni; Riccardo Veraldi; Maria Cristina Vistoli; D. Vitlacil; S. Zani
Performance, reliability and scalability in data-access are key issues in the context of the computing Grid and High Energy Physics data processing and analysis applications, in particular considering the large data size and I/O load that a Large Hadron Collider data centre has to support. In this paper we present the technical details and the results of a large scale validation and performance measurement employing different data-access platforms-namely CASTOR, dCache, GPFS and Scalla/Xrootd. The tests have been performed at the CNAF Tier-1, the central computing facility of the Italian National Institute for Nuclear Research (INFN). Our storage back-end was based on Fibre Channel disk-servers organized in a Storage Area Network, being the disk-servers connected to the computing farm via Gigabit LAN. We used 24 disk-servers, 260 TB of raw-disk space and 280 worker nodes as computing clients, able to run concurrently up to about 1100 jobs. The aim of the test was to perform sequential and random read/write accesses to the data, as well as more realistic access patterns, in order to evaluate efficiency, availability, robustness and performance of the various data-access solutions.
international parallel and distributed processing symposium | 2009
Marco Bencivenni; M. Canaparo; F. Capannini; L. Carota; M. Carpene; Alessandro Cavalli; Andrea Ceccanti; M. Cecchi; Daniele Cesini; Andrea Chierici; V. Ciaschini; A. Cristofori; S Dal Pra; Luca dell'Agnello; D De Girolamo; Massimo Donatelli; D. N. Dongiovanni; Enrico Fattibene; T. Ferrari; A Ferraro; Alberto Forti; Antonia Ghiselli; Daniele Gregori; G. Guizzunti; Alessandro Italiano; L. Magnoni; B. Martelli; Mirco Mazzucato; Giuseppe Misurelli; Michele Onofri
The four High Energy Physics (HEP) detectors at the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) are among the most important experiments where the National Institute of Nuclear Physics (INFN) is being actively involved. A Grid infrastructure of the World LHC Computing Grid (WLCG) has been developed by the HEP community leveraging on broader initiatives (e.g. EGEE in Europe, OSG in northen America) as a framework to exchange and maintain data storage and provide computing infrastructure for the entire LHC community. INFN-CNAF in Bologna hosts the Italian Tier-1 site, which represents the biggest italian center in the WLCG distributed computing. In the first part of this paper we will describe on the building of the Italian Tier-1 to cope with the WLCG computing requirements focusing on some peculiarities; in the second part we will analyze the INFN-CNAF contribution for the developement of the grid middleware, stressing in particular the characteristics of the Virtual Organization Membership Service (VOMS), the de facto standard for authorization on a grid, and StoRM, an implementation of the Storage Resource Manager (SRM) specifications for POSIX file systems. In particular StoRM is used at INFN-CNAF in conjunction with General Parallel File System (GPFS) and we are also testing an integration with Tivoli Storage Manager (TSM) to realize a complete Hierarchical Storage Management (HSM).
Journal of Physics: Conference Series | 2012
Pier Paolo Ricci; D. Bonacorsi; Alessandro Cavalli; Luca dell'Agnello; Daniele Gregori; Andrea Prosperini; Lorenzo Rinaldi; Vladimir Sapunenko; Vincenzo Vagnoni
The storage system currently used in production at the INFN Tier1 at CNAF is the result of several years of case studies, software development and tests. This solution, called the Grid Enabled Mass Storage System (GEMSS), is based on a custom integration between a fast and reliable parallel filesystem (the IBM General Parallel File System, GPFS), with a complete integrated tape backend based on the Tivoli Storage Manager (TSM), which provides Hierarchical Storage Management (HSM) capabilities, and the Grid Storage Resource Manager (StoRM), providing access to grid users through a standard SRM interface. Since the start of the Large Hadron Collider (LHC) operation, all LHC experiments have been using GEMSS at CNAF for both disk data access and long-term archival on tape media. Moreover, during last year, GEMSS has become the standard solution for all other experiments hosted at CNAF, allowing the definitive consolidation of the data storage layer. Our choice has proved to be very successful during the last two years of production with continuous enhancements, accurate monitoring and effective customizations according to the end-user requests. In this paper a description of the system is reported, addressing recent developments and giving an overview of the administration and monitoring tools. We also discuss the solutions adopted in order to grant the maximum availability of the service and the latest optimization features within the data access process. Finally, we summarize the main results obtained during these last years of activity from the perspective of some of the end-users, showing the reliability and the high performances that can be achieved using GEMSS.
Journal of Physics: Conference Series | 2015
Daniele Gregori; Stefano Dal Pra; Pier Paolo Ricci; Michele Pezzi; Andrea Prosperini; Vladimir Sapunenko
The storage and farming departments at the INFN-CNAF Tier1[1] manage approximately thousands of computing nodes and several hundreds of servers that provides access to the disk and tape storage. In particular, the storage server machines should provide the following services: an efficient access to about 15 petabytes of disk space with different cluster of GPFS file system, the data transfers between LHC Tiers sites (Tier0, Tier1 and Tier2) via GridFTP cluster and Xrootd protocol and finally the writing and reading data operations on magnetic tape backend. One of the most important and essential point in order to get a reliable service is a control system that can warn if problems arise and which is able to perform automatic recovery operations in case of service interruptions or major failures. Moreover, during daily operations the configurations can change, i.e. if the GPFS cluster nodes roles can be modified and therefore the obsolete nodes must be removed from the control system production, and the new servers should be added to the ones that are already present. The manual management of all these changes is an operation that can be somewhat difficult in case of several changes, it can also take a long time and is easily subject to human error or misconfiguration. For these reasons we have developed a control system with the feature of self-configure itself if any change occurs. Currently, this system has been in production for about a year at the INFN-CNAF Tier1 with good results and hardly any major drawback. There are three major key points in this system. The first is a software configurator service (e.g. Quattor or Puppet) for the servers machines that we want to monitor with the control system; this service must ensure the presence of appropriate sensors and custom scripts on the nodes to check and should be able to install and update software packages on them. The second key element is a database containing information, according to a suitable format, on all the machines in production and able to provide for each of them the principal information such as the type of hardware, the network switch to which the machine is connected, if the machine is real (physical) or virtual, the possible hypervisor to which it belongs and so on. The last key point is a control system software (in our implementation we choose the Nagios software), capable of assessing the status of the servers and services, and that can attempt to restore the working state, restart or inhibit software services and send suitable alarm messages to the site administrators. The integration of these three elements was made by appropriate scripts and custom implementation that allow the self-configuration of the system according to a decisional logic and the whole combination of all the above-mentioned components will be deeply discussed in this paper.
Journal of Physics: Conference Series | 2015
Pier Paolo Ricci; Alessandro Cavalli; Luca dell'Agnello; Matteo Favaro; Daniele Gregori; Andrea Prosperini; Michele Pezzi; Vladimir Sapunenko; Giovanni Zizzi; Vincenzo Vagnoni
The consolidation of Mass Storage services at the INFN-CNAF Tier1 Storage department that has occurred during the last 5 years, resulted in a reliable, high performance and moderately easy-to-manage facility that provides data access, archive, backup and database services to several different use cases. At present, the GEMSS Mass Storage System, developed and installed at CNAF and based upon an integration between the IBM GPFS parallel filesystem and the Tivoli Storage Manager (TSM) tape management software, is one of the largest hierarchical storage sites in Europe. It provides storage resources for about 12% of LHC data, as well as for data of other non-LHC experiments. Files are accessed using standard SRM Grid services provided by the Storage Resource Manager (StoRM), also developed at CNAF. Data access is also provided by XRootD and HTTP/WebDaV endpoints. Besides these services, an Oracle database facility is in production characterized by an effective level of parallelism, redundancy and availability. This facility is running databases for storing and accessing relational data objects and for providing database services to the currently active use cases. It takes advantage of several Oracle technologies, like Real Application Cluster (RAC), Automatic Storage Manager (ASM) and Enterprise Manager centralized management tools, together with other technologies for performance optimization, ease of management and downtime reduction. The aim of the present paper is to illustrate the state-of-the-art of the INFN-CNAF Tier1 Storage department infrastructures and software services, and to give a brief outlook to forthcoming projects. A description of the administrative, monitoring and problem-tracking tools that play a primary role in managing the whole storage framework is also given.
Journal of Physics: Conference Series | 2015
Vladimir Sapunenko; Domenico D'Urso; Luca dell'Agnello; Vincenzo Vagnoni; Matteo Duranti
Data management constitutes one of the major challenges that a geographically- distributed e-Infrastructure has to face, especially when remote data access is involved. We discuss an integrated solution which enables transparent and efficient access to on-line and near-line data through high latency networks. The solution is based on the joint use of the General Parallel File System (GPFS) and of the Tivoli Storage Manager (TSM). Both products, developed by IBM, are well known and extensively used in the HEP computing community. Owing to a new feature introduced in GPFS 3.5, so-called Active File Management (AFM), the definition of a single, geographically-distributed namespace, characterised by automated data flow management between different locations, becomes possible. As a practical example, we present the implementation of AFM-based remote data access between two data centres located in Bologna and Rome, demonstrating the validity of the solution for the use case of the AMS experiment, an astro-particle experiment supported by the INFN CNAF data centre with the large disk space requirements (more than 1.5 PB).
Journal of Physics: Conference Series | 2012
G Bortolotti; Alessandro Cavalli; L Chiarelli; Andrea Chierici; S Dal Pra; Luca dell'Agnello; D De Girolamo; Massimo Donatelli; A Ferraro; Daniele Gregori; Alessandro Italiano; B. Martelli; A Mazza; Michele Onofri; Andrea Prosperini; Pier Paolo Ricci; Elisabetta Ronchieri; F Rosso; Vladimir Sapunenko; Riccardo Veraldi; C Vistoli S Zani
INFN-CNAF is the central computing facility of INFN: it is the Italian Tier-1 for the experiments at LHC, but also one of the main Italian computing facilities for several other experiments such as BABAR, CDF, SuperB, Virgo, Argo, AMS, Pamela, MAGIC, Auger etc. Currently there is an installed CPU capacity of 100,000 HS06, a net disk capacity of 9 PB and an equivalent amount of tape storage (these figures are going to be increased in the first half of 2012 respectively to 125,000 HS06, 12 PB and 18 PB). More than 80,000 computing jobs are executed daily on the farm, managed by LSF, accessing the storage, managed by GPFS, with an aggregate bandwidth up to several GB/s. The access to the storage system from the farm is direct through the file protocol. The interconnection of the computing resources and the data storage is based on 10 Gbps technology. The disk-servers and the storage systems are connected through a Storage Area Network allowing a complete flexibility and easiness of management; dedicated disk-servers are connected, also via the SAN, to the tape library. The INFN Tier-1 is connected to the other centers via 3×10 Gbps links (to be upgraded at the end of 2012), including the LHCOPN and to the LHCONE. In this paper we show the main results of our center after 2 full years of run of LHC.
Archive | 2011
D. Andreotti; D. Bonacorsi; Alessandro Cavalli; S. Dal Pra; L. dell’Agnello; Alberto Forti; Claudio Grandi; Daniele Gregori; L. Li Gioi; B. Martelli; Andrea Prosperini; Pier Paolo Ricci; Elisabetta Ronchieri; Vladimir Sapunenko; A. Sartirana; Vincenzo Vagnoni; Riccardo Zappi
A brand new Mass Storage System solution called “Grid-Enabled Mass Storage System” (GEMSS) -based on the Storage Resource Manager (StoRM) developed by INFN, on the General Parallel File System by IBM and on the Tivoli Storage Manager by IBM -has been tested and deployed at the INFNCNAF Tier-1 Computing Centre in Italy. After a successful stress test phase, the solution is now being used in production for the data custodiality of the CMS experiment at CNAF. All data previously recorded on the CASTOR system have been transferred to GEMSS. As final validation of the GEMSS system, some of the computing tests done in the context of the WLCG “Scale Test for the Experiment Program” (STEP’09) challenge were repeated in September-October 2009 and compared with the results previously obtained with CASTOR in June 2009. In this paper, the GEMSS system basics, the stress test activity and the deployment phase -as well as the reliability and performance of the system -are overviewed. The experiences in the use of GEMSS at CNAF in preparing for the first months of data taking of the CMS experiment at the Large Hadron Collider are also presented.