Daniele Gregori
University of Bologna
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniele Gregori.
IEEE Transactions on Nuclear Science | 2010
Marco Bencivenni; Daniela Bortolotti; A. Carbone; Alessandro Cavalli; Andrea Chierici; Stefano Dal Pra; Donato De Girolamo; Luca dell'Agnello; Massimo Donatelli; Armando Fella; Domenico Galli; Antonia Ghiselli; Daniele Gregori; Alessandro Italiano; Rajeev Kumar; U. Marconi; B. Martelli; Mirco Mazzucato; Michele Onofri; Gianluca Peco; S. Perazzini; Andrea Prosperini; Pier Paolo Ricci; Elisabetta Ronchieri; F Rosso; Davide Salomoni; Vladimir Sapunenko; Vincenzo Vagnoni; Riccardo Veraldi; Maria Cristina Vistoli
In the prospect of employing 10 Gigabit Ethernet as networking technology for online systems and offline data analysis centers of High Energy Physics experiments, we performed a series of measurements on the performance of 10 Gigabit Ethernet, using the network interface cards mounted on the PCI-Express bus of commodity PCs both as transmitters and receivers. In real operating conditions, the achievable maximum transfer rate through a network link is not only limited by the capacity of the link itself, but also by that of the memory and peripheral buses and by the ability of the CPUs and of the Operating System to handle packet processing and interrupts raised by the network interface cards in due time. Besides the TCP and UDP maximum data transfer throughputs, we also measured the CPU loads of the sender/receiver processes and of the interrupt and soft-interrupt handlers as a function of the packet size, either using standard or ¿jumbo¿ Ethernet frames. In addition, we also performed the same measurements by simultaneously reading data from Fibre Channel links and forwarding them through a 10 Gigabit Ethernet link, hence emulating the behavior of a disk server in a Storage Area Network exporting data to client machines via 10 Gigabit Ethernet.
IEEE Transactions on Nuclear Science | 2006
A. Barczyk; Daniela Bortolotti; A. Carbone; J.P. Dufey; Domenico Galli; B. Gaidioz; Daniele Gregori; B. Jost; U. Marconi; N. Neufeld; Gianluca Peco; Vincenzo Vagnoni
We report on measurements performed to test the reliability of high rate data transmission over copper Gigabit Ethernet for the LHCb online system. High reliability of such transmissions will be crucial for the functioning of the software trigger layers of the LHCb experiment, at the CERNs LHC accelerator. The technological challenge in the system implementation consists of handling the expected high data throughput of event fragments using, to a large extent, commodity equipment. We report on performance evaluations (throughput, error rates and frame drop) of the main components involved in data transmission: the Ethernet cable, the PCI bus and the operating system (the latest kernel versions of Linux). Three different platforms have been used.
IEEE Transactions on Nuclear Science | 2008
F. Bonifazi; A. Carbone; Domenico Galli; C. Gaspar; Daniele Gregori; U. Marconi; Gianluca Peco; Vincenzo Vagnoni; E. van Herwijnen
The LHCb experiment at CERN will have an on-line trigger farm composed of up to 2000 PCs. In order to monitor and control each PC and to supervise the overall status of the farm, a farm monitoring and control system (FMC) was developed. The FMC is based on distributed information management (DIM) system as network communication layer, it is accessible both through a command line interface and through the Prozeszligvisualisierungs und Steuerungssystem (PVSS) graphical interface, and it is interfaced to the finite state machine (FSM) of the LHCb experiment control system (ECS) in order to manage anomalous farm conditions. The FMC is an integral part of the ECS, which is in charge of monitoring and controlling all on-line components; it uses the same tools (DIM, PVSS, FSM, etc.) to guarantee its complete integration and a coherent look and feel throughout the whole control system.
international parallel and distributed processing symposium | 2009
Marco Bencivenni; M. Canaparo; F. Capannini; L. Carota; M. Carpene; Alessandro Cavalli; Andrea Ceccanti; M. Cecchi; Daniele Cesini; Andrea Chierici; V. Ciaschini; A. Cristofori; S Dal Pra; Luca dell'Agnello; D De Girolamo; Massimo Donatelli; D. N. Dongiovanni; Enrico Fattibene; T. Ferrari; A Ferraro; Alberto Forti; Antonia Ghiselli; Daniele Gregori; G. Guizzunti; Alessandro Italiano; L. Magnoni; B. Martelli; Mirco Mazzucato; Giuseppe Misurelli; Michele Onofri
The four High Energy Physics (HEP) detectors at the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) are among the most important experiments where the National Institute of Nuclear Physics (INFN) is being actively involved. A Grid infrastructure of the World LHC Computing Grid (WLCG) has been developed by the HEP community leveraging on broader initiatives (e.g. EGEE in Europe, OSG in northen America) as a framework to exchange and maintain data storage and provide computing infrastructure for the entire LHC community. INFN-CNAF in Bologna hosts the Italian Tier-1 site, which represents the biggest italian center in the WLCG distributed computing. In the first part of this paper we will describe on the building of the Italian Tier-1 to cope with the WLCG computing requirements focusing on some peculiarities; in the second part we will analyze the INFN-CNAF contribution for the developement of the grid middleware, stressing in particular the characteristics of the Virtual Organization Membership Service (VOMS), the de facto standard for authorization on a grid, and StoRM, an implementation of the Storage Resource Manager (SRM) specifications for POSIX file systems. In particular StoRM is used at INFN-CNAF in conjunction with General Parallel File System (GPFS) and we are also testing an integration with Tivoli Storage Manager (TSM) to realize a complete Hierarchical Storage Management (HSM).
Journal of Physics: Conference Series | 2012
Pier Paolo Ricci; D. Bonacorsi; Alessandro Cavalli; Luca dell'Agnello; Daniele Gregori; Andrea Prosperini; Lorenzo Rinaldi; Vladimir Sapunenko; Vincenzo Vagnoni
The storage system currently used in production at the INFN Tier1 at CNAF is the result of several years of case studies, software development and tests. This solution, called the Grid Enabled Mass Storage System (GEMSS), is based on a custom integration between a fast and reliable parallel filesystem (the IBM General Parallel File System, GPFS), with a complete integrated tape backend based on the Tivoli Storage Manager (TSM), which provides Hierarchical Storage Management (HSM) capabilities, and the Grid Storage Resource Manager (StoRM), providing access to grid users through a standard SRM interface. Since the start of the Large Hadron Collider (LHC) operation, all LHC experiments have been using GEMSS at CNAF for both disk data access and long-term archival on tape media. Moreover, during last year, GEMSS has become the standard solution for all other experiments hosted at CNAF, allowing the definitive consolidation of the data storage layer. Our choice has proved to be very successful during the last two years of production with continuous enhancements, accurate monitoring and effective customizations according to the end-user requests. In this paper a description of the system is reported, addressing recent developments and giving an overview of the administration and monitoring tools. We also discuss the solutions adopted in order to grant the maximum availability of the service and the latest optimization features within the data access process. Finally, we summarize the main results obtained during these last years of activity from the perspective of some of the end-users, showing the reliability and the high performances that can be achieved using GEMSS.
Journal of Physics: Conference Series | 2015
Daniele Gregori; Stefano Dal Pra; Pier Paolo Ricci; Michele Pezzi; Andrea Prosperini; Vladimir Sapunenko
The storage and farming departments at the INFN-CNAF Tier1[1] manage approximately thousands of computing nodes and several hundreds of servers that provides access to the disk and tape storage. In particular, the storage server machines should provide the following services: an efficient access to about 15 petabytes of disk space with different cluster of GPFS file system, the data transfers between LHC Tiers sites (Tier0, Tier1 and Tier2) via GridFTP cluster and Xrootd protocol and finally the writing and reading data operations on magnetic tape backend. One of the most important and essential point in order to get a reliable service is a control system that can warn if problems arise and which is able to perform automatic recovery operations in case of service interruptions or major failures. Moreover, during daily operations the configurations can change, i.e. if the GPFS cluster nodes roles can be modified and therefore the obsolete nodes must be removed from the control system production, and the new servers should be added to the ones that are already present. The manual management of all these changes is an operation that can be somewhat difficult in case of several changes, it can also take a long time and is easily subject to human error or misconfiguration. For these reasons we have developed a control system with the feature of self-configure itself if any change occurs. Currently, this system has been in production for about a year at the INFN-CNAF Tier1 with good results and hardly any major drawback. There are three major key points in this system. The first is a software configurator service (e.g. Quattor or Puppet) for the servers machines that we want to monitor with the control system; this service must ensure the presence of appropriate sensors and custom scripts on the nodes to check and should be able to install and update software packages on them. The second key element is a database containing information, according to a suitable format, on all the machines in production and able to provide for each of them the principal information such as the type of hardware, the network switch to which the machine is connected, if the machine is real (physical) or virtual, the possible hypervisor to which it belongs and so on. The last key point is a control system software (in our implementation we choose the Nagios software), capable of assessing the status of the servers and services, and that can attempt to restore the working state, restart or inhibit software services and send suitable alarm messages to the site administrators. The integration of these three elements was made by appropriate scripts and custom implementation that allow the self-configuration of the system according to a decisional logic and the whole combination of all the above-mentioned components will be deeply discussed in this paper.
Journal of Physics: Conference Series | 2015
Pier Paolo Ricci; Alessandro Cavalli; Luca dell'Agnello; Matteo Favaro; Daniele Gregori; Andrea Prosperini; Michele Pezzi; Vladimir Sapunenko; Giovanni Zizzi; Vincenzo Vagnoni
The consolidation of Mass Storage services at the INFN-CNAF Tier1 Storage department that has occurred during the last 5 years, resulted in a reliable, high performance and moderately easy-to-manage facility that provides data access, archive, backup and database services to several different use cases. At present, the GEMSS Mass Storage System, developed and installed at CNAF and based upon an integration between the IBM GPFS parallel filesystem and the Tivoli Storage Manager (TSM) tape management software, is one of the largest hierarchical storage sites in Europe. It provides storage resources for about 12% of LHC data, as well as for data of other non-LHC experiments. Files are accessed using standard SRM Grid services provided by the Storage Resource Manager (StoRM), also developed at CNAF. Data access is also provided by XRootD and HTTP/WebDaV endpoints. Besides these services, an Oracle database facility is in production characterized by an effective level of parallelism, redundancy and availability. This facility is running databases for storing and accessing relational data objects and for providing database services to the currently active use cases. It takes advantage of several Oracle technologies, like Real Application Cluster (RAC), Automatic Storage Manager (ASM) and Enterprise Manager centralized management tools, together with other technologies for performance optimization, ease of management and downtime reduction. The aim of the present paper is to illustrate the state-of-the-art of the INFN-CNAF Tier1 Storage department infrastructures and software services, and to give a brief outlook to forthcoming projects. A description of the administrative, monitoring and problem-tracking tools that play a primary role in managing the whole storage framework is also given.
Journal of Physics: Conference Series | 2014
S Amerio; L Chiarelli; Luca dell'Agnello; D De Girolamo; Daniele Gregori; M Pezzi; Andrea Prosperini; Pier Paolo Ricci; F Rosso; S. Zani
Long-term preservation of experimental data (intended as both raw and derived formats) is one of the emerging requirements coming from scientific collaborations. Within the High Energy Physics community the Data Preservation in High Energy Physics (DPHEP) group coordinates this effort. CNAF is not only one of the Tier-1s for the LHC experiments, it is also a computing center providing computing and storage resources to many other HEP and non-HEP scientific collaborations, including the CDF experiment. After the end of data taking in 2011, CDF is now facing the challenge to both preserve the large amount of data produced during several years of data taking and to retain the ability to access and reuse it in the future. CNAF is heavily involved in the CDF Data Preservation activities, in collaboration with the Fermilab National Laboratory (FNAL) computing sector. At the moment about 4 PB of data (raw data and analysis-level ntuples) are starting to be copied from FNAL to the CNAF tape library and the framework to subsequently access the data is being set up. In parallel to the data access system, a data analysis framework is being developed which allows to run the complete CDF analysis chain in the long term future, from raw data reprocessing to analysis-level ntuple production. In this contribution we illustrate the technical solutions we put in place to address the issues encountered as we proceeded in this activity.
Journal of Physics: Conference Series | 2012
G Bortolotti; Alessandro Cavalli; L Chiarelli; Andrea Chierici; S Dal Pra; Luca dell'Agnello; D De Girolamo; Massimo Donatelli; A Ferraro; Daniele Gregori; Alessandro Italiano; B. Martelli; A Mazza; Michele Onofri; Andrea Prosperini; Pier Paolo Ricci; Elisabetta Ronchieri; F Rosso; Vladimir Sapunenko; Riccardo Veraldi; C Vistoli S Zani
INFN-CNAF is the central computing facility of INFN: it is the Italian Tier-1 for the experiments at LHC, but also one of the main Italian computing facilities for several other experiments such as BABAR, CDF, SuperB, Virgo, Argo, AMS, Pamela, MAGIC, Auger etc. Currently there is an installed CPU capacity of 100,000 HS06, a net disk capacity of 9 PB and an equivalent amount of tape storage (these figures are going to be increased in the first half of 2012 respectively to 125,000 HS06, 12 PB and 18 PB). More than 80,000 computing jobs are executed daily on the farm, managed by LSF, accessing the storage, managed by GPFS, with an aggregate bandwidth up to several GB/s. The access to the storage system from the farm is direct through the file protocol. The interconnection of the computing resources and the data storage is based on 10 Gbps technology. The disk-servers and the storage systems are connected through a Storage Area Network allowing a complete flexibility and easiness of management; dedicated disk-servers are connected, also via the SAN, to the tape library. The INFN Tier-1 is connected to the other centers via 3×10 Gbps links (to be upgraded at the end of 2012), including the LHCOPN and to the LHCONE. In this paper we show the main results of our center after 2 full years of run of LHC.
Archive | 2011
D. Andreotti; D. Bonacorsi; Alessandro Cavalli; S. Dal Pra; L. dell’Agnello; Alberto Forti; Claudio Grandi; Daniele Gregori; L. Li Gioi; B. Martelli; Andrea Prosperini; Pier Paolo Ricci; Elisabetta Ronchieri; Vladimir Sapunenko; A. Sartirana; Vincenzo Vagnoni; Riccardo Zappi
A brand new Mass Storage System solution called “Grid-Enabled Mass Storage System” (GEMSS) -based on the Storage Resource Manager (StoRM) developed by INFN, on the General Parallel File System by IBM and on the Tivoli Storage Manager by IBM -has been tested and deployed at the INFNCNAF Tier-1 Computing Centre in Italy. After a successful stress test phase, the solution is now being used in production for the data custodiality of the CMS experiment at CNAF. All data previously recorded on the CASTOR system have been transferred to GEMSS. As final validation of the GEMSS system, some of the computing tests done in the context of the WLCG “Scale Test for the Experiment Program” (STEP’09) challenge were repeated in September-October 2009 and compared with the results previously obtained with CASTOR in June 2009. In this paper, the GEMSS system basics, the stress test activity and the deployment phase -as well as the reliability and performance of the system -are overviewed. The experiences in the use of GEMSS at CNAF in preparing for the first months of data taking of the CMS experiment at the Large Hadron Collider are also presented.