A. Sciaba | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where A. Sciaba is active.

Explore More

Publication

Featured researches published by A. Sciaba.

Journal of Physics: Conference Series | 2010

The commissioning of CMS sites: Improving the site reliability

S Belforte; I. Fisk; J. Flix; J M Hernández; J. Klem; J. Letts; N Magini; P. Saiz; A. Sciaba

The computing system of the CMS experiment works using distributed resources from more than 60 computing centres worldwide. These centres, located in Europe, America and Asia are interconnected by the Worldwide LHC Computing Grid. The operation of the system requires a stable and reliable behaviour of the underlying infrastructure. CMS has established a procedure to extensively test all relevant aspects of a Grid site, such as the ability to efficiently use their network to transfer data, the functionality of all the site services relevant for CMS and the capability to sustain the various CMS computing workflows at the required scale. This contribution describes in detail the procedure to rate CMS sites depending on their performance, including the complete automation of the program, the description of monitoring tools, and its impact in improving the overall reliability of the Grid from the point of view of the CMS computing system.

Journal of Physics: Conference Series | 2010

Use of the gLite-WMS in CMS for production and analysis

G. Codispoti; C. Grandi; A. Fanfani; D. Spiga; M Cinquilli; F. M. Farina; V. Miccio; Federica Fanzago; A. Sciaba; S. Lacaprara; S Belforte; D. Bonacorsi; A Sartirana; D Dongiovanni; D Cesini; S. Wakefield; Jose M Hernandez; S Lemaitre; M Litmaath; Y Calas; E Roche

The CMS experiment at LHC started using the Resource Broker (by the EDG and LCG projects) to submit Monte Carlo production and analysis jobs to distributed computing resources of the WLCG infrastructure over 6 years ago. Since 2006 the gLite Workload Management System (WMS) and Logging & Bookkeeping (LB) are used. The interaction with the gLite-WMS/LB happens through the CMS production and analysis frameworks, respectively ProdAgent and CRAB, through a common component, BOSSLite. The important improvements recently made in the gLite-WMS/LB as well as in the CMS tools and the intrinsic independence of different WMS/LB instances allow CMS to reach the stability and scalability needed for LHC operations. In particular the use of a multi-threaded approach in BOSSLite allowed to increase the scalability of the systems significantly. In this work we present the operational set up of CMS production and analysis based on the gLite-WMS and the performances obtained in the past data challenges and in the daily Monte Carlo productions and user analysis usage in the experiment.

Journal of Grid Computing | 2010

Distributed analysis in CMS

A. Fanfani; Anzar Afaq; Jose Afonso Sanches; Julia Andreeva; Giusepppe Bagliesi; L. A. T. Bauerdick; Stefano Belforte; Patricia Bittencourt Sampaio; K. Bloom; Barry Blumenfeld; D. Bonacorsi; C. Brew; Marco Calloni; Daniele Cesini; Mattia Cinquilli; G. Codispoti; Jorgen D’Hondt; Liang Dong; Danilo N. Dongiovanni; Giacinto Donvito; David Dykstra; Erik Edelmann; R. Egeland; P. Elmer; Giulio Eulisse; D Evans; Federica Fanzago; F. M. Farina; Derek Feichtinger; I. Fisk

The CMS experiment expects to manage several Pbytes of data each year during the LHC programme, distributing them over many computing sites around the world and enabling data access at those centers for analysis. CMS has identified the distributed sites as the primary location for physics analysis to support a wide community with thousands potential users. This represents an unprecedented experimental challenge in terms of the scale of distributed computing resources and number of user. An overview of the computing architecture, the software tools and the distributed infrastructure is reported. Summaries of the experience in establishing efficient and scalable operations to get prepared for CMS distributed analysis are presented, followed by the user experience in their current analysis activities.

ieee nuclear science symposium | 2008

The commissioning of CMS computing centres in the worldwide LHC computing Grid

S. Belforte; A. Fanfani; I. Fisk; J. Flix; Jose M Hernandez; J. Klem; J. Letts; N Magini; Vincenzo Miccio; S. Padhi; P. Saiz; A. Sciaba; F. Würthwein

The computing system of the CMS experiment uses distributed resources from more than 60 computing centres worldwide. Located in Europe, America and Asia, these centres are interconnected by the Worldwide LHC Computing Grid. The operation of the system requires a stable and reliable behavior of the underlying infrastructure. CMS has established a procedure to extensively test all relevant aspects of a Grid site, such as the ability to efficiently use their network to transfer data, services relevant for CMS and the capability to sustain the various CMS computing workflows (Monte Carlo simulation, event reprocessing and skimming, data analysis) at the required scale. This contribution describes in detail the procedure to rate CMS sites depending on their performance, including the complete automation of the program, the description of monitoring tools, and its impact in improving the overall reliability of the Grid from the point of view of the CMS computing system.

Journal of Physics: Conference Series | 2014

CMS computing operations during run 1

J Adelman; S. Alderweireldt; J Artieda; G. Bagliesi; D Ballesteros; S. Bansal; L. A. T. Bauerdick; W Behrenhof; S. Belforte; K. Bloom; B. Blumenfeld; S. Blyweert; D. Bonacorsi; C. Brew; L Contreras; A Cristofori; S Cury; D da Silva Gomes; M Dolores Saiz Santos; J Dost; David Dykstra; E Fajardo Hernandez; F Fanzango; I. Fisk; J Flix; A Georges; M. Giffels; G. Gomez-Ceballos; S. J. Gowdy; Oliver Gutsche

During the first run, CMS collected and processed more than 10B data events and simulated more than 15B events. Up to 100k processor cores were used simultaneously and 100PB of storage was managed. Each month petabytes of data were moved and hundreds of users accessed data samples. In this document we discuss the operational experience from this first run. We present the workflows and data flows that were executed, and we discuss the tools and services developed, and the operations and shift models used to sustain the system. Many techniques were followed from the original computing planning, but some were reactions to difficulties and opportunities. We also address the lessons learned from an operational perspective, and how this is shaping our thoughts for 2015.

Journal of Physics: Conference Series | 2008

Experience with the gLite workload management system in ATLAS Monte Carlo production on LCG

S. Campana; David Rebatto; A. Sciaba

The ATLAS experiment has been running continuous simulated events production since more than two years. A considerable fraction of the jobs is daily submitted and handled via the gLite Workload Management System, which overcomes several limitations of the previous LCG Resource Broker. The gLite WMS has been tested very intensively for the LHC experiments use cases for more than six months, both in terms of performance and reliability. The tests were carried out by the LCG Experiment Integration Support team (in close contact with the experiments) together with the EGEE integration and certification team and the gLite middleware developers. A pragmatic iterative and interactive approach allowed a very quick rollout of fixes and their rapid deployment, together with new functionalities, for the ATLAS production activities. The same approach is being adopted for other middleware components like the gLite and CREAM Computing Elements. In this contribution we will summarize the learning from the gLite WMS testing activity, pointing out the most important achievements and the open issues. In addition, we will present the current situation of the ATLAS simulated event production activity on the EGEE infrastructure based on the gLite WMS, showing the main improvements and benefits from the new middleware. Finally, the gLite WMS is being used by many other VOs, including the LHC experiments. In particular, some statistics will be shown on the CMS experience running WMS user analysis via the WMS

Journal of Physics: Conference Series | 2012

New solutions for large scale functional tests in the WLCG infrastructure with SAM/Nagios: the experiments experience

Julia Andreeva; P Dhara; A. Di Girolamo; A Kakkar; M Litmaath; N Magini; Guidone Negri; S. Roiser; P. Saiz; M D Saiz Santos; B Sarkar; J. Schovancova; A. Sciaba; A Wakankar

Since several years the LHC experiments rely on the WLCG Service Availability Monitoring framework (SAM) to run functional tests on their distributed computing systems. The SAM tests have become an essential tool to measure the reliability of the Grid infrastructure and to ensure reliable computing operations, both for the sites and the experiments. Recently the old SAM framework was replaced with a completely new system based on Nagios and ActiveMQ to better support the transition to EGI and to its more distributed infrastructure support model and to implement several scalability and functionality enhancements. This required all LHC experiments and the WLCG support teams to migrate their tests, to acquire expertise on the new system, to validate the new availability and reliability computations and to adopt new visualisation tools. In this contribution we describe in detail the current state of the art of functional testing in WLCG: how the experiments use the new SAM/Nagios framework, the advanced functionality made available by the new framework and the future developments that are foreseen, with a strong focus on the improvements in terms of stability and flexibility brought by the new system.

Journal of Physics: Conference Series | 2010

Dashboard applications to monitor experiment activities at sites

Julia Andreeva; Stefano Belforte; Max Boehm; Adrian Casajus; Josep Flix; Benjamin Gaidioz; C. Grigoras; Lukasz Kokoszkiewicz; Elisa Lanciotti; Ricardo Rocha; P. Saiz; R. Santinelli; Irina Sidorova; A. Sciaba; A. Tsaregorodtsev

In the framework of a distributed computing environment, such as WLCG, monitoring has a key role in order to keep under control activities going on in sites located in different countries and involving people based in many different sites. To be able to cope with such a large scale heterogeneous infrastructure, it is necessary to have monitoring tools providing a complete and reliable view of the overall performance of the sites. Moreover, the structure of a monitoring system critically depends on the object to monitor and on the users it is addressed to. In this article we will describe two different monitoring systems both aimed to monitor activities and services provided in the WLCG framework, but designed in order to meet the requirements of different users: Site Status Board has an overall view of the services available in all the sites supporting an experiment, whereas Siteview provides a complete view of all the activities going on at a site, for all the experiments supported by the site.

grid computing | 2004

HEP Applications and Their Experience with the Use of DataGrid Middleware

S. Burke; F. J. Harris; Ian Stokes-Rees; I. Augustin; F. Carminati; J. Closier; E. van Herwijnen; A. Sciaba; D Boutigny; J. J. Blaising; Vincent Garonne; A. Tsaregorodtsev; Paolo Capiluppi; A. Fanfani; C. Grandi; R. Barbera; E. Luppi; Guido Negri; L. Perini; S. Resconi; M. Reale; A. De Salvo; S. Bagnasco; P. Cerello; Kors Bos; D.L. Groep; W. van Leeuwen; Jeffrey Templon; Oxana Smirnova; O. J. E. Maroney

An overview is presented of the characteristics of HEP computing and its mapping to the Grid paradigm. This is followed by a synopsis of the main experiences and lessons learned by HEP experiments in their use of DataGrid middleware using both the EDG application testbed and the LCG production service. Particular reference is made to experiment ‘data challenges’, and a forward look is given to necessary developments in the framework of the EGEE project.

Journal of Physics: Conference Series | 2008

Testing and integrating the WLCG/EGEE middleware in the LHC computing

S Campana; A D Girolamo; E Lanciotti; P M Lorenzo; N Magini; V Miccio; R. Santinelli; A. Sciaba

The main goal of the Experiment Integration and Support (EIS) team in WLCG is to help the LHC experiments with using proficiently the gLite middleware as part of their computing framework. This contribution gives an overview of the activities of the EIS team and focuses on a few of them particularly important for the experiments. One activity is the evaluation of the gLite workload management system (WMS) to assess its adequacy for the needs of the LHC computing in terms of functionality, reliability and scalability. We describe how the experiment requirements were mapped to validation criteria and how the WMS performances were accurately measured under realistic load conditions over prolonged periods of time. Another activity is the integration of the Service Availability Monitoring system (SAM) with the experiment monitoring framework. The SAM system is widely used in the EGEE operations to identify malfunctions in Grid services, but it can be adapted to perform the same function on experiment-specific services. We describe how this has been done for the LHC experiments, which are now using SAM as part of their operations.

Explore More