Federica Fanzago | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Federica Fanzago is active.

Explore More

Publication

Featured researches published by Federica Fanzago.

ieee international conference on high performance computing data and analytics | 2007

The CMS remote analysis builder (CRAB)

D. Spiga; Stefano Lacaprara; W. Bacchi; Mattia Cinquilli; G. Codispoti; Marco Corvo; A. Dorigo; A. Fanfani; Federica Fanzago; F. M. Farina; M. Merlo; Oliver Gutsche; L. Servoli; C. Kavka

The CMS experiment will produce several Pbytes of data every year, to be distributed over many computing centers geographically distributed in different countries. Analysis of this data will be also performed in a distributed way, using grid infrastructure. CRAB (CMS Remote Analysis Builder) is a specific tool, designed and developed by the CMS collaboration, that allows a transparent access to distributed data to end physicist. Very limited knowledge of underlying technicalities are required to the user. CRAB interacts with the local user environment, the CMS Data Management services and with the Grid middleware. It is able to use WLCG, gLite and OSG middleware. CRAB has been in production and in routine use by end-users since Spring 2004. It has been extensively used in studies to prepare the Physics Technical Design Report (PTDR) and in the analysis of reconstructed event samples generated during the Computing Software and Analysis Challenge (CSA06). This involved generating thousands of jobs per day at peak rates. In this paper we discuss the current implementation of CRAB, the experience with using it in production and the plans to improve it in the immediate future.

ieee nuclear science symposium | 2008

CRAB: A CMS application for distributed analysis

G. Codispoti; Mattia Cinquilli; A. Fanfani; Federica Fanzago; F. M. Farina; C. Kavka; Stefano Lacaprara; Vincenzo Miccio; D. Spiga; Eric Wayne Vaandering

Starting from 2008, the CMS experiment will produce several Pbytes of data every year, to be distributed over many computing centers geographically distributed in different countries. The CMS computing model defines how the data has to be distributed and accessed in order to enable physicists to run efficiently their analysis over the data. The analysis will be thus performed in a distributed way using Grid infrastructure. CRAB (CMS Remote Analysis Builder) is a specific tool, designed and developed by the CMS collaboration, that allows a transparent access to distributed data to end physicist. CRAB interacts with the local user environment, the CMS Data Management services and with the Grid middleware: it takes care of the data and resources discovery; it splits the user task in several analysis processes (jobs) and distribute and parallelize them over different Grid environments; it takes care of the process tracking and output handling. Very limited knowledge of underlying technical details are required to the end user. The tool can be used as a direct interface to the computing system or can delegate the task to a server, which takes care of the user jobs handling, providing services as automatic resubmission in case of failures and notification to the user of the task status. Its current implementation is able to interact with WLCG, gLite and OSG Grid middlewares. Furthermore it allows in the very same way the access to local data and batch systems such as LSF. CRAB has been in production and in routine use by end-users since Spring 2004. It has been extensively used in studies to prepare the Physics Technical Design Report, in the analysis of reconstructed event samples generated during the Computing Software and Analysis Challenges and in the preliminary cosmic rays data taking. The CRAB architecture and the usage inside the CMS community will be described in detail, as well as the current status and future development.

Journal of Physics: Conference Series | 2010

Use of the gLite-WMS in CMS for production and analysis

G. Codispoti; C. Grandi; A. Fanfani; D. Spiga; M Cinquilli; F. M. Farina; V. Miccio; Federica Fanzago; A. Sciaba; S. Lacaprara; S Belforte; D. Bonacorsi; A Sartirana; D Dongiovanni; D Cesini; S. Wakefield; Jose M Hernandez; S Lemaitre; M Litmaath; Y Calas; E Roche

The CMS experiment at LHC started using the Resource Broker (by the EDG and LCG projects) to submit Monte Carlo production and analysis jobs to distributed computing resources of the WLCG infrastructure over 6 years ago. Since 2006 the gLite Workload Management System (WMS) and Logging & Bookkeeping (LB) are used. The interaction with the gLite-WMS/LB happens through the CMS production and analysis frameworks, respectively ProdAgent and CRAB, through a common component, BOSSLite. The important improvements recently made in the gLite-WMS/LB as well as in the CMS tools and the intrinsic independence of different WMS/LB instances allow CMS to reach the stability and scalability needed for LHC operations. In particular the use of a multi-threaded approach in BOSSLite allowed to increase the scalability of the systems significantly. In this work we present the operational set up of CMS production and analysis based on the gLite-WMS and the performances obtained in the past data challenges and in the daily Monte Carlo productions and user analysis usage in the experiment.

Journal of Grid Computing | 2010

Distributed analysis in CMS

A. Fanfani; Anzar Afaq; Jose Afonso Sanches; Julia Andreeva; Giusepppe Bagliesi; L. A. T. Bauerdick; Stefano Belforte; Patricia Bittencourt Sampaio; K. Bloom; Barry Blumenfeld; D. Bonacorsi; C. Brew; Marco Calloni; Daniele Cesini; Mattia Cinquilli; G. Codispoti; Jorgen D’Hondt; Liang Dong; Danilo N. Dongiovanni; Giacinto Donvito; David Dykstra; Erik Edelmann; R. Egeland; P. Elmer; Giulio Eulisse; D Evans; Federica Fanzago; F. M. Farina; Derek Feichtinger; I. Fisk

The CMS experiment expects to manage several Pbytes of data each year during the LHC programme, distributing them over many computing sites around the world and enabling data access at those centers for analysis. CMS has identified the distributed sites as the primary location for physics analysis to support a wide community with thousands potential users. This represents an unprecedented experimental challenge in terms of the scale of distributed computing resources and number of user. An overview of the computing architecture, the software tools and the distributed infrastructure is reported. Summaries of the experience in establishing efficient and scalable operations to get prepared for CMS distributed analysis are presented, followed by the user experience in their current analysis activities.

IEEE Transactions on Nuclear Science | 2009

CRAB: A CMS Application for Distributed Analysis

G. Codispoti; Cinquilli Mattia; A. Fanfani; Federica Fanzago; F. M. Farina; C. Kavka; Stefano Lacaprara; Vincenzo Miccio; D. Spiga; Eric Wayne Vaandering

Beginning in 2009, the CMS experiment will produce several petabytes of data each year which will be distributed over many computing centres geographically distributed in different countries. The CMS computing model defines how the data is to be distributed and accessed to enable physicists to efficiently run their analyses over the data. The analysis will be performed in a distributed way using Grid infrastructure. CRAB (CMS remote analysis builder) is a specific tool, designed and developed by the CMS collaboration, that allows the end user to transparently access distributed data. CRAB interacts with the local user environment, the CMS data management services and with the Grid middleware; it takes care of the data and resource discovery; it splits the users task into several processes (jobs) and distributes and parallelizes them over different Grid environments; it performs process tracking and output handling. Very limited knowledge of the underlying technical details is required of the end user. The tool can be used as a direct interface to the computing system or can delegate the task to a server, which takes care of the job handling, providing services such as automatic resubmission in case of failures and notification to the user of the task status. Its current implementation is able to interact with gLite and OSG Grid middlewares. Furthermore, with the same interface, it enables access to local data and batch systems such as load sharing facility (LSF). CRAB has been in production and in routine use by end users since Spring 2004. It has been extensively used in studies to prepare the Physics Technical Design Report, in the analysis of reconstructed event samples generated during the Computing Software and Analysis Challenges and in the preliminary cosmic ray data taking. The CRAB architecture and the usage inside the CMS community will be described in detail, as well as the current status and future development.

Proceedings of International Symposium on Grids and Clouds (ISGC) 2017 — PoS(ISGC2017) | 2017

The "Cloud Area Padovana": Lessons Learned after Two Years of a Production OpenStack-based IaaS for the Local INFN User Community

Marco Verlato; Paolo Andreetto; Fabrizio Chiarello; Fulvia Costa; Alberto Crescente; Alvise Dorigo; Sergio Fantinel; Federica Fanzago; Ervin Konomi; Matteo Segatta; Massimo Sgaravatto; Sergio Traldi; Nicola Tritto; Lisa Zangrando

The Cloud Area Padovana is an OpenStack-based scientific cloud, spread across two different sites - the INFN Padova Unit and the INFN Legnaro National Labs - located 10 km away but connected with a dedicated 10 Gbps optical link. In the last two years its hardware resources have been scaled horizontally by adding new ones: currently it provides about 1100 logical cores and 50 TB of storage. Special in-house developments were also integrated in the OpenStack dashboard, such as a tool for user and project registrations with direct support for Single Sign-On via the INFN-AAI Identity Provider as a new option for the user authentication. The collaboration with the EU-funded INDIGO-DataCloud project, started one year ago, allowed to experiment the integration of Docker-based containers and the fair-share scheduling: a new resource allocation mechanism analogous to the ones available in the batch system schedulers for maximizing the usage of shared resources among concurrent users and projects. Both solutions are expected to be available in production soon. The entire computing facility now satisfies the computational and storage demands of more than 100 users afferent to about 30 research projects. nIn this paper we’ll present the architecture of the Cloud infrastructure, the tools and procedures used to operate it ensuring reliability and fault-tolerance. We’ll especially focus on the lessons learned in these two years, describing the challenges identified and the subsequent corrective actions applied. From the perspective of scientific applications, we’ll show some concrete use cases on how this Cloud infrastructure is being used. In particular we’ll focus on two big physics experiments which are intensively exploiting this computing facility: CMS and SPES. CMS deployed on the cloud a complex computational infrastructure, composed of several user interfaces for job submission in the Grid environment/local batch queues or for interactive processes; this is fully integrated with the local Tier-2 facility. To avoid a static allocation of the resources, an elastic cluster, initially based only on cernVM, has been configured: it allows to automatically create and delete virtual machines according to the user needs. SPES is using a client-server system called TraceWin to exploit INFNs virtual resources performing a very large number of simulations on about a thousand nodes elastically managed.

Proceedings of International Symposium on Grids and Clouds (ISGC) 2016 — PoS(ISGC 2016) | 2017

Synergy: a service for optimising the resource allocation in the cloud based environments

Lisa Zangrando; Marco Verlato; Federica Fanzago; Massimo Sgaravatto

In OpenStack, the current resources allocation model provides to each user group a nfixed amount of resources. This model based on fixed quotas, accurately reflects the neconomic model, pay-per-use, on which the Cloud paradigm is built. However it is not npretty suited to the computational model of the scientific computing whose demands of nresources consumption can not be predetermined, but vary greatly in time. Usually the nsize of the quota is agreed with the Cloud Infrastructure manager, contextually with the ncreation of a new project and it just rarely changes over the time. The main limitation ndue to the static partitioning of resources occurs mainly in a scenario of full quota nutilization. In this context, the project can not exceed its own quota even if, in the cloud ninfrastructure, there are several unused resources but assigned to different groups. It nfollows that the overall efficiency in a Data Centre is often rather low. n nThe European project INDIGO DataCloud is addressing this issue with “Synergy”, a nnew service that provides to OpenStack an advanced provisioning model based on nscheduling algorithms known by the name of “fair-share”. In addition to maximizing nthe usage, the fair-share ensures that these resources are equitably distributed between nusers and groups. n nIn this paper will be discussed the solution offered by INDIGO with Synergy, by ndescribing its features, architecture and the selected algorithm limitations confirmed nby the preliminary results of tests performed in the Padua testbed integrated with EGI nFederated Cloud.

Big Data Computing (BDC), 2015 IEEE/ACM 2nd International Symposium on | 2016

Any Data, Any Time, Anywhere: Global Data Access for Science

Kenneth Bloom; T. Boccali; Brian Bockelman; D C Bradley; Sridhara Dasu; J M Dost; Federica Fanzago; I. Sfiligoi; A Tadel; M. Tadel; C. Vuosalo; F. Würthwein; Avi Yagil; M. Zvada

Data access is key to science driven by distributed high-throughput computing (DHTC), an essential technology for many major research projects such as High Energy Physics (HEP) experiments. However, achieving efficient data access becomes quite difficult when many independent storage sites are involved because users are burdened with learning the intricacies of accessing each system and keeping careful track of data location. We present an alternate approach: the Any Data, Any Time, Anywhere infrastructure. Combining several existing software products, AAA presents a global, unified view of storage systems - a data federation, a global filesystem for software delivery, and a workflow management system. We present how one HEP experiment, the Compact Muon Solenoid (CMS), is utilizing the AAA infrastructure and some simple performance metrics.

Journal of Physics: Conference Series | 2014

Experience in CMS with the common analysis framework project

Marco Mascheroni; D. Spiga; T. Boccali; D. Bonacorsi; Mattia Cinquilli; D. Giordano; Federica Fanzago; I. Fisk; M. Girone; Jose Hernandez; Preslav Konstantinov; Niccolò Magini; Valentina Mancinelli; Hassen Riahi; Lola Saiz Santos; Eric Wayne Vaandering

ATLAS, CERN-IT, and CMS embarked on a project to develop a common system for analysis workflow management, resource provisioning and job scheduling. This distributed computing infrastructure was based on elements of PanDA and prior CMS workflow tools. After an extensive feasibility study and development of a proof-of-concept prototype, the project now has a basic infrastructure that supports the analysis use cases of both experiments via common services. In this paper we will discuss the state of the current solution and give an overview of all the components of the system.

Nuclear Physics B - Proceedings Supplements | 2008

CRAB: the CMS distributed analysis tool development and design

D. Spiga; S. Lacaprara; W. Bacchi; Mattia Cinquilli; G. Codispoti; M. Corvo; A. Dorigo; A. Fanfani; Federica Fanzago; F. M. Farina; Oliver Gutsche; C. Kavka; M. Merlo; L. Servoli

Explore More