B. Koblitz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where B. Koblitz is active.

Explore More

Publication

Featured researches published by B. Koblitz.

Journal of Physics: Conference Series | 2008

Managing ATLAS data on a petabyte-scale with DQ2

Miguel Branco; D. Cameron; Benjamin Gaidioz; Vincent Garonne; B. Koblitz; M. Lassnig; Ricardo Rocha; Pedro Salgado; T Wenaus

The ATLAS detector at CERNs Large Hadron Collider presents data handling requirements on an unprecedented scale. From 2008 on the ATLAS distributed data management system, Don Quijote2 (DQ2), must manage tens of petabytes of experiment data per year, distributed globally via the LCG, OSG and NDGF computing grids, now commonly known as the WLCG. Since its inception in 2005 DQ2 has continuously managed all experiment data for the ATLAS collaboration, which now comprises over 3000 scientists participating from more than 150 universities and laboratories in 34 countries. Fulfilling its primary requirement of providing a highly distributed, fault-tolerant and scalable architecture DQ2 was successfully upgraded from managing data on a terabyte-scale to managing data on a petabyte-scale. We present improvements and enhancements to DQ2 based on the increasing demands for ATLAS data management. We describe performance issues, architectural changes and implementation decisions, the current state of deployment in test and production as well as anticipated future improvements. Test results presented here show that DQ2 is capable of handling data up to and beyond the requirements of full-scale data-taking.

Journal of Grid Computing | 2008

The AMGA Metadata Service

B. Koblitz; Nuno Santos; V. Pose

We present the AMGA metadata catalogue, which was developed as part of the EGEE (enabling Grids for EsciencE) project’s gLite Grid middleware. AMGA provides access to meta data for files stored on the Grid, as well as a simplified general access to relational data stored in database systems. Design and implementation of AMGA was done in close collaboration with the very diverse EGEE user community to make sure all functionality, performance and security requirements were met. In particular, AMGA targets the needs of the high energy physics community to rapidly access very large amounts of metadata, as well as the needs for security of the biomedical community. AMGA therefore tightly integrates fine grained access control making use of a virtual organisation management system. In addition, it offers advanced federation and features to increase dependability, performance and data security.

Journal of Grid Computing | 2008

A Secure Grid Medical Data Manager Interfaced to the gLite Middleware

Johan Montagnat; Ákos Frohner; Daniel Jouvenot; Christophe Pera; Peter Z. Kunszt; B. Koblitz; Nuno Santos; Charles Loomis; Romain Texier; Diane Lingrand; Patrick Guio; Ricardo Rocha; Antonio Sobreira de Almeida; Zoltan Farkas

The medical community is producing and manipulating a tremendous volume of digital data for which computerized archiving, processing and analysis is needed. Grid infrastructures are promising for dealing with challenges arising in computerized medicine but the manipulation of medical data on such infrastructures faces both the problem of interconnecting medical information systems to Grid middlewares and of preserving patients’ privacy in a wide and distributed multi-user system. These constraints are often limiting the use of Grids for manipulating sensitive medical data. This paper describes our design of a medical data management system taking advantage of the advanced gLite data management services, developed in the context of the EGEE project, to fulfill the stringent needs of the medical community. It ensures medical data protection through strict data access control, anonymization and encryption. The multi-level access control provides the flexibility needed for implementing complex medical use-cases. Data anonymization prevents the exposure of most sensitive data to unauthorized users, and data encryption guarantees data protection even when it is stored at remote sites. Moreover, the developed prototype provides a Grid storage resource manager (SRM) interface to standard medical DICOM servers thereby enabling transparent access to medical data without interfering with medical practice.

parallel computing | 2007

Exploring high performance distributed file storage using LDPC codes

Benjamin Gaidioz; B. Koblitz; Nuno Santos

We explore the feasibility of implementing a reliable, high performance, distributed storage system on a commodity computing cluster. Files are distributed across storage nodes using erasure coding with small low-density parity-check (LDPC) codes, which provide high-reliability with small storage and performance overhead. We present performance measurements done on a prototype system comprising 50 nodes, which are self organised using a peer-to-peer overlay.

Journal of Physics: Conference Series | 2008

Storage Resource Manager Version 2.2: design, implementation, and testing experience

Flavia Donno; Lana Abadie; Paolo Badino; Jean Philippe Baud; Ezio Corso; Shaun De Witt; Patrick Fuhrmann; Junmin Gu; B. Koblitz; Sophie Lemaitre; Maarten Litmaath; Dimitry Litvintsev; Giuseppe Lo Presti; L. Magnoni; Gavin McCance; Tigran Mkrtchan; Rémi Mollon; Vijaya Natarajan; Timur Perelmutov; D. Petravick; Arie Shoshani; Alex Sim; David Smith; Paolo Tedesco; Riccardo Zappi

Storage Services are crucial components of the Worldwide LHC Computing Grid Infrastructure spanning more than 200 sites and serving computing and storage resources to the High Energy Physics LHC communities. Up to tens of Petabytes of data are collected every year by the four LHC experiments at CERN. To process these large data volumes it is important to establish a protocol and a very efficient interface to the various storage solutions adopted by the WLCG sites. In this work we report on the experience acquired during the definition of the Storage Resource Manager v2.2 protocol. In particular, we focus on the study performed to enhance the interface and make it suitable for use by the WLCG communities. At the moment 5 different storage solutions implement the SRM v2.2 interface: BeStMan (LBNL), CASTOR (CERN and RAL), dCache (DESY and FNAL), DPM (CERN), and StoRM (INFN and ICTP). After a detailed inside review of the protocol, various test suites have been written identifying the most effective set of tests: the S2 test suite from CERN and the SRM- Tester test suite from LBNL. Such test suites have helped verifying the consistency and coherence of the proposed protocol and validating existing implementations. We conclude our work describing the results achieved.

ieee conference on mass storage systems and technologies | 2007

Storage Resource Managers: Recent International Experience on Requirements and Multiple Co-Operating Implementations

Lana Abadie; Paolo Badino; J.-P. Baud; Ezio Corso; M. Crawford; S. De Witt; Flavia Donno; A. Forti; Ákos Frohner; Patrick Fuhrmann; G. Grosdidier; Junmin Gu; Jens Jensen; B. Koblitz; Sophie Lemaitre; Maarten Litmaath; D. Litvinsev; G. Lo Presti; L. Magnoni; T. Mkrtchan; Alexander Moibenko; Rémi Mollon; Vijaya Natarajan; Gene Oleynik; Timur Perelmutov; D. Petravick; Arie Shoshani; Alex Sim; David Smith; M. Sponza

Storage management is one of the most important enabling technologies for large-scale scientific investigations. Having to deal with multiple heterogeneous storage and file systems is one of the major bottlenecks in managing, replicating, and accessing files in distributed environments. Storage resource managers (SRMs), named after their Web services control protocol, provide the technology needed to manage the rapidly growing distributed data volumes, as a result of faster and larger computational facilities. SRMs are grid storage services providing interfaces to storage resources, as well as advanced functionality such as dynamic space allocation and file management on shared storage systems. They call on transport services to bring files into their space transparently and provide effective sharing of files. SRMs are based on a common specification that emerged over time and evolved into an international collaboration. This approach of an open specification that can be used by various institutions to adapt to their own storage systems has proven to be a remarkable success - the challenge has been to provide a consistent homogeneous interface to the grid, while allowing sites to have diverse infrastructures. In particular, supporting optional features while preserving interoperability is one of the main challenges we describe in this paper. We also describe using SRM in a large international high energy physics collaboration, called WLCG, to prepare to handle the large volume of data expected when the Large Hadron Collider (LHC) goes online at CERN. This intense collaboration led to refinements and additional functionality in the SRM specification, and the development of multiple interoperating implementations of SRM for various complex multi- component storage systems.

IEEE Transactions on Nuclear Science | 2006

Measurement of the LCG2 and Glite File Catalogue's Performance

Craig Munro; B. Koblitz; Nuno Santos; Akram Khan

When the Large Hadron Collider (LHC) begins operation at CERN in 2007 it will produce data in volumes never before seen. Physicists around the world will manage, distribute and analyse petabytes of this data using the middleware provided by the LHC Computing Grid. One of the critical factors in the smooth running of this system is the performance of the file catalogues which allow users to access their files with a logical filename without knowing their physical location. This paper presents a detailed study comparing the performance and respective merits and shortcomings of two of the main catalogues: the LCG File Catalogue and the gLite FiReMan catalogue

ieee nuclear science symposium | 2005

Performance comparison of the LCG2 and gLite file catalogues

Craig Munro; B. Koblitz; Nuno Santos; Akram Khan

When the Large Hadron Collider (LHC) begins operation at CERN in 2007 it will produce data in volumes never before seen. Physicists around the world will manage, distribute and analyse petabytes of this data using the middleware provided by the LHC computing grid. One of the critical factors in the smooth running of this system is the performance of the file catalogues which allow users to access their files with a logical filename without knowing their physical location. This paper presents a detailed study comparing the performance and respective merits and shortcomings of two of the main catalogues: the LCG file catalogue and the gLite FiReMan catalogue

Journal of Physics: Conference Series | 2010

Migration of ATLAS PanDA to CERN

Graeme Andrew Stewart; Alexei Klimentov; B. Koblitz; M. Lamanna; T. Maeno; P. Nevski; Marcin Nowak; Pedro Emanuel De Castro Faria Salgado; Torre Wenaus

The ATLAS Production and Distributed Analysis System (PanDA) is a key component of the ATLAS distributed computing infrastructure. All ATLAS production jobs, and a substantial amount of user and group analysis jobs, pass through the PanDA system, which manages their execution on the grid. PanDA also plays a key role in production task definition and the data set replication request system. PanDA has recently been migrated from Brookhaven National Laboratory (BNL) to the European Organization for Nuclear Research (CERN), a process we describe here. We discuss how the new infrastructure for PanDA, which relies heavily on services provided by CERN IT, was introduced in order to make the service as reliable as possible and to allow it to be scaled to ATLASs increasing need for distributed computing. The migration involved changing the backend database for PanDA from MySQL to Oracle, which impacted upon the database schemas. The process by which the client code was optimised for the new database backend is discussed. We describe the procedure by which the new database infrastructure was tested and commissioned for production use. Operations during the migration had to be planned carefully to minimise disruption to ongoing ATLAS offline computing. All parts of the migration were fully tested before commissioning the new infrastructure and the gradual migration of computing resources to the new system allowed any problems of scaling to be addressed.

Nuclear Instruments & Methods in Physics Research Section A-accelerators Spectrometers Detectors and Associated Equipment | 2006