Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jim Pivarski is active.

Publication


Featured researches published by Jim Pivarski.


arXiv: Distributed, Parallel, and Cluster Computing | 2017

Big Data in HEP: A comprehensive use case study

Oliver Gutsche; Jim Pivarski; Jim Kowalkowski; Nhan Tran; A. Svyatkovskiy; Matteo Cremonesi; P. Elmer; Bo Jayatilaka; Saba Sehrish; Cristina Mantilla Suarez

Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and promise a fresh look at analysis of very large datasets and could potentially reduce the time-to-physics with increased interactivity. In this talk, we present an active LHC Run 2 analysis, searching for dark matter with the CMS detector, as a testbed for Big Data technologies. We directly compare the traditional NTuple-based analysis with an equivalent analysis using Apache Spark on the Hadoop ecosystem and beyond. In both cases, we start the analysis with the official experiment data formats and produce publication physics plots. We will discuss advantages and disadvantages of each approach and give an outlook on further studies needed.


arXiv: Computational Physics | 2018

arXiv : HEP Software Foundation Community White Paper Working Group - Data Analysis and Interpretation

L. A. T. Bauerdick; Martin Ritter; Oliver Gutsche; M. D. Sokoloff; N. F. Castro; M. Girone; T. Sakuma; P. Elmer; Brian Bockelman; Elizabeth Sexton-Kennedy; G. Watts; J. Letts; F. Würthwein; C. Vuosalo; Jim Pivarski; Daniel S. Katz; Riccardo Maria Bianchi; K. Cranmer; Robert Gardner; Shawn Patrick McKee; B. Hegner; E. Rodrigues; David Lange; Christoph Paus; JoséM. Hernández; K. Pedro; Bodhitha Jayatilaka; Lukasz Kreczko

At the heart of experimental high energy physics (HEP) is the development of facilities and instrumentation that provide sensitivity to new phenomena. Our understanding of nature at its most fundamental level is advanced through the analysis and interpretation of data from sophisticated detectors in HEP experiments. The goal of data analysis systems is to realize the maximum possible scientific potential of the data within the constraints of computing and human resources in the least time. To achieve this goal, future analysis systems should empower physicists to access the data with a high level of interactivity, reproducibility and throughput capability. As part of the HEP Software Foundation Community White Paper process, a working group on Data Analysis and Interpretation was formed to assess the challenges and opportunities in HEP data analysis and develop a roadmap for activities in this area over the next decade. In this report, the key findings and recommendations of the Data Analysis and Interpretation Working Group are presented.


Journal of Physics: Conference Series | 2018

Optimizing ROOT IO For Analysis

Brian Bockelman; Zhe Zhang; Jim Pivarski

The ROOT I/O (RIO) subsystem is foundational to most HEP experiments - it provides a file format, a set of APIs/semantics, and a reference implementation in C++. It is often found at the base of an experiments framework and is used to serialize the experiments data; in the case of an LHC experiment, this may be hundreds of petabytes of files! Individual physicists will further use RIO to perform their end-stage analysis, reading from intermediate files they generate from experiment data. RIO is thus incredibly flexible: it must serve as a file format for archival (optimized for space) and for working data (optimized for read speed). To date, most of the technical work has focused on improving the former use case. We present work designed to help improve RIO for analysis. We analyze the real-world impact of LZ4 to decrease decompression times (and the corresponding cost in disk space). We introduce new APIs that read RIO data in bulk, removing the per-event overhead of a C++ function call. We compare the performance with the existing RIO APIs for simple structure data and show how this can be complimentary with efforts to improve the parallelism of the RIO stack.


Archive | 2016

histogrammar-scala: 1.0.0

Jim Pivarski; A. Svyatkovskiy


arXiv: Computational Physics | 2018

arXiv : HEP Software Foundation Community White Paper Working Group - Data and Software Preservation to Enable Reuse

Hildreth; M. Neubauer; S. Neubert; K. Cranmer; Carlos Maltzahn; Daniel S. Katz; Robert Gardner; G. Watts; A. Verbytskyi; D. South; M. Kane; Elizabeth Sexton-Kennedy; J. Shiers; T. Malik; Jim Pivarski; J. Wozniak; A. Boehnlein; L. Heinrich; S. Dallmeier-Tiessen; T. Hacker; S. Smith; T. Simko; I. Jimenez


Journal of Physics: Conference Series | 2018

CMS Analysis and Data Reduction with Apache Spark

Oliver Gutsche; Luca Canali; Illia Cremer; Matteo Cremonesi; P. Elmer; I. Fisk; M. Girone; Bo Jayatilaka; Jim Kowalkowski; Viktor Khristenko; Evangelos Motesnitsalis; Jim Pivarski; Saba Sehrish; Kacper Surdy; A. Svyatkovskiy


Journal of Physics: Conference Series | 2018

Toward real-time data query systems in HEP

Jim Pivarski; David Lange; Thanat Jatuphattharachat


international conference on big data | 2017

Fast access to columnar, hierarchically nested data via code transformation

Jim Pivarski; P. Elmer; Brian Bockelman; Zhe Zhang


Archive | 2017

CMS Analysis and Data Reduction with Apache Spark : arXiv

Oliver Gutsche; Jim Pivarski; M. Girone; Jim Kowalkowski; Luca Canali; Viktor Khristenko; Matteo Cremonesi; A. Svyatkovskiy; P. Elmer; Illia Cremer; Bo Jayatilaka; Saba Sehrish; I. Fisk; Evangelos Motesnitsalis; Kacper Surdy


Archive | 2017

Fast Access to Columnar, Hierarchical Data via Code Transformation.

Jim Pivarski; P. Elmer; Brian Bockelman; Zhe Zhang

Collaboration


Dive into the Jim Pivarski's collaboration.

Top Co-Authors

Avatar

P. Elmer

Princeton University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brian Bockelman

University of Nebraska–Lincoln

View shared research outputs
Top Co-Authors

Avatar

Saba Sehrish

University of Central Florida

View shared research outputs
Top Co-Authors

Avatar

Zhe Zhang

University of Nebraska–Lincoln

View shared research outputs
Researchain Logo
Decentralizing Knowledge