Jacek Becla | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jacek Becla is active.

Explore More

Publication

Featured researches published by Jacek Becla.

very large data bases | 2009

A demonstration of SciDB: a science-oriented DBMS

Philippe Cudré-Mauroux; Hideaki Kimura; Kian-Tat Lim; Jennie Rogers; Roman Simakov; Emad Soroush; Pavel Velikhov; Daniel L. Wang; Magdalena Balazinska; Jacek Becla; David J. DeWitt; Bobbi Heath; David Maier; Samuel Madden; Jignesh M. Patel; Michael Stonebraker; Stanley B. Zdonik

In CIDR 2009, we presented a collection of requirements for SciDB, a DBMS that would meet the needs of scientific users. These included a nested-array data model, science-specific operations such as regrid, and support for uncertainty, lineage, and named versions. In this paper, we present an overview of SciDBs key features and outline a demonstration of the first version of SciDB on data and operations from one of our lighthouse users, the Large Synoptic Survey Telescope (LSST).

Proceedings of SPIE | 2006

Designing a Multi-Petabyte Database for LSST

Jacek Becla; Andrew Hanushevsky; Sergei Nikolaev; Ghaleb Abdulla; Alexander S. Szalay; Maria A. Nieto-santisteban; Ani Thakar; Jim Gray

The 3.2 giga-pixel LSST camera will produce approximately half a petabyte of archive images every month. These data need to be reduced in under a minute to produce real-time transient alerts, and then added to the cumulative catalog for further analysis. The catalog is expected to grow about three hundred terabytes per year. The data volume, the real-time transient alerting requirements of the LSST, and its spatio-temporal aspects require innovative techniques to build an efficient data access system at reasonable cost. As currently envisioned, the system will rely on a database for catalogs and metadata. Several database systems are being evaluated to understand how they perform at these data rates, data volumes, and access patterns. This paper describes the LSST requirements, the challenges they impose, the data access philosophy, results to date from evaluating available database technologies against LSST requirements, and the proposed database architecture to meet the data challenges.

Data Science Journal | 2008

Report from the SciDB Workshop

Jacek Becla; Kian-Tat Lim

A mini-workshop with representatives from the data-driven science and database research communities was organized in response to suggestions at the first XLDB Workshop. The goal was to develop common requirements and primitives for a next-generation database management system that scientists would use, including those from high-energy physics, astronomy, biology, geoscience and fusion, in order to stimulate research and advance technology. These requirements were thought by the database researchers to be novel and unlikely to be fully met by current commercial vendors. The two groups accordingly decided to explore building a new open source DBMS. This paper is the final report of the discussions and activities at the workshop

high performance distributed computing | 2000

Creating large scale database servers

Jacek Becla; Andrew Hanushevsky

The BaBar experiment at the Stanford Linear Accelerator Center (SLAC) is designed to perform a high precision investigation of the decays of B-meson produced from electron-positron interactions. The experiment, started in May 1999, will generate approximately 300 TB/year of data for 10 years. All of the data will reside in objectivity databases (object oriented databases), accessible via the Advanced Multi-threaded Server (AMS). To date, over 70 TB of data have been placed in Objectivity/DB, making it one of the largest databases in the world. Providing access to such a large quantity of data through a database server is a daunting task. A full-scale testbed environment had to be developed to tune various software parameters and a fundamental change had to occur in the AMS architecture to allow it to scale past several hundred terabytes of data. Additionally, several protocol extensions had to be implemented to provide practical access to large quantities of data. The paper describes the design of the database and the changes that we needed to make in the AMS for scalability reasons and how the lessons we learned would be applicable to virtually any kind of database server seeking to operate in the Petabyte region.

Proceedings of SPIE | 2016

Agile software development in an earned value world: a survival guide

Jeffrey C. Kantor; Kevin N. Long; Jacek Becla; Frossie Economou; Margaret Gelman; Mario Jurić; Ron Lambert; K. Simon Krughoff; J. Swinbank; Xiuqin Wu

Agile methodologies are current best practice in software development. They are favored for, among other reasons, preventing premature optimization by taking a somewhat short-term focus, and allowing frequent replans/reprioritizations of upcoming development work based on recent results and current backlog. At the same time, funding agencies prescribe earned value management accounting for large projects which, these days, inevitably include substantial software components. Earned Value approaches emphasize a more comprehensive and typically longer-range plan, and tend to characterize frequent replans and reprioritizations as indicative of problems. Here we describe the planning, execution and reporting framework used by the LSST Data Management team, that navigates these opposite tensions.

Proceedings of SPIE | 2012

Data management cyberinfrastructure for the Large Synoptic Survey Telescope

D. M. Freemon; Kian-Tat Lim; Jacek Becla; Gregory P. Dubois-Felsman; Jeffrey C. Kantor

The Large Synoptic Survey Telescope (LSST) project is a proposed large-aperture, wide-field, ground-based telescope that will survey half the sky every few nights in six optical bands. LSST will produce a data set suitable for answering a wide range of pressing questions in astrophysics, cosmology, and fundamental physics. The 8.4-meter telescope will be located in the Andes mountains near La Serena, Chile. The 3.2 Gpixel camera will take 6.4 GB images every 15 seconds, resulting in 15 TB of new raw image data per night. An estimated 2 million transient alerts per night will be generated within 60 seconds of when the camera’s shutter closes. Processing such a large volume of data, converting the raw images into a faithful representation of the universe, automated data quality assessment, automated discovery of moving or transient sources, and archiving the results in useful form for a broad community of users is a major challenge. We present an overview of the planned computing infrastructure for LSST. The cyberinfrastructure required to support the movement, storing, processing, and serving of hundreds of petabytes of image and database data is described. We also review the sizing model that was developed to estimate the hardware requirements to support this environment beginning during project construction and continuing throughout the 10 years of operations.

arXiv: Databases | 2003

The Redesigned BaBar Event Store - Believe the Hype

Adeyemi Adesanya; Jacek Becla; Daniel L. Wang

As the BaBar experiment progresses, it produces new and unforeseen requirements and increasing demands on capacity and feature base. The current system is being utilized well beyond its original design specifications, and has scaled appropriately, maintaining data consistency and durability. The persistent event storage system has remained largely unchanged since the initial implementation, and thus includes many design features which have become performance bottlenecks. Programming interfaces were designed before sufficient usage information became available. Performance and efficiency were traded off for added flexibility to cope with future demands. With significant experience in managing actual production data under our belt, we are now in a position to recraft the system to better suit current needs. The Event Store redesign is intended to eliminate redundant features while adding new ones, increase overall performance, and contain the physical storage cost of the worlds largest database.

arXiv: Databases | 2003

On the Verge of One Petabyte - the Story Behind the BaBar Database System

Adeyemi Adesanya; Tofigh Azemoon; Jacek Becla; Andrew Hanushevsky; Adil Hasan; Wilko Kroeger; Artem Trunov; Daniel L. Wang; Igor Gaponenko; Simon J. Patton; D. R. Quarrie

Archive | 2000

Using NetLogger for distributed systems performance analysis of the BABAR data analysis system

Brian Tierney; Jacek Becla; Dan Gunter; Bob Jacobsen; D. R. Quarrie

Archive | 2007

Designing for Peta-Scale in the LSST Database

Jeffrey C. Kantor; Timothy S. Axelrod; Jacek Becla; Kem Holland Cook; Sergei Nikolaev; Jerry Gray; Raymond Louis Plante; Maria A. Nieto-santisteban; Alexander S. Szalay; Aniruddha R. Thakar

Explore More