Gilberto Pastorello
Lawrence Berkeley National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gilberto Pastorello.
international conference on e-science | 2014
Gilberto Pastorello; Deborah A. Agarwal; Dario Papale; Taghrid Samak; Carlo Trotta; Alessio Ribeca; Cristina Poindexter; Boris Faybishenko; Dan Gunter; Rachel Hollowgrass; Eleonora Canfora
Observational data are fundamental for scientific research in almost any domain. Recent advances in sensor and data management technologies are enabling unprecedented amounts of observational data to be collected and analyzed. However, an essential part of using observational data is not currently as scalable as data collection and analysis methods: data quality assurance and control. While specialized tools for very narrow domains do exist, general methods are harder to create. This paper explores the identification of data issues that lead to the creation of data tests and tools to perform data quality control activities. Developing this identification step in a systematic manner allows for better and more general quality control tools. As our case study, we use carbon, water, and energy fluxes as well as micro-meteorological data collected at field sites that are part of FLUXNET, a network of over 400 ecosystem-level monitoring stations. In an effort toward the release of a new global data set of fluxes, we are doing data quality control for these data. The experience from this work led to the creation of a catalog of issues identified in the data. This paper presents this catalog and its generalization into a set of patterns of data quality issues that can be detected in observational data.
international conference on e-science | 2014
Lavanya Ramakrishnan; Sarah S. Poon; Valerie Hendrix; Daniel K. Gunter; Gilberto Pastorello; Deborah A. Agarwal
Scientific data volumes have been growing exponentially. This has resulted in the need for new tools that enable users to operate on and analyze data. Cyber infrastructure tools, including workflow tools, that have been developed in the last few years has often fallen short if user needs and suffered from lack of wider adoption. User-centered Design (UCD) process has been used as an effective approach to develop usable software with high adoption rates. However, UCD has largely been applied for user-interfaces and there has been limited work in applying UCD to application program interfaces and cyber infrastructure tools. We use an adapted version of UCD that we refer to as Scientist-Centered Design (SCD) to engage with users in the design and development of Tigres, a workflow application programming interface. Tigres provides a simple set of programming templates (e.g., sequence, parallel, split, merge) that can be can used to compose and execute computational and data transformation pipelines. In this paper, we describe Tigres and discuss our experiences with the use of UCD for the initial development of Tigres. Our experience-to-date is that the UCD process not only resulted in better requirements gathering but also heavily influenced the architecture design and implementation details. User engagement during the development of tools such as Tigres is critical to ensure usability and increase adoption.
international conference on e-science | 2017
Gilberto Pastorello; Dan Gunter; Housen Chu; Danielle Christianson; Carlo Trotta; Eleonora Canfora; Boris Faybishenko; You-Wei Cheah; Norm Beekwilder; Stephen Chan; Sigrid Dengel; Trevor F. Keenan; Fianna O'Brien; Abdelrahman Elbashandy; Cristina Poindexter; Marty Humphrey; Dario Papale; Deborah A. Agarwal
Data quality control is one of the most time consuming activities within Research Infrastructures (RIs), especially when involving observational data and multiple data providers. In this work we report on our ongoing development of data rogues, a scalable approach to manage data quality issues for observational data within RIs. The motivation for this work started with the creation of the FLUXNET2015 dataset, which includes carbon, water, and energy fluxes plus micrometeorological and ancillary data measured in over 200 sites around the world. To create an uniform dataset, including derived data products, extensive work on data quality control was needed. The unpredictable nature of observational data quality issues makes the automation of data quality control inherently difficult. Developed based on this experience, the data rogues methodology allows for increased automation of quality control activities by systematically identifying, cataloging, and documenting implementations of solutions to data issues. We believe this methodology can be extended and applied to others domains and types of data, making the automation of data quality control a more tractable problem.
Ecological Informatics | 2017
Danielle Christianson; Charuleka Varadharajan; Bradley Christoffersen; Matteo Detto; Boris Faybishenko; Bruno O. Gimenez; Val Hendrix; K. Jardine; Robinson I. Negrón-Juárez; Gilberto Pastorello; Thomas L. Powell; Megha Sandesh; Jeffrey M. Warren; Brett T. Wolfe; Jeffrey Q. Chambers; Lara M. Kueppers; Nate G. McDowell; Deborah A. Agarwal
Abstract Metadata describe the ancillary information needed for data preservation and independent interpretation, comparison across heterogeneous datasets, and quality assessment and quality control (QA/QC). Environmental observations are vastly diverse in type and structure, can be taken across a wide range of spatiotemporal scales in a variety of measurement settings and approaches, and saved in multiple formats. Thus, well-organized, consistent metadata are required to produce usable data products from diverse environmental observations collected across field sites. However, existing metadata reporting protocols do not support the complex data synthesis and model-data integration needs of interdisciplinary earth system research. We developed a metadata reporting framework (FRAMES) to enable management and synthesis of observational data that are essential in advancing a predictive understanding of earth systems. FRAMES utilizes best practices for data and metadata organization enabling consistent data reporting and compatibility with a variety of standardized data protocols. We used an iterative scientist-centered design process to develop FRAMES, resulting in a data reporting format that incorporates existing field practices to maximize data-entry efficiency. Thus, FRAMES has a modular organization that streamlines metadata reporting and can be expanded to incorporate additional data types. With FRAMESs multi-scale measurement position hierarchy, data can be reported at observed spatial resolutions and then easily aggregated and linked across measurement types to support model-data integration. FRAMES is in early use by both data originators (persons generating data) and consumers (persons using data and metadata). In this paper, we describe FRAMES, identify lessons learned, and discuss areas of future development.
Global Change Biology | 2016
Craig A. Emmerton; Vincent L. St. Louis; Elyn R. Humphreys; John A. Gamon; Joel D. Barker; Gilberto Pastorello
Remote Sensing | 2016
Ran Wang; John A. Gamon; Craig A. Emmerton; Haitao Li; Enrica Nestola; Gilberto Pastorello; Olaf Menzer
2014 AGU Fall Meeting | 2013
Olaf Menzer; Gilberto Pastorello; Stefan Metzger; Cristina Poindexter; Deb Agarwal; Dario Papale
Archive | 2018
Shyue Ping Ong; Dan Gunter; Will Richards; shreddd; Anubhav Jain; gmatteo; Patrick Huck; Donny Winston; montegoode; Shyam Dwaraknath; Gilberto Pastorello; Brandon Bocklund; Zhi Deng; Miguel Dias Costa; Hanmei Tang
Archive | 2018
Boris Faybishenko; Steve Paton; Thomas L. Powell; Ryan G. Knox; Gilberto Pastorello; Charuleka Varadharajan; Danielle Christianson; Deb Agarwal
EPIC32017 Joint NACP and Ameriflux Principal Investigators Meeting, Bethesda North Marriott Hotel & Conference Center North Bethesda, MD, 2017-03-27-2017-03-30North Bethesda, MD | 2017
Stefan Metzger; Deborah A. Agarwal; Sebastian BIraud; Ankur R. Desai; David Durden; Jörg Hartmann; Jiahong Li; Hongyan Luo; Dario Papale; Gilberto Pastorello; Natchaya Pingintha-Durden; Torsten Sachs; Andrei Serafimovic; Cove Sturtevant; Margaret Torn; Ke Xu