Is this you? Create Your Porfile

Beth Plale

Indiana University Bloomington

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Beth Plale is active.

Explore More

Publication

Featured researches published by Beth Plale.

international conference on management of data | 2005

A survey of data provenance in e-science

Yogesh Simmhan; Beth Plale; Dennis Gannon

Data management is growing in complexity as large-scale applications take advantage of the loosely coupled resources brought together by grid middleware and by abundant storage capacity. Metadata describing the data products used in and generated by these applications is essential to disambiguate the data and enable reuse. Data provenance, one kind of metadata, pertains to the derivation history of a data product starting from its original sources.In this paper we create a taxonomy of data provenance characteristics and apply it to current research efforts in e-science, focusing primarily on scientific workflow approaches. The main aspect of our taxonomy categorizes provenance systems based on why they record provenance, what they describe, how they represent and store provenance, and ways to disseminate it. The survey culminates with an identification of open research problems in the field.

International Journal of Web Services Research | 2008

Karma2: Provenance Management for Data-Driven Workflows

Yogesh Simmhan; Beth Plale; Dennis Gannon

The increasing ability for the sciences to sense the world around us is resulting in a growing need for datadriven e-Science applications that are under the control of workflows composed of services on the Grid. The focus of our work is on provenance collection for these workflows that are necessary to validate the workflow and to determine quality of generated data products. The challenge we address is to record uniform and usable provenance metadata that meets the domain needs while minimizing the modification burden on the service authors and the performance overhead on the workflow engine and the services. The framework is based on generating discrete provenance activities during the lifecycle of a workflow execution that can be aggregated to form complex data and process provenance graphs that can span across workflows. The implementation uses a loosely coupled publish-subscribe architecture for propagating these activities, and the capabilities of the system satisfy the needs of detailed provenance collection. A performance evaluation of a prototype finds a minimal performance overhead (in the range of 1% for an eight-service workflow using 271 data products).

international conference on web services | 2006

A Framework for Collecting Provenance in Data-Centric Scientific Workflows

Yogesh Simmhan; Beth Plale; Dennis Gannon

The increasing ability for the Earth sciences to sense the world around us is resulting in a growing need for data-driven applications that are under the control of data-centric workflows composed of grid- and Web-services. The focus of our work is on provenance collection/or these workflows, necessary to validate the workflow and to determine quality of generated data products. The challenge we address is to record uniform and usable provenance metadata that meets the domain needs while minimizing the modification burden on the service authors and the performance overhead on the workflow engine and the services. The framework, based on a loosely-coupled publish-subscribe architecture for propagating provenance activities, satisfies the needs of detailed provenance collection while a performance evaluation of a prototype finds a minimal performance overhead (in the range of 1% for an eight service workflow using 271 data products)

Computing in Science and Engineering | 2005

Service-Oriented Environments for Dynamically Interacting with Mesoscale Weather

Kelvin K. Droegemeier; Dennis Gannon; Daniel A. Reed; Beth Plale; Jay Alameda; Tom Baltzer; Keith Brewster; Richard D. Clark; Ben Domenico; Sara J. Graves; Everette Joseph; Donald Murray; Mohan Ramamurthy; Lavanya Ramakrishnan; John A. Rushing; Daniel B. Weber; Robert B. Wilhelmson; Anne Wilson; Ming Xue; Sepideh Yalda

Within a decade after John von Neumann and colleagues conducted the first experimental weather forecast on the ENIAC computer in the late 1940s, numerical models of the atmosphere become the foundation of modern-day weather forecasting and one of the driving application areas in computer science. This article describes research that is enabling a major shift toward dynamically adaptive responses to rapidly changing environmental conditions.

IEEE Computer | 2006

CASA and LEAD: adaptive cyberinfrastructure for real-time multiscale weather forecasting

Beth Plale; Dennis Gannon; Jerry Brotzge; Kelvin K. Droegemeier; James F. Kurose; David J. McLaughlin; Robert B. Wilhelmson; Sara J. Graves; Mohan Ramamurthy; Richard D. Clark; Sepi Yalda; Daniel A. Reed; Everette Joseph; V. Chandrasekar

Two closely linked projects aim to dramatically improve storm forecasting speed and accuracy. CASA is creating a distributed, collaborative, adaptive sensor network of low-power, high-resolution radars that respond to user needs. LEAD offers dynamic workflow orchestration and data management in a Web services framework designed to support on-demand, real-time, dynamically adaptive systems

international provenance and annotation workshop | 2006

Performance evaluation of the karma provenance framework for scientific workflows

Yogesh Simmhan; Beth Plale; Dennis Gannon; Suresh Marru

Provenance about workflow executions and data derivations in scientific applications help estimate data quality, track resources, and validate in silico experiments. The Karma provenance framework provides a means to collect workflow, process, and data provenance from data-driven scientific workflows and is used in the Linked Environments for Atmospheric Discovery (LEAD) project. This article presents a performance analysis of the Karma service as compared against the contemporary PReServ provenance service. Our study finds that Karma scales exceedingly well for collecting and querying provenance records, showing linear or sub-linear scaling with increasing number of provenance records and clients when tested against workloads in the order of 10,000 application-service invocations and over 36 concurrent clients.

international conference on computational science | 2005

Towards dynamically adaptive weather analysis and forecasting in LEAD

Beth Plale; Dennis Gannon; Daniel A. Reed; Sara J. Graves; Kelvin K. Droegemeier; Bob Wilhelmson; Mohan K. Ramamurthy

LEAD is a large-scale effort to build a service-oriented infrastructure that allows atmospheric science researchers to dynamically and adaptively respond to weather patterns to produce better-than-real time predictions of tornadoes and other “mesoscale” weather events. In this paper we discuss an architectural framework that is forming our thinking about adaptability and give early solutions in workflow and monitoring.

IEEE Transactions on Parallel and Distributed Systems | 2003

Dynamic querying of streaming data with the dQUOB system

Beth Plale; Karsten Schwan

Data streaming has established itself as a viable communication abstraction in data-intensive parallel and distributed computations, occurring in applications such as scientific visualization, performance monitoring, and large-scale data transfer. A known problem in large-scale event communication is tailoring the data received at the consumer. It is the general problem of extracting data of interest from a data source, a problem that the database community has successfully addressed with SOL queries, a time tested, user-friendly way for noncomputer scientists to access data. By leveraging the efficiency of query processing provided by relational queries, the dQUOB system provides a conceptual relational data model and SOL query access over streaming data. Queries can be used to extract data, combine streams, and create new streams. The language augments queries with an action to enable more complex data transformations such as Fourier transforms. The dQUOB system has been applied to two large-scale distributed applications: a safety critical autonomous robotics simulation and scientific software visualization for global atmospheric transport modeling. In this paper, we present the dQUOB system and the results of performance evaluation undertaken to assess its applicability in data-intensive wide-area computations, where the benefit of portable data transformation must be evaluated against the cost of continuous query evaluation.

Proceedings of the IEEE | 2005

Building Grid Portal Applications From a Web Service Component Architecture

Dennis Gannon; Jay Alameda; Octav Chipara; Marcus Christie; Vinayak Dukle; Liang Fang; Matthew Farrellee; Gopi Kandaswamy; Deepti Kodeboyina; Sriram Krishnan; Charles W. Moad; Marlon E. Pierce; Beth Plale; Al Rossi; Yogesh Simmhan; Anuraag Sarangi; Aleksander Slominski; Satoshi Shirasuna; Thomas Thomas

This work describes an approach to building Grid applications based on the premise that users who wish to access and run these applications prefer to do so without becoming experts on Grid technology. We describe an application architecture based on wrapping user applications and application workflows as Web services and Web service resources. These services are visible to the users and to resource providers through a family of Grid portal components that can be used to configure, launch, and monitor complex applications in the scientific language of the end user. The applications in this model are instantiated by an application factory service. The layered design of the architecture makes it possible for an expert to configure an application factory service with a custom user interface client that may be dynamically loaded into the portal.

IEEE Internet Computing | 2005

Active management of scientific data

Beth Plale; Dennis Gannon; Jay Alameda; Bob Wilhelmson; Shawn Hampton; Al Rossi; Kelvin K. Droegemeier

Sophisticated data-distribution schemes and recent developments in sensors and instruments that can monitor the lower kilometers of the atmosphere at high levels of resolution have rapidly expanded the quantity of information available to mesoscale meteorology. The myLEAD personalized information-management tool helps geoscience users make sense of this vastly expanded information space. MyLEAD extends the general globus metadata catalog service and leverages a well-known general and extensible schema. Its orientation makes it an active player in large-scale distributed computation environments characterized by interacting grid and Web services.

Explore More