Is this you? Create Your Porfile

Chad Berkley

University of California, Santa Barbara

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chad Berkley is active.

Explore More

Publication

Featured researches published by Chad Berkley.

statistical and scientific database management | 2004

Kepler: an extensible system for design and execution of scientific workflows

Ilkay Altintas; Chad Berkley; Efrat Jaeger; Matthew Jones; Bertram Ludäscher; Steve Mock

Most scientists conduct analyses and run models in several different software and hardware environments, mentally coordinating the export and import of data from one environment to another. The Kepler scientific workflow system provides domain scientists with an easy-to-use yet powerful system for capturing scientific workflows (SWFs). SWFs are a formalization of the ad-hoc process that a scientist may go through to get from raw data to publishable results. Kepler attempts to streamline the workflow creation and execution process so that scientists can design, execute, monitor, re-run, and communicate analytical procedures repeatedly with minimal effort. Kepler is unique in that it seamlessly combines high-level workflow design with execution and runtime interaction, access to local and remote data, and local and remote service invocation. SWFs are superficially similar to business process workflows but have several challenges not present in the business workflow scenario. For example, they often operate on large, complex and heterogeneous data, can be computationally intensive and produce complex derived data products that may be archived for use in reparameterized runs or other workflows. Moreover, unlike business workflows, SWFs are often dataflow-oriented as witnessed by a number of recent academic systems (e.g., DiscoveryNet, Taverna and Triana) and commercial systems (Scitegic/Pipeline-Pilot, Inforsense). In a sense, SWFs are often closer to signal-processing and data streaming applications than they are to control-oriented business workflow applications.

IEEE Internet Computing | 2001

Managing scientific metadata

Matthew Jones; Chad Berkley; Jivka Bojilova; Mark Schildhauer

Metacat is a network-enabled database framework that lets users store, query, and retrieve XML documents with arbitrary schemas in SQL-compliant relational database systems. The system (available from the Knowledge Network for Biocomplexity, http://knb.ecoinformatics.org/) incorporates RDF-like methods for packaging data sets to allow researchers to customize and revise their metadata. It is extensible and flexible enough to preserve utility and interpretability working with future content standards. Metacat solves several key challenges that impede data confederation efforts in ecological research, or any field in which independent agencies collect heterogeneous data that they wish to control locally while enabling networked access. This distributed solution integrates with existing site infrastructures because it works with any SQL-compliant database system. The frameworks open-source based components are widely available, and individual sites can extend and customize the system to support their data and metadata needs.

statistical and scientific database management | 2001

Metacat: a schema-independent XML database system

Chad Berkley; Matthew Jones; Jivka Bojilova; Daniel David Higgins

The ecological sciences represent a challenging community from the perspective of scientific data management. Ecological data are collected by investigators who are spread out over a large geographic area and who use a wide variety of research protocols and data-handling techniques. The resulting heterogeneous data are stored in autonomous database systems that are dispersed throughout the ecological community. The Knowledge Network for Biocomplexity is seeking to address these issues through the use of structured metadata encoded in the Extensible Markup Language (XML). The main goal of this project has been to design and implement a schema-independent data storage system for XML which is called Metacat. Metacat uses a hybrid XML storage approach using a commercial relational DBMS back-end while still allowing any arbitrary XML document to be stored. This paper describes the Metacat XML data storage system and its relevance to scientific data management in the ecological sciences.

ieee international conference on escience | 2008

A High-Level Distributed Execution Framework for Scientific Workflows

Jianwu Wang; Ilkay Altintas; Chad Berkley; Lucas Gilbert; Matthew Jones

Domain scientists synthesize different data and computing resources to solve their scientific problems. Making use of distributed execution within scientific workflows is a growing and promising way to achieve better execution performance and efficiency. This paper presents a high-level distributed execution framework, which is designed based on the distributed execution requirements identified within the Kepler community. It also discusses mechanisms to make the presented distributed execution framework easy-to-use, comprehensive, adaptable, extensible and efficient.

Concurrency and Computation: Practice and Experience | 2006