Is this you? Create Your Porfile

Ron Musick

Lawrence Livermore National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ron Musick is active.

Explore More

Publication

Featured researches published by Ron Musick.

international conference on management of data | 1999

Practical lessons in supporting large-scale computational science

Ron Musick; Terence Critchlow

Business needs have driven the development of commercial database systems since their inception. As a result, there has been a strong focus on supporting many users, minimizing the potential corruption or loss of data, and maximizing performance metrics such as transactions-per-second and benchmark results [Gra93]. These goals have little to do with supporting business intelligence needs such as the decision support and data mining activities common in on-line analytic processing (OLAP) applications. As a result, business data are typically off-loaded to secondary systems before these activities occur. In addition, they have little to do with the needs of the scientific community, which typically revolve around a great deal of compute and I/O intensive analysis, often over large data with high dimensionality. For scientific data, in many cases the data was never collected in a DBMS in the first place, and so the analysis and visualization take place over specialized flat-file formats. This is a painful solution, because a DBMS has much to offer in the overall process of managing and exploring data.

international conference of the ieee engineering in medicine and biology society | 2000

DataFoundry: information management for scientific data

Terence Critchlow; Krzysztof Fidelis; Madhavan Ganesh; Ron Musick; Tom Slezak

Data warehouses and data marts have been successfully applied to a multitude of commercial business applications. They have proven to be invaluable tools by integrating information from distributed, heterogeneous sources and summarizing this data for use throughout the enterprise. Although the need for information dissemination is as vital in science as in business, working warehouses in this community are scarce because traditional warehousing techniques do not transfer to scientific environments. There are two primary reasons for this difficulty. First, schema integration is more difficult for scientific databases than for business sources because of the complexity of the concepts and the associated relationships. Second, scientific data sources have highly dynamic data representations (schemata). When a data source participating in a warehouse changes its schema, both the mediator transferring data to the warehouse and the warehouse itself need to be updated to reflect these modifications. The cost of repeatedly performing these updates in a traditional warehouse, as is required in a dynamic environment, is prohibitive. The paper discusses these issues within the context of the DataFoundry project, an ongoing research effort at Lawrence Livermore National Laboratory. DataFoundry utilizes a unique integration strategy to identify corresponding instances while maintaining differences between data from different sources, and a novel architecture and an extensive meta-data infrastructure, which reduce the cost of maintaining a warehouse.

cooperative information systems | 1998

Meta-data based mediator generation

Terence Critchlow; Madhavan Ganesh; Ron Musick

Mediators are a critical component of any data warehouse; they transform data from source formats to the warehouse representation while resolving semantic and syntactic conflicts. The close relationship between mediators and databases requires a mediator to be updated whenever an associated schema is modified. Failure to quickly perform these updates significantly reduces the reliability of the warehouse because queries do not have access to the most current data. This may result in incorrect or misleading responses, and reduce user confidence in the warehouse. Unfortunately, this maintenance may be a significant undertaking if a warehouse integrates several dynamic data sources. This paper describes a meta-data framework, and associated software, designed to automate a significant portion of the mediator generation task and thereby reduce the effort involved in adapting to schema changes. By allowing the DBA to concentrate on identifying the modifications at a high level, instead of reprogramming the mediator, turnaround time is reduced and warehouse reliability is improved.

acm/ieee joint conference on digital libraries | 2001

Approximate ad-hoc query engine for simulation data

Ghaleb Abdulla; Chuck Baldwin; Terence Critchlow; Roy Kamimura; Ida Lozares; Ron Musick; Nu Ai Tang; Byung Suk Lee; Robert R. Snapp

In this paper, we describe AQSim, an ongoing effort to design and impl ement a system to manage terabytes of scientific simulation data. The goal of this project is to reduce data storage requirements and access times while permitting ad-hoc queries using statistical and mathematical models of the data. In order to facilitate data exchange between models based on different representations, we are evaluating using the ASCI common data model that is comprised of several layers of increasing semantic complexity. To support queries over the spatial-temporal mesh structured data we are in the process of defining and implementing a grammar for MeshSQL

uncertainty in artificial intelligence | 1993

Minimal assumption distribution propagation in belief networks

Ron Musick

As belief networks are used to model increasingly complex situations, the need to automatically construct them from large databases will become paramount. This paper concentrates on solving a part of the belief network induction problem: that of learning the quantitative structure (the conditional probabilities), given the qualitative structure. In particular, a theory is presented that shows how to propagate inference distributions in a belief network, with the only assumption being that the given qualitative structure is correct. Most inference algorithms must make at least this assumption. The theory is based on four network transformations that are sufficient for any inference in a belief network. Furthermore, the claim is made that contrary to popular belief, error will not necessarily grow as the inference chain grows. Instead, for QBN belief nets induced from large enough samples, the error is more likely to decrease as the size of the inference chain increases.

Information Sciences | 2004

MeshSQL: the query language for simulation mesh data

Byung Suk Lee; Ron Musick

Mesh data has been a common form of data produced and searched in scientific simulations, and has been growing rapidly in the size thanks to the increasing computing power. Today, there are visualization tools that assist scientists to explore and examine the data, but their query capabilities are limited to a small set of fixed visualization operations, which is far too short to meet the needs of most users. Thus, it is imperative to provide ad hoc query tools for them.In this paper, we propose an ad hoc query language MeshSQL, which has been extended from ANSI SQL99 to support the features unique to simulation mesh data, such as temporality, spatial regions, statistics, and similarity. After classifying Mesh-SQL queries based on three criteria related to efficient implementations of the queries, we present the syntax and semantics of MeshSQL, and support them with examples. We also discuss implementing MeshSQL queries in SQL99 in an object-relational database system that allows incorporating user-defined types and functions. To our knowledge, MeshSQL is the first and the only query language for simulation mesh data.

database systems for advanced applications | 2001

Toward a query language on simulation mesh data: an object-oriented approach

Byung Suk Lee; Robert R. Snapp; Ron Musick

As simulation is gaining popularity as an inexpensive means of experimentation in diverse fields of industry and government, the attention to the data generated by scientific simulation is also increasing. Scientific simulation generates mesh data, i.e. data configured in a grid structure, in a sequence of time steps. Its model is complex - understanding it involves mathematical topology and geometry in addition to fields (in the relational sense). Moreover, there is no query language developed on mesh data at all. We develop a comprehensive model of mesh data in an object-oriented manner, propose a set of primitive algebraic operators, show their object-oriented implementation and demonstrate that the well-known object query language OQL (from the ODMG) is powerful enough to express queries on mesh data, whether the queries are on a mesh topology, geometry, fields, or a combination of them. Finally, we discuss some physical implementation issues that are pertinent to executing queries efficiently.

Information Sciences | 2003

The framework for approximate queries on simulation data

Byung Suk Lee; Terence Critchlow; Ghaleb Abdulla; Chuck Baldwin; Roy Kamimura; Ron Musick; Robert R. Snapp; Nu Ai Tang

AQSim is a system intended to enable scientists to query and analyze a large volume of scientific simulation data. The system uses the state of the art in approximate query processing techniques to build a novel framework for progressive data analysis. These techniques are used to define a multi-resolution index, where each node contains multiple models of the data. The benefits of these models are twofold: (1) they have compact representations, reconstructing only the information relevant to the analysis, and (2) the variety of models capture different aspects of the data which may be of interest to the user but are not readily apparent in their raw form. To be able to deal with the data interactively, AQSim allows the scientist to make an informed tradeoff between query response accuracy and time. In this paper, we present the framework of AQSim with a focus on its architectural design. We also show the results from an initial proof-of-concept prototype developed at LLNL. The presented framework is generic enough to handle more than just simulation data.

Information Sciences | 2001

Experiences applying meta-data to bioinformatics

Terence Critchlow; Ron Musick; Tom Slezak

Abstract Bioinformatics is facing the daunting challenge of providing geneticists and biologists effective, efficient access to data currently distributed among dynamic, heterogeneous data sources. Complicating the problem is the speed at which the underlying science and technology evolve, leaving the terminology, databases and interfaces to catch up. As the genomics community moves from sequences to functional genomics, the pressure to find a solution is increasing. Realistically addressing this problem, whether through a data warehouse, multi-database, federated database, or other approach, requires development of an scalable, flexible infrastructure that can quickly adapt to meet user needs in this extremely dynamic environment. This is best accomplished by extensively using meta-data to reduce the applications maintenance costs. Using the DataFoundry project as an example, this paper discusses the first steps and practical problems of developing a meta-data-based infrastructure capable of meeting the demands of an active scientific community. It also demonstrates how much bioinformatics must still progress before it can truly satisfy its users.

international conference on machine learning | 1993