Is this you? Create Your Porfile

Karen L. Schuchardt

Pacific Northwest National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karen L. Schuchardt is active.

Explore More

Publication

Featured researches published by Karen L. Schuchardt.

challenges of large applications in distributed environments | 2004

A collaborative informatics infrastructure for multi-scale science

J.D. Myers; Thomas C. Allison; Sandra Bittner; Brett T. Didier; Michael Frenklach; William H. Green; Y.-L. Ho; John C. Hewson; Wendy S. Koegler; L. Lansing; David Leahy; M. Lee; R. McCoy; Michael Minkoff; Sandeep Nijsure; G. von Laszewski; David W. Montoya; Carmen M. Pancerella; Reinhardt E. Pinzon; William J. Pitz; Larry A. Rahn; Branko Ruscic; Karen L. Schuchardt; Eric G. Stephan; Albert F. Wagner; Theresa L. Windus; Christine L. Yang

The Collaboratory for Multi-scale Chemical Science (CMCS) is developing a powerful informatics-based approach to synthesizing multi-scale information in support of systems-based research and is applying it within combustion science. An open source multi-scale informatics toolkit is being developed that addresses a number of issues core to the emerging concept of knowledge grids including provenance tracking and lightweight federation of data and application resources into cross-scale information flows. The CMCS portal is currently in use by a number of high-profile pilot groups and is playing a significant role in enabling their efforts to improve and extend community maintained chemical reference information.

international parallel and distributed processing symposium | 2013

Integrating Online Compression to Accelerate Large-Scale Data Analytics Applications

Tekin Bicer; Jian Yin; David Chiu; Gagan Agrawal; Karen L. Schuchardt

Compute cycles in high performance systems are increasing at a much faster pace than both storage and wide-area bandwidths. To continue improving the performance of large-scale data analytics applications, compression has therefore become promising approach. In this context, this paper makes the following contributions. First, we develop a new compression methodology, which exploits the similarities between spatial and/or temporal neighbors in a popular climate simulation dataset and enables high compression ratios and low decompression costs. Second, we develop a framework that can be used to incorporate a variety of compression and decompression algorithms. This framework also supports a simple API to allow integration with an existing application or data processing middleware. Once a compression algorithm is implemented, this framework automatically mechanizes multi-threaded retrieval, multi-threaded data decompression, and the use of informed prefetching and caching. By integrating this framework with a data-intensive middleware, we have applied our compression methodology and framework to three applications over two datasets, including the Global Cloud-Resolving Model (GCRM) climate dataset. We obtained an average compression ratio of 51.68%, and up to 53.27% improvement in execution time of data analysis applications by amortizing I/O time by moving compressed data.

Concurrency and Computation: Practice and Experience | 2002

Ecce - A Problem Solving Environment's Evolution Toward Grid Services and a Web Architecture

Karen L. Schuchardt; Brett T. Didier; Gary D. Black

The Extensible Computational Chemistry Environment (Ecce), an innovative problem‐solving environment, was designed a decade ago, before the emergence of the Web and Grid computing services. In this paper, we briefly examine the original Ecce architecture and discuss how it is evolving to incorporate both Grid services and components of the Web to increase its range of services, reduce deployment and maintenance costs, and reach a wider audience. We show that Ecce operates in both Grid and non‐Grid environments, an important consideration given Ecces broad range of uses and user community, and discuss the strategies for loosely coupled components that make this possible. Both in‐progress work and conceptual plans for how Ecce will evolve are presented. Copyright

intelligent user interfaces | 2002

New paradigms in problem solving environments for scientific computing

George Chin; L. Ruby Leung; Karen L. Schuchardt; Deborah K. Gracio

Computer and computational scientists at Pacific Northwest National Laboratory (PNNL) are studying and designing collaborative problem solving environments (CPSEs) for scientific computing in various domains. Where most scientific computing efforts focus at the level of the scientific codes, file systems, data archives, and networked computers, our analysis and design efforts are aimed at developing enabling technologies that are directly meaningful and relevant to domain scientist at the level of the practice and the science. We seek to characterize the nature of scientific problem solving and look for innovative ways to improve it. Moreover, we aim to glimpse beyond current systems and technical limitations to derive a design that expresses the scientists own perspective on research activities, processes, and resources. The product of our analysis and design work is a conceptual scientific CPSE prototype that specifies a complete simulation and modeling user environment and a suite of high-level problem solving tools.

Computing in Science and Engineering | 2012

Velo: A Knowledge-Management Framework for Modeling and Simulation

Ian Gorton; Chandrika Sivaramakrishnan; Gary D. Black; Signe K. White; Sumit Purohit; Carina S. Lansing; Michael C. Madison; Karen L. Schuchardt; Yan Liu

Velo is a reusable, domain-independent knowledge-management infrastructure for modeling and simulation. Velo leverages, integrates, and extends Web-based open source collaborative and data-management technologies to create a scalable and flexible core platform tailored to specific scientific domains. As the examples here describe, Velo has been used in both the carbon sequestration and climate modeling domains.

international conference on computational science | 2003

The extensible computational chemistry environment: a problem solving environment for high performance theoretical chemistry

Gary D. Black; Karen L. Schuchardt; Deborah K. Gracio; Bruce J. Palmer

The Extensible Computational Chemistry Environment (Ecce) is a suite of distributed applications that are integrated as a comprehensive problem solving environment for computational chemistry. Ecce provides scientists with an easily used graphical user interface to the tasks of setting up complex molecular modeling calculations, distributed use of high performance computers, and scientific visualization and analysis. Ecces flexible, standards-based architecture is an extensible framework that represents a significant milestone in production systems, both in the field of computational chemistry and problem solving environment research. Its base problem solving architecture components and concepts are applicable to problem solving environments beyond the computational chemistry domain.

Concurrency and Computation: Practice and Experience | 2007

Portal-based knowledge environment for collaborative science.

Karen L. Schuchardt; Carmen M. Pancerella; Larry A. Rahn; Brett T. Didier; Deepti Kodeboyina; David J. Leahy; James D. Myers; Oluwayemisi O. Oluwole; William J. Pitz; Branko Ruscic; Jing Song; Gregor von Laszewski; Christine L. Yang

The Knowledge Environment for Collaborative Science (KnECS) is an open‐source informatics toolkit designed to enable knowledge Grids that interconnect science communities, unique facilities, data, and tools. KnECS features a Web portal with team and data collaboration tools, lightweight federation of data, provenance tracking, and multi‐level support for application integration. We identify the capabilities of KnECS and discuss extensions from the Collaboratory for Multi‐Scale Chemical Sciences (CMCS) which enable diverse combustion science communities to create and share verified, documented data sets and reference data, thereby demonstrating new methods of community interaction and data interoperability required by systems science approaches. Finally, we summarize the challenges we encountered and foresee for knowledge environments. Copyright

high performance distributed computing | 2001

Open data management solutions for problem solving environments: application of distributed authoring and versioning to the Extensible Computational Chemistry Environment

Karen L. Schuchardt; James D. Myers; Eric G. Stephan

Next-generation problem solving environments (PSEs) promise significant advances over those now available. They will span scientific disciplines and incorporate collaboration capabilities. They will host feature-detection and other agents, allow data mining and pedigree tracking, and provide access from a wide range of devices. Fundamental changes in PSE architecture are required to realize these and other PSE goals. This paper focuses specifically on issues related to data management and recommends an approach based on open, metadata-driven repositories with loosely defined, dynamic schemas. Benefits of this approach are discussed and the redesign of the Extensible Computational Chemistry Environments (Ecce) data storage architecture to use such a repository is described, based on the distributed authoring and versioning (DAV) standard. The suitability of DAV for scientific data, the mapping of the Ecce schema to DAV, and promising initial results are presented.

Environmental Modelling and Software | 2011

Efficient data IO for a Parallel Global Cloud Resolving Model

Bruce J. Palmer; Annette Koontz; Karen L. Schuchardt; Ross Heikes; David A. Randall

Execution of a Global Cloud Resolving Model (GCRM) at target resolutions of 2-4 km will generate, at a minimum, 10s of Gigabytes of data per variable per snapshot. Writing this data to disk, without creating a serious bottleneck in the execution of the GCRM code, while also supporting efficient post-execution data analysis is a significant challenge. This paper discusses an Input/Output (IO) application programmer interface (API) for the GCRM that efficiently moves data from the model to disk while maintaining support for community standard formats, avoiding the creation of very large numbers of files, and supporting efficient analysis. Several aspects of the API will be discussed in detail. First, we discuss the output data layout which linearizes the data in a consistent way that is independent of the number of processors used to run the simulation and provides a convenient format for subsequent analyses of the data. Second, we discuss the flexible API interface that enables modelers to easily add variables to the output stream by specifying where in the GCRM code these variables are located and to flexibly configure the choice of outputs and distribution of data across files. The flexibility of the API is designed to allow model developers to add new data fields to the output as the model develops and new physics is added. It also provides a mechanism for allowing users of the GCRM code to adjust the output frequency and the number of fields written depending on the needs of individual calculations. Third, we describe the mapping to the NetCDF data model with an emphasis on the grid description. Fourth, we describe our messaging algorithms and IO aggregation strategies that are used to achieve high bandwidth while simultaneously writing concurrently from many processors to shared files. We conclude with initial performance results.

Cluster Computing | 2002

A Web-Based Data Architecture for Problem-Solving Environments: Application of Distributed Authoring and Versioning to the Extensible Computational Chemistry Environment

Karen L. Schuchardt; James D. Myers; Eric G. Stephan

Next-generation problem-solving environments (PSEs) promise significant advances over those now available. They will span scientific disciplines and incorporate collaboration capabilities. They will host feature-detection and other agents, allow data mining and pedigree tracking, and provide access from a wide range of devices. Fundamental changes in PSE architecture are required to realize these and other PSE goals. This paper focuses specifically on issues related to data management and recommends an approach based on open, metadata-driven repositories with loosely defined, dynamic schemas. Benefits of this approach are discussed, and the redesign of the Extensible Computational Chemistry Environments (Ecce) data storage architecture to use such a repository is described, based on the distributed authoring and versioning (DAV) standard. The suitability of DAV for scientific data, the mapping of the Ecce schema to DAV, and promising initial results are presented.

Explore More