Karen L. Schuchardt
Pacific Northwest National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Karen L. Schuchardt.
challenges of large applications in distributed environments | 2004
J.D. Myers; Thomas C. Allison; Sandra Bittner; Brett T. Didier; Michael Frenklach; William H. Green; Y.-L. Ho; John C. Hewson; Wendy S. Koegler; L. Lansing; David Leahy; M. Lee; R. McCoy; Michael Minkoff; Sandeep Nijsure; G. von Laszewski; David W. Montoya; Carmen M. Pancerella; Reinhardt E. Pinzon; William J. Pitz; Larry A. Rahn; Branko Ruscic; Karen L. Schuchardt; Eric G. Stephan; Albert F. Wagner; Theresa L. Windus; Christine L. Yang
The Collaboratory for Multi-scale Chemical Science (CMCS) is developing a powerful informatics-based approach to synthesizing multi-scale information in support of systems-based research and is applying it within combustion science. An open source multi-scale informatics toolkit is being developed that addresses a number of issues core to the emerging concept of knowledge grids including provenance tracking and lightweight federation of data and application resources into cross-scale information flows. The CMCS portal is currently in use by a number of high-profile pilot groups and is playing a significant role in enabling their efforts to improve and extend community maintained chemical reference information.
international parallel and distributed processing symposium | 2013
Tekin Bicer; Jian Yin; David Chiu; Gagan Agrawal; Karen L. Schuchardt
Compute cycles in high performance systems are increasing at a much faster pace than both storage and wide-area bandwidths. To continue improving the performance of large-scale data analytics applications, compression has therefore become promising approach. In this context, this paper makes the following contributions. First, we develop a new compression methodology, which exploits the similarities between spatial and/or temporal neighbors in a popular climate simulation dataset and enables high compression ratios and low decompression costs. Second, we develop a framework that can be used to incorporate a variety of compression and decompression algorithms. This framework also supports a simple API to allow integration with an existing application or data processing middleware. Once a compression algorithm is implemented, this framework automatically mechanizes multi-threaded retrieval, multi-threaded data decompression, and the use of informed prefetching and caching. By integrating this framework with a data-intensive middleware, we have applied our compression methodology and framework to three applications over two datasets, including the Global Cloud-Resolving Model (GCRM) climate dataset. We obtained an average compression ratio of 51.68%, and up to 53.27% improvement in execution time of data analysis applications by amortizing I/O time by moving compressed data.
Concurrency and Computation: Practice and Experience | 2002
Karen L. Schuchardt; Brett T. Didier; Gary D. Black
The Extensible Computational Chemistry Environment (Ecce), an innovative problem‐solving environment, was designed a decade ago, before the emergence of the Web and Grid computing services. In this paper, we briefly examine the original Ecce architecture and discuss how it is evolving to incorporate both Grid services and components of the Web to increase its range of services, reduce deployment and maintenance costs, and reach a wider audience. We show that Ecce operates in both Grid and non‐Grid environments, an important consideration given Ecces broad range of uses and user community, and discuss the strategies for loosely coupled components that make this possible. Both in‐progress work and conceptual plans for how Ecce will evolve are presented. Copyright
intelligent user interfaces | 2002
George Chin; L. Ruby Leung; Karen L. Schuchardt; Deborah K. Gracio
Computer and computational scientists at Pacific Northwest National Laboratory (PNNL) are studying and designing collaborative problem solving environments (CPSEs) for scientific computing in various domains. Where most scientific computing efforts focus at the level of the scientific codes, file systems, data archives, and networked computers, our analysis and design efforts are aimed at developing enabling technologies that are directly meaningful and relevant to domain scientist at the level of the practice and the science. We seek to characterize the nature of scientific problem solving and look for innovative ways to improve it. Moreover, we aim to glimpse beyond current systems and technical limitations to derive a design that expresses the scientists own perspective on research activities, processes, and resources. The product of our analysis and design work is a conceptual scientific CPSE prototype that specifies a complete simulation and modeling user environment and a suite of high-level problem solving tools.
Computing in Science and Engineering | 2012
Ian Gorton; Chandrika Sivaramakrishnan; Gary D. Black; Signe K. White; Sumit Purohit; Carina S. Lansing; Michael C. Madison; Karen L. Schuchardt; Yan Liu
Velo is a reusable, domain-independent knowledge-management infrastructure for modeling and simulation. Velo leverages, integrates, and extends Web-based open source collaborative and data-management technologies to create a scalable and flexible core platform tailored to specific scientific domains. As the examples here describe, Velo has been used in both the carbon sequestration and climate modeling domains.
international conference on computational science | 2003
Gary D. Black; Karen L. Schuchardt; Deborah K. Gracio; Bruce J. Palmer
The Extensible Computational Chemistry Environment (Ecce) is a suite of distributed applications that are integrated as a comprehensive problem solving environment for computational chemistry. Ecce provides scientists with an easily used graphical user interface to the tasks of setting up complex molecular modeling calculations, distributed use of high performance computers, and scientific visualization and analysis. Ecces flexible, standards-based architecture is an extensible framework that represents a significant milestone in production systems, both in the field of computational chemistry and problem solving environment research. Its base problem solving architecture components and concepts are applicable to problem solving environments beyond the computational chemistry domain.
Concurrency and Computation: Practice and Experience | 2007
Karen L. Schuchardt; Carmen M. Pancerella; Larry A. Rahn; Brett T. Didier; Deepti Kodeboyina; David J. Leahy; James D. Myers; Oluwayemisi O. Oluwole; William J. Pitz; Branko Ruscic; Jing Song; Gregor von Laszewski; Christine L. Yang
The Knowledge Environment for Collaborative Science (KnECS) is an open‐source informatics toolkit designed to enable knowledge Grids that interconnect science communities, unique facilities, data, and tools. KnECS features a Web portal with team and data collaboration tools, lightweight federation of data, provenance tracking, and multi‐level support for application integration. We identify the capabilities of KnECS and discuss extensions from the Collaboratory for Multi‐Scale Chemical Sciences (CMCS) which enable diverse combustion science communities to create and share verified, documented data sets and reference data, thereby demonstrating new methods of community interaction and data interoperability required by systems science approaches. Finally, we summarize the challenges we encountered and foresee for knowledge environments. Copyright
high performance distributed computing | 2001
Karen L. Schuchardt; James D. Myers; Eric G. Stephan
Next-generation problem solving environments (PSEs) promise significant advances over those now available. They will span scientific disciplines and incorporate collaboration capabilities. They will host feature-detection and other agents, allow data mining and pedigree tracking, and provide access from a wide range of devices. Fundamental changes in PSE architecture are required to realize these and other PSE goals. This paper focuses specifically on issues related to data management and recommends an approach based on open, metadata-driven repositories with loosely defined, dynamic schemas. Benefits of this approach are discussed and the redesign of the Extensible Computational Chemistry Environments (Ecce) data storage architecture to use such a repository is described, based on the distributed authoring and versioning (DAV) standard. The suitability of DAV for scientific data, the mapping of the Ecce schema to DAV, and promising initial results are presented.
Environmental Modelling and Software | 2011
Bruce J. Palmer; Annette Koontz; Karen L. Schuchardt; Ross Heikes; David A. Randall
Execution of a Global Cloud Resolving Model (GCRM) at target resolutions of 2-4 km will generate, at a minimum, 10s of Gigabytes of data per variable per snapshot. Writing this data to disk, without creating a serious bottleneck in the execution of the GCRM code, while also supporting efficient post-execution data analysis is a significant challenge. This paper discusses an Input/Output (IO) application programmer interface (API) for the GCRM that efficiently moves data from the model to disk while maintaining support for community standard formats, avoiding the creation of very large numbers of files, and supporting efficient analysis. Several aspects of the API will be discussed in detail. First, we discuss the output data layout which linearizes the data in a consistent way that is independent of the number of processors used to run the simulation and provides a convenient format for subsequent analyses of the data. Second, we discuss the flexible API interface that enables modelers to easily add variables to the output stream by specifying where in the GCRM code these variables are located and to flexibly configure the choice of outputs and distribution of data across files. The flexibility of the API is designed to allow model developers to add new data fields to the output as the model develops and new physics is added. It also provides a mechanism for allowing users of the GCRM code to adjust the output frequency and the number of fields written depending on the needs of individual calculations. Third, we describe the mapping to the NetCDF data model with an emphasis on the grid description. Fourth, we describe our messaging algorithms and IO aggregation strategies that are used to achieve high bandwidth while simultaneously writing concurrently from many processors to shared files. We conclude with initial performance results.
Cluster Computing | 2002
Karen L. Schuchardt; James D. Myers; Eric G. Stephan
Next-generation problem-solving environments (PSEs) promise significant advances over those now available. They will span scientific disciplines and incorporate collaboration capabilities. They will host feature-detection and other agents, allow data mining and pedigree tracking, and provide access from a wide range of devices. Fundamental changes in PSE architecture are required to realize these and other PSE goals. This paper focuses specifically on issues related to data management and recommends an approach based on open, metadata-driven repositories with loosely defined, dynamic schemas. Benefits of this approach are discussed, and the redesign of the Extensible Computational Chemistry Environments (Ecce) data storage architecture to use such a repository is described, based on the distributed authoring and versioning (DAV) standard. The suitability of DAV for scientific data, the mapping of the Ecce schema to DAV, and promising initial results are presented.