Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Todd O. Elsethagen is active.

Publication


Featured researches published by Todd O. Elsethagen.


Journal of Chemical Information and Modeling | 2007

Basis Set Exchange: A Community Database for Computational Sciences

Karen L. Schuchardt; Brett T. Didier; Todd O. Elsethagen; Lisong Sun; Vidhya Gurumoorthi; Jared M. Chase; Jun Li; Theresa L. Windus

Basis sets are some of the most important input data for computational models in the chemistry, materials, biology, and other science domains that utilize computational quantum mechanics methods. Providing a shared, Web-accessible environment where researchers can not only download basis sets in their required format but browse the data, contribute new basis sets, and ultimately curate and manage the data as a community will facilitate growth of this resource and encourage sharing both data and knowledge. We describe the Basis Set Exchange (BSE), a Web portal that provides advanced browsing and download capabilities, facilities for contributing basis set data, and an environment that incorporates tools to foster development and interaction of communities. The BSE leverages and enables continued development of the basis set library originally assembled at the Environmental Molecular Sciences Laboratory.


component based software engineering | 2013

Build less code deliver more science: an experience report on composing scientific environments using component-based and commodity software platforms

Ian Gorton; Yan Liu; Carina S. Lansing; Todd O. Elsethagen; Kerstin Kleese van Dam

Modern scientific software is daunting in its diversity and complexity. From massively parallel simulations running on the worlds largest supercomputers, to visualizations and user support environments that manage ever growing complex data collections, the challenges for software engineers are plentiful. While high performance simulators are necessarily specialized codes to maximize performance on specific supercomputer architectures, we argue the vast majority of supporting infrastructure, data management and analysis tools can leverage commodity open source and component-based technologies. This approach can significantly drive down the effort and costs of building complex, collaborative scientific user environments, as well as increase their reliability and extensibility. In this paper we describe our experiences in creating an initial user environment for scientists involved in modeling the detailed effects of climate change on the environment of selected geographical regions. Our approach composes the user environment using the Velo scientific knowledge management platform and the MeDICi Integration Framework for scientific workflows. These established platforms leverage component-based technologies and extend commodity open source platforms with abstractions and capabilities that make them amenable for broad use in science. Using this approach we were able to deliver an operational user environment capable of running thousands of simulations in a 7 month period, and achieve significant software reuse.


Journal of Physics: Conference Series | 2007

IO strategies and data services for petascale data sets from a global cloud resolving model

Karen L. Schuchardt; Bruce J. Palmer; Jeff Daily; Todd O. Elsethagen; Annette Koontz

Global cloud resolving models at resolutions of 4km or less create significant challenges for simulation output, data storage, data management, and post-simulation analysis and visualization. To support efficient model output as well as data analysis, new methods for IO and data organization must be evaluated. The model we are supporting, the Global Cloud Resolving Model being developed at Colorado State University, uses a geodesic grid. The non-monotonic nature of the grids coordinate variables requires enhancements to existing data processing tools and community standards for describing and manipulating grids. The resolution, size and extent of the data suggest the need for parallel analysis tools and allow for the possibility of new techniques in data mining, filtering and comparison to observations. We describe the challenges posed by various aspects of data generation, management, and analysis, our work exploring IO strategies for the model, and a preliminary architecture, web portal, and tool enhancements which, when complete, will enable broad community access to the data sets in familiar ways to the community.


2016 New York Scientific Data Summit (NYSDS) | 2016

Data provenance hybridization supporting extreme-scale scientific workflow applications

Todd O. Elsethagen; Eric G. Stephan; Bibi Raju; Malachi Schram; Matt C. Macduff; Darren J. Kerbyson; Kerstin Kleese van Dam; Alok Singh; Ilkay Altintas

As high performance computing (HPC) infrastructures continue to grow in capability and complexity, so do the applications that they serve. HPC and distributed-area computing (DAC) (e.g. grid and cloud) users are looking increasingly toward workflow solutions to orchestrate their complex application coupling, pre- and post-processing needs. To that end, the US Department of Energy Integrated end-to-end Performance Prediction and Diagnosis for Extreme Scientific Workflows (IPPD) project is currently investigating an integrated approach to prediction and diagnosis of these extreme-scale scientific workflows. To gain insight and a more quantitative understanding of a workflows performance our method includes not only the capture of traditional provenance information, but also the capture and integration of system environment metrics helping to give context and explanation for a workflows execution. In this paper, we describe IPPDs provenance management solution (ProvEn) and its hybrid data store combining both of these data provenance perspectives. We discuss design and implementation details that include provenance disclosure, scalability, data integration, and a discussion on query and analysis capabilities. We also present use case examples for climate modeling and thermal modeling application domains.


Journal of Physics: Conference Series | 2007

Process integration, data management, and visualization framework for subsurface sciences

Karen L. Schuchardt; Gary D. Black; Jared M. Chase; Todd O. Elsethagen; Lisong Sun

Applying subsurface simulation codes to understand heterogeneous flow and transport problems is a complex process potentially involving multiple models, multiple scales, and spanning multiple scientific disciplines. A typical end-to-end process involves many tools, scripts and data sources usually shared only though informal channels. Additionally, the process contains many sub-processes that are repeated frequently and could be automated and shared. Finally, keeping records of the models, processes, and correlation between inputs and outputs is currently manual, time consuming and error prone. We are developing a software framework that integrates a workflow execution environment, shared data repository, and analysis and visualization tools to support development and use of new hybrid subsurface simulation codes. We are taking advantage of recent advances in scientific process automation using the Kepler system and advances in data services based on content management. Extensibility and flexibility are key underlying design considerations to support the constantly changing set of tools, scripts, and models available. We describe the architecture and components of this system with early examples of applying it to a continuum subsurface model.


Information Systems Frontiers | 2016

Semantic catalog of things, services, and data to support a wind data management facility

Eric G. Stephan; Todd O. Elsethagen; Larry K. Berg; Matthew C. Macduff; Patrick R. Paulson; Will Shaw; Chitra Sivaraman; William P. Smith; Adam Wynne

Transparency and data integrity are crucial to any scientific study wanting to garner impact and credibility in the scientific community. The purpose of this paper is to discuss how this can be achieved using what we define as the Semantic Catalog. The catalog exploits community vocabularies as well as linked open data best practices to seamlessly describe and link things, data, and off-the-shelf (OTS) services to support scientific offshore wind energy research for the U.S. Department of Energy’s Office of Energy Efficiency and Renewable Energy (EERE) Wind and Water Power Program. This is largely made possible by leveraging collaborative advances in the Internet of Things (IoT), Semantic Web, Linked Services, Linked Open Data (LOD), and Resource Description Framework (RDF) vocabulary communities, which provides the foundation for our design. By adapting these linked community best practices, we designed a wind characterization Data Management Facility (DMF) capable of continuous data collection, processing, and preservation of in situ and remote sensing instrument measurements. The design incorporates the aforementioned Semantic Catalog which provides a transparent and ubiquitous interface for its user community to the things, data, and services for which the DMF is composed.


many task computing on grids and supercomputers | 2011

Design and implementation of "many parallel task" hybrid subsurface model

Khushbu Agarwal; Jared M. Chase; Karen L. Schuchardt; Timothy D. Schiebe; Bruce J. Palmer; Todd O. Elsethagen

Continuum scale models have been used to study subsurface flow, transport, and reactions for many years. Recently, pore scale models, which operate at scales of individual soil grains, have been developed to more accurately model pore scale phenomena, such as precipitation, that may not be well represented at the continuum scale. However, particle-based models become prohibitively expensive for modeling realistic domains. Instead, we are developing a hybrid model that simulates the full domain at continuum scale and applies the pore model only to areas of high reactivity. The hybrid model uses a dimension reduction approach to formulate the mathematical exchange of information across scales. Since the location, size, and number of pore regions in the model varies, an adaptive Pore Generator is being implemented to define pore regions at each iteration. A fourth code will provide data transformation from the pore scale back to the continuum scale. These components are coupled into a single hybrid model using the Swift workflow system. Our hybrid model workflow simulates a kinetic controlled mixing reaction in which multiple pore-scale simulations occur for every continuum scale time step. Each pore-scale simulation is itself parallel, thus exhibiting multi-level parallelism. Our workflow manages these multiple parallel tasks simultaneously, with the number of tasks changing across iterations. It also supports dynamic allocation of job resources and visualization processing at each iteration. We discuss the design, implementation and challenges associated with building a scalable, Many Parallel Task, hybrid model to run efficiently on thousands to tens of thousands of processors.


Journal of Physics: Conference Series, 180(1):Article No. 012065 | 2009

Application of the SALSSA framework to the validation of smoothed particle hydrodynamics simulations of low Reynolds number flows

Karen L. Schuchardt; Jared M. Chase; Jeffrey A. Daily; Todd O. Elsethagen; Bruce J. Palmer; Timothy D. Scheibe

The Support Architecture for Large-Scale Subsurface Analysis (SALSSA) provides an extensible framework, sophisticated graphical user interface (GUI), and underlying data management system that simplifies the process of running subsurface models, tracking provenance information, and analyzing the model results. The SALSSA software framework is currently being applied to validating the Smoothed Particle Hydrodynamics (SPH) model. SPH is a three-dimensional model of flow and transport in porous media at the pore scale. Because fluid flow in porous media at velocities common in natural porous media occur at low Reynolds numbers, it is important to verify that the SPH model is producing accurate flow solutions in this regime. Validating SPH requires performing a series of simulations and comparing these simulation flow solutions to analytical results or numerical results using other methods. This validation study has been greatly aided by the application of the SALSSA framework, which provides capabilities to setup, execute, analyze, and administer these SPH simulations.


2017 New York Scientific Data Summit (NYSDS) | 2017

A scientific data provenance harvester for distributed applications

Eric G. Stephan; Bibi Raju; Todd O. Elsethagen; Line Pouchard; Carlos Gamboa

Data provenance provides a way for scientists to observe how experimental data originates, conveys process history, and explains influential factors such as experimental rationale and associated environmental factors from system metrics measured at runtime. The US Department of Energy Office of Science Integrated end-to-end Performance Prediction and Diagnosis for Extreme Scientific Workflows (IPPD) project has developed a provenance harvester that is capable of collecting observations from file based evidence typically produced by distributed applications. To achieve this, file based evidence is extracted and transformed into an intermediate data format inspired in part by W3C CSV on the Web recommendations, called the Harvester Provenance Application Interface (HAPI) syntax. This syntax provides a general means to pre-stage provenance into messages that are both human readable and capable of being written to a provenance store, Provenance Environment (ProvEn). HAPI is being applied to harvest provenance from climate ensemble runs for Accelerated Climate Modeling for Energy (ACME) project funded under the U.S. Department of Energy’s Office of Biological and Environmental Research (BER) Earth System Modeling (ESM) program. ACME informally provides provenance in a native form through configuration files, directory structures, and log files that contain success/failure indicators, code traces, and performance measurements. Because of its generic format, HAPI is also being applied to harvest tabular job management provenance from Belle II DIRAC scheduler relational database tables as well as other scientific applications that log provenance related information.


international conference on big data | 2016

Leveraging large sensor streams for robust cloud control

Alok Singh; Eric G. Stephan; Todd O. Elsethagen; Matt C. Macduff; Bibi Raju; Malachi Schram; Kerstin Kleese van Dam; Darren J. Kerbyson; Ilkay Altintas

Todays dynamic computing deployment for commercial and scientific applications is propelling us to an era where minor inefficiencies can snowball into significant performance and operational bottlenecks. Data center operations is increasingly relying on sensors based control systems for key decision insights. The increased sampling frequencies, cheaper storage costs and prolific deployment of sensors is producing massive volumes of operational data. However, there is a lag between rapid development of analytical techniques and its widespread practical deployment. We present empirical evidence of the potential carried by analytical techniques for operations management in computing and data centers. Using Machine Learning modeling techniques on data from a real instrumented cluster, we demonstrate that predictive modeling on operational sensor data can directly reduce systems operations monitoring costs and improve system reliability.

Collaboration


Dive into the Todd O. Elsethagen's collaboration.

Top Co-Authors

Avatar

Eric G. Stephan

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bibi Raju

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jared M. Chase

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Kerstin Kleese van Dam

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Bruce J. Palmer

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Timothy D. Scheibe

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Ilkay Altintas

University of California

View shared research outputs
Top Co-Authors

Avatar

Khushbu Agarwal

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Matt C. Macduff

Pacific Northwest National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge