Is this you? Create Your Porfile

Daniel K. Gunter

Lawrence Berkeley National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel K. Gunter is active.

Explore More

Publication

Featured researches published by Daniel K. Gunter.

Concurrency and Computation: Practice and Experience | 2015

FireWorks: a dynamic workflow system designed for high-throughput applications

Anubhav Jain; Shyue Ping Ong; Wei Chen; Bharat Medasani; Xiaohui Qu; Michael Kocher; Miriam Brafman; Guido Petretto; Gian-Marco Rignanese; Geoffroy Hautier; Daniel K. Gunter; Kristin A. Persson

This paper introduces FireWorks, a workflow software for running high‐throughput calculation workflows at supercomputing centers. FireWorks has been used to complete over 50 million CPU‐hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center. It has been designed to serve the demanding high‐throughput computing needs of these applications, with extensive support for (i) concurrent execution through job packing, (ii) failure detection and correction, (iii) provenance and reporting for long‐running projects, (iv) automated duplicate detection, and (v) dynamic workflows (i.e., modifying the workflow graph during runtime). We have found that these features are highly relevant to enabling modern data‐driven and high‐throughput science applications, and we discuss our implementation strategy that rests on Python and NoSQL databases (MongoDB). Finally, we present performance data and limitations of our approach along with planned future work. Copyright

scientific cloud computing | 2013

Performance evaluation of a MongoDB and hadoop platform for scientific data analysis

Elif Dede; Madhusudhan Govindaraju; Daniel K. Gunter; Richard Shane Canon; Lavanya Ramakrishnan

Scientific facilities such as the Advanced Light Source (ALS) and Joint Genome Institute and projects such as the Materials Project have an increasing need to capture, store, and analyze dynamic semi-structured data and metadata. A similar growth of semi-structured data within large Internet service providers has led to the creation of NoSQL data stores for scalable indexing and MapReduce for scalable parallel analysis. MapReduce and NoSQL stores have been applied to scientific data. Hadoop, the most popular open source implementation of MapReduce, has been evaluated, utilized and modified for addressing the needs of different scientific analysis problems. ALS and the Materials Project are using MongoDB, a document oriented NoSQL store. However, there is a limited understanding of the performance trade-offs of using these two technologies together.In this paper we evaluate the performance, scalability and fault-tolerance of using MongoDB with Hadoop, towards the goal of identifying the right software environment for scientific data analysis.

computing frontiers | 2007

Reconfigurable hybrid interconnection for static and dynamic scientific applications

Shoaib Kamil; Ali Pinar; Daniel K. Gunter; Michael J. Lijewski; Leonid Oliker; John Shalf

As we enter the era of peta-scale computing, system architects must plan for machines composed of tens or even hundreds of thousands of processors. Although fully connected networks such as fat-tree configurations currently dominate HPC interconnect designs, such approaches are inadequate for ultra-scale concurrencies due to the superlinear growth of component costs. Traditional low-degree interconnect topologies, such as 3D tori, have reemerged as a competitive solution due to the linear scaling of system components relative to the node count; however, such networks are poorly suited for the requirements of many scientific applications at extreme concurrencies. To address these limitations, we propose HFAST, a hybrid switch architecture that uses circuit switches to dynamically reconfigure lower-degree interconnects to suit the topological requirements of a given scientific application. This work presents several new research contributions. We develop an optimization strategy for HFAST mappings and demonstrate that efficiency gains can be attained across a broad range of static numerical computations. Additionally, we conduct an extensive analysis of the communication characteristics of a dynamically adapting mesh calculation and show that the HFAST approach can achieve significant advantages, even when compared with traditional fat-tree configurations. Overall results point to the promising potential of utilizing hybrid reconfigurable networks to interconnect future peta-scale architectures, for both static and dynamically adapting applications.

many task computing on grids and supercomputers | 2011

Riding the elephant: managing ensembles with hadoop

Elif Dede; Madhusudhan Govindaraju; Daniel K. Gunter; Lavanya Ramakrishnan

Many important scientific applications do not fit the traditional model of a monolithic simulation running on thousands of nodes. Scientific workflows -- such as the Materials Genome project, Energy Frontiers Research Center for Gas Separations Relevant to Clean Energy Technologies, climate simulations, and Uncertainty Quantification in fluid and solid dynamics { all run large numbers of parallel analyses, which we call scientific ensembles. These scientific ensembles have a large number of tasks with control and data dependencies. Current tools for creating and managing these ensembles in HPC environments are limited and difficult to use; this is proving to be a limiting factor to running scientific ensembles at the large scale enabled by these HPC environments. MapReduce and its open-source implementation, Hadoop, is an attractive paradigm due to the simplicity of the programming model and intrinsic mechanisms for handling scalability and fault-tolerance. In this paper, we evaluate the programmability of MapReduce and Hadoop for scientific workflow ensembles.

ieee international conference on high performance computing data and analytics | 2012

A General Approach to Real-Time Workflow Monitoring

Karan Vahi; Ian Harvey; Taghrid Samak; Daniel K. Gunter; Kieran Evans; David Rogers; Ian J. Taylor; Monte Goode; Fabio Silva; Eddie Al-Shakarchi; Gaurang Mehta; Andrew Clifford Jones; Ewa Deelman

Scientific workflow systems support different workflow representations, operational modes and configurations. However, independent of the system used, end users need to track the status of their workflows in real time, be notified of execution anomalies and failures automatically, perform troubleshooting and automate the analysis of the workflow to help categorize and qualify the results. In this paper, we describe how the Stampede monitoring infrastructure, which was previously integrated in the Pegasus Workflow Management System, was employed in Triana in order to add generic real time monitoring and troubleshooting capabilities across both systems. Stampede is an infrastructure that attempts to address interoperable monitoring needs by providing a three-layer model: a common data model to describe workflow and job executions; high-performance tools to load workflow logs conforming to the data model into a data store, and a querying interface for extracting information from the data store in a standard fashion. The resulting integration demonstrates the generic nature of the Stampede monitoring infrastructure that has the potential to provide a common platform for monitoring across scientific workflow engines.

international conference on e-science | 2014

Experiences with User-Centered Design for the Tigres Workflow API

Lavanya Ramakrishnan; Sarah S. Poon; Valerie Hendrix; Daniel K. Gunter; Gilberto Pastorello; Deborah A. Agarwal

Scientific data volumes have been growing exponentially. This has resulted in the need for new tools that enable users to operate on and analyze data. Cyber infrastructure tools, including workflow tools, that have been developed in the last few years has often fallen short if user needs and suffered from lack of wider adoption. User-centered Design (UCD) process has been used as an effective approach to develop usable software with high adoption rates. However, UCD has largely been applied for user-interfaces and there has been limited work in applying UCD to application program interfaces and cyber infrastructure tools. We use an adapted version of UCD that we refer to as Scientist-Centered Design (SCD) to engage with users in the design and development of Tigres, a workflow application programming interface. Tigres provides a simple set of programming templates (e.g., sequence, parallel, split, merge) that can be can used to compose and execute computational and data transformation pipelines. In this paper, we describe Tigres and discuss our experiences with the use of UCD for the initial development of Tigres. Our experience-to-date is that the UCD process not only resulted in better requirements gathering but also heavily influenced the architecture design and implementation details. User engagement during the development of tools such as Tigres is critical to ensure usability and increase adoption.

grid computing | 2013

A Case Study into Using Common Real-Time Workflow Monitoring Infrastructure for Scientific Workflows

Karan Vahi; Ian Harvey; Taghrid Samak; Daniel K. Gunter; Kieran Evans; David Mckendrick Rogers; Ian J. Taylor; Monte Goode; Fabio Silva; Eddie Al-Shakarchi; Gaurang Mehta; Ewa Deelman; Andrew C. Jones

Scientific workflow systems support various workflow representations, operational modes, and configurations. Regardless of the system used, end users have common needs: to track the status of their workflows in real time, be notified of execution anomalies and failures automatically, perform troubleshooting, and automate the analysis of the workflow results. In this paper, we describe how the Stampede monitoring infrastructure was integrated with the Pegasus Workflow Management System and the Triana Workflow Systems, in order to add generic real time monitoring and troubleshooting capabilities across both systems. Stampede is an infrastructure that provides interoperable monitoring using a three-layer model: (1) a common data model to describe workflow and job executions; (2) high-performance tools to load workflow logs conforming to the data model into a data store; and (3) a common query interface. This paper describes the integration of Stampede monitoring architecture with Pegasus and Triana and shows the new analysis capabilities that Stampede provides to these workflow systems. The successful integration of Stampede with these workflow engines demonstrates the generic nature of the Stampede monitoring infrastructure and its potential to provide a common platform for monitoring across scientific workflow engines.

network operations and management symposium | 2012

Scalable analysis of network measurements with Hadoop and Pig

Taghrid Samak; Daniel K. Gunter; Valerie Hendrix

The deployment of ubiquitous distributed monitoring infrastructure such as perfSONAR is greatly increasing the availability and quality of network performance data. Cross-cutting analyses are now possible that can detect anomalies and provide real-time automated alerts to network management services. However, scaling these analyses to the volumes of available data remains a difficult task. Although there is significant research into offline analysis techniques, most of these approaches do not address the systems and scalability issues. This work presents an analysis framework incorporating industry best-practices and tools to perform large-scale analyses. Our framework integrates the expressiveness of Pig, the scalability of Hadoop, and the analysis and visualization capabilities of R to achieve a significant increase in both speed and power of analysis. Evaluation of our framework on a large dataset of real measurements from perfSONAR demonstrate a large speedup and novel statistical capabilities.

international conference on e-science | 2013

Automatic Outlier Detection for Genome Assembly Quality Assessment

Taghrid Samak; Rob Egan; Brian Bushnell; Daniel K. Gunter; Alex Copeland; Zhong Wang

In this work we describe a method to automatically detect errors in de novo assembled genomes. The method extends a Bayesian assembly quality evaluation framework, ALE, which computes the likelihood of an assembly given a set of unassembled data. Starting from ALE output, this method applies outlier detection algorithms to identify the precise locations of assembly errors. We show results from a microbial genome with manually curated assembly errors. Our method detects all deletions, 82.3% of insertions, and 88.8% of single base substitutions. It was also able to detect an inversion error that spans more than 400 bases.

international conference on e-science | 2017

Ten Principles for Creating Usable Software for Science

Lavanya Ramakrishnan; Daniel K. Gunter

The volume and variety of scientific data being generated at experimental facilities requires the seamless interaction of the scientist’s knowledge with the large-scale machines and software that is required to process the data. In the last few years, scientific software tools are being developed to address these increasingly complex workflow and data management needs. However, current approaches for designing systems and tools focus on the hardware and software of the machine and do not consider the user. Our experience shows us that user experience research needs to be tightly integrated with the software development life cycle for building sustainable software for science. It has become not just necessary, but critical, to consider the user interaction in the design of the entire system for data-intensive sciences that have complex human interaction with the data, software and systems. The dynamic nature of science projects and the complex roles of personnel in the projects makes it difficult to apply classical user research methodologies from industry. In this paper, we make three specific contributions towards improving the usability and sustainability of scientific software. First, we examine the software life cycle in science environments and identify the differences with commercial software development. Next, we outline ten principles we have developed to guide user engagement and software development and illustrate it with examples from our projects over the last several years. Finally, we provide guidelines to other eScience projects on applying the ten principles in the software development life cycle.

Explore More