David L. Hart
National Center for Atmospheric Research
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David L. Hart.
ieee international conference on high performance computing data and analytics | 2011
David L. Hart
TeraGrid has deployed a significant monitoring and accounting infrastructure in order to understand its operational success. In this paper, we present an analysis of the jobs reported by TeraGrid for 2008. We consider the workload from several perspectives: traditional high-performance computing (HPC) workload characteristics; grid-oriented workload characteristics; and finally user- and group-oriented characteristics. We use metrics reported in prior studies of HPC and grid systems in order to understand whether such metrics provide useful information for managing and studying resource federations. This study highlights the importance of distinguishing between analyses of job patterns and work patterns; that small sets of users dominate the workload both in terms of job and work patterns; and that aggregate analyses across even loosely coupled federations, with incomplete information for individual systems, reflect patterns seen in more tightly coupled grids and in single HPC systems.
extreme science and engineering discovery environment | 2013
Thomas R. Furlani; Barry L. Schneider; Matthew D. Jones; John Towns; David L. Hart; Steven M. Gallo; Robert L. DeLeon; Charng Da Lu; Amin Ghadersohi; Ryan J. Gentner; Abani K. Patra; Gregor von Laszewski; Fugang Wang; Jeffrey T. Palmer; Nikolay Simakov
The XDMoD auditing tool provides, for the first time, a comprehensive tool to measure both utilization and performance of high-end cyberinfrastructure (CI), with initial focus on XSEDE. Here, we demonstrate, through several case studies, its utility for providing important metrics regarding resource utilization and performance of TeraGrid/XSEDE that can be used for detailed analysis and planning as well as improving operational efficiency and performance. Measuring the utilization of high-end cyberinfrastructure such as XSEDE helps provide a detailed understanding of how a given CI resource is being utilized and can lead to improved performance of the resource in terms of job throughput or any number of desired job characteristics. In the case studies considered here, a detailed historical analysis of XSEDE usage data using XDMoD clearly demonstrates the tremendous growth in the number of users, overall usage, and scale of the simulations routinely carried out. Not surprisingly, physics, chemistry, and the engineering disciplines are shown to be heavy users of the resources. However, as the data clearly show, molecular biosciences are now a significant and growing user of XSEDE resources, accounting for more than 20 percent of all SUs consumed in 2012. XDMoD shows that the resources required by the various scientific disciplines are very different. Physics, Astronomical sciences, and Atmospheric sciences tend to solve large problems requiring many cores. Molecular biosciences applications on the other hand, require many cycles but do not employ core counts that are as large. Such distinctions are important in guiding future cyberinfrastructure design decisions. XDMoDs implementation of a novel application kernel-based auditing system to measure overall CI system performance and quality of service is shown, through several examples, to provide a useful means to automatically detect under performing hardware and software. This capability is especially critical given the complex composition of todays advanced CI. Examples include an application kernel based on a widely used quantum chemistry program that uncovered a software bug in the I/O stack of a commercial parallel file system, which was subsequently fixed by the vendor in the form of a software patch that is now part of their standard release. This error, which resulted in dramatically increased execution times as well as outright job failure, would likely have gone unnoticed for sometime and was only uncovered as a result of implementation of XDMoDs suite of application kernels.
ieee international conference on high performance computing data and analytics | 2011
Davide Del Vento; David L. Hart; Thomas Engel; Rory Kelly; Richard A. Valent; Siddhartha S. Ghosh; Si Liu
NCARs Bluefire supercomputer is instrumented with a set of low-overhead processes that continually monitor the floating point counters of its 3,840 batch-compute cores. We extract performance numbers for each batch job by correlating the data from corresponding nodes. From experience and heuristics for good performance, we use this data, in part, to identify poorly performing jobs and then work with the users to improve their jobs efficiency. Often, the solution involves simple steps such as spawning an adequate number of processes or threads, binding the processes or threads to cores, using large memory pages, or using adequate compiler optimization. These efforts typically result in performance improvements and a wall-clock runtime reduction of 10% to 20%. With more involved changes to codes and scripts, some users have obtained performance improvements of 40% to 90%. We discuss our instrumentation, some successful cases, and its general applicability to other systems.
Journal of the Association for Information Science and Technology | 2017
Matthew S. Mayernik; David L. Hart; Keith E. Maull; Nicholas M. Weber
Recent policy shifts on the part of funding agencies and journal publishers are causing changes in the acknowledgment and citation behaviors of scholars. A growing emphasis on open science and reproducibility is changing how authors cite and acknowledge “research infrastructures”—entities that are used as inputs to or as underlying foundations for scholarly research, including data sets, software packages, computational models, observational platforms, and computing facilities. At the same time, stakeholder interest in quantitative understanding of impact is spurring increased collection and analysis of metrics related to use of research infrastructures. This article reviews work spanning several decades on tracing and assessing the outcomes and impacts from these kinds of research infrastructures. We discuss how research infrastructures are identified and referenced by scholars in the research literature and how those references are being collected and analyzed for the purposes of evaluating impact. Synthesizing common features of a wide range of studies, we identify notable challenges that impede the analysis of impact metrics for research infrastructures and outline key open research questions that can guide future research and applications related to such metrics.
international conference on cluster computing | 2015
Gregor von Laszewski; Fugang Wang; Geoffrey C. Fox; David L. Hart; Thomas R. Furlani; Robert L. DeLeon; Steven M. Gallo
We present a framework that compares the publication impact based on a comprehensive peer analysis of papers produced by scientists using XSEDE and NCAR resources. The analysis is introducing a percentile ranking based approach of citations of the XSEDE and NCAR papers compared to peer publications in the same journal that do not use these resources. This analysis is unique in that it evaluates the impact of the two facilities by comparing the reported publications from them to their peers from within the same journal issue. From this analysis, we can see that papers that utilize XSEDE and NCAR resources are cited statistically significantly more often. Hence we find that reported publications indicate that XSEDE and NCAR resources exert a strong positive impact on scientific research.
Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure | 2015
Rory Kelly; Si Liu; Siddhartha S. Ghosh; Davide Del Vento; David L. Hart; Dan Nagle; B. J. Smith; Richard A. Valent
Scientists and engineers using supercomputer clusters should be able to focus on their scientific and technical work instead of worrying about operating their user environment. However, creating a convenient and effective user environment on modern supercomputers becomes more and more challenging due to the complexity of these large-scale systems. In this report, we discuss important design issues and goals in user environment that must support multiple compiler suites, various applications, and diverse libraries on heterogeneous computing architectures. We present our implementation on the latest high-performance computing system, Yellowstone, which is a powerful dedicated resource for earth system science deployed by the National Center for Atmospheric Research. Our newly designed user environment is built upon a hierarchical module structure, customized wrapper scripts, pre-defined system modules, Lmod modules implementation, and several creative tools. The resulting implementation realizes many great features including streamlined control, versioning, user customization, automated documentation, etc., and accommodates both novice and experienced users. The design and implementation also minimize the effort of the administrator and support team in managing users environment. The smooth application and positive feedback from our users demonstrate that our design and implementation on the Yellowstone system have been well accepted and have facilitated thousands of users all over the world.
Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure | 2015
Bill Anderson; Marc Genty; David L. Hart; Erich Thanhardt
Data storage needs continue to grow in most fields, and the cost per byte for tape remains lower than the cost for disk, making tape storage a good candidate for cost-effective long-term storage. However, the workloads suitable for tape archives differ from those for disk file systems, and archives must handle internally generated workloads that can be more demanding than those generated by end users (e.g., migration of data from an old tape technology to a new one). To better understand the variegated workloads, we have followed the first steps in the data science methodology. For anyone considering the use or deployment of a tape-based data archive or for anyone interested in details of data archives in the context of data science, this paper describes key aspects of data archive workloads.
ieee international symposium on parallel & distributed processing, workshops and phd forum | 2011
Daniel S. Katz; David L. Hart; Chris Jordan; Amit Majumdar; John-Paul Navarro; Warren Smith; John Towns; Von Welch; Nancy Wilkins-Diehr
teragrid conference | 2011
Richard L. Moore; David L. Hart; Wayne Pfeiffer; Mahidhar Tatineni; Kenneth Yoshimoto; William S. Young
Archive | 1999
Craig A. Stewart; Tin Wee Tan; Markus Buckhorn; David L. Hart; Donald K. Berry; Louxin Zhang; Eric A. Wernert; Meena Kishore Sakharkar; Will Fischer; Donald F. McMullen