Daniel Buettner | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel Buettner is active.

Explore More

Publication

Featured researches published by Daniel Buettner.

international parallel and distributed processing symposium | 2010

Analyzing and adjusting user runtime estimates to improve job scheduling on the Blue Gene/P

Wei Tang; Narayan Desai; Daniel Buettner; Zhiling Lan

Backfilling and short-job-first are widely acknowledged enhancements to the simple but popular first-come, first-served job scheduling policy. However, both enhancements depend on user-provided estimates of job runtime, which research has repeatedly shown to be inaccurate. We have investigated the effects of this inaccuracy on backfilling and different queue prioritization policies, determining which part of the scheduling policy is most sensitive. Using these results, we have designed and implemented several estimation-adjusting schemes based on historical data. We have evaluated these schemes using workload traces from the Blue Gene/P system at Argonne National Laboratory. Our experimental results demonstrate that dynamically adjusting job runtime estimates can improve job scheduling performance by up to 20%.

international conference on cluster computing | 2009

Fault-aware, utility-based job scheduling on Blue, Gene/P systems

Wei Tang; Zhiling Lan; Narayan Desai; Daniel Buettner

Job scheduling on large-scale systems is an increasingly complicated affair, with numerous factors influencing scheduling policy. Addressing these concerns results in sophisticated scheduling policies that can be difficult to reason about. In this paper, we present a general utility-based scheduling framework to balance various scheduling requirements and priorities. It enables system owners to customize scheduling policies under different circumstances without changing the scheduling code. We also develop a fault-aware job allocation strategy for Blue Gene/P systems to address the increasing concern of system failures. We demonstrate the effectiveness of these facilities by means of event-driven simulations with real job traces collected from the production Blue Gene/P system at Argonne National Laboratory.

international parallel and distributed processing symposium | 2011

Reducing Fragmentation on Torus-Connected Supercomputers

Wei Tang; Zhiling Lan; Narayan Desai; Daniel Buettner; Yongen Yu

Torus-based networks are prevalent on leadership-class petascale systems, providing a good balance between network cost and performance. The major disadvantage of this network architecture is its susceptibility to fragmentation. Many studies have attempted to reduce resource fragmentation in this architecture. Although the approaches suggested can make good allocation decisions reducing fragmentation at job start time, none of them considers a jobs wall time, which can cause resource fragmentation when neighboring jobs do not complete closely. In this paper, we propose a wall time-aware job allocation strategy, which adjacently packs jobs that finish around the same time, in order to minimize resource fragmentation caused by job length, discrepancy. Event-driven simulations using real job traces from a production Blue Gene/P system at Argonne National Laboratory demonstrate that our wall time-aware strategy can effectively reduce system fragmentation and improve overall system performance.

Journal of Parallel and Distributed Computing | 2013

Job scheduling with adjusted runtime estimates on production supercomputers

Wei Tang; Narayan Desai; Daniel Buettner; Zhiling Lan

Abstract The estimate of a parallel job’s running time (walltime) is an important attribute used by resource managers and job schedulers in various scenarios, such as backfilling and short-job-first scheduling. This value is provided by the user, however, and has been repeatedly shown to be inaccurate. We studied the workload characteristic based on a large amount of historical data (over 275,000 jobs in two and a half years) from a production leadership-class computer. Based on that study, we proposed a set of walltime adjustment schemes producing more accurate estimates. To ensure the utility of these schemes on production systems, we analyzed their potential impact in scheduling and evaluated the schemes with an event-driven simulator. Our experimental results show that our method can achieve not only better overall estimation accuracy but also improved overall system performance. Specifically, the average estimation accuracy of the tested workload can be improved by up to 35%, and the system performance in terms of average waiting time and weighted average waiting time can be improved by up to 22% and 28%, respectively.

international conference on parallel processing | 2009

Improving Resource Availability by Relaxing Network Allocation Constraints on Blue Gene/P

Narayan Desai; Darius Buntinas; Daniel Buettner; Pavan Balaji; Anthony Chan

High-end computing (HEC) systems have passed the petaflop barrier and continue to move toward the next frontier of {exascale} computing. As companies and research institutes continue to work toward architecting these enormous systems, it is becoming increasingly clear that these systems will utilize a significant amount of shared hardware between processing units, including shared caches, memory management engines, and network infrastructure. While these systems are optimized to use all of the hardware available in a dedicated manner to achieve the best performance, in practice, the shared nature of this hardware makes scheduling applications on it difficult and wasteful. For example, while the IBM Blue Gene/P system has been designed to use a torus network for efficient communication, some of the torus links (especially those connecting different racks) are shared between multiple racks. Thus, a job running on one rack, might preclude another job from running on a second rack in spite of having its compute resources completely idle. In this paper, we assess the relative performance degradation noticed by real applications when such shared network hardware is completely unutilized for some cases. Our measurements on Intrepid, one of the largest Blue Gene/P installations in the world, demonstrate less than 5% degradation for several leadership applications commonly run on the Intrepid system. Further, we demonstrate that the additional scheduling flexibility offered by not sharing such hardware can improve the overall job turnaround time by nearly 40% in some cases.

many-task computing on grids and supercomputers | 2010

Automatic and coordinated job recovery for high performance computing

Wei Tang; Zhiling Lan; Narayan Desai; Daniel Buettner

As the scale of high-performance computing systems continues to grow, the impact of failures on the systems is increasingly critical. Research has been performed on fault prediction and associated precautionary actions. While this approach is valuable, it is not adequate because of the inevitability of failures. Postfailure recovery is equally important; however, most current work relies mainly on checkpoint/restart, not addressing the problem from the system level. We propose AuCoRe, an automatic and coordinated job recovery framework. AuCoRe provides a coordination mechanism for failed-job recovery, taking the execution of regular jobs into account; users specify job recovery policy for their jobs, and an incentive mechanism minimizes gaming. We have implemented AuCoRe in Cobalt, a production resource manager, and evaluated it using real workloads from the Blue Gene/P system at Argonne National Laboratory. Experimental results demonstrate that AuCoRe improves system performance by efficiently managing job recovery.

international conference on parallel processing | 2011

Job Coscheduling on Coupled High-End Computing Systems

Wei Tang; Narayan Desai; Venkatram Vishwanath; Daniel Buettner; Zhiling Lan

Supercomputer centers often deploy large-scale computing systems together with an associated data analysis or visualization system. In this paper, we propose a co scheduling mechanism, providing the ability to coordinate execution between jobs on different systems. The mechanism is built on top of a lightweight protocol for coordination between policy domains without manual intervention. We have evaluated this system using real job traces from Intrepid and Eureka, the production Blue Gene/P and data analysis systems, respectively, deployed at Argonne National Laboratory. Our experimental results quantify the costs of co scheduling and demonstrate that co scheduling can be achieved with limited impact on system performance under varying workloads.

The Journal of Supercomputing | 2013

Multi-domain job coscheduling for leadership computing systems

Wei Tang; Narayan Desai; Venkatram Vishwanath; Daniel Buettner; Zhiling Lan

Current supercomputing centers usually deploy a large-scale compute system together with an associated data analysis or visualization system. Multiple scenarios have driven the demand that some associated jobs co-execute on different machines. We propose a multi-domain coscheduling mechanism, providing the ability to coordinate execution between jobs on multiple resource management domains without manual intervention. We have evaluated our mechanism based on real job traces from Intrepid and Eureka, the production Blue Gene/P system and a cluster with the largest GPU installation, deployed at Argonne National Laboratory. The experimental results show that coscheduling can be achieved with limited impact on system performance under varying workloads.

international conference on parallel processing | 2008

Simulating Failures on Large-Scale Systems

Narayan Desai; Ewing L. Lusk; Daniel Buettner; Andrew Cherry; Theron Voran

Developing fault management mechanisms is a difficult task because of the unpredictable nature of failures. In this paper, we present a fault simulation framework for Blue Gene/P systems implemented as a part of the Cobalt resource manager. The primary goal of this framework is to support system software development. We also present a hardware diagnostic system that we have implemented using this framework.

international conference on parallel processing | 2009