A. Woodard
University of Notre Dame
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by A. Woodard.
Journal of Physics: Conference Series | 2015
Haiyan Meng; Matthias Wolf; Peter Ivie; A. Woodard; Mike Hildreth; Douglas Thain
The reproducibility of scientific results increasingly depends upon the preservation of computational artifacts. Although preserving a computation to be used later sounds easy, it is surprisingly difficult due to the complexity of existing software and systems. Implicit dependencies, networked resources, and shifting compatibility all conspire to break applications that appear to work well. To investigate these issues, we present a case study of a complex high energy physics application. We analyze the application and attempt several methods at extracting its dependencies for the purposes of preservation. We propose one fine-grained dependency management toolkit to preserve the application and demonstrate its correctness in three different environments - the original machine, one virtual machine from the Notre Dame Cloud Platform and one virtual machine from the Amazon EC2 Platform. We report on the completeness, performance, and efficiency of each technique, and offer some guidance for future work in application preservation.
cluster computing and the grid | 2014
Dillon Skeehan; Paul Brenner; Ben Tovar; Douglas Thain; N. Valls; A. Woodard; Matthias Wolf; T. Pearson; S. Lynch; Kevin Lannon
The computing needs of high energy physics experiments like the Compact Muon Solenoid experiment at the Large Hadron Collider currently exceed the available dedicated computational resources, hence motivating a push to leverage opportunistic resources. However, access to opportunistic resources faces many obstacles, not the least of which is making available the complex software stack typically associated with such computations. This paper describes a framework constructed using existing software packages to distribute the needed software to opportunistic resources without the need for the job to have root-level privileges. Preliminary tests with this framework have demonstrated the feasibility of the approach and identified bottlenecks as well as reliability issues which must be resolved in order to make this approach viable for broad use.
international conference on e-science | 2016
Haiyan Meng; Douglas Thain; Alexander Vyushkov; Matthias Wolf; A. Woodard
Publishing scientific results without the detailed execution environments describing how the results were collected makes it difficult or even impossible for the reader to reproduce the work. However, the configurations of the execution environments are too complex to be described easily by authors. To solve this problem, we propose a framework facilitating the conduct of reproducible research by tracking, creating, and preserving the comprehensive execution environments with Umbrella. The framework includes a lightweight, persistent and deployable execution environment specification, an execution engine which creates the specified execution environments, and an archiver which archives an execution environment into persistent storage services like Amazon S3 and Open Science Framework (OSF). The execution engine utilizes sandbox techniques like virtual machines (VMs), Linux containers and user-space tracers, to create an execution environment, and allows common dependencies like base OS images to be shared by sandboxes for different applications. We evaluate our framework by utilizing it to reproduce three scientific applications from epidemiology, scene rendering, and high energy physics. We evaluate the time and space overhead of reproducing these applications, and the effectiveness of the chosen archive unit and mounting mechanism for allowing different applications to share dependencies. Our results show that these applications can be reproduced using different sandbox techniques successfully and efficiently, even through the overhead and performance slightly vary.
Journal of Physics: Conference Series | 2017
Matthias Wolf; A. Woodard; Wenzhao Li; Kenyi Hurtado Anampa; Anna Yannakopoulos; Benjamín Tovar; Patrick Donnelly; Paul Brenner; Kevin Lannon; Mike Hildreth; Douglas Thain
We previously described Lobster, a workflow management tool for exploiting volatile opportunistic computing resources for computation in HEP. We will discuss the various challenges that have been encountered while scaling up the simultaneous CPU core utilization and the software improvements required to overcome these challenges. Categories: Workflows can now be divided into categories based on their required system resources. This allows the batch queueing system to optimize assignment of tasks to nodes with the appropriate capabilities. Within each category, limits can be specified for the number of running jobs to regulate the utilization of communication bandwidth. System resource specifications for a task category can now be modified while a project is running, avoiding the need to restart the project if resource requirements differ from the initial estimates. Lobster now implements time limits on each task category to voluntarily terminate tasks. This allows partially completed work to be recovered. Workflow dependency specification: One workflow often requires data from other workflows as input. Rather than waiting for earlier workflows to be completed before beginning later ones, Lobster now allows dependent tasks to begin as soon as sufficient input data has accumulated. Resource monitoring: Lobster utilizes a new capability in Work Queue to monitor the system resources each task requires in order to identify bottlenecks and optimally assign tasks. The capability of the Lobster opportunistic workflow management system for HEP computation has been significantly increased. We have demonstrated efficient utilization of 25 000 non-dedicated cores and achieved a data input rate of 30 Gb/s and an output rate of 500 GB/h. This has required new capabilities in task categorization, workflow dependency specification, and resource monitoring.
Journal of Physics: Conference Series | 2017
Matthias Wolf; A. Woodard; Wenzhao Li; Kenyi Hurtado Anampa; Benjamín Tovar; Paul Brenner; Kevin Lannon; Mike Hildreth; Douglas Thain
The University of Notre Dame (ND) CMS group operates a modest-sized Tier-3 site suitable for local, final-stage analysis of CMS data. However, through the ND Center for Research Computing (CRC), Notre Dame researchers have opportunistic access to roughly 25k CPU cores of computing and a 100 Gb/s WAN network link. To understand the limits of what might be possible in this scenario, we undertook to use these resources for a wide range of CMS computing tasks from user analysis through large-scale Monte Carlo production (including both detector simulation and data reconstruction.) We will discuss the challenges inherent in effectively utilizing CRC resources for these tasks and the solutions deployed to overcome them.
international conference on cluster computing | 2015
A. Woodard; Matthias Wolf; C. Mueller; N. Valls; Ben Tovar; Patrick Donnelly; Peter Ivie; Kenyi Hurtado Anampa; Paul Brenner; Douglas Thain; Kevin Lannon; Michael Hildreth
The high energy physics (HEP) community relies upon a global network of computing and data centers to analyze data produced by multiple experiments at the Large Hadron Collider (LHC). However, this global network does not satisfy all research needs. Ambitious researchers often wish to harness computing resources that are not integrated into the global network, including private clusters, commercial clouds, and other production grids. To enable these use cases, we have constructed Lobster, a system for deploying data intensive high throughput applications on non-dedicated clusters. This requires solving multiple problems related to non-dedicated resources, including work decomposition, software delivery, concurrency management, data access, data merging, and performance troubleshooting. With these techniques, we demonstrate Lobster running effectively on 10k cores, producing throughput at a level comparable with some of the largest dedicated clusters in the LHC infrastructure.
Journal of Physics: Conference Series | 2015
A. Woodard; Matthias Wolf; C. Mueller; Ben Tovar; Patrick Donnelly; Kenyi Hurtado Anampa; Paul Brenner; Kevin Lannon; Mike Hildreth; Douglas Thain
Analysis of high energy physics experiments using the Compact Muon Solenoid (CMS) at the Large Hadron Collider (LHC) can be limited by availability of computing resources. As a joint effort involving computer scientists and CMS physicists at Notre Dame, we have developed an opportunistic workflow management tool, Lobster, to harvest available cycles from university campus computing pools. Lobster consists of a management server, file server, and worker processes which can be submitted to any available computing resource without requiring root access.Lobster makes use of the Work Queue system to perform task management, while the CMS specific software environment is provided via CVMFS and Parrot. Data is handled via Chirp and Hadoop for local data storage and XrootD for access to the CMS wide-area data federation. An extensive set of monitoring and diagnostic tools have been developed to facilitate system optimisation. We have tested Lobster using the 20 000-core cluster at Notre Dame, achieving approximately 8-10k tasks running simultaneously, sustaining approximately 9 Gbit/s of input data and 340 Mbit/s of output data.