Sean Wallace
Illinois Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sean Wallace.
ieee international conference on high performance computing data and analytics | 2013
Xu Yang; Zhou Zhou; Sean Wallace; Zhiling Lan; Wei Tang; Susan Coghlan; Michael E. Papka
The research literature to date mainly aimed at reducing energy consumption in HPC environments. In this paper we propose a job power aware scheduling mechanism to reduce HPCs electricity bill without degrading the system utilization. The novelty of our job scheduling mechanism is its ability to take the variation of electricity price into consideration as a means to make better decisions of the timing of scheduling jobs with diverse power profiles. We verified the effectiveness of our design by conducting trace-based experiments on an IBM Blue Gene/P and a cluster system as well as a case study on Argonnes 48-rack IBM Blue Gene/Q system. Our preliminary results show that our power aware algorithm can reduce electricity bill of HPC systems as much as 23%.
ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013
Sean Wallace; Venkatram Vishwanath; Susan Coghlan; Zhiling Lan; Michael E. Papka
In addition to pushing what is possible computationally, state-of-the-art supercomputers are also pushing what is acceptable in terms of power consumption. Despite hardware manufacturers researching and developing efficient system components (e.g., processor, memory, etc.), the power consumption of a complete system remains an understudied research area. Because of the complexity and unpredictable workloads of these systems, estimating the power consumption of a full system is a nontrivial task.In this paper, we provide system-level power usage and temperature analysis of early access to Argonnes latest generation of IBM Blue Gene supercomputers, the Mira Blue Gene/Q system. The analysis is provided from the point of view of jobs running on the system. We describe the important implications these system level measurements have as well as the challenges they present. Using profiling code on benchmarks, we will also look at the new tools this latest generation of supercomputer provides and gauge their usefulness and how well they match up against the environmental data.
parallel computing | 2016
Sean Wallace; Zhou Zhou; Venkatram Vishwanath; Susan Coghlan; John R. Tramm; Zhiling Lan; Michael E. Papka
We describe our power profiling library called MonEQ, built on the IBM provided API.We integrate MonEQ into several benchmarks to show the data it produces.Applications have different power profiles based on usage of domains in node.There is no difference in power consumption in network for a given network topology.Scale can reduce power consumption but is far less important than execution time. The power consumption of state of the art supercomputers, because of their complexity and unpredictable workloads, is extremely difficult to estimate. Accurate and precise results, as are now possible with the latest generation of IBM Blue Gene/Q, are therefore a welcome addition to the landscape. Only recently have end users been afforded the ability to access the power consumption of their applications. However, just because its possible for end users to obtain this data does not mean its a trivial task. This emergence of new data is therefore not only understudied, but also not fully understood.In this paper, we describe our open source power profiling library called MonEQ, built on the IBM provided Environmental Monitoring (EMON) API. We show that its lightweight, has extremely low overhead, is incredibly flexible, and has advanced features which end users can take advantage. We then integrate MonEQ into several benchmarks and show the data it produces and what analysis of this data can teach us. Going one step further we also describe how seemingly simple changes in scale or network topology can have dramatic effects on power consumption. To this end, previously well understood applications will now have new facets of potential analysis.
ieee international conference on high performance computing data and analytics | 2016
Sean Wallace; Xu Yang; Venkatram Vishwanath; William E. Allcock; Susan Coghlan; Michael E. Papka; Zhiling Lan
Modern schedulers running on HPC systems traditionally consider the number of resources and the time requested for each job that is to be executed when making scheduling decisions. Until recently this has been sufficient, however as systems get larger, other metrics like power consumption become necessary to ensure system stability. In this paper, we propose a data driven scheduling approach for controlling the power consumption of the entire system under any user defined budget. Here, “data driven” means that our approach actively observes, analyzes, and assesses power behaviors of the system and user jobs to guide scheduling decisions for power management. This design is based on the key observation that HPC jobs have distinct power profiles. Our work contains an empirical analysis of workload power characteristics on a production system, dynamic learner to estimate the job power profile for scheduling, and an online power-aware scheduler for managing the overall system power. Using real workload traces, we demonstrate that our design effectively controls system power consumption while minimizing the impact on system utilization.
American Journal of Sports Medicine | 2015
Erich J. Petushek; Edward T. Cokely; Paul Ward; John J. Durocher; Sean Wallace; Gregory D. Myer
Background: Available methods for screening anterior cruciate ligament (ACL) injury risk are effective but limited in application as they generally rely on expensive and time-consuming biomechanical movement analysis. A potentially efficient alternative to biomechanical screening is skilled movement analysis via visual inspection (ie, having experts estimate injury risk factors based on observations of athletes’ movements). Purpose: To develop a brief, valid psychometric assessment of ACL injury risk factor estimation skill: the ACL Injury Risk Estimation Quiz (ACL-IQ). Study Design: Cohort study (diagnosis); Level of evidence, 3. Methods: A total of 660 individuals participated in various stages of the study, including athletes, physicians, physical therapists, athletic trainers, exercise science researchers/students, and members of the general public in the United States. The ACL-IQ was fully computerized and made available online (www.ACL-IQ.org). Item sampling/reduction, reliability analysis, cross-validation, and convergent/discriminant validity analyses were conducted to refine the efficiency and validity of the assessment. Results: Psychometric optimization techniques identified a short (mean time, 2 min 24 s), robust, 5-item assessment with high reliability (test-retest: r = 0.90) and high test sensitivity (average difference of exercise science professionals vs general population: Cohen d = 2). Exercise science professionals and individuals from the general population scored 74% and 53% correct, respectively. Convergent and discriminant validity was demonstrated. Scores on the ACL-IQ were best predicted by ACL knowledge and specific judgment strategies (ie, cue use) and were largely unrelated to domain-general spatial/decision-making ability, personality, or other demographic variables. Overall, 23% of the total sample (40% of exercise science professionals; 6% of general population) performed better than or equal to the ACL nomogram. Conclusion: This study presents the results of a systematic approach to assess individual differences in ACL injury risk factor estimation skill; the assessment approach is efficient (ie, it can be completed in <3 min) and psychometrically robust. The results provide evidence that some individuals have the ability to visually estimate ACL injury risk factors more accurately than other instrument-based ACL risk estimation methods (ie, ACL nomogram). The ACL-IQ provides the foundation for assessing the efficacy of observational ACL injury risk factor assessment (ie, does simple skilled visual inspection reduce ACL injuries?). The ACL-IQ can also be used to increase our understanding of the perceptual-cognitive mechanisms underlying injury risk assessment expertise, which can be leveraged to accelerate learning and improve performance.
international conference on cluster computing | 2014
Eduardo Berrocal; Li Yu; Sean Wallace; Michael E. Papka; Zhiling Lan
Mean Time Between Failures (MTBF), now calculated in days or hours, is expected to drop to minutes on exascale machines. The advancement of resilience technologies greatly depends on a deeper understanding of faults arising from hardware and software components. This understanding has the potential to help us build better fault tolerance technologies. For instance, it has been proved that combining checkpointing and failure prediction leads to longer checkpoint intervals, which in turn leads to fewer total checkpoints. In this paper we present a new approach for fault detection based on the Void Search (VS) algorithm. VS is used primarily in astrophysics for finding areas of space that have a very low density of galaxies. We evaluate our algorithm using real environmental logs from Mira Blue Gene/Q supercomputer at Argonne National Laboratory. Our experiments show that our approach can detect almost all faults (i.e., sensitivity close to 1) with a low false positive rate (i.e., specificity values above 0.7). We also compare our algorithm with a number of existing detection algorithms, and find that ours outperforms all of them.
international conference on cluster computing | 2015
Sean Wallace; Venkatram Vishwanath; Susan Coghlan; Zhiling Lan; Michael E. Papka
The high performance computing landscape is filled with diverse hardware components. A large part of understanding how these components compare to others is by looking at the various environmental aspects of these devices such as power consumption, temperature, etc. Thankfully, vendors of these various pieces of hardware have supported this by providing mechanisms to obtain this data. However, differences not only in the way this data is obtained but also the data which is provided is common between products. In this paper, we take a comprehensive look at the data which is available for the most common pieces of todays HPC landscape, as well as how this data is obtained and how accurate it is. Having surveyed these components, we compare and contrast them noting key differences as well as providing insight into what features future components should have.
Journal of Parallel and Distributed Computing | 2015
Li Yu; Zhou Zhou; Sean Wallace; Michael E. Papka; Zhiling Lan
Abstract As high performance computing (HPC) continues to grow in scale and complexity, energy becomes a critical constraint in the race to exascale computing. The days of “performance at all cost” are coming to an end. While performance is still a major objective, future HPC will have to deliver desired performance under the energy constraint. Among various power management methods, power capping is a widely used approach. Unfortunately, the impact of power capping on system performance, user jobs, and power-performance efficiency are not well studied due to many interfering factors imposed by system workload and configurations. To fully understand power management in extreme scale systems with a fixed power budget, we introduce a power-performance modeling tool named PuPPET (Power Performance PETri net). Unlike the traditional performance modeling approaches such as analytical methods or trace-based simulators, we explore a new approach–colored Petri nets–for the design of PuPPET. PuPPET is fast and extensible for navigating through different configurations. More importantly, it can scale to hundreds of thousands of processor cores and at the same time provide high levels of modeling accuracy. We validate PuPPET by using system traces (i.e., workload log and power data) collected from the production 48-rack IBM Blue Gene/Q supercomputer at Argonne National Laboratory. Our trace-based validation demonstrates that PuPPET is capable of modeling the dynamic execution of parallel jobs on the machine by providing an accurate approximation of energy consumption. In addition, we present two case studies of using PuPPET to study power-performance tradeoffs on petascale systems.
Archive | 2015
Erich J. Petushek; Edward T. Cokely; Paul Ward; John J. Durocher; Sean Wallace; Gregory D. Myer
Archive | 2014
Erich J. Petushek; Edward T. Cokely; Paul Ward; Gregory D. Myer; Sean Wallace