Lucas A. Wilson
University of Texas at Austin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lucas A. Wilson.
extreme science and engineering discovery environment | 2014
Lucas A. Wilson; John M. Fonner
Petascale computing systems have enabled tremendous advances for traditional simulation and modeling algorithms that are built around parallel execution. Unfortunately, scientific domains using data-oriented or high-throughput paradigms have difficulty taking full advantage of these resources without custom software development. This paper describes our solution for rapidly developing parallel parametric studies using sequential or threaded tasks: The launcher. We detail how to get ensembles executing quickly through common job schedulers SGE and SLURM, and the various user-customizable options that the launcher provides. We illustrate the efficiency of or tool by presenting execution results at large scale (over 65,000 cores) for varying workloads, including a virtual screening workload with indeterminate runtimes using the drug docking software Autodock Vina.
international parallel and distributed processing symposium | 2008
Lucas A. Wilson
As high performance and distributed computing become more important tools for enabling scientists and engineers to solve large computational problems, the need for methods to fairly and efficiently schedule tasks across multiple, possibly geographically distributed, computing resources becomes more crucial. Given the nature of distributed systems and the immense numbers of resources to be managed in distributed and large-scale cluster environments, traditional centralized schedulers will not be extremely effective at providing timely scheduling information. In order to manage large numbers of resources quickly, less computationally intensive methods for scheduling tasks must be explored. This paper proposes a novel resource management system based on the immune system metaphor, making use of the concepts in Immune Network Theory and Danger Theory. By emulating various elements in the immune system, the proposed manager could efficiently execute tasks on very large systems of heterogeneous resources across geographic and/or administrative domains. The distributed nature of the immune system is also exploited in order to allow efficient scheduling of tasks, even in extremely large environments, without the use of a centralized or hierarchical scheduler.
international conference on parallel processing | 2014
Lucas A. Wilson; Jeffery von Ronne
High Performance Computing (HPC) systems now consist of many thousands of individual servers. While relatively scalable and cost effective, these systems suffer from a complexity of scale that will not improve with increasing machine size. It will become increasingly difficult, if not impossible, for HPC systems to maintain node availability long enough for any type of worthwhile scientific calculations to be performed. Existing execution and programming models, which are dependent on guaranteed hardware reliability, are not well suited to future distributed memory parallel systems where hardware reliability cannot be guaranteed. We propose a distributed dataflow execution model which utilizes a distributed dictionary for data memoization, allowing each parallel task to schedule instructions without direct inter-task coordination. We provide a description of the proposed execution model, including program formulation and autonomous dataflow task selection. Experiments performed demonstrate the proposed models ability to automatically distribute work across tasks, as well as the proposed models ability to scale in both shared memory and distributed memory.
ieee international conference on high performance computing data and analytics | 2011
Lucas A. Wilson; John A. Lockman
The possibility of hardware failures occurring during the execution of application software continues to increase along with the scale of modern systems. Existing parallel development approaches cannot effectively recover from these errors except by means of expensive checkpoint/restart files. As a result, many CPU hours of scientific simulation are lost due to hardware failures. Relentless Computing is a data-oriented approach to software development that allows for many classes of distributed and parallel algorithms, from no data-sharing to intense data-sharing, to be solved in both loosely- and tightly- coupled environments. Each process requires no knowledge of the current runtime status of the others to begin contributing, meaning that the execution pool can shrink and grow, as well as recover from hardware failure, automatically. We present motivations for the development of Relentless Computing, how it works, examples of using Relentless Computing to solve several types of problems, and initial scaling results.
parallel computing | 2016
Lucas A. Wilson; Jeffery von Ronne
We describe a novel model for executing distributed memory parallel programs using uncoordinated tasks.We describe several off-line optimizations for the proposed model.We examine the effects of these optimizations on modern processors with wider vector units.Increasing levels of task coalescence can improve throughput and increase performance.Increases in performance are observed in both single node and multi node experiments. We propose a distributed dataflow execution model which utilizes a distributed dictionary for data memoization, allowing each parallel task to schedule instructions without direct inter-task coordination. We provide a description of the proposed model, including autonomous dataflow task selection. We also describe a set of optimization strategies which improve overall throughput of stencil programs executed using this model on modern multi-core and vectorized architectures.
Journal of Social Structure | 2017
Lucas A. Wilson; John M. Fonner; Jason Allison; Oscar Esteban; Harry Kenya; Marshall Lerner
Launcher (Wilson and Fonner 2014, Wilson (2016), Wilson (2017)) is a utility for performing simple, data parallel, high throughput computing (HTC) workflows on clusters, massively parallel processor (MPP) systems, workgroups of computers, and personal machines. It can be easily run from userspace or installed for system-wide use on shared resources. Launcher will perform automatic process binding on multi-/many-core architectures where hwloc (“Portable Hardware Locality (Hwloc)” n.d.) is installed.
Proceedings of the Second Workshop on Optimizing Stencil Computations | 2014
Lucas A. Wilson; Jeffery von Ronne
Many different scientific domains make use of stencil-based algorithms to solve mathematical equations for computational modeling and simulation. Existing imperative languages map well onto physical hardware, but can be difficult for domain scientists to map to mathematical stencil algorithms. StenSAL is a domain specific language which is tailored to the expression of explicit stencil algorithms through deterministic tasks chained together through single assignment data dependencies, and generates programs that map to the relentless execution model of computation. We provide a description of the StenSAL language and grammar, some of the sanity checks that can be performed on StenSAL programs before code generation, and how the compiler translates StenSAL into Python.
Big Data Research | 2017
Lucas A. Wilson
For many scientific disciplines, the transition to using advanced cyberinfrastructure comes not out of a desire to use the most advanced or most powerful resources available, but because their current operational model is no longer sufficient to meet their computational needs. Many researchers begin their computations on their desktop or local workstation, only to discover that the time required to simulate their problem, analyze their instrument data, or score the multitude of entities that they want to would require far more time than they have available.Launcher is a simple utility which enables the execution of high throughput computing workloads on managed HPC systems quickly and with as little effort as possible on the part of the user. Basic usage of the Launcher is straightforward, but Launcher provides several more advanced capabilities including use of Intel Xeon Phi coprocessor cards and task binding support for multi-/many-core architectures. We step through the processes of setting up a basic Launcher job, including creating a job file, setting appropriate environment variables, and using scheduler integration. We also describe how to enable use of the Intel Xeon Phi coprocessor cards, take advantage of Launchers task binding system, and execute many parallel (OpenMP/MPI) applications at once.
ieee international conference on high performance computing data and analytics | 2016
Lucas A. Wilson; S. Charlie Dey
Archive | 2010
Lucas A. Wilson; Michael Scherger; John A. Lockman