Justin M. Wozniak | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Justin M. Wozniak is active.

Explore More

Publication

Featured researches published by Justin M. Wozniak.

parallel computing | 2011

Swift: A language for distributed parallel scripting

Michael Wilde; Mihael Hategan; Justin M. Wozniak; Ben Clifford; Daniel S. Katz; Ian T. Foster

Scientists, engineers, and statisticians must execute domain-specific application programs many times on large collections of file-based data. This activity requires complex orchestration and data management as data is passed to, from, and among application invocations. Distributed and parallel computing resources can accelerate such processing, but their use further increases programming complexity. The Swift parallel scripting language reduces these complexities by making file system structures accessible via language constructs and by allowing ordinary application programs to be composed into powerful parallel scripts that can efficiently utilize parallel and distributed resources. We present Swifts implicitly parallel and deterministic programming model, which applies external applications to file collections using a functional style that abstracts and simplifies distributed parallel execution.

utility and cloud computing | 2011

Coasters: Uniform Resource Provisioning and Access for Clouds and Grids

Mihael Hategan; Justin M. Wozniak; Ketan Maheshwari

In this paper we present the Coaster System. It is an automatically-deployed node provisioning (Pilot Job) system for grids, clouds, and ad-hoc desktop-computer networks supporting file staging, on-demand opportunistic multi-node allocation, remote logging, and remote monitoring. The Coaster System has been previously [32] shown to work at scales of thousands of cores. It has been used since 2009 for applications in fields that include biochemistry, earth systems science, energy modeling, and neuroscience. The system has been used successfully on the Open Science Grid, the TeraGrid [1], supercomputers (IBM Blue Gene/P [15], Cray XT and XE systems [5], and Sun Constellation [26]), a number of smaller clusters, and three cloud infrastructures (BioNimbus [2], Future Grid [20] and Amazon EC2 [16]).

Fundamenta Informaticae | 2013

Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Justin M. Wozniak; Timothy G. Armstrong; Ketan Maheshwari; Ewing L. Lusk; Daniel S. Katz; Michael Wilde; Ian T. Foster

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. “Many-task” programming models such as functional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tightly coupled parallelism at the lower level through multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and intertask data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and distributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with automated self-distribution and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.

ieee international conference on high performance computing data and analytics | 2012

Design and analysis of data management in scalable parallel scripting

Zhao Zhang; Daniel S. Katz; Justin M. Wozniak; Allan Espinosa; Ian T. Foster

We seek to enable efficient large-scale parallel execution of applications in which a shared filesystem abstraction is used to couple many tasks. Such parallel scripting (many-task computing, MTC) applications suffer poor performance and utilization on large parallel computers because of the volume of filesystem I/O and a lack of appropriate optimizations in the shared filesystem. Thus, we design and implement a scalable MTC data management system that uses aggregated compute node local storage for more efficient data movement strategies. We co-design the data management system with the data-aware scheduler to enable dataflow pattern identification and automatic optimization. The framework reduces the time to solution of parallel stages of an astronomy data analysis application, Montage, by 83.2% on 512 cores; decreases the time to solution of a seismology application, CyberShake, by 7.9% on 2,048 cores; and delivers BLAST performance better than mpiBLAST at various scales up to 32,768 cores, while preserving the flexibility of the original BLAST application.

ieee international conference on high performance computing data and analytics | 2014

Compiler techniques for massively scalable implicit task parallelism

Timothy G. Armstrong; Justin M. Wozniak; Michael Wilde; Ian T. Foster

Swift/T is a high-level language for writing concise, deterministic scripts that compose serial or parallel codes implemented in lower-level programming models into large-scale parallel applications. It executes using a data-driven task parallel execution model that is capable of orchestrating millions of concurrently executing asynchronous tasks on homogeneous or heterogeneous resources. Producing code that executes efficiently at this scale requires sophisticated compiler transformations: poorly optimized code inhibits scaling with excessive synchronization and communication. We present a comprehensive set of compiler techniques for data-driven task parallelism, including novel compiler optimizations and intermediate representations. We report application benchmark studies, including unbalanced tree search and simulated annealing, and demonstrate that our techniques greatly reduce communication overhead and enable extreme scalability, distributing up to 612 million dynamically load balanced tasks per second at scales of up to 262,144 cores without explicit parallelism, synchronization, or load balancing in application code.

petascale data storage workshop | 2009

Case studies in storage access by loosely coupled petascale applications

Justin M. Wozniak; Michael Wilde

A large number of real-world scientific applications can be characterized as loosely coupled: the communication among tasks is infrequent and can be performed by using file operations. While these applications may be ported to large scale machines designed for tightly coupled, massively parallel jobs, direct implementations do not perform well because of the large number of small, latency-bound file accesses. This problem may be overcome through the use of a variety of custom, hand-coded strategies applied at various subsystems of modern near-petascale computers- but is a labor intensive process that will become increasingly difficult at the petascale and beyond. This work profiles the essential operations in the I/O workload for five loosely coupled scientific applications. We characterize the I/O workload induced by these applications and offer an analysis to motivate and aid the development of programming tools, I/O subsystems, and filesystems.

acm sigplan symposium on principles and practice of parallel programming | 2013

Swift/T: scalable data flow programming for many-task applications

Justin M. Wozniak; Timothy G. Armstrong; Michael Wilde; Daniel S. Katz; Ewing L. Lusk; Ian T. Foster

Swift/T, a novel programming language implementation for highly scalable data flow programs, is presented.

ieee international conference on high performance computing data and analytics | 2013

Parallelizing the execution of sequential scripts

Zhao Zhang; Daniel S. Katz; Timothy G. Armstrong; Justin M. Wozniak; Ian T. Foster

Scripting is often used in science to create applications via the composition of existing programs. Parallel scripting systems allow the creation of such applications, but each system introduces the need to adopt a somewhat specialized programming model. We present an alternative scripting approach, AMFS Shell, that lets programmers express parallel scripting applications via minor extensions to existing sequential scripting languages, such as Bash, and then execute them in-memory on large-scale computers. We define a small set of commands between the scripts and a parallel scripting runtime system, so that programmers can compose their scripts in a familiar scripting language. The underlying AMFS implements both collective (fast file movement) and functional (transformation based on content) file management. Tasks are handled by AMFSs built-in execution engine. AMFS Shell is expressive enough for a wide range of applications, and the framework can run such applications efficiently on large-scale computers.

Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization | 2015

Lessons Learned from Building In Situ Coupling Frameworks

Matthieu Dorier; Matthieu Dreher; Tom Peterka; Justin M. Wozniak; Gabriel Antoniu; Bruno Raffin

Over the past few years, the increasing amounts of data produced by large-scale simulations have motivated a shift from traditional offline data analysis to in situ analysis and visualization. In situ processing began as the coupling of a parallel simulation with an analysis or visualization library, motivated primarily by avoiding the high cost of accessing storage. Going beyond this simple pairwise tight coupling, complex analysis workflows today are graphs with one or more data sources and several interconnected analysis components. In this paper, we review four tools that we have developed to address the challenges of coupling simulations with visualization packages or analysis workflows: Damaris, Decaf, FlowVR and Swift. This self-critical inquiry aims to shed light not only on their potential, but most importantly on the forthcoming software challenges that these and other in situ analysis and visualization frameworks will face in order to move toward exascale.

Proceedings of the 20th European MPI Users' Group Meeting on | 2013

Dataflow coordination of data-parallel tasks via MPI 3.0

Justin M. Wozniak; Tom Peterka; Timothy G. Armstrong; James Dinan; Ewing L. Lusk; Michael Wilde; Ian T. Foster

Scientific applications are often complex collections of many large-scale tasks. Mature tools exist for describing task-parallel workflows consisting of serial tasks, and a variety of tools exist for programming a single data-parallel operation. However, few tools cover the intersection of these two models. In this work, we extend the load balancing library ADLB to support parallel tasks. We demonstrate how applications can easily be composed of parallel tasks using Swift dataflow scripts, which are compiled to ADLB programs with performance comparable to hand-coded equivalents. By combining this framework with data-parallel analysis libraries, we are able to dynamically execute many instances of a parallel data analysis application in support of a parameter exploration workload.

Explore More