Jonathan C. Beard | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jonathan C. Beard is active.

Explore More

Publication

Featured researches published by Jonathan C. Beard.

programming models and applications for multicores and manycores | 2015

RaftLib: a C++ template library for high performance stream parallel processing

Jonathan C. Beard; Peng Li; Roger D. Chamberlain

Stream processing or data-flow programming is a compute paradigm that has been around for decades in many forms yet has failed garner the same attention as other mainstream languages and libraries (e.g., C++ or OpenMP [15]). Stream processing has great promise: the ability to safely exploit extreme levels of parallelism. There have been many implementations, both libraries and full languages. The full languages implicitly assume that the streaming paradigm cannot be fully exploited in legacy languages, while library approaches are often preferred for being integrable with the vast expanse of legacy code that exists in the wild. Libraries, however are often criticized for yielding to the shape of their respective languages. RaftLib aims to fully exploit the stream processing paradigm, enabling a full spectrum of streaming graph optimizations while providing a platform for the exploration of integrability with legacy C/C++ code. RaftLib is built as a C++ template library, enabling end users to utilize the robust C++ standard library along with RaftLibs pipeline parallel framework. RaftLib supports dynamic queue optimization, automatic parallelization, and real-time low overhead performance monitoring.

modeling, analysis, and simulation on computer and telecommunication systems | 2013

Analysis of a Simple Approach to Modeling Performance for Streaming Data Applications

Jonathan C. Beard; Roger D. Chamberlain

Current state of the art systems contain various types of multicore processors, General Purpose Graphics Processing Units (GPGPUs) and occasionally Digital Signal Processors (DSPs) or Field-Programmable Gate Arrays (FPGAs). With heterogeneity comes multiple abstraction layers that hide underlying complexity. While necessary to ease programmability of these systems, this hidden complexity makes quantitative performance modeling a difficult task. This paper outlines a computationally simple approach to modeling the overall throughput and buffering needs of a streaming application deployed on heterogeneous hardware.

European Workshop on Performance Engineering | 2014

Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Jonathan C. Beard; Roger D. Chamberlain

Minor variations in execution time can lead to out-sized effects on the behavior of an application as a whole. There are many sources of such variation within modern multi-core computer systems. For an otherwise deterministic application, we would expect the execution time variation to be non-existent (effectively zero). Unfortunately, this expectation is in error. For instance, variance in the realized execution time tends to increase as the number of processes per compute core increases. Recognizing that characterizing the exact variation or the maximal variation might be a futile task, we take a different approach, focusing instead on the best case variation. We propose a modified (truncated) Levy distribution to characterize this variation. Using empirical sampling we also derive a model to parametrize this distribution that doesn’t require expensive distribution fitting, relying only on known parameters of the system. The distributional assumptions and parametrization model are evaluated on multi-core systems with the common Linux completely fair scheduler.

embedded and ubiquitous computing | 2011

Crossing Boundaries in TimeTrial: Monitoring Communications across Architecturally Diverse Computing Platforms

Joseph M. Lancaster; Joseph G. Wingbermuehle; Jonathan C. Beard; Roger D. Chamberlain

Time Trial is a low-impact performance monitor that supports streaming data applications deployed on a variety of architecturally diverse computational platforms, including multicore processors and field-programmable gate arrays. Communication between resources in architecturally diverse systems is frequently a limitation to overall application performance. Understanding these bottlenecks is crucial to understanding overall application performance. Direct measurement of inter-resource communications channel occupancy is not readily achievable without significantly impacting performance of the application itself. Here, we present Time Trials approach to monitoring those queues that cross platform boundaries. Since the approach includes a combination of direct measurement and modeling, we also describe circumstances under which the model can be shown to be inappropriate. Examples with several micro-benchmark applications (for which the true measurement is known) and an application that uses Monte Carlo techniques to solve Lap laces equation are used for illustrative purposes.

Archive | 2013

Simple Analytic Performance Models for Streaming Data Applications Deployed on Diverse Architectures

Jonathan C. Beard; Roger D. Chamberlain; Mark A. Franklin

Modern hardware is inherently heterogeneous. With heterogeneity comes multiple abstraction layers that hide underlying complex systems. While hidden, this complexity makes quantitative performance modeling a difficult task. Designers of high-performance streaming applications for heterogeneous systems must contend with unpredictable and often non-generalizable models to predict performance of a particular application and hardware mapping. This paper outlines a computationally simple approach that can be used to model the overall throughput and buffering needs of a streaming application on heterogeneous hardware. The model presented is based upon a hybrid maximum flow and decomposed discrete queueing model. The utility of the model is assessed using a set of real and synthetic benchmarks with model predictions compared to measured application performance. Type of Report: Other Department of Computer Science & Engineering Washington University in St. Louis Campus Box 1045 St. Louis, MO 63130 ph: (314) 935-6160 Simple Analytic Performance Models for Streaming Data Applications Deployed on Diverse Architectures Jonathan C. Beard, Roger D. Chamberlain and Mark A. Franklin Dept. of Computer Science and Engineering Washington University in St. Louis {jbeard,roger,jbf}@wustl.edu

International Journal of High Performance Computing Applications | 2017

RaftLib: A C++ template library for high performance stream parallel processing:

Jonathan C. Beard; Peng Li; Roger D. Chamberlain

Stream processing is a compute paradigm that has been around for decades, yet until recently has failed to garner the same attention as other mainstream languages and libraries (e.g. C++, OpenMP, MPI). Stream processing has great promise: the ability to safely exploit extreme levels of parallelism to process huge volumes of streaming data. There have been many implementations, both libraries and full languages. The full languages implicitly assume that the streaming paradigm cannot be fully exploited in legacy languages, while library approaches are often preferred for being integrable with the vast expanse of extant legacy code. Libraries, however are often criticized for yielding to the shape of their respective languages. RaftLib aims to fully exploit the stream processing paradigm, enabling a full spectrum of streaming graph optimizations, while providing a platform for the exploration of integrability with legacy C/C++ code. RaftLib is built as a C++ template library, enabling programmers to utilize the robust C++ standard library, and other legacy code, along with RaftLib’s parallelization framework. RaftLib supports several online optimization techniques: dynamic queue optimization, automatic parallelization, and real-time low overhead performance monitoring.

european conference on parallel processing | 2015

Online Automated Reliability Classification of Queueing Models for Streaming Processing Using Support Vector Machines

Jonathan C. Beard; Cooper Epstein; Roger D. Chamberlain

When do you trust a performance model? More specifically, when can a particular model be used for a specific application? Once a stochastic model is selected, its parameters must be determined. This involves instrumentation, data collection, and finally interpretation; which are very time consuming. Even when done correctly, the results hold for only the conditions under which the system was characterized. For modern, dynamic stream processing systems, this is far too slow if a model-based approach to performance tuning is to be considered. This work demonstrates the use of a Support Vector Machine (SVM) to determine if a stochastic queueing model is usable or not for a particular queueing station within a streaming application. When combined with methods for online service rate approximation, our SVM approach can select models while the application is executing (online). The method is tested on a variety of hardware and software platforms. The technique is shown to be highly effective for determining the applicability of M / M / 1 and M / D / 1 queueing models to stream processing applications.

international symposium on performance analysis of systems and software | 2013

Use of simple analytic performance models for streaming data applications deployed on diverse architectures

Jonathan C. Beard; Roger D. Chamberlain

Modern hardware is often heterogeneous. With heterogeneity comes multiple abstraction layers that hide underlying complex systems. This complexity makes quantitative performance modeling a difficult task. Designers of high-performance streaming applications for heterogeneous systems must contend with unpredictable and often non-generalizable models to predict performance of a particular application and hardware mapping. This paper outlines a computationally simple approach that can be used to model the overall throughput and buffering needs of a streaming application on heterogeneous hardware. The model presented is based upon a hybrid maximum flow and decomposed discrete queueing model. The utility of the model is assessed using a set of real and synthetic benchmarks with model predictions compared to measured application performance.

programming models and applications for multicores and manycores | 2015

Deadlock-free buffer configuration for stream computing

Peng Li; Jonathan C. Beard; Jeremy Buhler

Stream computing is a popular paradigm for parallel and distributed computing, which features computing nodes connected by first-in first-out (FIFO) data channels. To increase the efficiency of communication links and boost application throughput, output buffers are often used. However, the connection between the configuration of output buffers and application deadlocks has not been studied. In this paper, we show that bad configuration of output buffers can lead to application deadlock. We prove necessary and sufficient condition for deadlock-free buffer configurations. We also propose an efficient method based on all-pair shortest path algorithms to detect unsafe buffer configurations. We also sketch a method to adjust an unsafe buffer configuration to a safe one.

international conference on performance engineering | 2015

Automated Reliability Classification of Queueing Models for Streaming Computation

Jonathan C. Beard; Cooper Epstein; Roger D. Chamberlain

When do you trust a model? More specifically, when can a model be used for a specific application? This question often takes years of experience and specialized knowledge to answer correctly. Once this knowledge is acquired it must be applied to each application. This involves instrumentation, data collection and finally interpretation. We propose the use of a trained Support Vector Machine (SVM) to give an automated system the ability to make an educated guess as to model applicability. We demonstrate a proof-of-concept which trains a SVM to correctly determine if a particular queueing model is suitable for a specific queue within a streaming system. The SVM is demonstrated using a micro-benchmark to simulate a wide variety of queueing conditions.

Explore More