Is this you? Create Your Porfile

John Feo

Pacific Northwest National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John Feo is active.

Explore More

Publication

Featured researches published by John Feo.

Journal of Parallel and Distributed Computing | 1990

A report on the Sisal language project

John Feo; David C. Cann; Rodney R. Oldehoeft

Abstract Sisal (Streams and Iterations in Single Assignment Language) is a general-purpose applicative language intended for use on both conventional and novel multiprocessor systems. In this report we discuss the projects objectives, philosophy, and accomplishments and state our future plans. Four significant results of the Sisal project are compilation techniques for high-performance parallel applicative computation, a microtasking environment that supports dataflow on conventional shared-memory architectures, execution times comparable to those of Fortran, and cost-effective speedup on shared-memory multiprocessors.

parallel computing | 1988

An analysis of the computational and parallel complexity of the Livermore Loops

John Feo

Abstract This paper presents and analyzes the computational and parallel complexity of the Livermore Loops. The Loops represent the type of computational kernels typically found in large-scale scientific computing and have been used to benchmark computer system since the mid-60s. On parallel systems, a processs computational structure can greatly affect its efficiency. If the loops are to be used to benchmark such systems, their computations must be understood thoroughly, so that efficient implementations may be written. This paper addresses that concern.

international conference on e-science | 2009

A High-Performance Hybrid Computing Approach to Massive Contingency Analysis in the Power Grid

Ian Gorton; Zhenyu Huang; Yousu Chen; Benson K. Kalahar; Shuangshuang Jin; Daniel G. Chavarría-Miranda; Douglas J. Baxter; John Feo

Operating the electrical power grid to prevent power black-outs is a complex task. An important aspect of this is contingency analysis, which involves understanding and mitigating potential failures in power grid elements such as transmission lines. When taking into account the potential for multiple simultaneous failures (known as the N-x contingency problem), contingency analysis becomes a massively computational task. In this paper we describe a novel hybrid computational approach to contingency analysis. This approach exploits the unique graph processing performance of the Cray XMT in conjunction with a conventional massively parallel compute cluster to identify likely simultaneous failures that could cause widespread cascading power failures that have massive economic and social impact on society. The approach has the potential to provide the first practical and scalable solution to the N-x contingency problem. When deployed in power grid operations, it will increase the grid operator’s ability to deal effectively with outages and failures with power grid components while preserving stable and safe operation of the grid. The paper describes the architecture of our solution and presents preliminary performance results that validate the efficacy of our approach.

Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis | 1997

The Sisal model of functional programming and its implementation

Jean-Luc Gaudiot; Wim Bohm; Walid A. Najjar; Tom DeBoni; John Feo; Patrick Miller

Programming a massively-parallel machine is a daunting task for any human programmer, and parallelization may even be impossible for any compiler. Instead, the functional programming paradigm may prove to be an ideal solution by providing an implicitly parallel interface to the programmer. We describe the Sisal (Stream and Iteration in a Single Assignment Language) project. Its goal is to provide a general-purpose user interface for a wide range of parallel processing platforms.

ieee high performance extreme computing conference | 2013

Standards for graph algorithm primitives

Tim Mattson; David A. Bader; Jonathan W. Berry; Aydin Buluç; Jack J. Dongarra; Christos Faloutsos; John Feo; John R. Gilbert; Joseph E. Gonzalez; Bruce Hendrickson; Jeremy Kepner; Charles E. Leiserson; Andrew Lumsdaine; David A. Padua; Stephen W. Poole; Steven P. Reinhardt; Michael Stonebraker; Steve Wallach; Andrew Yoo

It is our view that the state of the art in constructing a large collection of graph algorithms in terms of linear algebraic operations is mature enough to support the emergence of a standard set of primitive building blocks. This paper is a position paper defining the problem and announcing our intention to launch an open effort to define this standard.

parallel computing | 2012

Graph coloring algorithms for multi-core and massively multithreaded architectures

ímit V. Çatalyürek; John Feo; Assefaw Hadish Gebremedhin; Mahantesh Halappanavar; Alex Pothen

We explore the interplay between architectures and algorithm design in the context of shared-memory platforms and a specific graph problem of central importance in scientific and high-performance computing, distance-1 graph coloring. We introduce two different kinds of multithreaded heuristic algorithms for the stated, NP-hard, problem. The first algorithm relies on speculation and iteration, and is suitable for any shared-memory system. The second algorithm uses dataflow principles, and is targeted at the non-conventional, massively multithreaded Cray XMT system. We study the performance of the algorithms on the Cray XMT and two multi-core systems, Sun Niagara 2 and Intel Nehalem. Together, the three systems represent a spectrum of multithreading capabilities and memory structure. As testbed, we use synthetically generated large-scale graphs carefully chosen to cover a wide range of input types. The results show that the algorithms have scalable runtime performance and use nearly the same number of colors as the underlying serial algorithm, which in turn is effective in practice. The study provides insight into the design of high performance algorithms for irregular problems on many-core architectures.

computing frontiers | 2007

Evaluating the potential of multithreaded platforms for irregular scientific computations

Jarek Nieplocha; Andres Marquez; John Feo; Daniel G. Chavarría-Miranda; George Chin; Chad Scherrer; Nathaniel Beagley

The resurgence of current and upcoming multithreaded architectures and programming models led us to conduct a detailed study to understand the potential of these platforms to increase the performance of data-intensive, irregular scientific applications. Our study is based on a power system state estimation application and a novel anomaly detection application applied to network traffic data. We also conducted a detailed evaluation of the platforms using microbenchmarks in order to gain insight into their architectural capabilities and their interaction with programming models and application software. The evaluation was performed on the Cray MTA-2 and the Sun Niagar.

ieee international symposium on parallel distributed processing workshops and phd forum | 2010

Hashing strategies for the Cray XMT

Eric Goodman; David J. Haglin; Chad Scherrer; Daniel G. Chavarría-Miranda; Jace A. Mogill; John Feo

Two of the most commonly used hashing strategies-linear probing and hashing with chaining-are adapted for efficient execution on a Cray XMT. These strategies are designed to minimize memory contention. Datasets that follow a power law distribution cause significant performance challenges to shared memory parallel hashing implementations. Experimental results show good scalability up to 128 processors on two power law datasets with different data types: integer and string. These implementations can be used in a wide range of applications.

Journal of Parallel and Distributed Computing | 1993

Program partitioning for NUMA multiprocessor computer systems

Richard Wolski; John Feo

Abstract Program partitioning and scheduling are essential steps in programming non-shared-memory computer systems. Partitioning is the separation of program operations into sequential tasks, and scheduling is the assignment of tasks to processors. To be effective, automatic methods require an accurate representation of the model of computation and the target architecture. Current partitioning methods assume today′s most prevalent models - macro dataflow and a homogeneous/two-level multicomputer system. Based on communication channels, neither model represents well the emerging class of NUMA multiprocessor computer systems consisting of hierarchical read/write memories. Consequently, the partitions generated by extant methods do not execute well on these systems. In this paper, we extend the conventional graph representation of the macro-dataflow model to enable mapping heuristics to consider the complex conununication options supported by NUMA architectures. We describe two such heuristics. Simulated execution times of program graphs show that our model and heuristics generate higher quality program mappings than current methods for NUMA architectures.

conference on high performance computing (supercomputing) | 1990

SISAL versus FORTRAN: a comparison using the Livermore loops

David C. Cann; John Feo

The authors compare the performance of SISAL, an application language for parallel numerical computations, and Fortran. The intent is to show that applicative programs, when compiled using a set of powerful yet simple optimization techniques, can achieve sequential execution speeds comparable to Fortran, and automatically utilize conventional shared memory multiprocessors.<<ETX>>

Explore More