Robert W. Leland | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Robert W. Leland is active.

Explore More

Publication

Featured researches published by Robert W. Leland.

conference on high performance computing (supercomputing) | 1995

A Multi-Level Algorithm For Partitioning Graphs

Bruce Hendrickson; Robert W. Leland

The graph partitioning problem is that of dividing the vertices of a graph into sets of specified sizes such that few edges cross between sets. This NP-complete problem arises in many important scientific and engineering problems. Prominent examples include the decomposition of data structures for parallel computation, the placement of circuit elements and the ordering of sparse matrix computations. We present a multilevel algorithm for graph partitioning in which the graph is approximated by a sequence of increasingly smaller graphs. The smallest graph is then partitioned using a spectral method, and this partition is propagated back through the hierarchy of graphs. A variant of the Kernighan-Lin algorithm is applied periodically to refine the partition. The entire algorithm can be implemented to execute in time proportional to the size of the original graph. Experiments indicate that, relative to other advanced methods, the multilevel algorithm produces high quality partitions at low cost.

SIAM Journal on Scientific Computing | 1995

An improved spectral graph partitioning algorithm for mapping parallel computations

Bruce Hendrickson; Robert W. Leland

Efficient use of a distributed memory parallel computer requires that the computational load be balanced across processors in a way that minimizes interprocessor communication. A new domain mapping algorithm is presented that extends recent work in which ideas from spectral graph theory have been applied to this problem. The generalization of spectral graph bisection involves a novel use of multiple eigenvectors to allow for division of a computation into four or eight parts at each stage of a recursive decomposition. The resulting method is suitable for scientific computations like irregular finite elements or differences performed on hypercube or mesh architecture machines. Experimental results confirm that the new method provides better decompositions arrived at more economically and robustly than with previous spectral methods. This algorithm allows for arbitrary nonnegative weights on both vertices and edges to model inhomogeneous computation and communication. A new spectral lower bound for graph bi...

International Journal of High Speed Computing | 1995

AN EFFICIENT PARALLEL ALGORITHM FOR MATRIX-VECTOR MULTIPLICATION

Bruce Hendrickson; Robert W. Leland; Steve Plimpton

The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scientific computation. A fast and efficient parallel algorithm for this calculation is therefore desirable. This paper describes a parallel matrix-vector multiplication algorithm which is particularly well suited to dense matrices or matrices with an irregular sparsity pattern. Such matrices can arise from discretizing partial differential equations on irregular grids or from problems exhibiting nearly random connectivity between data structures. The communication cost of the algorithm is independent of the matrix sparsity pattern and is shown to scale as for an n×n matrix on p processors. The algorithm’s performance is demonstrated by using it within the well known NAS conjugate gradient benchmark. This resulted in the fastest run times achieved to date on both the 1024 node nCUBE 2 and the 128 node Intel iPSC/860. Additional improvements to the algorithm which are possible when integrating it with the conjugate gradient algorithm are also discussed.

Communications of The ACM | 1994

Massively parallel methods for engineering and science problems

William J. Camp; Steve Plimpton; Bruce Hendrickson; Robert W. Leland

Scientific research and engineering development are relying increasingly on computational simulation to augment theoretical analysis, experimentation, and testing. Many of todays problems are far too complex to yield to mathematical analyses. Likewise, large-scale experimental testing is often infeasible for a variety of economic, political, or environmental reasons. At the very least, testing adds to the time and expense of product development

ieee international conference on high performance computing data and analytics | 1994

An empirical study of static load balancing algorithms

Robert W. Leland; Bruce Hendrickson

Empirically compares a variety of current algorithms used to map scientific computations onto massively parallel computers. The comparison is performed using Chaco, a publicly available graph partitioning code written by the authors. Algorithms are evaluated in terms of both computing cost and quality of partition, as judged by the execution time of the parallel application.<<ETX>>

IEEE Computer | 2015

Computing beyond Moore's Law

John Shalf; Robert W. Leland

Photolithography systems are on pace to reach atomic scale by the mid-2020s, necessitating alternatives to continue realizing faster, more predictable, and cheaper computing performance. If the end of Moores law is real, a research agenda is needed to assess the viability of novel semiconductor technologies and navigate the ensuing challenges.

Concurrency and Computation: Practice and Experience | 2005

Architectural specification for massively parallel computers: an experience and measurement‐based approach

Ron Brightwell; William J. Camp; Benjamin Cole; Erik P. DeBenedictis; Robert W. Leland; James L. Tomkins; Arthur B. Maccabe

In this paper, we describe the hardware and software architecture of the Red Storm system developed at Sandia National Laboratories. We discuss the evolution of this architecture and provide reasons for the different choices that have been made. We contrast our approach of leveraging high‐volume, mass‐market commodity processors to that taken for the Earth Simulator. We present a comparison of benchmarks and application performance that support our approach. We also project the performance of Red Storm and the Earth Simulator. This projection indicates that the Red Storm architecture is a much more cost‐effective approach to massively parallel computing. Published in 2005 by John Wiley & Sons, Ltd.

Other Information: PBD: 16 May 2003 | 2003

Creating science-driven computer architecture: A new patch to scientific leadership

Horst D. Simon; C. William McCurdy; T.C. Kramer; Rick Stevens; Mike McCoy; Mark Seager; Thomas Zacharia; Ray Bair; Scott Studham; William J. Camp; Robert W. Leland; John Morrison; William Feiereisen

We believe that it is critical for the future of high end computing in the United States to bring into existence a new class of computational capability that is optimal for science. In recent years scientific computing has increasingly become dependent on hardware that is designed and optimized for commercial applications. Science in this country has greatly benefited from the improvements in computers that derive from advances in microprocessors following Moores Law, and a strategy of relying on machines optimized primarily for business applications. However within the last several years, in part because of the challenge presented by the appearance of the Japanese Earth Simulator, the sense has been growing in the scientific community that a new strategy is needed. A more aggressive strategy than reliance only on market forces driven by business applications is necessary in order to achieve a better alignment between the needs of scientific computing and the platforms available. The United States should undertake a program that will result in scientific computing capability that durably returns the advantage to American science, because doing so is crucial to the countrys future. Such a strategy must also be sustainable. New classes of computer designs will not only revolutionize the power of supercomputing for science, but will also affect scientific computing at all scales. What is called for is the opening of a new frontier of scientific capability that will ensure that American science is greatly enabled in its pursuit of research in critical areas such as nanoscience, climate prediction, combustion, modeling in the life sciences, and fusion energy, as well as in meeting essential needs for national security. In this white paper we propose a strategy for accomplishing this mission, pursuing different directions of hardware development and deployment, and establishing a highly capable networking and grid infrastructure connecting these platforms to the broad research community.

conference on high performance computing (supercomputing) | 1994

A 65+ Gflop/s unstructured finite element simulation of chemically reacting flows on the Intel Paragon

John N. Shadid; Scott A. Hutchinson; Harry K. Moffat; Gary L. Hennigan; Bruce Hendrickson; Robert W. Leland

Many scientific and engineering applications require a detailed analysis of complex systems with strongly coupled fluid flow, thermal energy transfer mass transfer and nonequilibrium chemical reactions. Here we describe the performance of a newly developed application code, SALSA, designed to simulate these complex flows on large-scale parallel machines such as the Intel Paragon. SALSA uses 3D unstructured finite element methods to model geometrically complex flow systems. Fully implicit time integration, multicomponent mass transport and general gas phase and surface species non-equilibrium chemical kinetics are employed. Using these techniques we have obtained over 65 Gflop/s on a 3D chemically reacting flow CVD problem for Silicon Carbide (SiC) deposition. This represents 46% of the peak performance of our 1904 node Intel Paragon, an outstanding computational rate in view of the required unstructured data communication.<<ETX>>

Archive | 1993