Andrew S. Grimshaw | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew S. Grimshaw is active.

Explore More

Publication

Featured researches published by Andrew S. Grimshaw.

acm sigplan symposium on principles and practice of parallel programming | 2012

Scalable GPU graph traversal

Duane Merrill; Michael Garland; Andrew S. Grimshaw

Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both irregular and data-dependent. Recent work has demonstrated the plausibility of GPU sparse graph traversal, but has tended to focus on asymptotically inefficient algorithms that perform poorly on graphs with non-trivial diameter. We present a BFS parallelization focused on fine-grained task management constructed from efficient prefix sum that achieves an asymptotically optimal O(|V|+|E|) work complexity. Our implementation delivers excellent performance on diverse graphs, achieving traversal rates in excess of 3.3 billion and 8.3 billion traversed edges per second using single and quad-GPU configurations, respectively. This level of performance is several times faster than state-of-the-art implementations both CPU and GPU platforms.

IEEE Computer | 1993

Easy-to-use object-oriented parallel processing with Mentat

Andrew S. Grimshaw

Mentat, an object-oriented parallel processing system designed to directly address the difficulty of developing architecture-independent parallel programs, is discussed. The Mentat system consists of two components: the Mentat programming language and the Mentat runtime system. The Mentat programming language, which is based on C++, is described. Performance results from implementing the Mentat runtime system on a network of Sun 3 and 4 workstations, the Silicon Graphics Iris, the Intel iPSC/2, and the Intel iPSC/860 are presented.<<ETX>>

job scheduling strategies for parallel processing | 1999

The Legion Resource Management System

Steve J. Chapin; Dimitrios Katramatos; John F. Karpovich; Andrew S. Grimshaw

Recent technological developments, including gigabit networking technology and low-cost, high-performance microprocessors, have given rise to metacomputing environments. Metacomputing environments combine hosts from multiple administrative domains via transnational and world-wide networks. Managing the resources in such a system is a complex task, but is necessary to efficiently and economically execute user programs. The Legion resource management system is flexible both in its support for system-level resource management but also in their adaptability for user-level scheduling policies.

high performance distributed computing | 1996

The core Legion object model

Michael J. Lewis; Andrew S. Grimshaw

The Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure shared object and shared name spaces, application adjustable fault tolerance, improved response time, and greater throughput. Legion targets wide area assemblies of workstations, supercomputers, and parallel supercomputers. Legion tackles problems not solved by existing workstation based parallel processing tools; the system will enable fault tolerance, wide area parallel processing, interoperability, heterogeneity, a single global name space, protection, security, efficient scheduling, and comprehensive resource management. The paper describes the core Legion object model, which specifies the composition and functionality of Legions core objects-those objects that cooperate to create, locate, manage, and remove objects in the Legion system. The object model facilitates a flexible extensible implementation, provides a single global name space, grants site autonomy to participating organizations, and scales to millions of sites and trillions of objects.

Future Generation Computer Systems | 1999

Resource management in Legion

Steve J. Chapin; Dimitrios Katramatos; John F. Karpovich; Andrew S. Grimshaw

The recent development of gigabit networking technology, combined with the proliferation of low-cost, high-performance microprocessors, has given rise to metacomputing environments. These environments can combine many thousands of hosts, from hundreds of administrative domains, connected by transnational and world-wide networks. Managing the resources in such a system is a complex task, but is necessary to efficiently and economically execute user programs. In this paper, we describe the resource management portions of the Legion metacomputing system, including the basic model and its implementation. These mechanisms are flexible both in their support for system-level resource management but also in their adaptability for user-level scheduling policies. We show this by implementing a simple scheduling policy and demonstrating how it can be adapted to more complex algorithms.

high performance distributed computing | 1996

Legion-a view from 50,000 feet

Andrew S. Grimshaw; William A. Wulf

The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations. To realize the potential that the physical infrastructure provides, software must be developed that is easy to use, supports a large degree of parallelism in the application code, and manages the complexity of the underlying physical system for the user. Legion is a metasystem project at the University of Virginia designed to provide users with a transparent interface to the available resources, both at the programming interface level as well as at the user level. Legion addresses issues such as parallelism, fault-tolerance, security, autonomy, heterogeneity, resource management and access transparency in a multi-language environment. In this paper, we present a high-level overview of Legion, its vision, objectives, a brief sketch of how some of those objectives will be met, and the current status of the project.

IEEE Computer | 1999

Wide area computing: resource sharing on a large scale

Andrew S. Grimshaw; Adam J. Ferrari; Frederick C. Knabe; Marty Humphrey

Consider almost any computing resource today-whether hardware, software, or data-and it will invariably be networked. Computing over wide area networks has been largely ad hoc, but as needs increase, piecemeal solutions no longer make sense. The authors set out to design and build a wide-area operating system that would allow multiple organizations with diverse platforms to share and combine their resources. This system, Legion, is a network-level operating system designed from scratch to target wide-area computing demands.

international conference on parallel architectures and compilation techniques | 2010

Revisiting sorting for GPGPU stream architectures

Duane Merrill; Andrew S. Grimshaw

This poster presents efficient strategies for sorting large sequences of fixed-length keys (and values) using GPGPU stream processors. Compared to the state-of-the-art, our radix sorting methods exhibit speedup of at least 2x for all generations of NVIDIA GPGPUs, and up to 3.7x for current GT200-based models. Our implementations demonstrate sorting rates of 482 million key-value pairs per second, and 550 million keys per second (32-bit). For this domain of sorting problems, we believe our sorting primitive to be the fastest available for any fully-programmable microarchitecture. These results motivate a different breed of parallel primitives for GPGPU stream architectures that can better exploit the memory and computational resources while maintaining the flexibility of a reusable component. Our sorting performance is derived from a parallel scan stream primitive that has been generalized in two ways: (1) with local interfaces for producer/consumer operations (visiting logic), and (2) with interfaces for performing multiple related, concurrent prefix scans (multi-scan).

Journal of Parallel and Distributed Computing | 1994

Metasystems: An Approach Combining Parallel Processing and Heterogeneous Distributed Computing Systems

Andrew S. Grimshaw; Jon B. Weissman; Emily A. West; Edmond C. Loyot

Abstract A metasystem is a single computing resource composed of a heterogeneous group of autonomous computers linked together by a network. The interconnection network needed to construct large metasystems will soon be in place. To fully exploit these new systems, software that is easy to use, supports large degrees of parallelism, and hides the complexity of the underlying physical architecture must be developed. In this paper we describe our metasystem vision, our approach to constructing a metasystem testbed, and early experimental results. Our approach combines features from earlier work on both parallel processing systems and heterogeneous distributed computing systems. Using the testbed, we have found that data coercion costs are not a serious obstacle to high performance, but that load imbalance induced by differing processor capabilities can limit performance. We then present a mechanism to overcome load imbalance that utilizes user-provided callbacks.

conference on high performance computing (supercomputing) | 2001

LegionFS: A Secure and Scalable File System Supporting Cross-Domain High-Performance Applications

Brian S. White; Michael Pittman Walker; Marty Humphrey; Andrew S. Grimshaw

Realizing that current file systems can not cope with the diverse requirements of wide-area collaborations, researchers have developed data access facilities to meet their needs. Recent work has focused on comprehensive data access architectures. In order to fulfill the evolving requirements in this environment, we suggest a more fully-integrated architecture built upon the fundamental tenets of naming, security, scalability, extensibility, and adaptability. These form the underpinning of the Legion File System (LegionFS). This paper motivates the need for these requirements and presents benchmarks that highlight the scalability of LegionFS. LegionFS aggregate throughput follows the linear growth of the network, yielding an aggregate read bandwidth of 193.8 MB/s on a 100 Mbps Ethernet backplane with 50 simultaneous readers. The serverless architecture of LegionFS is shown to benefit important scientific applications, such as those accessing the Protein Data Bank, within both local- and wide-area environments.

Explore More