Kurt Walter Pinnow | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kurt Walter Pinnow is active.

Explore More

Publication

Featured researches published by Kurt Walter Pinnow.

Ibm Journal of Research and Development | 2005

Design and implementation of message-passing services for the Blue Gene/L supercomputer

George S. Almasi; Charles J. Archer; José G. Castaños; John A. Gunnels; C. Christopher Erway; Philip Heidelberger; Xavier Martorell; José E. Moreira; Kurt Walter Pinnow; Joe Ratterman; Burkhard Steinmacher-Burow; William Gropp; Brian R. Toonen

The Blue Gene®/L (BG/L) supercomputer, with 65,536 dual-processor compute nodes, was designed from the ground up to support efficient execution of massively parallel message-passing programs. Part of this support is an optimized implementation of the Message Passing Interface (MPI), which leverages the hardware features of BG/L. MPI for BG/L is implemented on top of a more basic message-passing infrastructure called the message layer. This message layer can be used both to implement other higher-level libraries and directly by applications. MPI and the message layer are used in the two BG/L modes of operation: the coprocessor mode and the virtual node mode. Performance measurements show that our message-passing services deliver performance close to the hardware limits of the machine. They also show that dedicating one of the processors of a node to communication functions (coprocessor mode) greatly improves the message-passing bandwidth, whereas running two processes per compute node (virtual node mode) can have a positive impact on application performance.

european conference on parallel processing | 2004

Implementing MPI on the BlueGene/L Supercomputer

George S. Almasi; Charles J. Archer; José G. Castaños; C. Christopher Erway; Philip Heidelberger; Xavier Martorell; José E. Moreira; Kurt Walter Pinnow; Joe Ratterman; Nils Smeds; Burkhard Steinmacher-Burow; William Gropp; Brian R. Toonen

The BlueGene/L supercomputer will consist of 65,536 dual-processor compute nodes interconnected by two high-speed networks: a three-dimensional torus network and a tree topology network. Each compute node can only address its own local memory, making message passing the natural programming model for BlueGene/L. In this paper we present our implementation of MPI for BlueGene/L. In particular, we discuss how we leveraged the architectural features of BlueGene/L to arrive at an efficient implementation of MPI in this machine. We validate our approach by comparing MPI performance against the hardware limits and also the relative performance of the different modes of operation of BlueGene/L. We show that dedicating one of the processors of a node to communication functions greatly improves the bandwidth achieved by MPI operation, whereas running two MPI tasks per compute node can have a positive impact on application performance.

Ibm Journal of Research and Development | 2008

EUDOC on the IBM Blue Gene/L system: accelerating the transfer of drug discoveries from laboratory to patient

Yuan-Ping Pang; Timothy J. Mullins; Brent Allen Swartz; Jeff S. McAllister; Brian E. Smith; Charles J. Archer; Roy Glenn Musselman; Amanda Peters; Brian Paul Wallenfelt; Kurt Walter Pinnow

EUDOC™ is a molecular docking program that has successfully helped to identify new drug leads. This virtual screening (VS) tool identifies drug candidates by computationally testing the binding of these drugs to biologically important protein targets. This approach can reduce the research time required of biochemists, accelerating the identification of therapeutically useful drugs and helping to transfer discoveries from the laboratory to the patient. Migration of the EUDOC application code to the IBM Blue Gene/L™ (BG/L) supercomputer has been highly successful. This migration led to a 200-fold improvement in elapsed time for a representative VS application benchmark. Three focus areas provided benefits. First, we enhanced the performance of serial code through application redesign, hand-tuning, and increased usage of SIMD (single-instruction, multiple-data) floating-point unit operations. Second, we studied computational load-balancing schemes to maximize processor utilization and application scalability for the massively parallel architecture of the BG/L system. Third, we greatly enhanced system I/O interaction design. We also identified and resolved severe performance bottlenecks, allowing for efficient performance on more than 4,000 processors. This paper describes specific improvements in each of the areas of focus.

Archive | 1995

Distributed hash group-by cooperative processing

Daniel Manual Dias; Randy L. Egan; Roy Louis Hoffman; Richard P. King; Kurt Walter Pinnow; Christos A. Polyzois

Archive | 2000

Storage format for encoded vector indexes

Abdo Esmail Abdo; Kevin James Kathmann; Kurt Walter Pinnow

Archive | 2007

Interactive tool for visualizing performance data in real-time to enable adaptive performance optimization and feedback

Thomas M. Gooding; David L. Hermsmeier; Roy Glenn Musselman; Amanda Peters; Kurt Walter Pinnow; Brent Allen Swartz

Archive | 2007

Routing performance analysis and optimization within a massively parallel computer

Charles J. Archer; Amanda Peters; Kurt Walter Pinnow; Brent Allen Swartz

Archive | 2007

Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by employing bandwidth shells at areas of overutilization

Charles J. Archer; Roy Glenn Musselman; Amanda Peters; Kurt Walter Pinnow; Brent Allen Swartz; Brian Paul Wallenfelt

Archive | 2006

Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

Charles J. Archer; Roy Glenn Musselman; Amanda Peters; Kurt Walter Pinnow; Brent Allen Swartz; Brian Paul Wallenfelt

Archive | 2007

Method and Apparatus for Operating a Massively Parallel Computer System to Utilize Idle Processor Capability at Process Synchronization Points

Charles J. Archer; Roy Glenn Musselman; Amanda Peters; Kurt Walter Pinnow; Brent Allen Swartz

Explore More

Collaboration

Dive into the Kurt Walter Pinnow's collaboration.

Top Co-Authors

Charles J. Archer

IBM

View shared research outputs

Top Co-Authors

Brent Allen Swartz

IBM

View shared research outputs

Top Co-Authors

Roy Glenn Musselman

IBM

View shared research outputs

Top Co-Authors

Brian Paul Wallenfelt

IBM

View shared research outputs

Top Co-Authors

Amanda Peters

IBM

View shared research outputs

Top Co-Authors

Brian E. Smith

IBM

View shared research outputs

Top Co-Authors

Joseph D. Ratterman

IBM

View shared research outputs

Top Co-Authors

Abdo Esmail Abdo

IBM

View shared research outputs

Top Co-Authors

Burkhard Steinmacher-Burow

IBM

View shared research outputs

Top Co-Authors

George S. Almasi

IBM

View shared research outputs

Explore More