Craig Edward Rasmussen

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Craig Edward Rasmussen is active.

Explore More

Publication

Featured researches published by Craig Edward Rasmussen.

International Journal of Parallel Programming | 2003

A network-failure-tolerant message-passing system for terascale clusters

Richard L. Graham; Sung-Eun Choi; David Daniel; Nehal N. Desai; Ronald Minnich; Craig Edward Rasmussen; L. Dean Risinger; Mitchel W. Sukalski

The Los Alamos Message Passing Interface (LA-MPI) is an end-to-end network-failure-tolerant message-passing system designed for terascale clusters. LA-MPI is a standard-compliant implementation of MPI designed to tolerate network-related failures including I/O bus errors, network card errors, and wire-transmission errors. This paper details the distinguishing features of LA-MPI, including support for concurrent use of multiple types of network interface, and reliable message transmission utilizing multiple network paths and routes between a given source and destination. In addition, performance measurements on production-grade platforms are presented.

conference on high performance computing (supercomputing) | 2000

A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates

Kathleen Lindlan; Janice E. Cuny; Allen D. Malony; Sameer Shende; Reid Rivenburgh; Craig Edward Rasmussen; Bernd Mohr

The developers of high-performance scientific applications often work in complex computing environments that place heavy demands on program analysis tools. The developers need tools that interoperate, are portable across machine architectures, and provide source-level feedback. In this paper, we describe a tool framework, the Program Database Toolkit (PDT), that supports the development of program analysis tools meeting these requirements. PDT uses compile-time information to create a complete database of high-level program information that is structured for well-defined and uniform access by tools and applications. PDT’s current applications make heavy use of advanced features of C++, in particular, templates. We describe the toolkit, focussing on its most important contribution -- its handling of templates -- as well as its use in existing applications.

international parallel and distributed processing symposium | 2009

An integrated approach to improving the parallel application development process

Gregory R. Watson; Craig Edward Rasmussen; Beth R. Tibbitts

The development of parallel applications is becoming increasingly important to a broad range of industries. Traditionally, parallel programming was a niche area that was primarily exploited by scientists trying to model extremely complicated physical phenomenon. It is becoming increasingly clear, however, that continued hardware performance improvements through clock scaling and feature-size reduction are simply not going to be achievable for much longer. The hardware vendors approach to addressing this issue is to employ parallelism through multi-processor and multi-core technologies. While there is little doubt that this approach produces scaling improvements, there are still many significant hurdles to be overcome before parallelism can be employed as a general replacement to more traditional programming techniques. The Parallel Tools Platform (PTP) Project was created in 2005 in an attempt to provide developers with new tools aimed at addressing some of the parallel development issues. Since then, the introduction of a new generation of peta-scale and multi-core systems has highlighted the need for such a platform. In this paper, we describe some of the challenges facing parallel application developers, present the current state of PTP, and provide a simple case study that demonstrates how PTP can be used to locate a potential deadlock situation in an MPI code.

Concurrency and Computation: Practice and Experience | 2005

Performance technology for parallel and distributed component software

Allen D. Malony; Sameer Shende; Nick Trebon; Jaideep Ray; Robert C. Armstrong; Craig Edward Rasmussen; Matthew J. Sottile

This work targets the emerging use of software component technology for high‐performance scientific parallel and distributed computing. While component software engineering will benefit the construction of complex science applications, its use presents several challenges to performance measurement, analysis, and optimization. The performance of a component application depends on the interaction (possibly nonlinear) of the composed component set. Furthermore, a component is a ‘binary unit of composition’ and the only information users have is the interface the component provides to the outside world. A performance engineering methodology and development approach is presented to address evaluation and optimization issues in high‐performance component environments. We describe a prototype implementation of a performance measurement infrastructure for the Common Component Architecture (CCA) system. A case study demonstrating the use of this technology for integrated measurement, monitoring, and optimization in CCA component‐based applications is given. Copyright

applied imagery pattern recognition workshop | 2009

Large-scale functional models of visual cortex for remote sensing

Steven P. Brumby; Garrett T. Kenyon; Will Landecker; Craig Edward Rasmussen; Sriram Swaminarayan; Luís M. A. Bettencourt

Neuroscience has revealed many properties of neurons and of the functional organization of visual cortex that are believed to be essential to human vision, but are missing in standard artificial neural networks. Equally important may be the sheer scale of visual cortex requiring ~1 petaflop of computation, while the scale of human visual experience greatly exceeds standard computer vision datasets: the retina delivers ~1 petapixel/year to the brain, driving learning at many levels of the cortical system. We describe work at Los Alamos National Laboratory (LANL) to develop large-scale functional models of visual cortex on LANLs Roadrunner petaflop supercomputer. An initial run of a simple region V1 code achieved 1.144 petaflops during trials at the IBM facility in Poughkeepsie, NY (June 2008). Here, we present criteria for assessing when a set of learned local representations is ¿complete¿ along with general criteria for assessing computer vision models based on their projected scaling behavior. Finally, we extend one class of biologically-inspired learning models to problems of remote sensing imagery.

PLOS Computational Biology | 2011

Model Cortical Association Fields Account for the Time Course and Dependence on Target Complexity of Human Contour Perception

Vadas Gintautas; Michael I. Ham; Benjamin S. Kunsberg; Shawn Barr; Steven P. Brumby; Craig Edward Rasmussen; John S. George; Ilya Nemenman; Luís M. A. Bettencourt; Garrett T. Kenyon

Can lateral connectivity in the primary visual cortex account for the time dependence and intrinsic task difficulty of human contour detection? To answer this question, we created a synthetic image set that prevents sole reliance on either low-level visual features or high-level context for the detection of target objects. Rendered images consist of smoothly varying, globally aligned contour fragments (amoebas) distributed among groups of randomly rotated fragments (clutter). The time course and accuracy of amoeba detection by humans was measured using a two-alternative forced choice protocol with self-reported confidence and variable image presentation time (20-200 ms), followed by an image mask optimized so as to interrupt visual processing. Measured psychometric functions were well fit by sigmoidal functions with exponential time constants of 30-91 ms, depending on amoeba complexity. Key aspects of the psychophysical experiments were accounted for by a computational network model, in which simulated responses across retinotopic arrays of orientation-selective elements were modulated by cortical association fields, represented as multiplicative kernels computed from the differences in pairwise edge statistics between target and distractor images. Comparing the experimental and the computational results suggests that each iteration of the lateral interactions takes at least ms of cortical processing time. Our results provide evidence that cortical association fields between orientation selective elements in early visual areas can account for important temporal and task-dependent aspects of the psychometric curves characterizing human contour perception, with the remaining discrepancies postulated to arise from the influence of higher cortical areas.

The Journal of Supercomputing | 2006

Rapid prototyping frameworks for developing scientific applications: A case study

Christopher D. Rickett; Sung-Eun Choi; Craig Edward Rasmussen; Matthew J. Sottile

In this paper, we describe a Python-based framework for the rapid prototyping of scientific applications. A case study was performed using a problem specification developed for Marmot, a project at the Los Alamos National Laboratory aimed at re-factoring standard physics codes into reusable and extensible components. Components were written in Python, ZPL, Fortran, and C++ following the Marmot component design. We evaluate our solution both qualitatively and quantitatively by comparing it to a single-language version written in C.

Journal of Chemical Physics | 2004

Surface tension of quantum fluids from molecular simulations.

Xiongce Zhao; J. Karl Johnson; Craig Edward Rasmussen

We present the first molecular simulations of the vapor-liquid surface tension of quantum liquids. The path integral formalism of Feynman was used to account for the quantum mechanical behavior of both the liquid and the vapor. A replica-data parallel algorithm was implemented to achieve good parallel performance of the simulation code on at least 32 processors. We have computed the surface tension and the vapor-liquid phase diagram of pure hydrogen over the temperature range 18-30 K and pure deuterium from 19 to 34 K. The simulation results for surface tension and vapor-liquid orthobaric densities are in very good agreement with experimental data. We have computed the interfacial properties of hydrogen-deuterium mixtures over the entire concentration range at 20.4 and 24 K. The calculated equilibrium compositions of the mixtures are in excellent agreement with experimental data. The computed mixture surface tension shows negative deviations from ideal solution behavior, in agreement with experimental data and predictions from Prigogines theory. The magnitude of the deviations at 20.4 K are substantially larger from simulations and from theory than from experiments. We conclude that the experimentally measured mixture surface tension values are systematically too high. Analysis of the concentration profiles in the interfacial region shows that the nonideal behavior can be described entirely by segregation of H(2) to the interface, indicating that H(2) acts as a surfactant in H(2)-D(2) mixtures.

european conference on parallel processing | 2004

Co-array Python: A Parallel Extension to the Python Language

Craig Edward Rasmussen; Matthew J. Sottile; Jarek Nieplocha; Robert W. Numrich; Eric Jones

A parallel extension to the Python language is introduced that is modeled after the Co-Array Fortran extensions to Fortran 95. A new Python module, CoArray, has been developed to provide co-array syntax that allows a Python programmer to address co-array data on a remote processor. An example of Jacobi iteration using the CoArray module is shown and corresponding performance results are presented.

computational science and engineering | 2013

ForOpenCL: transformations exploiting array syntax in Fortran for accelerator programming

Matthew J. Sottile; Craig Edward Rasmussen; Wayne Weseloh; Robert W. Robey; Daniel J. Quinlan; Jeffrey Overbey

Emerging GPU architectures for high performance computing are well suited to a data-parallel programming model. This paper presents preliminary work examining a programming methodology that provides Fortran programmers with access to these emerging systems. We use array constructs in Fortran to show how this infrequently exploited, standardised language feature is easily transformed to lower-level accelerator code. The transformations in ForOpenCL are based on a simple mapping from Fortran to OpenCL. We demonstrate, using a stencil code solving the shallow-water fluid equations, that the performance of the ForOpenCL compiler-generated transformations is comparable with that of hand-optimised OpenCL code.

Explore More