Is this you? Create Your Porfile

Michael L. Welcome

Lawrence Berkeley National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael L. Welcome is active.

Explore More

Publication

Featured researches published by Michael L. Welcome.

9th Computational Fluid Dynamics Conference | 1989

Adaptive Mesh Refinement on Moving Quadrilateral Grids

John Bell; Phillip Colella; John A. Trangenstein; Michael L. Welcome

This paper describes an adaptive mesh refinement algorithm for unsteady gas dynamics. The algorithm is based on an unsplit, second-order Godunov integration scheme for logically-rectangular moving quadrilateral grids. The integration scheme is conservative and provides a robust, high resolution discretization of the equations of gas dynamics for problems with strong nonlinearities. The integration scheme is coupled to a local adaptive mesh refinement algorithm that dynamically adjusts the location of refined grid patches to preserve the accuracy of the solution, preserving conservation at interfaces between coarse and fine grids while adjusting the geometry of the fine grids so that grid lines remain smooth under refinement. Numerical results are presented illustrating the performance of the algorithm. 5 refs., 3 figs.

conference on high performance computing (supercomputing) | 2005

Tera-Scalable Algorithms for Variable-Density Elliptic Hydrodynamics with Spectral Accuracy

Andrew W. Cook; William H. Cabot; Peter L. Williams; Brian Miller; Bronis R. de Supinski; Robert Kim Yates; Michael L. Welcome

We describe Miranda, a massively parallel spectral/compact solver for variabledensity incompressible flow, including viscosity and species diffusivity effects. Miranda utilizes FFTs and band-diagonal matrix solvers to compute spatial derivatives to at least 10th-order accuracy. We have successfully ported this communicationintensive application to BlueGene/L and have explored both direct block parallel and transpose-based parallelization strategies for its implicit solvers. We have discovered a mapping strategy which results in virtually perfect scaling of the transpose method up to 65,536 processors of the BlueGene/L machine. Sustained global communication rates in Miranda typically run at 85% of the theoretical peak speed of the BlueGene/L torus network, while sustained communication plus computation speeds reach 2.76 TeraFLOPS. This effort represents the first time that a high-order variable-density incompressible flow solver with species diffusion has demonstrated sustained performance in the TeraFLOPS range.

Archive | 2009

Porting GASNet to Portals: Partitioned Global Address Space (PGAS) Language Support for the Cray XT

Dan Bonachea; Paul Hargrove; Michael L. Welcome; Katherine A. Yelick

Partitioned Global Address Space (PGAS) Languages are an emerging alternative to MPI for HPC applications development. The GASNet library from Lawrence Berkeley National Lab and the University of California at Berkeley provides the network runtime for multiple implementations of four PGAS Languages: Unified Parallel C (UPC), Co-Array Fortran (CAF), Titanium and Chapel. GASNet provides a low overhead one-sided communication layer has enabled portability and high performance of PGAS languages. This paper describes our experiences porting GASNet to the Portals network API on the Cray XT series.

ieee international conference on high performance computing data and analytics | 2008

BlueGene/L applications: Parallelism On a Massive Scale

Bronis R. de Supinski; Martin Schulz; Vasily V. Bulatov; William H. Cabot; Bor Chan; Andrew W. Cook; Erik W. Draeger; James N. Glosli; Jeffrey Greenough; Keith Henderson; Alison Kubota; Steve Louis; Brian Miller; Mehul Patel; Thomas E. Spelce; Frederick H. Streitz; Peter L. Williams; Robert Kim Yates; Andy Yoo; George S. Almasi; Gyan Bhanot; Alan Gara; John A. Gunnels; Manish Gupta; José E. Moreira; James C. Sexton; Bob Walkup; Charles J. Archer; Francois Gygi; Timothy C. Germann

BlueGene/L (BG/L), developed through a partnership between IBM and Lawrence Livermore National Laboratory (LLNL), is currently the worlds largest system both in terms of scale, with 131,072 processors, and absolute performance, with a peak rate of 367 Tflop/s. BG/L has led the last four Top500 lists with a Linpack rate of 280.6 Tflop/s for the full machine installed at LLNL and is expected to remain the fastest computer in the next few editions. However, the real value of a machine such as BG/L derives from the scientific breakthroughs that real applications can produce by successfully using its unprecedented scale and computational power. In this paper, we describe our experiences with eight large scale applications on BG/ L from several application domains, ranging from molecular dynamics to dislocation dynamics and turbulence simulations to searches in semantic graphs. We also discuss the challenges we faced when scaling these codes and present several successful optimization techniques. All applications show excellent scaling behavior, even at very large processor counts, with one code even achieving a sustained performance of more than 100 Tflop/s, clearly demonstrating the real success of the BG/L design.

8th Computational Fluid Dynamics Conference | 1987

Adaptive methods for high Mach number reacting flow

John Bell; Phillip Colella; John A. Trangenstein; Michael L. Welcome

This paper describes some adaptive techniques suitable for modeling high Mach number reacting flows. Two basic types of methods are considered: adaptive mesh techniques and front tracking. Numerical results are described using one of these methods, local mesh refinement, for a simple model for planar detonation. The computational results show the formation of Mach triple points in the detonation fronts and provide an initial step toward understanding the factors influencing the spacing. 10 refs., 30 figs.

computing frontiers | 2006

Performance characteristics of an adaptive mesh refinement calculation on scalar and vector platforms

Michael L. Welcome; Charles A. Rendleman; Leonid Oliker; Rupak Biswas

Adaptive mesh refinement (AMR) is a powerful technique that reduces the resources necessary to solve otherwise intractable problems in computational science. The AMR strategy solves the problem on a relatively coarse grid, and dynamically refines it in regions requiring higher resolution. However, AMR codes tend to be far more complicated than their uniform grid counterparts due to the software infrastructure necessary to dynamically manage the hierarchical grid framework. Despite this complexity, it is generally believed that future multi-scale applications will increasingly rely on adaptive methods to study problems at unprecedented scale and resolution. Recently, a new generation of parallel-vector architectures have become available that promise to achieve extremely high sustained performance for a wide range of applications, and are the foundation of many leadership-class computing systems worldwide. It is therefore imperative to understand the tradeoffs between conventional scalar and parallel-vector platforms for solving AMR-based calculations. In this paper, we examine the LibraryHyperCLaw AMR framework to compare and contrast performance on the Cray X1E, IBM Power3 and Power5, and SGI Altix. To the best of our knowledge, this is the first work that investigates and characterizes the performance of an AMR calculation on modern parallel-vector systems.

conference on high performance computing (supercomputing) | 2006

Optimized collectives for PGAS languages with one-sided communication

Dan Bonachea; Paul Hargrove; Rajesh Nishtala; Michael L. Welcome; Katherine A. Yelick

Optimized collective operations are a crucial performance factor for many scientific applications. This work investigates the design and optimization of collectives in the context of Partitioned Global Address Space (PGAS) languages such as Unified Parallel C (UPC). Languages with one-sided communication permit a more flexible and expressive collective interface with application code, in turn enabling more aggressive optimization and more effective utilization of system resources. We investigate the design tradeoffs in a collectives implementation for UPC, ranging from resource management to synchronization mechanisms and target-dependent selection of optimal communication patterns. Our collectives are implemented in the Berkeley UPC compiler using the GASNet communication system, tuned across a wide variety of supercomputing platforms, and benchmarked against MPI collectives. Special emphasis is placed on the newly added Cray XT3 backend for UPC, whose characteristics are benchmarked in detail.

Other Information: PBD: 30 Apr 2004 | 2003

The Global Unified Parallel File System (GUPFS) Project: FY 2002 Activities and Results

Gregory F. Butler; Rei Chi Lee; Michael L. Welcome

The Global Unified Parallel File System (GUPFS) project is a multiple-phase, five-year project at the National Energy Research Scientific Computing (NERSC) Center to provide a scalable, high performance, high bandwidth, shared file system for all the NERSC production computing and support systems. The primary purpose of the GUPFS project is to make it easier to conduct advanced scientific research using the NERSC systems. This is to be accomplished through the use of a shared file system providing a unified file namespace, operating on consolidated shared storage that is directly accessed by all the NERSC production computing and support systems. During its first year, FY 2002, the GUPFS project focused on identifying, testing, and evaluating existing and emerging shared/cluster file system, SAN fabric, and storage technologies; identifying NERSC user input/output (I/O) requirements, methods, and mechanisms; and developing appropriate benchmarking methodologies and benchmark codes for a parallel environment. This report presents the activities and progress of the GUPFS project during its first year, the results of the evaluations conducted, and plans for near-term and longer-term investigations.

Journal of Computational Physics | 1999