Is this you? Create Your Porfile

Scott B. Baden

University of California, San Diego

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Scott B. Baden is active.

Explore More

Publication

Featured researches published by Scott B. Baden.

SIAM Journal on Scientific Computing | 2008

Fast Monte Carlo Simulation Methods for Biological Reaction-Diffusion Systems in Solution and on Surfaces

Rex Kerr; Thomas M. Bartol; Boris Kaminsky; Markus Dittrich; Jen-Chien Jack Chang; Scott B. Baden; Terrence J. Sejnowski; Joel R. Stiles

Many important physiological processes operate at time and space scales far beyond those accessible to atom-realistic simulations, and yet discrete stochastic rather than continuum methods may best represent finite numbers of molecules interacting in complex cellular spaces. We describe and validate new tools and algorithms developed for a new version of the MCell simulation program (MCell3), which supports generalized Monte Carlo modeling of diffusion and chemical reaction in solution, on surfaces representing membranes, and combinations thereof. A new syntax for describing the spatial directionality of surface reactions is introduced, along with optimizations and algorithms that can substantially reduce computational costs (e.g., event scheduling, variable time and space steps). Examples for simple reactions in simple spaces are validated by comparison to analytic solutions. Thus we show how spatially realistic Monte Carlo simulations of biological systems can be far more cost-effective than often is assumed, and provide a level of accuracy and insight beyond that of continuum methods.

IEEE Transactions on Parallel and Distributed Systems | 1996

Dynamic partitioning of non-uniform structured workloads with spacefilling curves

John R. Pilkington; Scott B. Baden

We discuss inverse spacefilling partitioning (ISP), a partitioning strategy for non-uniform scientific computations running on distributed memory MIMD parallel computers. We consider the case of a dynamic workload distributed on a uniform mesh, and compare ISP against orthogonal recursive bisection (ORE) and a median of medians variant of ORE, ORB-MM. We present two results. First, ISP and ORB-MM are superior to ORE in rendering balanced workloads-because they are more fine-grained-and incur communication overheads that are comparable to ORE. Second, ISP is more attractive than ORB-MM from a software engineering standpoint because it avoids elaborate bookkeeping. Whereas ISP partitionings can be described succinctly as logically contiguous segments of the line, ORB-MMs partitionings are inherently unstructured. We describe the general d-dimensional ISP algorithm and report empirical results with two- and three-dimensional, non-hierarchical particle methods.

international conference on supercomputing | 2011

Mint: realizing CUDA performance in 3D stencil methods with annotated C

Didem Unat; Xing Cai; Scott B. Baden

We present Mint, a programming model that enables the non-expert to enjoy the performance benefits of hand coded CUDA without becoming entangled in the details. Mint targets stencil methods, which are an important class of scientific applications. We have implemented the Mint programming model with a source-to-source translator that generates optimized CUDA C from traditional C source. The translator relies on annotations to guide translation at a high level. The set of pragmas is small, and the model is compact and simple. Yet, Mint is able to deliver performance competitive with painstakingly hand-optimized CUDA. We show that, for a set of widely used stencil kernels, Mint realized 80% of the performance obtained from aggressively optimized CUDA on the 200 series NVIDIA GPUs. Our optimizations target three dimensional kernels, which present a daunting array of optimizations.

field-programmable custom computing machines | 2010

Accelerating Viola-Jones Face Detection to FPGA-Level Using GPUs

Daniel Hefenbrock; Jason Oberg; Nhat Tan Nguyen Thanh; Ryan Kastner; Scott B. Baden

Face detection is an important aspect for biometrics, video surveillance and human computer interaction. We present a multi-GPU implementation of the Viola-Jones face detection algorithm that meets the performance of the fastest known FPGA implementation. The GPU design offers far lower development costs, but the FPGA implementation consumes less power. We discuss the performance programming required to realize our design, and describe future research directions.

Journal of Parallel and Distributed Computing | 1998

Efficient Run-Time Support for Irregular Block-Structured Applications

Stephen J. Fink; Scott B. Baden; Scott R. Kohn

Parallel implementations of scientific applications often rely on elaborate dynamic data structures with complicated communication patterns. We describe a set of intuitive geometric programming abstractions that simplify coordination of irregular block-structured scientific calculations without sacrificing performance. We have implemented these abstractions in KeLP, a C++ run-time library. KeLPs abstractions enable the programmer to express complicated communication patterns for dynamic applications and to tune communication activity with a high-level, abstract interface. We show that KeLPs flexible communication model effectively manages elaborate data motion patterns arising in structured adaptive mesh refinement and achieves performance comparable to hand-coded message-passing on several structured numerical kernels.

international workshop on parallel algorithms for irregularly structured problems | 1996

Flexible Communication Mechanisms for Dynamic Structured Applications

Stephen J. Fink; Scott B. Baden; Scott R. Kohn

Irregular scientific applications are often difficult to parallelize due to elaborate dynamic data structures with complicated communication patterns. We describe flexible data orchestration abstractions that enable the programmer to express customized communication patterns arising in an important class of irregular computations—adaptive finite difference methods for partial differential equations. These abstractions are supported by KeLP, a c++ run-time library. KeLP enables the programmer to manage spatial data dependence patterns and express data motion handlers as first-class mutable objects. Using two finite difference applications, we show that KeLPs flexible communication model effectively manages elaborate data motion arising in semi-structured adaptive methods.

Siam Journal on Scientific and Statistical Computing | 1991

Programming Abstractions for Dynamically Partitioning and Coordinating Localized Scientific Calculations Running on Multiprocessors

Scott B. Baden

Certain software abstractions help to automate load balancing during various math-physics calculations on a team of concurrently executing processors. These abstractions have been tested on a vortex method for computational fluid dynamics. Experiments exhibited good parallel speedups of 24 and 3.6,respectively, on 32 processors of the Intel iPSC-1—a message-passing hypercube architecture—and on 4 processors of a Cray X-MP—a shared-memory vector architecture. The abstractions should apply to diverse applications, including finite difference methods, and to diverse architectures without requiring that the application be reprogrammed extensively for each new architecture.

international conference on parallel processing | 1996

Analysis of the numerical effects of parallelism on a parallel genetic algorithm

William E. Hart; Scott B. Baden; Richard K. Belew; Scott R. Kohn

Examines the effects of relaxed synchronization on both the numerical and parallel efficiency of parallel genetic algorithms (GAs). We describe a coarse-grain geographically structured parallel genetic algorithm. Our experiments provide preliminary evidence that asynchronous versions of these algorithms have a lower run-time than synchronous GAs. Our analysis shows that this improvement is due to (1) reduced synchronization costs and (2) higher numerical efficiency (e.g. fewer function evaluations) for the asynchronous GAs. This analysis includes a critique of the utility of traditional parallel performance measures for parallel GAs.

conference on high performance computing (supercomputing) | 1998

Communication overlap in multi-tier parallel algorithms

Scott B. Baden; Stephen J. Fink

Hierarchically organized multicomputers such as SMP clusters offer new opportunities and new challenges for high-performance computation, but realizing their full potential remains a formidable task. We present a hierarchical model of communication targeted to block- structured, bulk-synchronous applications running on dedicated clusters of symmetric multiprocessors. Our model supports node-level rather processor-level communication as the fundamental operation, and is optimized for aggregate patterns of regular section moves rather than point-to-point messages. These two capabilities work synergistically. They provide flexibility in overlapping communication and overcome deficiencies in the underlying communication layer on systems where inter-node communication bandwidth is at a premium. We have implemented our communication model in the KeLP2.0 run time library. We present empirical results for five applications running on a cluster of Digital AlphaServer 2100s. Four of the applications were able to overlap communication on a system which does not support overlap via non-blocking message passing using MPI. Overall performance improvements due to our overlap strategy ranged from 12% to 28%.

ieee international conference on high performance computing data and analytics | 1994

A robust parallel programming model for dynamic non-uniform scientific computations

Scott R. Kohn; Scott B. Baden

LPARX provides efficient run-time support for dynamic, non-uniform scientific calculations running on MIMD distributed memory architectures. It extends HPFs data decomposition model to provide support for dynamic, block irregular data structures. LPARX represents data decompositions as first-class objects and expresses data dependencies in a manner which is logically independent of data decomposition and problem dimension. LPARX applications are portable across a diversity of MIMD machines. We have implemented a number of applications in LPARX-including a 3D particle calculation and 2D and 3D adaptive multigrid solvers-which could not have been efficiently implemented in HPF.<<ETX>>

Explore More