Craig C. Douglas
University of Wyoming
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Craig C. Douglas.
architectural support for programming languages and operating systems | 1996
James Philbin; Jan Edler; Craig C. Douglas; Kai Li
This paper describes a method to improve the cache locality of sequential programs by scheduling fine-grained threads. The algorithm relies upon hints provided at the time of thread creation to determine a thread execution order likely to reduce cache misses. This technique may be particularly valuable when compiler-directed tiling is not feasible. Experiments with several application programs, on two systems with different cache structures, show that our thread scheduling method can improve program performance by reducing second-level cache misses.
Mathematics and Computers in Simulation | 2008
Jan Mandel; Lynn S. Bennethum; Jonathan D. Beezley; Janice L. Coen; Craig C. Douglas; Minjeong Kim; Anthony Vodacek
A wildfire model is formulated based on balance equations for energy and fuel, where the fuel loss due to combustion corresponds to the fuel reaction rate. The resulting coupled partial differential equations have coefficients that can be approximated from prior measurements of wildfires. An ensemble Kalman filter technique with regularization is then used to assimilate temperatures measured at selected points into running wildfire simulations. The assimilation technique is able to modify the simulations to track the measurements correctly even if the simulations were started with an erroneous ignition location that is quite far away from the correct one.
ieee international conference on high performance computing data and analytics | 2009
Gundolf Haase; Manfred Liebmann; Craig C. Douglas; Gernot Plank
The paper presents a multi-GPU implementation of the preconditioned conjugate gradient algorithm with an algebraic multigrid preconditioner (PCG-AMG) for an elliptic model problem on a 3D unstructured grid. An efficient parallel sparse matrix-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster and a multi-GPU configuration with eight GPUs is about 100 times faster than a typical server CPU core.
Archive | 2003
Craig C. Douglas; Gundolf Haase; Ulrich Langer
From the Publisher: This compact yet thorough tutorial is the perfect introduction to the basic concepts of solving partial differential equations (PDEs) using parallel numerical methods. In just eight short chapters, the authors provide readers with enough basic knowledge of PDEs, discretization methods, solution techniques, parallel computers, parallel programming, and the run-time behavior of parallel algorithms to allow them to understand, develop, and implement parallel PDE solvers. Examples throughout the book are intentionally kept simple so that the parallelization strategies are not dominated by technical details. A Tutorial on Elliptic PDE Solvers and Their Parallelization is a valuable aid for learning about the possible errors and bottlenecks in parallel computing. One of the highlights of the tutorial is that the course material can run on a laptop, not just on a parallel computer or cluster of PCs, thus allowing readers to experience their first successes in parallel computing in a relatively short amount of time. This tutorial is intended for advanced undergraduate and graduate students in computational sciences and engineering; however, it may also be helpful to professionals who use PDE-based parallel computer simulations in the field.
ieee international conference on high performance computing data and analytics | 1995
Alexandre Ern; Craig C. Douglas; Mitchell D. Smooke
We present a numerical simulation of an axisymmet ric, laminar diffusion flame with finite-rate chemistry on serial and distributed-memory parallel computers. We use the total mass, momentum, energy, and spe cies conservation equations with the compressible Navier-Stokes equations written in vorticity-velocity form. The computational algorithm for solving the re sulting nonlinear coupled elliptic partial differential equations involves damped Newton iterations, Krylov type linear-system solvers, and adaptive mesh refine ment. The results presented here are the first in which a lifted diffusion flame structure is obtained on a non- staggered grid. The numerical solution is in very good agreement with previous numerical and experimental data.
Computing | 2009
Oliver Lass; Michelle Vallejos; Alfio Borzì; Craig C. Douglas
The detailed implementation and analysis of a finite element multigrid scheme for the solution of elliptic optimal control problems is presented. A particular focus is in the definition of smoothing strategies for the case of constrained control problems. For this setting, convergence of the multigrid scheme is discussed based on the BPX framework. Results of numerical experiments are reported to illustrate and validate the optimal efficiency and robustness of the performance of the present multigrid strategy.
international conference on computational science | 2004
Jan Mandel; Mingshi Chen; Leopoldo P. Franca; Craig J. Johns; A. Puhalskii; Janice L. Coen; Craig C. Douglas; Robert Kremens; Anthony Vodacek; Wei Zhao
A proposed system for real-time modeling of wildfires is described. The system involves numerical weather and fire prediction, automated data acquisition from Internet sources, and input from aerial photographs and sensors. The system will be controlled by a non-Gaussian ensemble filter capable of assimilating out-of-order data. The computational model will run on remote supercomputers, with visualization on PDAs in the field connected to the Internet via a satellite.
Parallel Algorithms and Applications | 1996
Craig C. Douglas
Multigrid methods combine a number of standard sparse matrix techniques. Usual implementations separate the individual components (e.g., an iterative methods, residual computation, and interpolation between grids) into nicely structured routines. However, many computers today employ quite sophisticated and potentially large caches whose correct use are instrumental in gaining much of the peak performance of the processors. We investigate when it makes sense to combine several of the multigrid components into one, using block oriented algorithms. We determine how large (or small) the blocks must be in order for the data in the block to just fit into the processors primary cache. By re-using the data in cache several times, a potential savings in run time can be predicted. This is analyzed for a set of examples.
international conference on conceptual structures | 2007
Jan Mandel; Jonathan D. Beezley; Lynn S. Bennethum; Soham Chakraborty; Janice L. Coen; Craig C. Douglas; Jay Hatcher; Minjeong Kim; Anthony Vodacek
We present an overview of an ongoing project to build DDDAS to use all available data for a short term wildfire prediction. The project involves new data assimilation methods to inject data into a running simulation, a physics based model coupled with weather prediction, on-site data acquisition using sensors that can survive a passing fire, and on-line visualization using Google Earth.
international conference on computational science | 2005
Jan Mandel; Lynn S. Bennethum; Mingshi Chen; Janice L. Coen; Craig C. Douglas; Leopoldo P. Franca; Craig J. Johns; Minjeong Kim; Andrew V. Knyazev; Robert Kremens; Vaibhav V. Kulkarni; Guan Qin; Anthony Vodacek; Jianjia Wu; Wei Zhao; Adam Zornes
We report on an ongoing effort to build a Dynamic Data Driven Application System (DDDAS) for short-range forecast of wildfire behavior from real-time weather data, images, and sensor streams. The system should change the forecast when new data is received. The basic approach is to encapsulate the model code and use an ensemble Kalman filter in time-space. Several variants of the ensemble Kalman filter are presented, for out-of-sequence data assimilation, hidden model states, and highly nonlinear problems. Parallel implementation and web-based visualization are also discussed.