Is this you? Create Your Porfile

James Demmel

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where James Demmel is active.

Explore More

Publication

Featured researches published by James Demmel.

Archive | 1997

ScaLAPACK users' guide

L. S. Blackford; Jaeyoung Choi; Andrew J. Cleary; Eduardo F. D'Azevedo; James Demmel; Inderjit S. Dhillon; Jack J. Dongarra; Sven Hammarling; Greg Henry; Antoine Petitet; K. Stanley; David Walker; R. C. Whaley

Where you can find the scalapack users guide easily? Is it in the book store? On-line book store? are you sure? Keep in mind that you will find the book in this site. This book is very referred for you because it gives not only the experience but also lesson. The lessons are very valuable to serve for you, thats not about who are reading this scalapack users guide book. It is about this book that will give wellness for all people from many societies.

Archive | 2000

Templates for the solution of algebraic eigenvalue problems: a practical guide

James Demmel; Jack J. Dongarra; Axel Ruhe; Henk A. van der Vorst; Zhaojun Bai

List of symbols and acronyms List of iterative algorithm templates List of direct algorithms List of figures List of tables 1: Introduction 2: A brief tour of Eigenproblems 3: An introduction to iterative projection methods 4: Hermitian Eigenvalue problems 5: Generalized Hermitian Eigenvalue problems 6: Singular Value Decomposition 7: Non-Hermitian Eigenvalue problems 8: Generalized Non-Hermitian Eigenvalue problems 9: Nonlinear Eigenvalue problems 10: Common issues 11: Preconditioning techniques Appendix: of things not treated Bibliography Index .

information processing in sensor networks | 2007

Health monitoring of civil infrastructures using wireless sensor networks

Sukun Kim; Shamim N. Pakzad; David E. Culler; James Demmel; Gregory L. Fenves; Steve Glaser; Martin Turon

A Wireless Sensor Network (WSN) for Structural Health Monitoring (SHM) is designed, implemented, deployed and tested on the 4200 ft long main span and the south tower of the Golden Gate Bridge (GGB). Ambient structural vibrations are reliably measured at a low cost and without interfering with the operation of the bridge. Requirements that SHM imposes on WSN are identified and new solutions to meet these requirements are proposed and implemented. In the GGB deployment, 64 nodes are distributed over the main span and the tower, collecting ambient vibrations synchronously at 1 kHz rate, with less than 10 mus jitter, and with an accuracy of 30 muG. The sampled data is collected reliably over a 46-hop network, with a bandwidth of 441 B/s at the 46th hop. The collected data agrees with theoretical models and previous studies of the bridge. The deployment is the largest WSN for SHM.

ieee international conference on high performance computing data and analytics | 2008

Benchmarking GPUs to tune dense linear algebra

Vasily Volkov; James Demmel

We present performance results for dense linear algebra using recent NVIDIA GPUs. Our matrix-matrix multiply routine (GEMM) runs up to 60% faster than the vendors implementation and approaches the peak of hardware capabilities. Our LU, QR and Cholesky factorizations achieve up to 80--90% of the peak GEMM rate. Our parallel LU running on two GPUs achieves up to ~540 Gflop/s. These results are accomplished by challenging the accepted view of the GPU architecture and programming guidelines. We argue that modern GPUs should be viewed as multithreaded multicore vector units. We exploit blocking similarly to vector computers and heterogeneity of the system by computing both on GPU and CPU. This study includes detailed benchmarking of the GPU memory system that reveals sizes and latencies of caches and TLB. We present a couple of algorithmic optimizations aimed at increasing parallelism and regularity in the problem that provide us with slightly higher performance.

SIAM Journal on Matrix Analysis and Applications | 1999

A Supernodal Approach to Sparse Partial Pivoting

James Demmel; Stanley C. Eisenstat; John R. Gilbert; Xiaoye S. Li; Joseph W. H. Liu

We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. We introduce the notion of unsymmetric supernodes to perform most of the numerical computation in dense matrix kernels. We introduce unsymmetric supernode-panel updates and two-dimensional data partitioning to better exploit the memory hierarchy. We use Gilbert and Peierlss depth-first search with Eisenstat and Lius symmetric structural reductions to speed up symbolic factorization. We have developed a sparse LU code using all these ideas. We present experiments demonstrating that it is significantly faster than earlier partial pivoting codes. We also compare its performance with UMFPACK, which uses a multifrontal approach; our code is very competitive in time and storage requirements, especially for large problems.

parallel computing | 2009

A view of the parallel computing landscape

Krste Asanovic; Rastislav Bodik; James Demmel; Tony M. Keaveny; Kurt Keutzer; John Kubiatowicz; Nelson Morgan; David A. Patterson; Koushik Sen; John Wawrzynek; David Wessel; Katherine A. Yelick

Writing programs that scale with increasing numbers of cores should be as easy as writing programs for sequential computers.

ACM Transactions on Mathematical Software | 2002

An Updated Set of Basic Linear Algebra Subprograms (BLAS)

L. Susan Blackford; Antoine Petitet; Roldan Pozo; Karin A. Remington; R. Clint Whaley; James Demmel; Jack J. Dongarra; Iain S. Duff; Sven Hammarling; Greg Henry; Michael A. Heroux; Linda Kaufman; Andrew Lumsdaine

L. SUSAN BLACKFORD Myricom, Inc. JAMES DEMMEL University of California, Berkeley JACK DONGARRA The University of Tennessee IAIN DUFF Rutherford Appleton Laboratory and CERFACS SVEN HAMMARLING Numerical Algorithms Group, Ltd. GREG HENRY Intel Corporation MICHAEL HEROUX Sandia National Laboratories LINDA KAUFMAN William Patterson University ANDREW LUMSDAINE Indiana University ANTOINE PETITET Sun Microsystems ROLDAN POZO National Institute of Standards and Technology KARIN REMINGTON The Center for Advancement of Genomics and R. CLINT WHALEY Florida State University

ACM Transactions on Mathematical Software | 2003

SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

Xiaoye S. Li; James Demmel

We present the main algorithmic features in the software package SuperLU_DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the softwares parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication patterns, which lets us exploit techniques used in parallel sparse Cholesky algorithms to better parallelize both LU decomposition and triangular solution on large-scale distributed machines.

Presented at: SciDAC 2005 Proceedings (Journal of Physics), San Francisco, CA, United States, Jun 26 - Jun 30, 2005 | 2005

OSKI: A library of automatically tuned sparse matrix kernels

Richard W. Vuduc; James Demmel; Katherine A. Yelick

The Optimized Sparse Kernel Interface (OSKI) is a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. These kernels include sparse matrix-vector multiply and sparse triangular solve, among others. The primary aim of this interface is to hide the complex decision-making process needed to tune the performance of a kernel implementation for a particular users sparse matrix and machine, while also exposing the steps and potentially non-trivial costs of tuning at run-time. This paper provides an overview of OSKI, which is based on our research on automatically tuned sparse kernels for modern cache-based superscalar machines.

conference on high performance computing (supercomputing) | 1990

LAPACK: a portable linear algebra library for high-performance computers

E. Anderson; Z. Bai; Jack J. Dongarra; A. Greenbaum; A. McKenney; J. Du Croz; S. Hammarling; James Demmel; Christian H. Bischof; Danny C. Sorensen

The goal of the LAPACK project is to design and implement a portable linear algebra library for efficient use on a variety of high-performance computers. The library is based on the widely used LINPACK and EISPACK packages for solving linear equations, eigenvalue problems, and linear least-squares problems, but extends their functionality in a number of ways. The major methodology for making the algorithms run faster is to restructure them to perform block matrix operations (e.g., matrix-matrix multiplication) in their inner loops. These block operations may be optimized to exploit the memory hierarchy of a specific architecture. The LAPACK project is also working on new algorithms that yield higher relative accuracy for a variety of linear algebra problems. >

Explore More