Yozo Hida
University of California, Berkeley
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yozo Hida.
ACM Transactions on Mathematical Software | 2006
James Demmel; Yozo Hida; William Kahan; Xiaoye S. Li; Sonil Mukherjee; E. Jason Riedy
We present the design and testing of an algorithm for iterative refinement of the solution of linear equations where the residual is computed with extra precision. This algorithm was originally proposed in 1948 and analyzed in the 1960s as a means to compute very accurate solutions to all but the most ill-conditioned linear systems. However, two obstacles have until now prevented its adoption in standard subroutine libraries like LAPACK: (1) There was no standard way to access the higher precision arithmetic needed to compute residuals, and (2) it was unclear how to compute a reliable error bound for the computed solution. The completion of the new BLAS Technical Forum Standard has essentially removed the first obstacle. To overcome the second obstacle, we show how the application of iterative refinement can be used to compute an error bound in any norm at small cost and use this to compute both an error bound in the usual infinity norm, and a componentwise relative error bound.We report extensive test results on over 6.2 million matrices of dimensions 5, 10, 100, and 1000. As long as a normwise (componentwise) condition number computed by the algorithm is less than 1/max{10,&nsqrt;}ϵw, the computed normwise (componentwise) error bound is at most 2 max{10, &nsqrt;} · ϵw, and indeed bounds the true error. Here, n is the matrix dimension and ϵw = 2−24 is the working precision. Residuals were computed in double precision (53 bits of precision). In other words, the algorithm always computed a tiny error at negligible extra cost for most linear systems. For worse conditioned problems (which we can detect using condition estimation), we obtained small correct error bounds in over 90% of cases.
SIAM Journal on Scientific Computing | 2003
James Demmel; Yozo Hida
We present and analyze several simple algorithms for accurately computing the sum of n floating point numbers using a wider accumulator. Let f and F be the number of significant bits in the summands and the accumulator, respectively. Then assuming gradual underflow, no overflow, and round-to-nearest arithmetic, up to approximately 2F-f numbers can be added accurately by simply summing the terms in decreasing order of exponents, yielding a sum correct to within about 1.5 units in the last place (ulps). We apply this result to the floating point formats in the IEEE floating point standard. For example, a dot product of single precision vectors of length at most 33 computed using double precision and sorting is guaranteed correct to nearly 1.5 ulps. If double-extended precision is used, the vector length can be as large as 65,537. We also investigate how the cost of sorting can be reduced or eliminated while retaining accuracy.
ACM Transactions on Mathematical Software | 2009
James Demmel; Yozo Hida; E. Jason Riedy; Xiaoye S. Li
We present the algorithm, error bounds, and numerical results for extra-precise iterative refinement applied to overdetermined linear least squares (LLS) problems. We apply our linear system refinement algorithm to Björck’s augmented linear system formulation of an LLS problem. Our algorithm reduces the forward normwise and componentwise errors to <i>O</i>(<i>ϵ</i><sub>w</sub>), where <i>ϵ</i><sub>w</sub> is the working precision, unless the system is too ill conditioned. In contrast to linear systems, we provide two separate error bounds for the solution <i>x</i> and the residual <i>r</i>. The refinement algorithm requires only limited use of extra precision and adds only <i>O</i>(<i>mn</i>) work to the <i>O</i>(<i>mn</i><sup>2</sup>) cost of QR factorization for problems of size <i>m</i>-by-<i>n</i>. The extra precision calculation is facilitated by the new extended-precision BLAS standard in a portable way, and the refinement algorithm will be included in a future release of LAPACK and can be extended to the other types of least squares problems.
Numerical Algorithms | 2004
James Demmel; Yozo Hida
We present several simple algorithms for accurately computing the sum of n floating point numbers using a wider accumulator. Let f and F be the number of significant bits in the summands and the accumulator, respectively. Then assuming gradual underflow, no overflow, and round-to-nearest arithmetic, up to ⌊2F−f/(1−2−f)⌋+1 numbers can be accurately added by just summing the terms in decreasing order of exponents, yielding a sum correct to within about 1.5 units in the last place. In particular, if the sum is zero, it is computed exactly. We apply this result to the floating point formats in the IEEE floating point standard, and investigate its performance. Our results show that in the absence of massive cancellation (the most common case) the cost of guaranteed accuracy is about 30–40% more than the straightforward summation. If massive cancellation does occur, the cost of computing the accurate sum is about a factor of ten. Finally, we apply our algorithm in computing a robust geometric predicate (used in computational geometry), where our accurate summation algorithm improves the existing algorithm by a factor of two on a nearly coplanar set of points.
parallel computing | 2006
James Demmel; Jack J. Dongarra; Beresford N. Parlett; William Kahan; Ming Gu; David Bindel; Yozo Hida; Xiaoye S. Li; Osni Marques; E. Jason Riedy; Christof Vömel; Julien Langou; Piotr Luszczek; Jakub Kurzak; Alfredo Buttari; Julie Langou; Stanimire Tomov
New releases of the widely used LAPACK and ScaLAPACK numerical linear algebra libraries are planned. Based on an on-going user survey (www.netlib.org/lapack-dev) and research by many people, we are proposing the following improvements: Faster algorithms, including better numerical methods, memory hierarchy optimizations, parallelism, and automatic performance tuning to accommodate new architectures; More accurate algorithms, including better numerical methods, and use of extra precision; Expanded functionality, including updating and downdating, new eigenproblems, etc. and putting more of LAPACK into ScaLAPACK; Improved ease of use, e.g., via friendlier interfaces in multiple languages. To accomplish these goals we are also relying on better software engineering techniques and contributions from collaborators at many institutions.
SIAM Journal on Scientific Computing | 2009
James Demmel; Mark Hoemmen; Yozo Hida; Jason Riedy
The Householder reflections used in LAPACKs
IEEE Std 754-2008 | 2008
Dan Zuras; Mike Cowlishaw; Alex Aiken; Matthew Applegate; David H. Bailey; Steve Bass; Dileep Bhandarkar; Mahesh Bhat; David Bindel; Sylvie Boldo; Stephen Canon; Steven R. Carlough; Marius Cornea; John H. Crawford; Joseph D. Darcy; Debjit Das Sarma; Marc Daumas; Bob Davis; Mark Davis; Dick Delp; James Demmel; Mark A. Erle; Hossam A. H. Fahmy; J. P. Fasano; Richard Fateman; Eric Feng; Warren E. Ferguson; Alex Fit-Florea; Laurent Fournier; Chip Freitag
QR
Lawrence Berkeley National Laboratory | 2000
Yozo Hida; Xiaoye S. Li; David H. Bailey
factorization leave positive and negative real entries along
Archive | 2007
Yozo Hida; Xiaoye S. Li; David H. Bailey
R
Archive | 2002
James Demmel; Yozo Hida
s diagonal. This is sufficient for most applications of