Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christian H. Bischof is active.

Publication


Featured researches published by Christian H. Bischof.


ACM Transactions on Mathematical Software | 1998

Computing rank-revealing QR factorizations of dense matrices

Christian H. Bischof; Gregorio Quintana-Ortí

We develop algorithms and implementations for computing rank-revealing QR (RRQR) factorizations of dense matrices. First, we develop an efficient block algorithm for approximating an RRQR factorization, employing a windowed version of the commonly used Golub pivoting strategy, aided by incremental condition estimation. Second, we develop efficiently implementable variants of guaranteed reliable RRQR algorithms for triangular matrices originally suggested by Chandrasekaran and Ipsen and by Pan and Tang. We suggest algorithmic improvements with respect to condition estimation, termination criteria, and Givens updating. By combining the block algorithm with one of the triangular postprocessing steps, we arrive at an efficient and reliable algorithm for computing an RRQR factorization of a dense matrix. Experimental results on IBM RS/6000 SGI R8000 platforms show that this approach performs up to three times faster that the less reliable QR factorization with column pivoting as it is currently implemented in LAPACK, and comes within 15% of the performance of the LAPACK block algorithm for computing a QR factorization without any column exchanges. Thus, we expect this routine to be useful in may circumstances where numerical rank deficiency cannot be ruled out, but currently has been ignored because of the computational cost of dealing with it.


SIAM Journal on Matrix Analysis and Applications | 1990

Incremental condition estimation

Christian H. Bischof

This paper introduces a new technique for estimating the smallest singular value, and hence the condition number, of a dense triangular matrix as it is generated one row or column at a time. It is also shown how this condition estimator can be interpreted as trying to approximate the secular equation with a simpler rational function. While one can construct examples where this estimator fails, numerical experiments demonstrate that despite its small computational cost, it produces reliable estimates. Also given is an example that shows the advantage of incorporating the incremental condition estimation strategy into the QR factorization algorithm with column pivoting to guard against near rank deficiency going unnoticed.


Siam Journal on Scientific and Statistical Computing | 1991

Structure-preserving and rank-revealing QR-factorizations

Christian H. Bischof; Per Christian Hansen

The rank-revealing QR-factorization (RRQR factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using a restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compu...


Siam Journal on Scientific and Statistical Computing | 1991

A parallel QR factorization algorithm with controlled local pivoting

Christian H. Bischof

This paper presents a new version of the Householder algorithm with column pivoting for computing a QR factorization that identifies rank and range space of a given matrix. The standard pivoting technique is not well suited for parallel computation, since it requires synchronization at every step in order to choose the next pivot column. In contrast, a restricted pivoting scheme that restricts the choice of pivot columns and avoids this synchronization constraint is employed. Incremental condition estimation is used to assess the effect that the addition of a candidate pivot column would have on the condition number of the matrix being generated. This safeguard ensures that this local strategy selects pivot columns that make sense in the global context of the computation. The resulting algorithm is well suited for implementation on a parallel machine, in particular, a MIMD machine with distributed memory. Simulations demonstrate that the numerical behavior of the restricted pivoting strategy is comparable to the traditional global pivoting strategy. Implementation results of the QR factorization algorithm without pivoting and with local and traditional pivoting on the Intel iPSC/1 and iPSC/2 hypercubes show that our scheme about halves the extra time required for pivoting.


parallel computing | 1987

Computing the Singular Value Decomposition on a Distributed System of Vector Processors

Christian H. Bischof

Jacobi methods for computing the singular value decomposition (SVD) of a matrix are ideally suited for multiprocessor environments due to their inherent parallelism. In this paper we show how a block version of the two-sided Jacobi method can be used to compute the SVD efficiently on a distributed architecture. We compare two variants of this method that differ mainly in the degree to which they diagonalize a given subproblem. The first method is a true block generalization of the scalar scheme in that each off-diagonal block is completely annihilated. The second method is a scalar Jacobi algorithm reorganized in such a manner that it conforms to the block decomposition of the problem. We have performed experiments on the Loosely Coupled Array Processor (LCAP) system at IBM Kingston which for the purposes of this article can be viewed as a ring of up to ten FPS-164/MAX array processors. These experiments show that the block Jacobi algorithm performs well on a distributed system, especially when the processors have vector processing hardware. As an example, we were able to achieve a sustained performance of 159 MFlops on a 960-by-720 SVD problem using eight processors. A surprising outcome of these experiments is that the determining factor for the performance of the algorithm on a high-performance architecture is the subproblem solver, not the communication overhead of the algorithm.


The Journal of Supercomputing | 1989

Adaptive blocking in the QR factorization

Christian H. Bischof

On most high-performance architectures, data movement is slow compared to floating point (in particular, vector) performance. On these architectures block algorithms have been successful for matrix computations. By considering a matrix as a collection of submatrices (the so-called blocks), one naturally arrives at algorithms that require little data movement. The optimal blocking strategy, however, depends on the computing environment and on the problem parameters. On parallel machines, tradeoffs between individual floating point performance and overall system performance also come into play. Current approaches use fixed-width blocking strategies which are not optimal. This paper presents an adaptive blocking methodology for determining a good blocking strategy systematically. We demonstrate this technique on a block QR factorization routine on a distributed-memory machine. Using timing models for the high-level kernels of the algorithm, we can formulate in a recurrence relation a blocking strategy that avoids adding extra delays along the critical path of the algorithm. This recurrence relation predicts performance well since we base our timing models on observed data, not other simplistic measures. Experiments on the Intel iPSC/1 hypercube show that, in fact, the resulting blocking strategy is as good as any fixed-width blocking strategy, independent of problem size and the number of processors employed. So while we do not know the optimum fixed-width blocking strategy unless we rerun the same problem several times, adaptive blocking provides close to optimum performance in the first run. We also mention how adaptive blocking can result in performance portable code by automating the generation of the timing models.


Archive | 1991

Robust incremental condition estimation

Christian H. Bischof; P.T.P. Tang

This paper presents an improved version of incremental condition estimation, a technique for tracking the extremal singular values of a triangular matrix as it is being constructed one column at a time. We present a new motivation for this estimation technique using orthogonal projections. The paper focuses on an implementation of this estimation scheme in an accurate and consistent fashion. In particular, we address the subtle numerical issues arising in the computation of the eigensystem of a symmetric rank-one perturbed diagonal 2 {times} 2 matrix. Experimental results show that the resulting scheme does a good job in estimating the extremal singular values of triangular matrices, independent of matrix size and matrix condition number, and that it performs qualitatively in the same fashion as some of the commonly used nonincremental condition estimation schemes.


conference on high performance computing (supercomputing) | 1988

A parallel QR factorization algorithm using local pivoting

Christian H. Bischof

A parallel version of the Householder algorithm with column pivoting is introduced for computing the QR factorization of a matrix. Local pivoting allows efficient implementation of the algorithm on a parallel machine; in particular, it is implemented on one with a distributed architecture. An inexpensive but reliable incremental condition estimator is used to control the selection of pivot columns by obtaining cheap estimates for the smallest singular value of the currently created upper triangular matrix R. Numerical experiments show that the local pivoting strategy behaves about as well as the traditional global pivoting strategy. They also show the advantages of incorporating the controlled pivoting strategy into the traditional QR algorithm to guard against the known pathological cases.<<ETX>>


North-holland Mathematics Studies | 1986

Computing the Singular Value Decomposition on a Ring of Array Processors

Christian H. Bischof; Charles Van Loan

The parallel matrix computation community has lately devoted considerable attention to the Jacobi family of methods for computing eigenvalues and singular values. These methods “map” rather neatly onto various nearest neighbor architectures. If the processors are reasonably powerful and have significant local memory then block Jacobi procedures are appealing because they render a more favorable computation/communication ratio. We examined the scope of these claims by implementing a block Jacobi SVD procedure on the IBM Kingston Loosely Coupled Array Processor (LCAP) system. The LCAP system consists of ten FPS-164/MAX array processors connected in a ring via some large bulk memories. Our basic finding is that the algorithm is well-suited to the architecture but that its advantage over a single processor implementation of the Golub-Reinsch procedure is rather unclear.


Selected Papers from the Second Conference on Parallel Processing for Scientific Computing | 1985

The WY representation for products of householder matrices

Christian H. Bischof; Charles Van Loan

Collaboration


Dive into the Christian H. Bischof's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gautam M. Shroff

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Per Christian Hansen

Technical University of Denmark

View shared research outputs
Researchain Logo
Decentralizing Knowledge