Yousef Saad | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yousef Saad is active.

Explore More

Publication

Featured researches published by Yousef Saad.

Numerical Linear Algebra With Applications | 1994

ILUT: A dual threshold incomplete LU factorization

Yousef Saad

In this paper we describe an Incomplete LU factorization technique based on a strategy which combines two heuristics. This ILUT factorization extends the usual ILU(O) factorization without using the concept of level of fill-in. There are two traditional ways of developing incomplete factorization preconditioners. The first uses a symbolic factorization approach in which a level of fill is attributed to each fill-in element using only the graph of the matrix. Then each fill-in that is introduced is dropped whenever its level of fill exceeds a certain threshold. The second class of methods consists of techniques derived from modifications of a given direct solver by including a dropoff rule, based on the numerical size of the fill-ins introduced, traditionally referred to as threshold preconditioners. The first type of approach may not be reliable for indefinite problems, since it does not consider numerical values. The second is often far more expensive than the standard ILU(O). The strategy we propose is a compromise between these two extremes.

SIAM Journal on Numerical Analysis | 1992

Analysis of some Krylov subspace approximations to the matrix exponential operator

Yousef Saad

In this note a theoretical analysis of some Krylov subspace approximations to the matrix exponential operation

Mathematics of Computation | 1981

Krylov subspace methods for solving large unsymmetric linear systems

Yousef Saad

\exp (A)v

Siam Journal on Scientific and Statistical Computing | 1992

Efficient solution of parabolic equations by Krylov approximation methods

Efstratios Gallopoulos; Yousef Saad

is presented, and a priori and a posteriors error estimates are established. Several such approximations are considered. The main idea of these techniquesis to approximately project the exponential operator onto a small Krylov subspace and to carry out the resulting small exponential matrix computation accurately. This general approach, which has been used with success in several applications, provides a systematic way of defining high-order explicit-type schemes for solving systems of ordinary differential equations or time-dependent partial differential equations.

Journal of Computational and Applied Mathematics | 1997

Experimental study of ILU preconditioners for indefinite matrices

Edmond Chow; Yousef Saad

Some algorithms based upon a projection process onto the Krylov subspace K/sub m/ = Span(r/sub 0/, Ar/sub 0/,..., A/sup m-1/r/sub 0/) are developed, generalizing the method of conjugate gradients to unsymmetric systems. These methods are extensions of Arnoldis algorithm for solving eigenvalue problems. The convergence is analyzed in terms of the distance of the solution to the subspace K/sub m/ and some error bounds are established showing in particular a similarity with the conjugate gradient method (for symmetric matrices) when the eigenvalues are real. Several numerical experiments are described and discussed.

The Journal of Supercomputing | 2013

GPU-accelerated preconditioned iterative linear solvers

Ruipeng Li; Yousef Saad

This paper takes a new look at numerical techniques for solving parabolic equations by the method of lines. The main motivation for the proposed approach is the possibility of exploiting a high degree of parallelism in a simple manner. The basic idea of the method is to approximate the action of the evolution operator on a given state vector by means of a projection process onto a Krylov subspace. Thus the resulting approximation consists of applying an evolution operator of very small dimension to a known vector, which is, in turn, computed accurately by exploiting high-order rational Chebyshev and Pade approximations to the exponential. Because the rational approximation is only applied to a small matrix, the only operations required with the original large matrix are matrix-by-vector multiplications and, as a result, the algorithm can easily be parallelized and vectorized. Further parallelism is introduced by expanding the rational approximations into partial fractions. Some relevant approximation and ...

SIAM Journal on Numerical Analysis | 1980

On the Rates of Convergence of the Lanczos and the Block-Lanczos Methods

Yousef Saad

Incomplete LU factorization preconditioners have been surprisingly successful for many cases of general nonsymmetric and indefinite matrices. However, their failure rate is still too high for them to be useful as black-box library software for general matrices. Besides fatal breakdowns due to zero pivots, the major causes of failure are inaccuracy, and instability of the triangular solves. When there are small pivots, both these problems can occur, but these problems can also occur without small pivots. Through examples from actual problems, this paper shows how these problems evince themselves, how these problems can be detected, and how these problems can sometimes be circumvented through pivoting, reordering, scaling, perturbing diagonal elements, and preserving symmetric structure. The goal of this paper is to gain a better practical understanding of ILU preconditioners and help improve their reliability.

Numerical Linear Algebra With Applications | 1997

Deflated and Augmented Krylov Subspace Techniques

Andrew Chapman; Yousef Saad

This work is an overview of our preliminary experience in developing a high-performance iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the advantages and difficulties encountered when deploying GPU technology to perform sparse linear algebra computations. Techniques for speeding up sparse matrix-vector product (SpMV) kernels and finding suitable preconditioning methods are discussed. Our experiments with an NVIDIA TESLA M2070 show that for unstructured matrices SpMV kernels can be up to 8 times faster on the GPU than the Intel MKL on the host Intel Xeon X5675 Processor. Overall performance of the GPU-accelerated Incomplete Cholesky (IC) factorization preconditioned CG method can outperform its CPU counterpart by a smaller factor, up to 3, and GPU-accelerated The incomplete LU (ILU) factorization preconditioned GMRES method can achieve a speed-up nearing 4. However, with better suited preconditioning techniques for GPUs, this performance can be further improved.

SIAM Journal on Scientific Computing | 1999

A Deflated Version of the Conjugate Gradient Algorithm

Yousef Saad; Manshung Yeung; Jocelyne Erhel; Frédéric Guyomarc'h

Theoretical error bounds are established, improving those given by S. Kaniel. Similar inequalities are found for the eigenvectors by using bounds on the acute angle between the exact eigenvectors a...

SIAM Journal on Scientific Computing | 1996

ILUM: a multi-elimination ILU preconditioner for general sparse matrices

Yousef Saad

We present a general framework for a number of techniques based on projection methods on ‘augmented Krylov subspaces’. These methods include the deflated GMRES algorithm, an inner–outer FGMRES iteration algorithm, and the class of block Krylov methods. Augmented Krylov subspace methods often show a significant improvement in convergence rate when compared with their standard counterparts using the subspaces of the same dimension. The methods can all be implemented with a variant of the FGMRES algorithm.

Explore More