Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Isak Jonsson is active.

Publication


Featured researches published by Isak Jonsson.


Siam Review | 2004

Recursive blocked algorithms and hybrid data structures for dense matrix library software

Erik Elmroth; Fred G. Gustavson; Isak Jonsson; Bo Kågström

Matrix computations are both fundamental and ubiquitous in computational science and its vast application areas. Along with the development of more advanced computer systems with complex memory hierarchies, there is a continuing demand for new algorithms and library software that efficiently utilize and adapt to new architecture features. This article reviews and details some of the recent advances made by applying the paradigm of recursion to dense matrix computations on todays memory-tiered computer systems. Recursion allows for efficient utilization of a memory hierarchy and generalizes existing fixed blocking by introducing automatic variable blocking that has the potential of matching every level of a deep memory hierarchy. Novel recursive blocked algorithms offer new ways to compute factorizations such as Cholesky and QR and to solve matrix equations. In fact, the whole gamut of existing dense linear algebra factorization is beginning to be reexamined in view of the recursive paradigm. Use of recursion has led to using new hybrid data structures and optimized superscalar kernels. The results we survey include new algorithms and library software implementations for level 3 kernels, matrix factorizations, and the solution of general systems of linear equations and several common matrix equations. The software implementations we survey are robust and show impressive performance on todays high performance computing systems.


ACM Transactions on Mathematical Software | 2002

Recursive blocked algorithms for solving triangular systems—Part I: one-sided and coupled Sylvester-type matrix equations

Isak Jonsson; Bo Kågström

Triangular matrix equations appear naturally in estimating the condition numbers of matrix equations and different eigenspace computations, including block-diagonalization of matrices and matrix pairs and computation of functions of matrices. To solve a triangular matrix equation is also a major step in the classical Bartels--Stewart method for solving the standard continuous-time Sylvester equation (AX − XB = C). We present novel recursive blocked algorithms for solving one-sided triangular matrix equations, including the continuous-time Sylvester and Lyapunov equations, and a generalized coupled Sylvester equation. The main parts of the computations are performed as level-3 general matrix multiply and add (GEMM) operations. In contrast to explicit standard blocking techniques, our recursive approach leads to an automatic variable blocking that has the potential of matching the memory hierarchies of todays HPC systems. Different implementation issues are discussed, including when to terminate the recursion, the design of new optimized superscalar kernels for solving leaf-node triangular matrix equations efficiently, and how parallelism is utilized in our implementations. Uniprocessor and SMP parallel performance results of our recursive blocked algorithms and corresponding routines in the state-of-the-art libraries LAPACK and SLICOT are presented. The performance improvements of our recursive algorithms are remarkable, including 10-fold speedups compared to standard algorithms.


parallel computing | 1998

Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms

Fred G. Gustavson; André Henriksson; Isak Jonsson; Bo Kågström; Per Ling

Recursive blocked data formats and recursive blocked BLAS’s are introduced and applied to dense linear algebra algorithms that are typified by LAPACK. The new data formats allow for maintaining data locality at every level of the memory hierarchy and hence providing high performance on today’s memory tiered processors. This new data format is hybrid. It contains blocking parameters which are chosen so that the associated submatrices of a block-partitioned A fir into level 1 cache. The recursive part of the data format chooses a linear order of the blocks that maintains a two-dimensional data locality of A in a one-dimensional tiered memory structure. We argue that, out of the NB factorial choices of ordering the NB blocks, our recursive ordering leads to one of the best. This is because our algorithms are also recursive and will do their computations on submatrices that follow the new recursive data structure definition. This is in analogy with the well known principle that the data structure should be matched to the algorithm. Performance results in support for our recursive approach are also presented.


ACM Transactions on Mathematical Software | 2002

Recursive blocked algorithms for solving triangular systems—Part II: two-sided and generalized Sylvester and Lyapunov matrix equations

Isak Jonsson; Bo Kågström

We continue our study of high-performance algorithms for solving triangular matrix equations. They appear naturally in different condition estimation problems for matrix equations and various eigenspace computations, and as reduced systems in standard algorithms. Building on our successful recursive approach applied to one-sided matrix equations (Part I), we now present novel recursive blocked algorithms for two-sided matrix equations, which include matrix product terms such as AXBT. Examples are the discrete-time standard and generalized Sylvester and Lyapunov equations. The means for achieving high performance is the recursive variable blocking, which has the potential of matching the memory hierarchies of todays high-performance computing systems, and level-3 computations which mainly are performed as GEMM operations. Different implementation issues are discussed, including the design of efficient new algorithms for two-sided matrix products. We present uniprocessor and SMP parallel performance results of recursive blocked algorithms and routines in the state-of-the-art SLICOT library. Although our recursive algorithms with optimized kernels for the two-sided matrix equations perform more operations, the performance improvements are remarkable, including 10-fold speedups or more, compared to standard algorithms.


Ibm Journal of Research and Development | 2000

Minimal-storage high-performance Cholesky factorization via blocking and recursion

Fred G. Gustavson; Isak Jonsson

We present a novel practical algorithm for Cholesky factorization when the matrix is stored in packed format by combining blocking and recursion. The algorithm simultaneously obtains Level 3 performance, conserves about half the storage, and avoids the production of Level 3 BLAS for packed format. We use recursive packed format, which was first described by Andersen et al. [1]. Our algorithm uses only DGEMM and Level 3 kernel routines; it first transforms standard packed format to packed recursive lower row format. Our new algorithm outperforms the Level 3 LAPACK routine DPOTRF even when we include the cost of data transformation. (This is true for three IBM platforms--the POWER3, the POWER2, and the PowerPC 604e.) For large matrices, blocking is not required for acceptable Level 3 performance. However, for small matrices the overhead of pure recursion and/or data transformation is too high. We analyze these costs analytically and provide detailed cost estimates. We show that blocking combined with recursion reduces all overheads to a tiny, acceptable level. However, a new problem of nonlinear addressing arises. We use two-dimensional mappings (tables) or data copying to overcome the high costs of directly computing addresses that are nonlinear functions of i and j.


parallel computing | 1998

Superscalar GEMM-based Level 3 BLAS - The On-going Evolution of a Portable and High-Performance Library

Fred G. Gustavson; André Henriksson; Isak Jonsson; Bo Kågström; Per Ling

Recently, a first version of our GEMM-based level 3 BLAS for superscalar type processors was announced. A new feature is the inclusion of DGEMM itself. This DGEMM routine contains inline what we call a level 3 kernel routine, which is based on register blocking. Additionally, it features level 1 cache blocking and data copying of submatrix operands for the level 3 kernel. Our other BLAS’s which possess triangular operands, e.g., DTRSM, DSYRK use a similar level 3 kernel routine to handle the triangular blocks that appear on the diagonal of the larger input triangular operand. Like our previous GEMM-based work all other BLAS’s perform the dominating part of the computations in calls to DGEMM. We are seeing the adoption of our BLAS’s by several organizations, including the ATLAS and PHiPAC projects on automatic generation of fast DGEMM kernels for superscalar processors, and some computer vendors. The evolution of the superscalar GEMM-based level 3 BLAS is presented. Also, we describe new developments which include techniques that make the library applicable to symmetric multiprocessing (SMP) systems.


parallel computing | 2006

Recursive blocked algorithms for solving periodic triangular Sylvester-type matrix equations

Robert Granat; Isak Jonsson; Bo Kågström

Recently, recursive blocked algorithms for solving triangular one-sided and two-sided Sylvester-type equations were introduced by Jonsson and Kagstrom. This elegant yet simple technique enables an automatic variable blocking that has the potential of matching the memory hierarchies of todays HPC systems. The main parts of the computations are performed as level 3 general matrix multiply and add (GEMM) operations. We extend and apply the recursive blocking technique to solving periodic Sylvester-type matrix equations. Successive recursive splittings are performed on 3-dimensional arrays, where the third dimension represents the periodicity of a matrix equation.


european conference on parallel processing | 2003

RECSY — A High Performance Library for Sylvester-Type Matrix Equations

Isak Jonsson; Bo Kågström

RECSY is a library for solving triangular Sylvester-type matrix equations. Its objectives are both speed and reliability. In order to achieve these goals, RECSY is based on novel recursive blocked algorithms, which call high-performance kernels for solving small-sized leaf problems of the recursion tree. In contrast to explicit standard blocking techniques, our recursive approach leads to an automatic variable blocking that has the potential of matching the memory hierarchies of today’s HPC systems. The RECSY library comprises a set of Fortran 90 routines, which uses recursion and OpenMP for shared memory parallelism to solve eight different matrix equations, including continuous-time as well as discrete-time standard and generalized Sylvester and Lyapunov equations. Uniprocessor and SMP parallel performance results of our recursive blocked algorithms and corresponding routines in state-of-the-art libraries LAPACK and SLICOT are presented. The performance improvements of our recursive algorithms are remarkable, including 10-fold speedups compared to standard algorithms.


european conference on parallel processing | 2008

Parallel Algorithms for Triangular Periodic Sylvester-Type Matrix Equations

Per Ola Andersson; Robert Granat; Isak Jonsson; Bo Kågström

We present parallel algorithms for triangular periodic Sylves-ter-type matrix equations, conceptually being the third step of a periodic Bartels---Stewart-like solution method for general periodic Sylvester-type matrix equations based on variants of the periodic Schur decomposition. The presented algorithms are designed and implemented in the framework of the recently developed HPC library SCASY and are based on explicit blocking, 2-dimensional block cyclic data distribution and a wavefront-like traversal of the right hand side matrices. High performance is obtained by rich usage of level 3 BLAS operations. It is also demonstrated how several important key concepts of SCASY regarding communications and the treatment of quasi-triangular coefficient matrices are generalized to the periodic case. Some experimental results from a distributed memory Linux cluster demonstrate are also presented.


Archive | 2009

RECSY and SCASY library software : recursive blocked and parallel algorithms for Sylvester-type matrix equations with some applications

Robert Granat; Isak Jonsson; Bo Kågström

In this contribution, we review state-of-the-art high-performance computing software for solving common standard and generalized continuous-time and discrete-time Sylvester-type matrix equations. The analysis is based on RECSY and SCASY software libraries. Our algorithms and software rely on the standard Schur method. Two ways of introducing blocking for solving matrix equations in reduced (quasi-triangular) form are reviewed. Most common is to perform a fix block partitioning of the matrices involved and rearrange the loop nests of a single-element algorithm so that the computations are performed on submatrices (matrix blocks). Another successful approach is to combine recursion and blocking.

Collaboration


Dive into the Isak Jonsson's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Per Ola Andersson

Swedish Defence Research Agency

View shared research outputs
Researchain Logo
Decentralizing Knowledge