Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jonathan D. Hogg is active.

Publication


Featured researches published by Jonathan D. Hogg.


SIAM Journal on Scientific Computing | 2010

Design of a Multicore Sparse Cholesky Factorization Using DAGs

Jonathan D. Hogg; J. K. Reid; Jennifer A. Scott

The rapid emergence of multicore machines has led to the need to design new algorithms that are efficient on these architectures. Here, we consider the solution of sparse symmetric positive-definite linear systems by Cholesky factorization. We were motivated by the successful division of the computation in the dense case into tasks on blocks and use of a task manager to exploit all the parallelism that is available between these tasks, whose dependencies may be represented by a directed acyclic graph (DAG). Our sparse algorithm is built on the assembly tree and subdivides the work at each node into tasks on blocks of the Cholesky factor. The dependencies between these tasks may again be represented by a DAG. To limit memory requirements, blocks are updated directly rather than through generated-element matrices. Our algorithm is implemented within a new efficient and portable solver HSL_MA87. It is written in Fortran 95 plus OpenMP and is available as part of the software library HSL. Using problems arising from a range of applications, we present experimental results that support our design choices and demonstrate that HSL_MA87 obtains good serial and parallel times on our 8-core test machines. Comparisons are made with existing modern solvers and show that HSL_MA87 performs well, particularly in the case of very large problems.


Mathematical Programming Computation | 2009

A structure-conveying modelling language for mathematical and stochastic programming

Marco Colombo; Andreas Grothey; Jonathan D. Hogg; Kristian Woodsend; Jacek Gondzio

We present a structure-conveying algebraic modelling language for mathematical programming. The proposed language extends AMPL with object-oriented features that allows the user to construct models from sub-models, and is implemented as a combination of pre- and post-processing phases for AMPL. Unlike traditional modelling languages, the new approach does not scramble the block structure of the problem, and thus it enables the passing of this structure on to the solver. Interior point solvers that exploit block linear algebra and decomposition-based solvers can therefore directly take advantage of the problem’s structure. The language contains features to conveniently model stochastic programming problems, although it is designed with a much broader application spectrum.


ACM Transactions on Mathematical Software | 2016

A Sparse Symmetric Indefinite Direct Solver for GPU Architectures

Jonathan D. Hogg; Evgueni E. Ovtchinnikov; Jennifer A. Scott

In recent years, there has been considerable interest in the potential for graphics processing units (GPUs) to speed up the performance of sparse direct linear solvers. Efforts have focused on symmetric positive-definite systems for which no pivoting is required, while little progress has been reported for the much harder indefinite case. We address this challenge by designing and developing a sparse symmetric indefinite solver SSIDS. This new library-quality LDLT factorization is designed for use on GPU architectures and incorporates threshold partial pivoting within a multifrontal approach. Both the factorize and the solve phases are performed using the GPU. Another important feature is that the solver produces bit-compatible results. Numerical results for indefinite problems arising from a range of practical applications demonstrate that, for large problems, SSIDS achieves performance improvements of up to a factor of 4.6 × compared with a state-of-the-art multifrontal solver on a multicore CPU.


Algorithms | 2013

New Parallel Sparse Direct Solvers for Multicore Architectures

Jonathan D. Hogg; Jennifer A. Scott

At the heart of many computations in science and engineering lies the need to efficiently and accurately solve large sparse linear systems of equations. Direct methods are frequently the method of choice because of their robustness, accuracy and potential for use as black-box solvers. In the last few years, there have been many new developments, and a number of new modern parallel general-purpose sparse solvers have been written for inclusion within the HSL mathematical software library. In this paper, we introduce and briefly review these solvers for symmetric sparse systems. We describe the algorithms used, highlight key features (including bit-compatibility and out-of-core working) and then, using problems arising from a range of practical applications, we illustrate and compare their performances. We demonstrate that modern direct solvers are able to accurately solve systems of order 106 in less than 3 minutes on a 16-core machine.


european conference on parallel processing | 2016

A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves

Weifeng Liu; Ang Li; Jonathan D. Hogg; Iain S. Duff; Brian Vinter

The sparse triangular solve kernel, SpTRSV, is an important building block for a number of numerical linear algebra routines. Parallelizing SpTRSV on todays manycore platforms, such as GPUs, is not an easy task since computing a component of the solution may depend on previously computed components, enforcing a degree of sequential processing. As a consequence, most existing work introduces a preprocessing stage to partition the components into a group of level-sets or colour-sets so that components within a set are independent and can be processed simultaneously during the subsequent solution stage. However, this class of methods requires a long preprocessing time as well as significant runtime synchronization overhead between the sets. To address this, we propose in this paper a novel approach for SpTRSV in which the ordering between components is naturally enforced within the solution stage. In this way, the cost for preprocessing can be greatly reduced, and the synchronizations between sets are completely eliminated. A comparison with the state-of-the-art library supplied by the GPU vendor, using 11 sparse matrices on the latest GPU device, show that our approach obtains an average speedup of 2.3 times in single precision and 2.14 times in double precision. The maximum speedups are 5.95 and 3.65, respectively. In addition, our method is an order of magnitude faster for the preprocessing stage than existing methods.


ACM Transactions on Mathematical Software | 2013

Pivoting strategies for tough sparse indefinite systems

Jonathan D. Hogg; Jennifer A. Scott

The performance of a sparse direct solver is dependent upon the pivot sequence that is chosen before the factorization begins. In the case of symmetric indefinite systems, it may be necessary to modify this sequence during the factorization to ensure numerical stability. These modifications can have serious consequences in terms of time as well as the memory and flops required for the factorization and subsequent solves. This study focuses on hard-to-solve sparse symmetric indefinite problems for which standard threshold partial pivoting leads to significant modifications. We perform a detailed review of pivoting strategies that are aimed at reducing the modifications without compromising numerical stability. Extensive numerical experiments are performed on a set of tough problems arising from practical applications. Based on our findings, we make recommendations on which strategy to use and, in particular, a matching-based approach is recommended for numerically challenging problems.


ACM Transactions on Mathematical Software | 2010

A fast and robust mixed-precision solver for the solution of sparse symmetric linear systems

Jonathan D. Hogg; Jennifer A. Scott

On many current and emerging computing architectures, single-precision calculations are at least twice as fast as double-precision calculations. In addition, the use of single precision may reduce pressure on memory bandwidth. The penalty for using single precision for the solution of linear systems is a potential loss of accuracy in the computed solutions. For sparse linear systems, the use of mixed precision in which double-precision iterative methods are preconditioned by a single-precision factorization can enable the recovery of high-precision solutions more quickly and use less memory than a sparse direct solver run using double-precision arithmetic. In this article, we consider the use of single precision within direct solvers for sparse symmetric linear systems, exploiting both the reduction in memory requirements and the performance gains. We develop a practical algorithm to apply a mixed-precision approach and suggest parameters and techniques to minimize the number of solves required by the iterative recovery process. These experiments provide the basis for our new code HSL_MA79—a fast, robust, mixed-precision sparse symmetric solver that is included in the mathematical software library HSL. Numerical results for a wide range of problems from practical applications are presented.


Concurrency and Computation: Practice and Experience | 2017

Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides

Weifeng Liu; Ang Li; Jonathan D. Hogg; Iain S. Duff; Brian Vinter

The sparse triangular solve kernels, SpTRSV and SpTRSM, are important building blocks for a number of numerical linear algebra routines. Parallelizing SpTRSV and SpTRSM on todays manycore platforms, such as GPUs, is not an easy task since computing a component of the solution may depend on previously computed components, enforcing a degree of sequential processing. As a consequence, most existing work introduces a preprocessing stage to partition the components into a group of level‐sets or colour‐sets so that components within a set are independent and can be processed simultaneously during the subsequent solution stage. However, this class of methods requires a long preprocessing time as well as significant runtime synchronization overheads between the sets. To address this, we propose in this paper novel approaches for SpTRSV and SpTRSM in which the ordering between components is naturally enforced within the solution stage. In this way, the cost for preprocessing can be greatly reduced, and the synchronizations between sets are completely eliminated. To further exploit the data‐parallelism, we also develop an adaptive scheme for efficiently processing multiple right‐hand sides in SpTRSM. A comparison with a state‐of‐the‐art library supplied by the GPU vendor, using 20 sparse matrices on the latest GPU device, shows that the proposed approach obtains an average speedup of over two for SpTRSV and up to an order of magnitude speedup for SpTRSM. In addition, our method is up to two orders of magnitude faster for the preprocessing stage than existing SpTRSV and SpTRSM methods.


SIAM Journal on Scientific Computing | 2013

A Fast Dense Triangular Solve in CUDA

Jonathan D. Hogg

The level 2 BLAS operation _trsv performs a dense triangular solve and is often used in the solve phase of a direct solver following a matrix factorization. With the advent of manycore architectures reducing the cost of compute-bound parts of the computation, memory-bound operations such as this kernel become increasingly important. This is particularly noticeable in sparse direct solvers used for optimization applications where multiple memory-bound solves follow each (traditionally expensive) compute-bound factorization. In this paper, a high performance implementation of the triangular solve is developed through an analysis of theoretical and practical bounds on its run time. This implementation outperforms the CUBLAS by a factor of 5--15.


Numerical Linear Algebra With Applications | 2015

On the use of suboptimal matchings for scaling and ordering sparse symmetric matrices

Jonathan D. Hogg; Jennifer A. Scott

Summary The use of matchings is a powerful technique for scaling and ordering sparse matrices prior to the solution of a linear system Ax = b. Traditional methods such as implemented by the HSL software package MC64 use the Hungarian algorithm to solve the maximum weight maximum cardinality matching problem. However, with advances in the algorithms and hardware used by direct methods for the parallelization of the factorization and solve phases, the serial Hungarian algorithm can represent an unacceptably large proportion of the total solution time for such solvers. Recently, auction algorithms and approximation algorithms have been suggested as alternatives for achieving near-optimal solutions for the maximum weight maximum cardinality matching problem. In this paper, the efficacy of auction and approximation algorithms as replacements for the Hungarian algorithm is assessed in the context of sparse symmetric direct solvers when used in problems arising from a range of practical applications. High-cardinality suboptimal matchings are shown to be as effective as optimal matchings for the purposes of scaling. However, matching-based ordering techniques require that matchings are much closer to optimality before they become effective. The auction algorithm is demonstrated to be capable of finding such matchings significantly faster than the Hungarian algorithm, but our 12-approximation matching approach fails to consistently achieve a sufficient cardinality. Copyright

Collaboration


Dive into the Jonathan D. Hogg's collaboration.

Top Co-Authors

Avatar

Jennifer A. Scott

Rutherford Appleton Laboratory

View shared research outputs
Top Co-Authors

Avatar

Iain S. Duff

Rutherford Appleton Laboratory

View shared research outputs
Top Co-Authors

Avatar

J. K. Reid

Rutherford Appleton Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brian Vinter

University of Copenhagen

View shared research outputs
Top Co-Authors

Avatar

Weifeng Liu

University of Copenhagen

View shared research outputs
Top Co-Authors

Avatar

Ang Li

Pacific Northwest National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge