Sverker Holmgren | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sverker Holmgren is active.

Explore More

Publication

Featured researches published by Sverker Holmgren.

Bioinformatics | 2004

Simultaneous search for multiple QTL using the global optimization algorithm DIRECT

Kajsa Ljungberg; Sverker Holmgren; Örjan Carlborg

MOTIVATION A simultaneous search is necessary for maximizing the power to detect epistatic quantitative trait loci (QTL). The computational complexity demands that the traditional exhaustive search be replaced by a more efficient global optimization algorithm. RESULTS We have the previously known algorithm adapted DIRECT, to the problem of simultaneous mapping of multiple QTL. We have compared DIRECT with standard exhaustive search and a genetic algorithm previously used for QTL mapping in two dimensions. In all two- and three-QTL test cases, DIRECT accurately finds the global optimum two to four orders of magnitude faster than when using an exhaustive search, and one order of magnitude faster than when using the genetic algorithm. Thus, randomization testing for determining empirical significance thresholds for at least three QTL is made feasible by the use of DIRECT. AVAILABILITY The code of the prototype implementation is available at http://user.it.uu.se/~kl/qtl_software.html

international conference on supercomputing | 2005

affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system

Henrik Löf; Sverker Holmgren

The non-uniform memory access times of modern cc-NUMA systems often impair performance for shared memory applications. This is especially true for applications exhibiting complex access patterns. To improve performance, a mechanism for co-locating threads and data during the execution is needed. In this paper, we study how an affinity-on-next-touch procedure can be used to attain this goal. Such a procedure can increase thread-data affinity by migrating data across nodes to better match the access pattern. The migration is triggered by a directive and it can often be implemented as a re-invocation of a standard first-touch page placement procedure. We study an industrial-class scientific application where the thread-data affinity is small due to serial initializations of data structures accessed indirectly. Adding a single affinity-on-next-touch directive, we observed a performance improvement of 69% for 22 threads. We also perform experiments to study the scalability of the affinity-on-next-touch procedure. Our results indicate that the overhead associated with the procedure is highly dependent on the efficiency of the mechanism used to keep TLBs consistent. Using larger but fewer memory pages in the virtual memory sub-system we measured a total performance improvement of 166% compared to the original code.

Journal of Chemical Physics | 2008

Accurate time propagation for the Schrödinger equation with an explicitly time-dependent Hamiltonian

Katharina Kormann; Sverker Holmgren; Hans O. Karlsson

Several different numerical propagation techniques for explicitly time-dependent Hamiltonians are discussed and compared, with the focus on models of pump-probe experiments. The quality of the rotating wave approximation is analyzed analytically, and we point out under which circumstances the modeling becomes inaccurate. For calculations with the fully time-dependent Hamiltonian, we show that for multistate systems, with either time or space dependence in the interstate coupling, the fourth order truncated Magnus expansion can be reformulated so that no commutators appear. Our results show that the split-operator method should only be used when low accuracy is acceptable. For accurate and efficient time stepping, the Magnus-Lanczos approach appears to be the best choice.

SIAM Journal on Matrix Analysis and Applications | 1992

Iterative solution methods and preconditioners for block-tridiagonal systems of equations

Sverker Holmgren; Kurt Otto

Systems of equations arising from implicit time discretizations and finite difference space discretizations of systems of partial differential equations in two space dimensions are considered. The ...

international conference on supercomputing | 2006

Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessors

Dan Wallin; Henrik Löf; Erik Hagersten; Sverker Holmgren

Efficient solution of partial differential equations require a match between the algorithm and the target architecture. Many recent chip multiprocessors, CMPs (a.k.a. multi-core), feature low intra-thread communication costs and smaller per-thread caches compared to previous shared memory multi-processor systems. From an algorithmic point of view this means that data locality issues become more important than communication overheads. A fact that may require a re-evaluation of many existing algorithms.We have investigated parallel implementations of multi-grid methods using a parallel temporally blocked, naturally ordered smoother. Compared to the standard multigrid solution based on a red-black ordering, we improve the data locality often as much as ten times, while our use of a fine-grained locking scheme keeps the parallel efficiency high.Our algorithm was initially inspired by CMPs and it was surprising to see that our OpenMP multigrid implementation ran up to 40 percent faster than the standard red-black algorithm on a contemporary 8-way SMP system. Thanks to the temporal blocking introduced, our smoother implementation often allowed us to apply the smoother two times at the same cost as a single application of a red-black smoother. By executing our smoother on a 32-thread UltraSPARC T1 (Niagara) SMT/CMP and a simulated 32-way CMP we demonstrate that such architectures can tolerate the increased communication costs implied by the tradeoffs made in our implementation.

Journal of Computational Biology | 2002

Efficient algorithms for quantitative trait loci mapping problems.

Kajsa Ljungberg; Sverker Holmgren; Örjan Carlborg

Rapid advances in molecular genetics push the need for efficient data analysis. Advanced algorithms are necessary for extracting all possible information from large experimental data sets. We present a general linear algebra framework for quantitative trait loci (QTL) mapping, using both linear regression and maximum likelihood estimation. The formulation simplifies future comparisons between and theoretical analyses of the methods. We show how the common structure of QTL analysis models can be used to improve the kernel algorithms, drastically reducing the computational effort while retaining the original analysis results. We have evaluated our new algorithms on data sets originating from two large F(2) populations of domestic animals. Using an updating approach, we show that 1-3 orders of magnitude reduction in computational demand can be achieved for matrix factorizations. For interval-mapping/composite-interval-mapping settings using a maximum likelihood model, we also show how to use the original EM algorithm instead of the ECM approximation, significantly improving the convergence and further reducing the computational time. The algorithmic improvements makes it feasible to perform analyses which have previously been deemed impractical or even impossible. For example, using the new algorithms, it is reasonable to perform permutation testing using exhaustive search on populations of 200 individuals using an epistatic two-QTL model.

international conference on bioinformatics | 2009

cnF2freq: Efficient Determination of Genotype and Haplotype Probabilities in Outbred Populations Using Markov Models

Carl Nettelblad; Sverker Holmgren; Lucy Crooks; Örjan Carlborg

We have applied and implemented HMM (Hidden Markov Model) algorithms to calculate QTL genotype probabilities from marker and pedigree data in general population structures. These algorithms have a linear complexity in memory. In nearly all experimental pedigrees they result in more precise genotype estimates than the most commonly used approaches for determining genotypes at non-marker positions in QTL analysis in outbred F 2 line intercrosses [1], which include an exponential complexity factor as well as a data-reducing sampling step [2]. With a proper choice of parameters, the results from the existing methods can also be reproduced exactly. We show how the relative run times differ by a factor of 50 when 24 SNP markers are used, with our run time practically independent of marker count. The new method can also provide multi-generational probability estimates and perform haplotype inference from unphased data, which further improves accuracy and flexibility. An important future application of this method is for computationally efficient QTL genotype estimation in maps based on data from SNP chips containing 1000s of markers with mixed information content, for which there are no other suitable methods available at present.

SIAM Journal on Scientific Computing | 1994

Semicirculant preconditioners for first-order partial differential equations

Sverker Holmgren; Kurt Otto

This paper considers solving time-independent systems of first-order partial differential equations (PDEs) in two space dimensions using a conjugate gradient (CG)-like iterative method. The systems of equations are preconditioned using semicirculant preconditioners. Analytical formulas for the eigenvalues and the eigenvectors are derived for a scalar model problem with constant coefficients. The main problems in constructing and analyzing the numerical methods are caused by the numerical boundary conditions required at the outflow boundaries. It is proved that, when the grid ratio is less than one, the spectrum asymptotically becomes two finite curve segments that are independent of the number of gridpoints. The same type of result for a time-dependent problem has previously been established. For the restarted generalized minimal residual (GMRES) iteration, a slight reduction of the grid ratio from one substantially improves the convergence rate. This is also predicted by an asymptotic analysis of the eig...

international workshop on openmp | 2005

Geographical locality and dynamic data migration for OpenMP implementations of adaptive PDE solvers

Markus Nordén; Henrik Löf; Jarmo Rantakokko; Sverker Holmgren

On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geographical locality. In this article, we study the performance of a parallel PDE solver with adaptive mesh refinement. The solver is parallelized using OpenMP and the adaptive mesh refinement makes dynamic load balancing necessary. Due to the dynamically changing memory access pattern caused by the runtime adaption, it is a challenging task to achieve a high degree of geographical locality. The main conclusions of the study are: (1) that geographical locality is very important for the performance of the solver, (2) that the performance can be improved significantly using dynamic page migration of misplaced data, (3) that a migrate-on-next-touch directive works well whereas the first-touch strategy is less advantageous for programs exhibiting a dynamically changing memory access patterns, and (4) that the overhead for such migration is low compared to the total execution time.

networking architecture and storages | 2012

Investigating an Open Source Cloud Storage Infrastructure for CERN-specific Data Analysis

Salman Zubair Toor; Rainer Toebbicke; Maitane Zotes Resines; Sverker Holmgren

We present a first case study where an open source storage cloud based on Openstack - SWIFT is used for handling data from CERN experiments using the ROOT software framework. This type of storage clouds promise to be easy to deploy and provide transparent access to data using standardized protocols. We examine the scalability and performance of the system using test cases which are derived from the normal usage and the structure of the ROOT software. The results show that cloud solutions like the SWIFT storage system could fulfill the requirements by the CERN scientific community. To verify this, a more extensive effort with many more tests and use-cases is needed. However, the impact of providing alternate storage solutions is large and further work is motivated.

Explore More