Adam Dziekonski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adam Dziekonski is active.

Explore More

Publication

Featured researches published by Adam Dziekonski.

ieee conference on electromagnetic field computation | 2009

How to Render FDTD Computations More Effective Using a Graphics Accelerator

Piotr Sypek; Adam Dziekonski; Michal Mrozowski

Graphics processing units (GPUs) for years have been dedicated mostly to real time rendering. Recently leading GPU manufactures have extended their research area and decided to support also graphics computing. In this paper, we describe an impact of new GPU features on development process of an efficient finite difference time domain (FDTD) implementation.

Progress in Electromagnetics Research-pier | 2011

A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU

Adam Dziekonski; Adam Lamecki; Michal Mrozowski

This paper proposes a new sparse matrix storage format which allows an e-cient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed speciflcally for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, signiflcantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.

Progress in Electromagnetics Research-pier | 2012

Finite Element Matrix Generation on a GPU

Adam Dziekonski; Piotr Sypek; Adam Lamecki; Michal Mrozowski

This paper presents an e-cient technique for fast gener- ation of sparse systems of linear equations arising in computational electromagnetics in a flnite element method using higher order ele- ments. The proposed approach employs a graphics processing unit (GPU) for both numerical integration and matrix assembly. The per- formance results obtained on a test platform consisting of a Fermi GPU (1x Tesla C2075) and a CPU (2x twelve-core Opterons), indicate that the GPU implementation of the matrix generation allows one to achieve speedups by a factor of 81 and 19 over the optimized single- and multi-threaded CPU-only implementations, respectively.

IEEE Microwave and Wireless Components Letters | 2011

GPU Acceleration of Multilevel Solvers for Analysis of Microwave Components With Finite Element Method

Adam Dziekonski; Adam Lamecki; Michal Mrozowski

The letter discusses a fast implementation of the conjugate gradient iterative method with E-field multilevel preconditioner applied to solving real symmetric and sparse systems obtained with vector finite element method. In order to accelerate computations, a graphics processing unit (GPU) was used and significant speed-up (2.61 fold) was achieved comparing to a central processing unit (CPU) based approach. These results indicate that performance of electromagnetic simulations can be significantly improved thereby enabling full wave optimization of microwave components in more manageable time.

IEEE Antennas and Wireless Propagation Letters | 2011

Tuning a Hybrid GPU-CPU V-Cycle Multilevel Preconditioner for Solving Large Real and Complex Systems of FEM Equations

Adam Dziekonski; Adam Lamecki; Michal Mrozowski

This letter presents techniques for tuning an accelerated preconditioned conjugate gradient solver with a multilevel preconditioner. The solver is optimized for a fast solution of sparse systems of equations arising in computational electromagnetics in a finite element method using higher-order elements. The goal of the tuning is to increase the throughput while at the same time reducing the memory requirements in order to allow one to process very large complex or real systems in single and double precision using commodity graphic processing units (GPUs). A threefold memory footprint reduction is achieved by means of a new format of storing sparse matrices. The acceleration is achieved by optimizing a sparse matrix-vector product on a GPU by applying new features of the Fermi architecture. Further improvements are obtained by introducing more levels into the preconditioner and the application of a fast sparse direct solver for the operations executed on a CPU. Numerical results for a setup consisting of a Fermi GPU (GTX 480) and a Xeon six-core CPU showed that the proposed approach allows one to handle systems involving millions of unknowns and reach the speedup factor of almost 4 compared to the CPU-only implementation.

IEEE Antennas and Wireless Propagation Letters | 2012

Accuracy, Memory, and Speed Strategies in GPU-Based Finite-Element Matrix-Generation

Adam Dziekonski; Piotr Sypek; Adam Lamecki; Michal Mrozowski

This letter presents strategies on how to optimize graphics processing unit (GPU)-based finite-element matrix-generation that occurs in the finite element method (FEM) using higher-order curvilinear elements. The goal of the optimization is to increase the speed of evaluation and assembly of large finite-element matrices on a single GPU while maintaining the accuracy of numerical integration at the desired level. For this reason, the choice of the optimal Gaussian quadratures for curvilinear finite elements focused on accuracy, memory usage, and runtime of numerical integration is discussed. Moreover, we show how to efficiently utilize symmetry of local mass and stiffness matrices on a GPU in the numerical integration step. The performance results, obtained on a workstation equipped with one Tesla C2075, indicate that the proposed strategies retain the accuracy of computations, allow generation of larger sparse linear systems, and provide 2.5-fold acceleration of GPU-based finite-element matrix-generation.

IEEE Antennas and Propagation Magazine | 2014

GPU-Accelerated Finite-Element Matrix Generation for Lossless, Lossy, and Tensor Media [EM Programmer's Notebook]

Adam Dziekonski; Piotr Sypek; Adam Lamecki; Michal Mrozowski

This paper presents an optimization approach for limiting memory requirements and enhancing the performance of GPU-accelerated finite-element matrix generation applied in the implementation of the higher-order finite-element method (FEM). It emphasizes the details of the implementation of the matrix-generation algorithm for the simulation of electromagnetic wave propagation in lossless, lossy, and tensor media. Moreover, the impact of GPU RAM memory requirements on the performance of the finite-element matrix-generation process is discussed. The numerical results were obtained using a workstation equipped with a Tesla K40 GPU and two Intel Xeon Sandy Bridge E5-2687W CPUs. The results obtained for the high-end test platform indicated that the utilization of a GPU in the finite-element matrix-generation process allowed significant time reduction. With double-precision arithmetic, the GPU-accelerated matrix generation of over 5 million unknowns could be carried out in a matter of tens of seconds, as opposed to a CPU that required several minutes.

IEEE Transactions on Microwave Theory and Techniques | 2017

Communication and Load Balancing Optimization for Finite Element Electromagnetic Simulations Using Multi-GPU Workstation

Adam Dziekonski; Piotr Sypek; Adam Lamecki; Michal Mrozowski

This paper considers a method for accelerating finite-element simulations of electromagnetic problems on a workstation using graphics processing units (GPUs). The focus is on finite-element formulations using higher order elements and tetrahedral meshes that lead to sparse matrices too large to be dealt with on a typical workstation using direct methods. We discuss the problem of rapid matrix generation and assembly, as well as accelerating preconditioned iterative solvers in the context of limited on-board GPU memory, and we show how to mitigate some of these problems using multiple GPUs. We propose a new fast data-distribution technique for multi-GPU platforms that allows optimal splitting of finite-element method (FEM) matrices between graphics accelerators. The technique draws upon the graph partitioning approach used in nonoverlapping domain-decomposition methods and provides information that drives the FEM matrix-generation and assembly process in such a way that it produces data structures for each GPU; this not only ensures load balancing and minimizes communication between GPUs, but also reflects the hierarchy of the basis functions. The concepts proposed in this paper are illustrated with examples involving sparse matrices of up to 13.9 million rows and over a billion nonzero elements.

international conference on microwaves, radar & wireless communications | 2012

Multi-core and multiprocessor implementation of numerical integration in Finite Element Method

J. Mamza; P. Makyla; Adam Dziekonski; Adam Lamecki; Michal Mrozowski

The paper presents techniques for accelerating a numerical integration process which appears in the Finite Element Method. The acceleration is achieved by taking advantages of multi-core and multiprocessor devices. It is shown that using multi-core implementation with OpenMP and a GPU acceleration using CUDA architecture allows one to achieve the speedups by a factor of 5 and 10 on a CPU and GPUs, respectively.

ieee mtt s international conference on numerical electromagnetic and multiphysics modeling and optimization | 2017

GPU acceleration of block Krylov methods for FEM problems in electromagnetics

Adam Dziekonski; Michal Mrozowski

This paper presents an approach to performing computations of a block sparse matrix-vector product, a corner-stone operation of the GPU-accelerated block Krylov subspace methods used in the solution phase of finite-element analysis. The results obtained for a high-end test platform (GPU: Tesla K40; CPU: Intel Xeon E5-2680 v3) indicate that the proposed code optimization allows significant time reductions in comparison with the functions available in the MAGMA, cuSPARSE, and Intel MKL libraries.

Explore More