Dennis D. Giannacopoulos

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dennis D. Giannacopoulos is active.

Explore More

Publication

Featured researches published by Dennis D. Giannacopoulos.

IEEE Transactions on Magnetics | 2010

Finite-Element Sparse Matrix Vector Multiplication on Graphic Processing Units

Maryam Mehri Dehnavi; David M. Fernández; Dennis D. Giannacopoulos

A wide class of finite-element (FE) electromagnetic applications requires computing very large sparse matrix vector multiplications (SMVM). Due to the sparsity pattern and size of the matrices, solvers can run relatively slowly. The rapid evolution of graphic processing units (GPUs) in performance, architecture, and programmability make them very attractive platforms for accelerating computationally intensive kernels such as SMVM. This work presents a new algorithm to accelerate the performance of the SMVM kernel on graphic processing units.

IEEE Transactions on Magnetics | 2006

Response surface space mapping for electromagnetic optimization

Mark Dorica; Dennis D. Giannacopoulos

An electromagnetic system can be described in a variety of ways. Coarse models provide fast evaluations but lack the required accuracy in the final stages of design. Fine models are highly accurate, but prohibitively expensive. Finding a compromise between these extremes may assist in overcoming bottlenecks in design automation and optimization. One approach is to carry out optimization in the coarse model space and use fine model simulations to fine-tune the result via space mapping. A new response surface space mapping (RSSM) strategy is presented and applied to an E-shaped patch antenna test case. The solutions that emerge are comparable to full fine model optimization at a fraction of the cost

ieee conference on electromagnetic field computation | 2010

Enhancing the performance of conjugate gradient solvers on graphic processing units

Maryam Mehri Dehnavi; David M. Fernández; Dennis D. Giannacopoulos

A study of the fundamental obstacles to accelerate the preconditioned conjugate gradient (PCG) method on modern graphic processing units (GPUs) is presented and several techniques are proposed to enhance its performance over previous work independent of the GPU generation and the matrix sparsity pattern. The proposed enhancements increase the performance of PCG up to 23 times compared to vector optimized PCG results on modern CPUs and up to 3.4 times compared to previous GPU results.

Computer Physics Communications | 2008

FPGA architecture and implementation of sparse matrix–vector multiplication for the finite element method

Yousef El-Kurdi; David M. Fernández; Evgueni Souleimanov; Dennis D. Giannacopoulos; Warren J. Gross

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix–vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.

ieee conference on electromagnetic field computation | 2007

Hardware Acceleration for Finite-Element Electromagnetics: Efficient Sparse Matrix Floating-Point Computations With FPGAs

Yousef El-Kurdi; Dennis D. Giannacopoulos; Warren J. Gross

Custom hardware acceleration of electromagnetics computation leverages favorable industry trends, which indicate reconfigurable hardware devices such as field-programmable gate arrays (FPGAs) may soon outperform general-purpose CPUs. We present a new striping method for efficient sparse matrix-vector multiplication implemented in a deeply pipelined FPGA design. The effectiveness of the new method is illustrated for a representative set of finite-element matrices computed on our highly scalable and fully pipelined FPGA-based implementation

IEEE Transactions on Magnetics | 1994

Towards optimal h-p adaptation near singularities in finite element electromagnetics

Dennis D. Giannacopoulos; Steve McFee

One of the most important problems of hybrid h-p adaption in finite element electromagnetics has been the accurate and efficient resolution of the singularities associated with sharp material edges and corners. One of the key obstacles has been the lack of objective standards by which to evaluate and compare adaptive control strategies. A set of optimal adaption benchmarks for the fundamental electromagnetic point and line singularity models is presented. The primary adaption procedures and control schemes are evaluated and compared. The absolute and relative performance of the competing approaches is discussed. >

IEEE Transactions on Magnetics | 1996

Optimal discretization based refinement criteria for finite element adaption

Steve McFee; Dennis D. Giannacopoulos

One of the major research issues in adaptive finite element analysis is the feedback control system used to guide the adaption. Essentially, one needs to resolve which error data to feedback after each iteration, and how to use it to initialize the next adaptive step. Variational aspects of optimal discretizations for scalar Poisson and Helmholtz systems are used to derive new refinement criteria for adaptive finite element solvers. They are shown to be effective and economical for h-, p- and hp-schemes.

IEEE Transactions on Parallel and Distributed Systems | 2013

Parallel Sparse Approximate Inverse Preconditioning on Graphic Processing Units

Maryam Mehri Dehnavi; David M. Fernández; Jean-Luc Gaudiot; Dennis D. Giannacopoulos

Accelerating numerical algorithms for solving sparse linear systems on parallel architectures has attracted the attention of many researchers due to their applicability to many engineering and scientific problems. The solution of sparse systems often dominates the overall execution time of such problems and is mainly solved by iterative methods. Preconditioners are used to accelerate the convergence rate of these solvers and reduce the total execution time. Sparse approximate inverse (SAI) preconditioners are a popular class of preconditioners designed to improve the condition number of large sparse matrices. We propose a GPU accelerated SAI preconditioning technique called GSAI, which parallelizes the computation of this preconditioner on NVIDIA graphic cards. The preconditioner is then used to enhance the convergence rate of the BiConjugate Gradient Stabilized (BiCGStab) iterative solver on the GPU. The SAI preconditioner is generated on average 28 and 23 times faster on the NVIDIA GTX480 and TESLA M2070 graphic cards, respectively, compared to ParaSails (a popular implementation of SAI preconditioners on CPU) single processor/core results. The proposed GSAI technique computes the SAI preconditioner in approximately the same time as ParaSails generates the same preconditioner on 16 AMD Opteron 252 processors.

IEEE Transactions on Magnetics | 2012

Alternate Parallel Processing Approach for FEM

David M. Fernández; Maryam Mehri Dehnavi; Warren J. Gross; Dennis D. Giannacopoulos

In this work we present a new alternate way to formulate the finite element method (FEM) for parallel processing based on the solution of single mesh elements called FEM-SES. The key idea is to decouple the solution of a single element from that of the whole mesh, thus exposing parallelism at the element level. Individual element solutions are then superimposed node-wise using a weighted sum over concurrent nodes. A classic 2-D electrostatic problem is used to validate the proposed method obtaining accurate results. Results show that the number of iterations of the proposed FEM-SES method scale sublinearly with the number of unknowns. Two generations of CUDA enabled NVIDIA GPUs were used to implement the FEM-SES method and the execution times were compared to the classic FEM showing important performance benefits.

IEEE Transactions on Antennas and Propagation | 2013

Finite-Element Time-Domain Solution of the Vector Wave Equation in Doubly Dispersive Media Using Möbius Transformation Technique

Ali Akbarzadeh-Sharbaf; Dennis D. Giannacopoulos

Several finite-element time-domain (FETD) formulations to model inhomogeneous and electrically/magnetically/doubly dispersive materials based on the second-order vector wave equation discretized by the Newmark-β scheme are developed. In contrast to the existing formulations, which employ recursive convolution (RC) approaches, we use a Möbius transformation method to derive our new formulations. Hence, the obtained equations are not only simpler in form and easier to derive and implement, but also do not suffer from the intrinsic limitations of the RC methods in modeling arbitrary high-order media. To obtain the formulations, we first demonstrate that the update equation for the electric field strength {e} in the mixed Crank-Nicolson (CN) FETD formulation, which is based on expanding the electric and magnetic field in terms of the edge and face elements in space and discretizing the resultant first-order differential equations using Crank-Nicolson scheme in time, is equivalent to the unconditionally stable (US) second-order vector wave equation for the same variable ( {e}) discretized by the Newmark- β method with β = 1/4. In addition, we show that the update equation for the magnetic flux density {b} in CN-FETD is the same as the second-order vector wave equation for {b} on the dual grid discretized again by a similar Newmark-β method. Subsequently, thanks to the mixed FETD formulation properties, we derive update equations for the constitutive relations using a Möbius transformation method separately. In addition, we use the shown equivalence to derive formulations based on the vector wave equation. Finally, several numerical examples are solved to validate the developed formulations.

Explore More