Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alexei Strelchenko is active.

Publication


Featured researches published by Alexei Strelchenko.


Computer Physics Communications | 2014

Evaluation of disconnected quark loops for hadron structure using GPUs

Constantia Alexandrou; Martha Constantinou; Vincent Drach; Kyriakos Hadjiyiannakou; Karl Jansen; Giannis Koutsou; Alexei Strelchenko; Alejandro Vaquero

Abstract A number of stochastic methods developed for the calculation of fermion loops are investigated and compared, in particular with respect to their efficiency when implemented on Graphics Processing Units (GPUs). We assess the performance of the various methods by studying the convergence and statistical accuracy obtained for observables that require a large number of stochastic noise vectors, such as the isoscalar nucleon axial charge. The various methods are also examined for the evaluation of sigma-terms where noise reduction techniques specific to the twisted mass formulation can be utilized thus reducing the required number of stochastic noise vectors.


Computer Physics Communications | 2012

Evaluation of fermion loops applied to the calculation of the η′ mass and the nucleon scalar and electromagnetic form factors

Constantia Alexandrou; Kyriakos Hadjiyiannakou; Giannis Koutsou; A. OʼCais; Alexei Strelchenko

Abstract The exact evaluation of the disconnected diagram contributions to the flavor-singlet pseudo-scalar meson mass, the nucleon σ -term and the nucleon electromagnetic form factors is carried out utilizing GPGPU technology with the NVIDIA CUDA platform. The disconnected loops are also computed using stochastic methods with several noise reduction techniques. Various dilution schemes as well as the truncated solver method are studied. We make a comparison of these stochastic techniques to the exact results and show that the number of noise vectors depends on the operator insertion in the fermion loop.


Physical Review D | 2013

Determination of Δ-resonance parameters from lattice QCD

Constantia Alexandrou; John W. Negele; Marcus Petschlies; Alexei Strelchenko; A. Tsapalis

A method suitable for extracting resonance parameters of unstable baryons in lattice QCD is examined. The method is applied to the strong decay of the Delta to a pion-nucleon state, extracting the pion-nucleon - Delta coupling constant and Delta decay width.


ieee international conference on high performance computing data and analytics | 2016

Accelerating lattice QCD multigrid on GPUs using fine-grained parallelization

Michael Clark; Balint Joo; Alexei Strelchenko; Michael Cheng; Arjun Singh Gambhir; Richard C. Brower

The past decade has witnessed a dramatic acceleration of lattice quantum chromodynamics calculations in nuclear and particle physics. This has been due to both significant progress in accelerating the iterative linear solvers using multigrid algorithms, and due to the throughput improvements brought by GPUs. Deploying hierarchical algorithms optimally on GPUs is non-trivial owing to the lack of parallelism on the coarse grids, and as such, these advances have not proved multiplicative. Using the QUDA library, we demonstrate that by exposing all sources of parallelism that the underlying stencil problem possesses, and through appropriate mapping of this parallelism to the GPU architecture, we can achieve high efficiency even for the coarsest of grids. Results are presented for the Wilson-Clover discretization, where we demonstrate up to 10x speedup over present state-of-the-art GPU-accelerated methods on Titan. Finally, we look to the future, and consider the software implications of our findings.


Computer Physics Communications | 2018

Pushing memory bandwidth limitations through efficient implementations of Block-Krylov space solvers on GPUs

M. A. Clark; Alexei Strelchenko; Alejandro Vaquero; Mathias Wagner; Evan Weinberg

Abstract The cost of the iterative solution of a sparse matrix–vector system against multiple vectors is a common challenge within scientific computing. A tremendous number of algorithmic advances, such as eigenvector deflation and domain-specific multi-grid algorithms, have been ubiquitously beneficial in reducing this cost. However, they do not address the intrinsic memory-bandwidth constraints of the matrix–vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix–vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. Practical implementations typically suffer from the quadratic scaling in the number of vector–vector operations. We present an implementation of the block Conjugate Gradient algorithm on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector–vector operations from quadratic to linear. As a representative case, we consider the domain of lattice quantum chromodynamics and present results for one of the fermion discretizations. Using the QUDA library as a framework, we demonstrate a 5 × speedup compared to highly-optimized independent Krylov solves on NVIDIA’s SaturnV cluster.


arXiv: High Energy Physics - Lattice | 2016

Accelerating Twisted Mass LQCD with QPhiX

Mario Schröck; S. Simula; Alexei Strelchenko

We present the implementation of twisted mass fermion operators for the QPhiX library. We analyze the performance on the Intel Xeon Phi (Knights Corner) coprocessor as well as on Intel Xeon Haswell CPUs. In particular, we demonstrate that on the Xeon Phi 7120P the Dslash kernel is able to reach 80\% of the theoretical peak bandwidth, while on a Xeon Haswell E5-2630 CPU our generated code for the Dslash operator with AVX2 instructions outperforms the corresponding implementation in the tmLQCD library by a factor of


arXiv: High Energy Physics - Lattice | 2014

Implementation of the twisted mass fermion operator in the QUDA library

Alexei Strelchenko; Constantia Alexandrou; Giannis Koutsou; Alejandro Vaquero Avilés-Casco


arXiv: High Energy Physics - Lattice | 2014

A QUDA-branch to compute disconnected diagrams in GPUs

Alejandro Vaquero Avilés-Casco; Constantia Alexandrou; Kyriakos Hadjiyiannakou; Giannis Koutsou; Alexei Strelchenko

\sim 5\times


Proceedings of 34th annual International Symposium on Lattice Field Theory — PoS(LATTICE2016) | 2017

Progress Report on Staggered Multigrid

Evan Weinberg; Richard C. Brower; Kate Clark; Alexei Strelchenko


Proceedings of The 32nd International Symposium on Lattice Field Theory — PoS(LATTICE2014) | 2015

Extending the QUDA library with the eigCG solver

Alexei Strelchenko; Andreas Stathopoulos

in single precision. We strong scale the code up to 6.8 (14.1) Tflops in single (half) precision on 64 Xeon Haswell CPUs.

Collaboration


Dive into the Alexei Strelchenko's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Richard C. Brower

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

K. Jansen

Humboldt University of Berlin

View shared research outputs
Top Co-Authors

Avatar

A. Vaquero

University of Zaragoza

View shared research outputs
Researchain Logo
Decentralizing Knowledge