Constantine Bekas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Constantine Bekas is active.

Explore More

Publication

Featured researches published by Constantine Bekas.

Computer Science - Research and Development | 2010

A new energy aware performance metric

Constantine Bekas; Alessandro Curioni

Energy aware algorithms are the wave of the future. The development of exascale systems made it clear that extrapolations of current technologies, algorithmic practices and performance metrics are simply inadequate. The community reacted by introducing the FLOPS/WATT metric in order to promote energy awareness. In this work we take a step forward and argue what one should aim for is the total reduction of the spent energy in conjunction with minimization of time to solution. Thus, we propose to use f(timetosolution)⋅energy (FTTSE) as the performance metric, where f(⋅) is an application dependent function of time. In this paper, we introduce our ideas and showcase them with a recently developed framework for solving large dense linear systems.

high performance computational finance | 2009

Low cost high performance uncertainty quantification

Constantine Bekas; Alessandro Curioni; I. Fedulova

Uncertainty quantification in risk analysis has become a key application. In this context, computing the diagonal of inverse covariance matrices is of paramount importance. Standard techniques, that employ matrix factorizations, incur a cubic cost which quickly becomes intractable with the current explosion of data sizes. In this work we reduce this complexity to quadratic with the synergy of two algorithms that gracefully complement each other and lead to a radically different approach. First, we turned to stochastic estimation of the diagonal. This allowed us to cast the problem as a linear system with a relatively small number of multiple right hand sides. Second, for this linear system we developed a novel, mixed precision, iterative refinement scheme, which uses iterative solvers instead of matrix factorizations. We demonstrate that the new framework not only achieves the much needed quadratic cost but in addition offers excellent opportunities for scaling at massively parallel environments. We based our implementation on BLAS 3 kernels that ensure very high processor performance. We achieved a peak performance of 730 TFlops on 72 BG/P racks, with a sustained performance 73% of theoretical peak. We stress that the techniques presented in this work are quite general and applicable to several other important applications.

SIAM Journal on Scientific Computing | 2005

Computation of Smallest Eigenvalues using Spectral Schur Complements

Constantine Bekas; Yousef Saad

The automated multilevel substructuring method (AMLS) was recently presented as an alternative to well-established methods for computing eigenvalues of large matrices in the context of structural engineering. This technique is based on exploiting a high level of dimensional reduction via domain decomposition and projection methods. This paper takes a purely algebraic look at the method and explains that it can be viewed as a combination of three ingredients: (a) A first order expansion to a nonlinear eigenvalue problem that approximates the restriction of the original eigenproblem on the interface between the subdomains, (b) judicious projections on partial eigenbases that correspond to the interior of the subdomains, (c) recursivity. This viewpoint leads us to explore variants of the method which use Krylov subspaces instead of eigenbases to construct subspaces of approximants. The nonlinear eigenvalue problem viewpoint yields a second order approximation as an enhancement to the first order technique inherent to AMLS. Numerical experiments are reported to validate the approaches presented.

Computer Physics Communications | 2005

Computing charge densities with partially reorthogonalized Lanczos

Constantine Bekas; Yousef Saad; Murilo L. Tiago; James R. Chelikowsky

This paper considers the problem of computing charge densities in a density functional theory (DFT) framework. In contrast to traditional, diagonalization-based, methods, we utilize a technique which exploits a Lanczos basis, without explicit reference to individual eigenvectors. The key ingredient of this new approach is a partial reorthogonalization strategy whose goal is to ensure a good level of orthogonality of the basis vectors. The experiments reveal that the method can be a few times faster than ARPACK, the implicit restart Lanczos method. This is achievable by exploiting more memory and BLAS3 (dense) computations while avoiding the frequent updates of eigenvectors inherent to all restarted Lanczos methods.

parallel computing | 2001

Cobra: Parallel path following for computing the matrix pseudospectrum☆

Constantine Bekas; Efstratios Gallopoulos

The construction of an accurate approximation of the ϵ-pseudospectrum of a matrix by means of the standard grid method is a very demanding computational task. In this paper, we describe Cobra, a domain-based method for the computation of pseudospectra that combines predictor corrector path following with a one-dimensional grid. The algorithm offers large and medium grain parallelism and becomes particularly attractive when we seek fine resolution of the pseudospectrum boundary. We implement Cobra using standard LAPACK components and show that it is more robust than the existing path following technique and faster than it and the traditional grid method. Cobra is also combined with a partial SVD algorithm to produce an effective parallel method for computing the matrix pseudospectrum.

Numerical Algorithms | 2013

Accelerating data uncertainty quantification by solving linear systems with multiple right-hand sides

Vassilis Kalantzis; Constantine Bekas; Alessandro Curioni; Efstratios Gallopoulos

The subject of this work is accelerating data uncertainty quantification. In particular, we are interested in expediting the stochastic estimation of the diagonal of the inverse covariance (precision) matrix that holds a wealth of information concerning the quality of data collections, especially when the matrices are symmetric positive definite and dense. Schemes built on direct methods incur a prohibitive cubic cost. Recently proposed iterative methods can remedy this but the overall cost is raised again as the convergence of stochastic estimators can be slow. The motivation behind our approach stems from the fact that the computational bottleneck in stochastic estimation is the application of the precision matrix on a set of appropriately selected vectors. The proposed method combines block conjugate gradient with a block-seed approach for multiple right-hand sides, taking advantage of the nature of the right-hand sides and the fact that the diagonal is not sought to high accuracy. Our method is applicable if the matrix is only known implicitly and also produces a matrix-free diagonal preconditioner that can be applied to further accelerate the method. Numerical experiments confirm that the approach is promising and helps contain the overall cost of diagonal estimation as the number of samples grows.

SIAM Journal on Matrix Analysis and Applications | 2008

Computation of Large Invariant Subspaces Using Polynomial Filtered Lanczos Iterations with Applications in Density Functional Theory

Constantine Bekas; Effrosini Kokiopoulou; Yousef Saad

The most expensive part of all electronic structure calculations based on density functional theory lies in the computation of an invariant subspace associated with some of the smallest eigenvalues of a discretized Hamiltonian operator. The dimension of this subspace typically depends on the total number of valence electrons in the system, and can easily reach hundreds or even thousands when large systems with many atoms are considered. At the same time, the discretization of Hamiltonians associated with large systems yields very large matrices, whether with planewave or real-space discretizations. The combination of these two factors results in one of the most significant bottlenecks in computational materials science. In this paper we show how to efficiently compute a large invariant subspace associated with the smallest eigenvalues of a symmetric/Hermitian matrix using polynomially filtered Lanczos iterations. The proposed method does not try to extract individual eigenvalues and eigenvectors. Instead, it constructs an orthogonal basis of the invariant subspace by combining two main ingredients. The first is a filtering technique to dampen the undesirable contribution of the largest eigenvalues at each matrix-vector product in the Lanczos algorithm. This technique employs a well-selected low pass filter polynomial, obtained via a conjugate residual-type algorithm in polynomial space. The second ingredient is the Lanczos algorithm with partial reorthogonalization. Experiments are reported to illustrate the efficiency of the proposed scheme compared to state-of-the-art implicitly restarted techniques.

Concurrency and Computation: Practice and Experience | 2012

Low-cost data uncertainty quantification

Constantine Bekas; Alessandro Curioni; I. Fedulova

The analysis of a huge backload of ever‐accumulating data presents a huge challenge in all respects of computing. Inverse covariance matrices in this respect are very important. We target data uncertainty quantification, a very useful measure of which is provided by inverse covariance matrix diagonal entries. In previous work, we introduced a novel method that reduces overall complexity by at least two orders of magnitude. At the same time, a state‐of‐the‐art message‐passing interface (MPI) implementation allowed us to reach a sustained performance of up to 73% (730 TFLOPS on the full 72 Blue Gene/P rack configuration at Jülich). Thanks to its reduced complexity, this work has attracted significant interest, and thus, we have received numerous requests concerning its exploitation in various fields. A common denominator in these requests is that they almost all came from people with no or, in the best case, limited high‐performance computing background. Nevertheless, all interest is in analyzing huge data sets, suitably adapting the method to particular applications. A bottleneck then is that potential users are reluctant to pay for a steep learning curve to get proficient in parallel computing using the de facto standard: MPI. Thus, we turned to the Partitioned Global Address Space programming model and in particular the Unified Parallel C language. In this work, we gave a comprehensive description of the framework and demonstrated the efficiency of the state‐of‐the‐art MPI implementation. In addition, we showed that one can develop an easy‐to‐follow yet efficient Unified Parallel C implementation, which is also easy to debug and maintain, features that significantly boost overall productivity. Copyright

Future Generation Computer Systems | 2005

The design of a distributed MATLAB-based environment for computing pseudospectra

Constantine Bekas; Effrosini Kokiopoulou; Efstratios Gallopoulos

It has been documented in the literature that the pseudospectrum of a matrix is a powerful concept that broadens our understanding of phenomena based on matrix computations. When the matrix A is non-normal, however, the computation of the pseudospectrum becomes a very expensive computational task. Thus, the use of high performance computing resources becomes key to obtaining useful answers in acceptable amounts of time. In this work we describe the design and implementation of an environment that integrates a suite of state-of-the-art algorithms running on a cluster of workstations to enable the matrix pseudospectrum become a practical tool for scientists and engineers. The user interacts with the environment via the graphical user interface PPsGUI. The environment is constructed on top of CMTM, an existing environment that enables distributed computation via an MPI API for MATLAB.

international conference on supercomputing | 2001

Towards the effective parallel computation of matrix pseudospectra

Constantine Bekas; Effrosini Kokiopoulou; Ioannis Koutis; Efstratios Gallopoulos

Given a matrix A, the computation of its pseudospectrum A∈ (A) is a far more expensive task than the computation of characteristics such as the condition number and the matrix spectrum. As research of the last 15 years has shown, however, the matrix pseudospectrum provides valuable information that is not included in other indicators. So, we ask how to compute it efficiently and build a tool that would facilitate engineers and scientists to make such analyses? In this paper we focus on parallel algorithms for computing pseudospectra. The most widely used algorithm for computing pseudospectra is embarassingly parallel; nevertheless, it is extremely costly and one cannot hope achieve absolute high performance with it. We describe algorithms that have drastically improved performance while maintaining a high degree of large grain parallelism. We evaluate the effectiveness of these methods in the context of a MATLAB-based environment for parallel programming using MPI on small, off-the-shelf parallel systems.

Explore More