Is this you? Create Your Porfile

Osni Marques

Lawrence Berkeley National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Osni Marques is active.

Explore More

Publication

Featured researches published by Osni Marques.

Proteins | 2000

Building-Block Approach for Determining Low-Frequency Normal Modes of Macromolecules

Florence Tama; Florent Xavier Gadéa; Osni Marques; Yves-Henri Sanejouand

Normal mode analysis of proteins of various sizes, ranging from 46 (crambin) up to 858 residues (dimeric citrate synthase) were performed, by using standard approaches, as well as a recently proposed method that rests on the hypothesis that low‐frequency normal modes of proteins can be described as pure rigid‐body motions of blocks of consecutive amino‐acid residues. Such a hypothesis is strongly supported by our results, because we show that the latter method, named RTB, yields very accurate approximations for the low‐frequency normal modes of all proteins considered. Moreover, the quality of the normal modes thus obtained depends very little on the way the polypeptidic chain is split into blocks. Noteworthy, with six amino‐acids per block, the normal modes are almost as accurate as with a single amino‐acid per block. In this case, for a protein of n residues and N atoms, the RTB method requires the diagonalization of an n × n matrix, whereas standard procedures require the diagonalization of a 3N × 3N matrix. Being a fast method, our approach can be useful for normal mode analyses of large systems, paving the way for further developments and applications in contexts for which the normal modes are needed frequently, as for example during molecular dynamics calculations. Proteins 2000;41:1–7.

SIAM Journal on Scientific Computing | 2008

Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers

James Demmel; Osni Marques; Beresford N. Parlett; Christof Vömel

PERFORMANCE AND ACCURACY OF LAPACK’S SYMMETRIC TRIDIAGONAL EIGENSOLVERS JAMES W. DEMMEL † OSNI A. MARQUES ‡ BERESFORD N. PARLETT † AND CHRISTOF V OMEL ‡ Abstract. We compare four algorithms from the latest LAPACK 3.1 release for computing eigenpairs of a symmetric tridiagonal matrix. These include QR iteration, bisection and inverse iteration (BI), the Divide-and-Conquer method (DC), and the method of Multiple Relatively Robust Representations (MR). Our evaluation considers speed and accuracy when computing all eigenpairs, and additionally subset computations. Using a variety of carefully selected test problems, our study includes a variety of today’s computer architectures. Our conclusions can be summarized as follows. (1) DC and MR are generally much faster than QR and BI on large matrices. (2) MR almost always does the fewest ﬂoating point operations, but at a lower MFlop rate than all the other algorithms. (3) The exact performance of MR and DC strongly depends on the matrix at hand. (4) DC and QR are the most accurate algorithms with observed accuracy O( ne). The accuracy of BI and MR is generally O(ne). (5) MR is preferable to BI for subset computations. Key words. LAPACK, symmetric eigenvalue problem, inverse iteration, Divide & Conquer, QR algorithm, MRRR algorithm, accuracy, performance, benchmark. AMS subject classiﬁcations. 15A18, 15A23. 1. Introduction. One goal of the latest 3.1 release [25] of LAPACK [1] is to pro- duce the fastest possible symmetric eigensolvers subject to the constraint of delivering small residuals and orthogonal eigenvectors. For an input matrix A that may be dense or banded, one standard approach is the conversion to tridiagonal form T , then the eigenvalues and eigenvectors of T are found, and last the eigenvectors of T transformed to eigenvectors of A. Depending on the situation, all the eigenpairs or just some of them may be de- sired. LAPACK, for some algorithms, allows selection by eigenvalue indices (‘ﬁnd λ i , λ i+1 , ...λ j , where λ 1 ≤ λ 2 ≤ · · · ≤ λ n are all the eigenvalues in increasing order’, and their eigenvectors) or by an interval (‘ﬁnd all the eigenvalues in [a, b] and their eigenvectors’). This paper analyzes the performance and accuracy of four algorithms: 1. QR iteration, in LAPACK’s driver STEV (QR for short), 2. Bisection and Inverse Iteration, in STEVX (BI for short), 3. Divide and Conquer, in STEVD (DC for short), 4. Multiple Relatively Robust Representations, in STEVR (MR for short) Section 2 gives a brief description of these algorithms with references. For a representative picture of each algorithm’s capacities, we developed an ex- tensive set of test matrices [7], broken into two classes: (1) ‘practical matrices’ based on reducing matrices from a variety of practical applications to tridiagonal form, and generating some other tridiagonals with similar spectra, and (2) synthetic ‘testing ma- trices’ chosen to have extreme distributions of eigenvalues or other properties designed to exercise one or more of the algorithms, see Section 3.1 for details. The timing and † Mathematics Department and Computer Science Division, University of California, Berkeley, CA 94720, USA. {demmel@cs,parlett@math}.berkeley.edu ‡ Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA. {oamarques,cvoemel}@lbl.gov

ACM Transactions on Mathematical Software | 2002

On computing givens rotations reliably and efficiently

David Bindel; James Demmel; William Kahan; Osni Marques

We consider the efficient and accurate computation of Givens rotations. When f and g are positive real numbers, this simply amounts to computing the values of c = f/√f2 + g2, s = g/√f2 + g2, and r = √f2 + g2. This apparently trivial computation merits closer consideration for the following three reasons. First, while the definitions of c, s and r seem obvious in the case of two nonnegative arguments f and g, there is enough freedom of choice when one or more of f and g are negative, zero or complex that LAPACK auxiliary routines SLARTG, CLARTG, SLARGV and CLARGV can compute rather different values of c, s and r for mathematically identical values of f and g. To eliminate this unnecessary ambiguity, the BLAS Technical Forum chose a single consistent definition of Givens rotations that we will justify here. Second, computing accurate values of c, s and r as efficiently as possible and reliably despite over/underflow is surprisingly complicated. For complex Givens rotations, the most efficient formulas require only one real square root and one real divide (as well as several much cheaper additions and multiplications), but a reliable implementation using only working precision has a number of cases. On a Sun Ultra-10, the new implementation is slightly faster than the previous LAPACK implementation in the most common case, and 2.7 to 4.6 times faster than the corresponding vendor, reference or ATLAS routines. It is also more reliable; all previous codes occasionally suffer from large inaccuracies due to over/underflow. For real Givens rotations, there are also improvements in speed and accuracy, though not as striking. Third, the design process that led to this reliable implementation is quite systematic, and could be applied to the design of similarly reliable subroutines.

Linear Algebra and its Applications | 2000

An Implementation of the dqds Algorithm (Positive Case)

Beresford N. Parlett; Osni Marques

Abstract The dqds algorithm was introduced in 1994 to compute singular values of bidiagonal matrices to high relative accuracy but it may also be used to compute eigenvalues of tridiagonal matrices. This paper discusses in detail the issues that have to be faced when the algorithm is to be realized on a computer: criteria for accepting a value, for splitting the matrix, and for choosing a shift to reduce the number of iterations, as well as the relative advantages of using IEEE arithmetic when available. Ways to avoid unnecessary over/underflows are described. In addition some new formulae are developed to approximate the smallest eigenvalue from a twisted factorization of a matrix. The results of extensive testing are presented at the end. The list of contents is a valuable guide to the reader interested in specific features of the algorithm.

Journal of Computational Physics | 2008

State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems

Christof Vömel; Stanimire Tomov; Osni Marques; Andrew Canning; Lin-Wang Wang; Jack J. Dongarra

The band edge states determine optical and electronic properties of semiconductor nano-structures which can be computed from an interior eigenproblem. We study the reliability and performance of state-of-the-art iterative eigensolvers on large quantum dots and wires, focusing on variants of preconditioned CG, Lanczos, and Davidson methods. One Davidson variant, the GD+k (Olsen) method, is identified to be as reliable as the commonly used preconditioned CG while consistently being between two and three times faster.

Lecture Notes in Computer Science | 1998

Large-Scale SVD and Subspace-Based Methods for Information Retrieval

Hongyuan Zha; Osni Marques; Horst D. Simon

A theoretical foundation for latent semantic indexing (LSI) is proposed by adapting a model first used in array signal processing to the context of information retrieval using the concept of subspaces. It is shown that this subspace-based model coupled with minimal description length (MDL) principle leads to a statistical test to determine the dimensions of the latent-concept subspaces in LSI. The effect of weighting on the choice of the optimal dimensions of latent-concept subspaces is illustrated. It is also shown that the model imposes a so-called low-rank-plus-shift structure that is approximately satisfied by the cross-product of the term-document matrices. This structure can be exploited to give a more accurate updating scheme for LSI and to correct some of the misconception about the achievable retrieval accuracy in LSI updating. Variants of Lanczos algorithms are illustrated with numerical test results on Cray T3E using document collections generated from World Wide Web.

ACM Transactions on Mathematical Software | 2008

Algorithm 880: A testing infrastructure for symmetric tridiagonal eigensolvers

Osni Marques; Christof Vömel; James Demmel; Beresford N. Parlett

LAPACK is often mentioned as a positive example of a software library that encapsulates complex, robust, and widely used numerical algorithms for a wide range of applications. At installation time, the user has the option of running a (limited) number of test cases to verify the integrity of the installation process. On the algorithm developers side, however, more exhaustive tests are usually performed to study algorithm behavior on a variety of problem settings and also computer architectures. In this process, difficult test cases need to be found that reflect particular challenges of an application or push algorithms to extreme behavior. These tests are then assembled into a comprehensive collection, therefore making it possible for any new or competing algorithm to be stressed in a similar way. This article describes an infrastructure for exhaustively testing the symmetric tridiagonal eigensolvers implemented in LAPACK. It consists of two parts: a selection of carefully chosen test matrices with particular idiosyncrasies and a portable testing framework that allows for easy testing and data processing. The tester facilitates experiments with algorithmic choices, parameter and threshold studies, and performance comparisons on different architectures.

parallel computing | 2006

Prospectus for the next LAPACK and ScaLAPACK libraries

James Demmel; Jack J. Dongarra; Beresford N. Parlett; William Kahan; Ming Gu; David Bindel; Yozo Hida; Xiaoye S. Li; Osni Marques; E. Jason Riedy; Christof Vömel; Julien Langou; Piotr Luszczek; Jakub Kurzak; Alfredo Buttari; Julie Langou; Stanimire Tomov

New releases of the widely used LAPACK and ScaLAPACK numerical linear algebra libraries are planned. Based on an on-going user survey (www.netlib.org/lapack-dev) and research by many people, we are proposing the following improvements: Faster algorithms, including better numerical methods, memory hierarchy optimizations, parallelism, and automatic performance tuning to accommodate new architectures; More accurate algorithms, including better numerical methods, and use of extra precision; Expanded functionality, including updating and downdating, new eigenproblems, etc. and putting more of LAPACK into ScaLAPACK; Improved ease of use, e.g., via friendlier interfaces in multiple languages. To accomplish these goals we are also relying on better software engineering techniques and contributions from collaborators at many institutions.

Archive | 2013

High Performance Computing for Computational Science - VECPAR 2012

Michel Dayd; Osni Marques; Kengo Nakajima

This book constitutes the thoroughly refereed post-conference proceedings of the 10th International Conference on High Performance Computing for Computational Science, VECPAR 2012, held in Kope, Japan, in July 2012. The 28 papers presented together with 7 invited talks were carefully selected during two rounds of reviewing and revision. The papers are organized in topical sections on CPU computing, applications, finite element method from various viewpoints, cloud and visualization performance, method and tools for advanced scientific computing, algorithms and data analysis, parallel iterative solvers on multicore architectures.

ACM Transactions on Mathematical Software | 2005

An overview of the Advanced CompuTational Software (ACTS) collection

Leroy A. Drummond; Osni Marques

The ACTS Collection brings together a number of general-purpose computational tools that were developed by independent research projects mostly funded and supported by the U.S. Department of Energy. These tools tackle a number of common computational issues found in many applications, mainly implementation of numerical algorithms, and support for code development, execution, and optimization. In this article, we introduce the numerical tools in the collection and their functionalities, present a model for developing more complex computational applications on top of ACTS tools, and summarize applications that use these tools. Last, we present a vision of the ACTS project for deployment of the ACTS Collection by the computational sciences community.

Explore More