Eduardo F. D'Azevedo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eduardo F. D'Azevedo is active.

Explore More

Publication

Featured researches published by Eduardo F. D'Azevedo.

Archive | 1997

ScaLAPACK users' guide

L. S. Blackford; Jaeyoung Choi; Andrew J. Cleary; Eduardo F. D'Azevedo; James Demmel; Inderjit S. Dhillon; Jack J. Dongarra; Sven Hammarling; Greg Henry; Antoine Petitet; K. Stanley; David Walker; R. C. Whaley

Where you can find the scalapack users guide easily? Is it in the book store? On-line book store? are you sure? Keep in mind that you will find the book in this site. This book is very referred for you because it gives not only the experience but also lesson. The lessons are very valuable to serve for you, thats not about who are reading this scalapack users guide book. It is about this book that will give wellness for all people from many societies.

Concurrency and Computation: Practice and Experience | 2000

The design and implementation of the parallel out‐of‐core ScaLAPACK LU, QR, and Cholesky factorization routines

Eduardo F. D'Azevedo; Jack J. Dongarra

SUMMARY This paper describes the design and implementation of three core factorization routines—LU, QR, and Cholesky—included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. The full matrix is stored on disk and the factorization routines transfer sub-matrice panels into memory. The ‘left-looking’ columnoriented variant of the factorization algorithm is implemented to reduce the disk I/O traffic. The routines are implemented using a portable I/O interface and utilize high-performance ScaLAPACK factorization routines as in-core computational kernels. We present the details of the implementation for the out-of-core ScaLAPACK factorization routines, as well as performance and scalability results on a Beowulf Linux cluster. Copyright

international conference on computational science | 2005

Vectorized sparse matrix multiply for compressed row storage format

Eduardo F. D'Azevedo; Mark R. Fahey; Richard Tran Mills

The innovation of this work is a simple vectorizable algorithm for performing sparse matrix vector multiply in compressed sparse row (CSR) storage format. Unlike the vectorizable jagged diagonal format (JAD), this algorithm requires no data rearrangement and can be easily adapted to a sophisticated library framework such as PETSc. Numerical experiments on the Cray X1 show an order of magnitude improvement over the non-vectorized algorithm.

SIAM Journal on Scientific Computing | 2000

Are Bilinear Quadrilaterals Better Than Linear Triangles

Eduardo F. D'Azevedo

This paper compares the theoretical effectiveness of bilinear approximation over quadrilaterals with linear approximation over triangles. Anisotropic mesh transformation is used to generate asymptotically optimally efficient meshes for piecewise linear interpolation over triangles and bilinear interpolation over quadrilaterals. For approximating a convex function, although bilinear quadrilaterals are more efficient, linear triangles are more accurate and may be preferred in finite element computations; whereas for saddle-shaped functions, quadrilaterals may offer a higher-order approximation on a well-designed mesh. A surprising finding is different grid orientations may yield an order of magnitude improvement in approximation accuracy

Computers & Geosciences | 2001

HBGC123D: a high-performance computer model of coupled hydrogeological and biogeochemical processes

Jin P. Gwo; Eduardo F. D'Azevedo; Hartmut Frenzel; Melaine Mayes; Gour-Tsyh Yeh; Philip M. Jardine; Karen M. Salvage; Forrest M. Hoffman

Groundwater flow and transport models have been used to assist management ofsubsurf ace water resources and water quality. The needs ofmore efficient use oftechnical and financial resources have recently motivated the development ofmore effective remediation techniques and complex models ofcoupled hydrogeological and biogeochemical processes. We present a high-performance computer model of the coupled processes, HBGC123D. The model uses a hybrid Eulerian–Lagrangian finite element method to solve the solute transport equation and a Newton’s method to solve the system ofnonlinear, mixed kinetics and equilibrium reaction equations. Application of the model to a laboratory soil column with multispecies tracer injection suggests that one may use the model to derive important parameters of subsurface solute fate and transport. These parameters may be used for predictive purpose in similar field problems. To this end, we present a three-dimensional, hypothetical bioremediation simulation on an aquifer contaminated by CoNTA. The simulation suggests that, using oxygen alone to stimulate the biodegradation of the contaminant, one may reduce the waste to 40% in 10 years. Using a refined mesh ofthis three-dimensional model, we also conduct a performance study of HBGC123D on an array of SGI Origin 2000 distributed shared-memory processors. Both the computational kernels and the entire model show very good performance up to 32 processors. The CPU time is essentially reduced by 20-fold using 64 processors. This result suggests that HBGC123D may be a useful tool in assisting environmental restoration efforts such as waste site characterization and remediation. # 2001 Elsevier Science Ltd. All rights reserved.

Computers & Geosciences | 2010

Application of a hybrid MPI/OpenMP approach for parallel groundwater model calibration using multi-core computers

Guoping Tang; Eduardo F. D'Azevedo; Fan Zhang; Jack C. Parker; David B. Watson; Philip M. Jardine

Calibration of groundwater models involves hundreds to thousands of forward solutions, each of which may solve many transient coupled nonlinear partial differential equations, resulting in a computationally intensive problem. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelisms in software and hardware to reduce calibration time on multi-core computers. HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for direct solutions for a reactive transport model application, and a field-scale coupled flow and transport model application. In the reactive transport model, a single parallelizable loop is identified to account for over 97% of the total computational time using GPROF. Addition of a few lines of OpenMP compiler directives to the loop yields a speedup of about 10 on a 16-core compute node. For the field-scale model, parallelizable loops in 14 of 174 HGC5 subroutines that require 99% of the execution time are identified. As these loops are parallelized incrementally, the scalability is found to be limited by a loop where Cray PAT detects over 90% cache missing rates. With this loop rewritten, similar speedup as the first application is achieved. The OpenMP-parallelized code can be run efficiently on multiple workstations in a network or multiple compute nodes on a cluster as slaves using parallel PEST to speedup model calibration. To run calibration on clusters as a single task, the Levenberg-Marquardt algorithm is added to HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, 100-200 compute cores are used to reduce the calibration time from weeks to a few hours for these two applications. This approach is applicable to most of the existing groundwater model codes for many applications.

Physics of Plasmas | 2006

Self-consistent full-wave and Fokker-Planck calculations for ion cyclotron heating in non-Maxwellian plasmas

E. F. Jaeger; Lee A. Berry; S. D. Ahern; Richard Frederick Barrett; D. B. Batchelor; Mark Dwain Carter; Eduardo F. D'Azevedo; R. D. Moore; R.W. Harvey; J. R. Myra; D. A. D’Ippolito; R. J. Dumont; C. K. Phillips; H. Okuda; David Smithe; P.T. Bonoli; John Wright; M. Choi

Magnetically confined plasmas can contain significant concentrations of nonthermal plasma particles arising from fusion reactions, neutral beam injection, and wave-driven diffusion in velocity space. Initial studies in one-dimensional and experimental results show that nonthermal energetic ions can significantly affect wave propagation and heating in the ion cyclotron range of frequencies. In addition, these ions can absorb power at high harmonics of the cyclotron frequency where conventional two-dimensional global-wave models are not valid. In this work, the all-orders global-wave solver AORSA [E. F. Jaeger et al., Phys. Rev. Lett. 90, 195001 (2003)] is generalized to treat non-Maxwellian velocity distributions. Quasilinear diffusion coefficients are derived directly from the wave fields and used to calculate energetic ion velocity distributions with the CQL3D Fokker-Planck code [R. W. Harvey and M. G. McCoy, Proceedings of the IAEA Technical Committee Meeting on Simulation and Modeling of Thermonuclear ...

Journal of Physics: Conference Series | 2009

Scaling to 150K cores: recent algorithm and performance engineering developments enabling XGC1 to run at scale

Mark Adams; Seung-Hoe Ku; Patrick H. Worley; Eduardo F. D'Azevedo; Julian Cummings; Cindy Chang

Particle-in-cell (PIC) methods have proven to be effective in discretizing the Vlasov-Maxwell system of equations describing the core of toroidal burning plasmas for many decades. Recent physical understanding of the importance of edge physics for stability and transport in tokamaks has lead to development of the first fully toroidal edge PIC code – XGC1. The edge region poses special problems in meshing for PIC methods due to the lack of closed flux surfaces, which makes field-line following meshes and coordinate systems problematic. We present a solution to this problem with a semi-field line following mesh method in a cylindrical coordinate system. Additionally, modern supercomputers require highly concurrent algorithms and implementations, with all levels of the memory hierarchy being efficiently utilized to realize optimal code performance. This paper presents a mesh and particle partitioning method, suitable to our meshing strategy, for use on highly concurrent cache-based computing platforms.

Journal of Computational Physics | 2016

A fully non-linear multi-species Fokker-Planck-Landau collision operator for simulation of fusion plasma

R. Hager; Eisung Yoon; S. Ku; Eduardo F. D'Azevedo; Patrick H. Worley; Choong-Seock Chang

Fusion edge plasmas can be far from thermal equilibrium and require the use of a non-linear collision operator for accurate numerical simulations. In this article, the non-linear single-species Fokker-Planck-Landau collision operator developed by Yoon and Chang (2014) 9 is generalized to include multiple particle species. The finite volume discretization used in this work naturally yields exact conservation of mass, momentum, and energy. The implementation of this new non-linear Fokker-Planck-Landau operator in the gyrokinetic particle-in-cell codes XGC1 and XGCa is described and results of a verification study are discussed. Finally, the numerical techniques that make our non-linear collision operator viable on high-performance computing systems are described, including specialized load balancing algorithms and nested OpenMP parallelization. The collision operators good weak and strong scaling behavior are shown.

Journal of Computational Physics | 2007

Compression of magnetohydrodynamic simulation data using singular value decomposition

Diego del-Castillo-Negrete; S.P. Hirshman; Donald A. Spong; Eduardo F. D'Azevedo

Numerical calculations of magnetic and flow fields in magnetohydrodynamic (MHD) simulations can result in extensive data sets. Particle-based calculations in these MHD fields, needed to provide closure relations for the MHD equations, will require communication of this data to multiple processors and rapid interpolation at numerous particle orbit positions. To facilitate this analysis it is advantageous to compress the data using singular value decomposition (SVD, or principal orthogonal decomposition, POD) methods. As an example of the compression technique, SVD is applied to magnetic field data arising from a dynamic nonlinear MHD code. The performance of the SVD compression algorithm is analyzed by calculating Poincare plots for electron orbits in a three-dimensional magnetic field and comparing the results with uncompressed data.

Explore More