Wael R. Elwasif | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wael R. Elwasif is active.

Explore More

Publication

Featured researches published by Wael R. Elwasif.

ieee international conference on high performance computing data and analytics | 2006

A Component Architecture for High-Performance Scientific Computing

Benjamin A. Allan; Robert C. Armstrong; David E. Bernholdt; Felipe Bertrand; Kenneth Chiu; Tamara L. Dahlgren; Kostadin Damevski; Wael R. Elwasif; Thomas Epperly; Madhusudhan Govindaraju; Daniel S. Katz; James Arthur Kohl; Manoj Kumar Krishnan; Gary Kumfert; J. Walter Larson; Sophia Lefantzi; Michael J. Lewis; Allen D. Malony; Lois C. Mclnnes; Jarek Nieplocha; Boyana Norris; Steven G. Parker; Jaideep Ray; Sameer Shende; Theresa L. Windus; Shujia Zhou

The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance coputing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed coputing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal ovehead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including cobustion research, global climate simulation, and computtional chemistry.

ieee international symposium on fault tolerant computing | 1998

Experimental assessment of workstation failures and their impact on checkpointing systems

James S. Plank; Wael R. Elwasif

In the past twenty years, there has been a wealth of theoretical research on minimizing the expected running time of a program in the presence of failures by employing checkpointing and rollback recovery. In the same time period, there has been little experimental research to corroborate these results. We study three separate projects that monitor failure in workstation networks. Our goals are twofold. The first is to see how these results correlate with the theoretical results, and the second is to assess their impact on strategies for checkpointing long-running computations on workstations and networks of workstations. A significant result of our work is that although the base assumptions of the theoretical research do not hold, many of the results are still applicable.

component based software engineering | 2004

Computational quality of service for scientific components.

Boyana Norris; Jaideep Ray; Robert C. Armstrong; Lois Curfman McInnes; David E. Bernholdt; Wael R. Elwasif; Allen D. Malony; Sameer Shende

Scientific computing on massively parallel computers presents unique challenges to component-based software engineering (CBSE). While CBSE is at least as enabling for scientific computing as it is for other arenas, the requirements are different. We briefly discuss how these requirements shape the Common Component Architecture, and we describe some recent research on quality-of-service issues to address the computational performance and accuracy of scientific simulations.

Journal of Computational Physics | 2012

Event-based parareal: A data-flow based implementation of parareal

Lee A. Berry; Wael R. Elwasif; José Miguel Reynolds-Barredo; D. Samaddar; Raul Sanchez; David E. Newman

Parareal is an iterative algorithm that, in effect, achieves temporal decomposition for a time-dependent system of differential or partial differential equations. A solution is obtained in a shorter wall-clock time, but at the expense of increased compute cycles. The algorithm combines a fine solver that solves the system to acceptable accuracy with an approximate coarse solver. The critical task for the successful implementation of parareal on any system is the development of a coarse solver that leads to convergence in a small number of iterations compared to the number of time slices in the full time interval, and is, at the same time, much faster than the fine solver. Very fast coarse solvers may not lead to sufficiently rapid convergence, and slow coarse solvers may not lead to significant gains even if the number of iterations to convergence is satisfactory. We find that the difficulty of meeting these conflicting demands can be substantially eased by using a data-driven, event-based implementation of parareal. As a result, tasks for one iteration do not wait for the previous iteration to complete, but are started when the needed data are available. For given convergence properties, the event-based approach relaxes the speed requirements on the coarse solver by a factor of ~K, where K is the number of iterations required for a converged solution. This may, for many problems, lead to an efficient parareal implementation that would otherwise not be possible or would require substantial coarse solver development. In addition, the framework used for this implementation executes a task when the data dependencies are satisfied and computational resources are available. This leads to improved computational efficiency over previous approaches that pipeline or schedule groups of tasks to a particular processor or group of processors.

Journal of Computational Physics | 2012

Mechanisms for the convergence of time-parallelized, parareal turbulent plasma simulations

José Miguel Reynolds-Barredo; David E. Newman; Raul Sanchez; D. Samaddar; Lee A. Berry; Wael R. Elwasif

Parareal is a recent algorithm able to parallelize the time dimension in spite of its sequential nature. It has been applied to several linear and nonlinear problems and, very recently, to a simulation of fully-developed, two-dimensional drift wave turbulence. The mere fact that parareal works in such a turbulent regime is in itself somewhat unexpected, due to the characteristic sensitivity of turbulence to any change in initial conditions. This fundamental property of any turbulent system should render the iterative correction procedure characteristic of the parareal method inoperative, but this seems not to be the case. In addition, the choices that must be made to implement parareal (division of the temporal domain, election of the coarse solver and so on) are currently made using trial-and-error approaches. Here, we identify the mechanisms responsible for the convergence of parareal of these simulations of drift wave turbulence. We also investigate which conditions these mechanisms impose on any successful parareal implementation. The results reported here should be useful to guide future implementations of parareal within the much wider context of fully-developed fluid and plasma turbulent simulations.

parallel, distributed and network-based processing | 2010

The Design and Implementation of the SWIM Integrated Plasma Simulator

Wael R. Elwasif; David E. Bernholdt; Aniruddha G. Shet; Samantha S. Foley; Randall Bramley; D. B. Batchelor; Lee A. Berry

As computing capabilities have increased, the coupling of computational models has become an increasingly viable and therefore important way of improving the physical fidelity of simulations. Applications currently using some form of multicode or multi-component coupling include climate modeling, rocket simulations, and chemistry. In recent years, the plasma physics community has also begun to pursue integrated multiphysics simulations for space weather and fusion energy applications. Such model coupling generally exposes new issues in the physical, mathematical, and computational aspects of the problem. This paper focuses on the computational aspects of one such effort, detailing the design, and implementation of the Integrated Plasma Simulator (IPS) for the Center for Simulation of Wave Interactions with Magnetohydrodynamics (SWIM). The IPS framework focuses on maximizing flexibility for the creators of loosely-coupled component-based simulations, and provides services for execution coordination, resource management, data management, and inter-component communication. It also serves as a proving ground for a concurrent “multi-tasking” execution model to improve resource utilization, and application-level fault tolerance. We also briefly describe how the IPS has been applied to several problems of interest to the fusion community.

Proceedings of the 2007 symposium on Component and framework technology in high-performance and scientific computing | 2007

Component framework for coupled integrated fusion plasma simulation

Wael R. Elwasif; David E. Bernholdt; Lee A. Berry; D. B. Batchelor

Successful simulation of the complex physics that affect magnetically confined fusion plasma remains an important target milestone towards the development of viable fusion energy. Major advances in the underlying physics formulations, mathematical modeling, and computational tools and techniques are needed to enable a complete fusion simulation on the emerging class of large scalecapability parallel computers that are coming on-line in the next few years. Several pilot projects are currently being undertaken to explore different (partial) code integration and coupling problems, and possible solutions that may guide the larger integration endeavor. In this paper, we present the design and implementation details of one such project, a component based approach to couple existing codes to model the interaction between high power radio frequency (RF) electromagnetic waves, and magnetohydrodynamics (MHD) aspects of the burning plasma. The framework and component design utilize a light coupling approach based on high level view of constituent codes that facilitates rapid incorporation of new components into the integrated simulation framework. The work illustrates the viability of the light coupling approach to better understand physics and stand-alone computer code dependencies and interactions, as a precursor to a more tightly coupled integrated simulation environment.

Journal of Applied Physics | 2015

Multiscale modeling and characterization for performance and safety of lithium-ion batteries

Sreekanth Pannala; John A. Turner; Srikanth Allu; Wael R. Elwasif; Sergiy Kalnaus; Srdjan Simunovic; Abhishek Kumar; Jay Jay Billings; Hsin Wang; Jagjit Nanda

Lithium-ion batteries are highly complex electrochemical systems whose performance and safety are governed by coupled nonlinear electrochemical-electrical-thermal-mechanical processes over a range of spatiotemporal scales. Gaining an understanding of the role of these processes as well as development of predictive capabilities for design of better performing batteries requires synergy between theory, modeling, and simulation, and fundamental experimental work to support the models. This paper presents the overview of the work performed by the authors aligned with both experimental and computational efforts. In this paper, we describe a new, open source computational environment for battery simulations with an initial focus on lithium-ion systems but designed to support a variety of model types and formulations. This system has been used to create a three-dimensional cell and battery pack models that explicitly simulate all the battery components (current collectors, electrodes, and separator). The models are used to predict battery performance under normal operations and to study thermal and mechanical safety aspects under adverse conditions. This paper also provides an overview of the experimental techniques to obtain crucial validation data to benchmark the simulations at various scales for performance as well as abuse. We detail some initial validation using characterization experiments such as infrared and neutron imaging and micro-Raman mapping. In addition, we identify opportunities for future integration of theory, modeling, and experiments.

Proceedings of the 2007 symposium on Component and framework technology in high-performance and scientific computing | 2007

Bocca: a development environment for HPC components

Wael R. Elwasif; Boyana Norris; Benjamin A. Allan; Robert C. Armstrong

In high-performance scientific software development, the emphasis is often on short time to first solution. Even when the development of new components mostly reuses existing components or libraries and only small amounts of new code must be created, dealing with the component glue code and software build processes to obtain complete applications is still tedious and error-prone. Component-based software meant to reduce complexity at the application level increases complexity with the attendant glue code. To address these needs, we introduce Bocca, the first tool to enable application developers to perform rapid component prototyping while maintaining robust software-engineering practices suitable to HPC environments. Bocca provides project management and a comprehensive build environment for creating and managing applications composed of CommonComponent Architecture components. Of critical importance for HPC applications, Bocca is designed to operate in a language-agnostic way, simultaneously handling components written in any of the languages commonly used in scientific applications: C, C++, Fortran, Fortran77, Python,and Java. Bocca automates the tasks related to the component glue code, freeing the user to focus on the scientific aspects of the application. Bocca embraces the philosophy pioneered by Ruby Rails for web applications: Start with something that works and evolve it to the users purpose.

international conference on conceptual structures | 2011

Strategies for Fault Tolerance in Multicomponent Applications

Aniruddha G. Shet; Wael R. Elwasif; Samantha S. Foley; Byung H. Park; David E. Bernholdt; Randall Bramley

Abstract This paper discusses on-going work with the Integrated Plasma Simulator (IPS), a framework for coupled multiphysics simulations of plasmas, to allow simulations to run through the loss of nodes on which the simulation is executing. While many different techniques are available to improve the fault tolerance of computational science applications on high-performance computer systems, checkpoint/restart (C/R) remains virtually the only one that see widespread use in practice. Our focus here is to augment the traditional C/R approach with additional techniques that can provide a more localized and tailored response to faults based on the ability to restart failed tasks on an individual basis, and the use of information external to the application itself in order to guide decision-making, in many cases avoiding the need to stop and restart the entire simulation. This capability involves several features within the IPS framework, and leverages the Fault Tolerance Backplane, a publish/subscribe event service to disseminate fault-related information throughout HPC systems, to obtain information from the Reliability, Availability and Serviceability (RAS) subsystem of the HPC system. This work is described in the context of Cray XT-series computer systems for concreteness, but is applicable to other environments as well. As part of the analysis of this work, we discuss the requirements to generalize this approach to other complex simulation applications beyond the Integrated Plasma Simulator.

Explore More